public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Michael J Gruber <git@drmicha•warpmail.net>
To: Jeff King <peff@peff•net>
Cc: git@vger•kernel.org, Mislav Marohnic <mislav@github•com>
Subject: Re: [RFH] eol=lf on existing mixed line-ending files
Date: Fri, 08 Apr 2011 11:36:20 +0200	[thread overview]
Message-ID: <4D9ED714.80307@drmicha.warpmail.net> (raw)
In-Reply-To: <20110407231556.GA10868@sigill.intra.peff.net>

Jeff King venit, vidit, dixit 08.04.2011 01:15:
> I investigated some odd git behavior with the EOL gitattributes today,
> and I'm curious to hear what others on the list think of what git does.
> In particular, index raciness means git produces non-deterministic
> results in this case.
> 
> The repo in question has a gitattributes file with "* crlf=input" (which
> we would spell "eol=lf" these days, but the results are the same), but
> still contains some files with mixed line endings. Which you can
> reproduce with:
> 
>   git init repo &&
>   cd repo &&
>   {
>     printf 'one\n' &&
>     printf 'two\r\n'
>   } >mixed &&
>   git add mixed &&
>   git commit -m one &&
>   echo '* eol=lf' >.gitattributes
> 
> Now if we run "git status" or "git diff", it will let us know that
> "mixed" is modified, insofar as adding and committing it would perform
> the LF conversion.
> 
> Now we come to the first confusing behavior. Generally one would expect
> the working directory to be clean after a "git reset --hard". But not
> here:
> 
>   git reset --hard &&
>   git status
> 
> will still show "mixed" as modified. Because of course we are checking
> out the version from HEAD into the index and working tree, which has the
> mixed line endings. So we rewrite the identical file.
> 
> So that kind of makes sense. But it isn't all that helpful, if I just
> want to reset my working tree to something sane without making a new
> commit (more on this later).
> 
> But here's an extra helping of confusion on top. Every once in a while,
> doing the reset _won't_ keep "mixed" as modified. I can trigger it
> reliably by inserting an extra sleep into git:
> 
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 500ebcf..735b13e 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -223,6 +223,7 @@ static int check_updates(struct unpack_trees_options *o)
>  		}
>  	}
>  	stop_progress(&progress);
> +	sleep(1);
>  	if (o->update)
>  		git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
>  	return errs != 0;
> 
> That puts a delay between when reset writes the "mixed" file, and when
> we write out the refreshed index. So next time we look at the index
> (e.g., in "status"), we will see that the "mixed" entry has up-to-date
> stat information and not look at its actual contents.
> 
> But in the original case (without the sleep), that doesn't happen.
> There, we usually end up writing the file and the index in the same
> second. So when status looks at the index, the "mixed" entry is racily
> clean, and we actually check it again.
> 
> So we get two different outcomes, depending on the index raciness. Which
> one is right, or is it right for it to be non-deterministic?
> 
> And one final question. Let's say I don't immediately convert this mixed
> file to the correct line-endings. Instead, it persists over a large
> number of commits, some of them even changing the "mixed" file but not
> fixing the line endings[1]. We can simulate that with:
> 
>   mv .gitattributes tmp
>   echo three >>mixed &&
>   git commit -a -m three &&
>   mv tmp .gitattributes
> 
> Now imagine I am somebody who has cloned this repo; the clone will tend
> to end the race condition in the "clean" state, since it will often take
> more than 1 second to write out all of the files (at least for a
> normal-sized project). We can simulate using our sleep-patched reset:
> 
>   git reset --hard
> 
> to get a "clean" repo. Now let's say I want to explore old history, so I
> go to a detached HEAD, but using normal git, not the sleep-patched one:
> 
>   git checkout HEAD^
> 
> And, of course, now we think "mixed" is modified. After I'm done
> exploring, I want to go back to "master", but I can't:
> 
>   $ git checkout master
>   error: Your local changes to the following files would be overwritten by checkout:
>           mixed
> 
> What is the best way out of this situation? You can't use "reset --hard"
> to fix the working tree. I guess "git checkout -f" is the best option.
> 
> Hopefully my example made sense and was reproducible. The real repo
> which triggered this puzzle was jquery. You can try:
> 
>   git clone git://github.com/jquery/jquery.git &&
>   cd jquery &&
>   git checkout 1.4.2 &&
>   git checkout master
> 
> which will fail (but may succeed racily on a slow enough machine).
> Obviously they need to fix the mixed line-ending files in their repo.
> But that fix would be on HEAD, and "git checkout 1.4.2" will be forever
> broken. Is there a way to fix that?
> 
> -Peff
> 
> [1] The one thing still puzzling me about the jquery repo is how they
> managed to make so many commits (including ones to mixed line ending
> files) without seeing the dirty working tree state and committing it. Is
> there some combination of config that makes this not happen?

When did they introduce the .gitattributes file?
Also, maybe they're jgit users.

Michael

  reply	other threads:[~2011-04-08  9:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-07 23:15 [RFH] eol=lf on existing mixed line-ending files Jeff King
2011-04-08  9:36 ` Michael J Gruber [this message]
2011-04-08 16:06   ` Jeff King
2011-04-09 18:58 ` Dmitry Potapov
2011-04-09 19:32   ` Jeff King
2011-04-09 20:09     ` Dmitry Potapov
2011-04-12 13:57 ` Jay Soffian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D9ED714.80307@drmicha.warpmail.net \
    --to=git@drmicha$(echo .)warpmail.net \
    --cc=git@vger$(echo .)kernel.org \
    --cc=mislav@github$(echo .)com \
    --cc=peff@peff$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox