From: Michael J Gruber <git@drmicha•warpmail.net>
To: Jeff King <peff@peff•net>
Cc: git@vger•kernel.org, Mislav Marohnic <mislav@github•com>
Subject: Re: [RFH] eol=lf on existing mixed line-ending files
Date: Fri, 08 Apr 2011 11:36:20 +0200 [thread overview]
Message-ID: <4D9ED714.80307@drmicha.warpmail.net> (raw)
In-Reply-To: <20110407231556.GA10868@sigill.intra.peff.net>
Jeff King venit, vidit, dixit 08.04.2011 01:15:
> I investigated some odd git behavior with the EOL gitattributes today,
> and I'm curious to hear what others on the list think of what git does.
> In particular, index raciness means git produces non-deterministic
> results in this case.
>
> The repo in question has a gitattributes file with "* crlf=input" (which
> we would spell "eol=lf" these days, but the results are the same), but
> still contains some files with mixed line endings. Which you can
> reproduce with:
>
> git init repo &&
> cd repo &&
> {
> printf 'one\n' &&
> printf 'two\r\n'
> } >mixed &&
> git add mixed &&
> git commit -m one &&
> echo '* eol=lf' >.gitattributes
>
> Now if we run "git status" or "git diff", it will let us know that
> "mixed" is modified, insofar as adding and committing it would perform
> the LF conversion.
>
> Now we come to the first confusing behavior. Generally one would expect
> the working directory to be clean after a "git reset --hard". But not
> here:
>
> git reset --hard &&
> git status
>
> will still show "mixed" as modified. Because of course we are checking
> out the version from HEAD into the index and working tree, which has the
> mixed line endings. So we rewrite the identical file.
>
> So that kind of makes sense. But it isn't all that helpful, if I just
> want to reset my working tree to something sane without making a new
> commit (more on this later).
>
> But here's an extra helping of confusion on top. Every once in a while,
> doing the reset _won't_ keep "mixed" as modified. I can trigger it
> reliably by inserting an extra sleep into git:
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 500ebcf..735b13e 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -223,6 +223,7 @@ static int check_updates(struct unpack_trees_options *o)
> }
> }
> stop_progress(&progress);
> + sleep(1);
> if (o->update)
> git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
> return errs != 0;
>
> That puts a delay between when reset writes the "mixed" file, and when
> we write out the refreshed index. So next time we look at the index
> (e.g., in "status"), we will see that the "mixed" entry has up-to-date
> stat information and not look at its actual contents.
>
> But in the original case (without the sleep), that doesn't happen.
> There, we usually end up writing the file and the index in the same
> second. So when status looks at the index, the "mixed" entry is racily
> clean, and we actually check it again.
>
> So we get two different outcomes, depending on the index raciness. Which
> one is right, or is it right for it to be non-deterministic?
>
> And one final question. Let's say I don't immediately convert this mixed
> file to the correct line-endings. Instead, it persists over a large
> number of commits, some of them even changing the "mixed" file but not
> fixing the line endings[1]. We can simulate that with:
>
> mv .gitattributes tmp
> echo three >>mixed &&
> git commit -a -m three &&
> mv tmp .gitattributes
>
> Now imagine I am somebody who has cloned this repo; the clone will tend
> to end the race condition in the "clean" state, since it will often take
> more than 1 second to write out all of the files (at least for a
> normal-sized project). We can simulate using our sleep-patched reset:
>
> git reset --hard
>
> to get a "clean" repo. Now let's say I want to explore old history, so I
> go to a detached HEAD, but using normal git, not the sleep-patched one:
>
> git checkout HEAD^
>
> And, of course, now we think "mixed" is modified. After I'm done
> exploring, I want to go back to "master", but I can't:
>
> $ git checkout master
> error: Your local changes to the following files would be overwritten by checkout:
> mixed
>
> What is the best way out of this situation? You can't use "reset --hard"
> to fix the working tree. I guess "git checkout -f" is the best option.
>
> Hopefully my example made sense and was reproducible. The real repo
> which triggered this puzzle was jquery. You can try:
>
> git clone git://github.com/jquery/jquery.git &&
> cd jquery &&
> git checkout 1.4.2 &&
> git checkout master
>
> which will fail (but may succeed racily on a slow enough machine).
> Obviously they need to fix the mixed line-ending files in their repo.
> But that fix would be on HEAD, and "git checkout 1.4.2" will be forever
> broken. Is there a way to fix that?
>
> -Peff
>
> [1] The one thing still puzzling me about the jquery repo is how they
> managed to make so many commits (including ones to mixed line ending
> files) without seeing the dirty working tree state and committing it. Is
> there some combination of config that makes this not happen?
When did they introduce the .gitattributes file?
Also, maybe they're jgit users.
Michael
next prev parent reply other threads:[~2011-04-08 9:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-07 23:15 [RFH] eol=lf on existing mixed line-ending files Jeff King
2011-04-08 9:36 ` Michael J Gruber [this message]
2011-04-08 16:06 ` Jeff King
2011-04-09 18:58 ` Dmitry Potapov
2011-04-09 19:32 ` Jeff King
2011-04-09 20:09 ` Dmitry Potapov
2011-04-12 13:57 ` Jay Soffian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D9ED714.80307@drmicha.warpmail.net \
--to=git@drmicha$(echo .)warpmail.net \
--cc=git@vger$(echo .)kernel.org \
--cc=mislav@github$(echo .)com \
--cc=peff@peff$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox