public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: David Turner <dturner@twopensource•com>
Cc: git mailing list <git@vger•kernel.org>
Subject: Re: File owner/group and git
Date: Wed, 04 Nov 2015 18:38:40 -0800	[thread overview]
Message-ID: <xmqqoaf9891r.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <1446689037.4131.47.camel@twopensource.com> (David Turner's message of "Wed, 04 Nov 2015 21:03:57 -0500")

David Turner <dturner@twopensource•com> writes:

> In unpack-trees.c, in verify_uptodate_1, we check ie_match_stat.  This
> returns OWNER_CHANGED if a file has changed ownership since the index
> was updated.  Do we actually care about that particular case?  Or really
> anything other than DATA_CHANGED?

That's a 10-year old code and there aren't that many people left
who can answer the original rationale, I am afraid ;-)

In general, "Do we actually care?" is not the question we ask in
this area of the code.  "Does it help us to catch real changes, or
does it change spuriously to make it too unreliable a signal to be
useful?" is the question that drives the design of this part of the
system.

DATA_CHANGED is "we know the contents are different without even
looking at the data".  If the size is different from the last time
we hashed the data, the contents must have changed.  The inverse is
not true (and that is half of the "racy git" issue).

Other *_CHANGED are finely classified only because originally we
didn't really know which are useful to treat as notable change
event, and "changed" variable had sufficient number of bits to hold
different classification, so that we could pick and choose which
ones we truly care.  We knew MTIME was useful in the sense that even
if the size is the same, updated mtime is good enough indication
that the stuff has changed, even to "make" utility.

INODE and CTIME are not so stable on some filesystems (e.g. inum may
not be stable on a network share across remount) and in some
environments (e.g. some virus scanners touch ctime to mark scanned
files, cf. 1ce4790b), and would trigger false positives too often to
be useful.  We always paid attention to them initially, but there
are configurations to tell Git not raise them these days.

OWNER probably falls into a category that is stable enough to be
useful, as the most likely way for it to change is not by running
"chown" on the file in-place (which does not change the contents),
but by running "mv" to drop another file owned by somebody else to
the original location (which likely does change the contents).  At
the same time, "mv" a different file into the path would likely
trigger changes to INODE and MTIME as well, so it cannot be more
than belt-and-suspenders measure to catch modification.  In that
sense ignoring OWNER would not hurt too much.

If it changes spuriously to make it too unreliable a signal to be
useful, it certainly is OK to introduce a knob to ignore it.  It
might even make sense to ignore it unconditionally if the false hit
happens too frequently, but offhand my gut reaction is that there
may be something wrong in the environment (i.e. system outside Git
in which Git runs) if owner/group changes spuriously to cause
issues.

  reply	other threads:[~2015-11-05  2:38 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-05  2:03 File owner/group and git David Turner
2015-11-05  2:38 ` Junio C Hamano [this message]
2015-11-05 17:57   ` David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqoaf9891r.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=dturner@twopensource$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox