public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Duy Nguyen <pclouds@gmail•com>
Cc: Michael J Gruber <git@drmicha•warpmail.net>, git@vger•kernel.org
Subject: Re: [PATCH v2] diff-lib.c: adjust position of i-t-a entries in diff
Date: Tue, 17 Mar 2015 10:57:06 -0700	[thread overview]
Message-ID: <xmqq1tknpkwd.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20150317140704.GA7248@lanh> (Duy Nguyen's message of "Tue, 17 Mar 2015 21:07:04 +0700")

Duy Nguyen <pclouds@gmail•com> writes:

> On Mon, Mar 16, 2015 at 09:05:45AM -0700, Junio C Hamano wrote:
>> The offending one came from eec3e7e4 (cache-tree: invalidate i-t-a
>> paths after generating trees, 2012-12-16), which was a fix to an
>> earlier bug where a cache-tree written out of an index with i-t-a
>> entries had incorrect information and still claimed it is fully
>> valid after write-tree rebuilt it.  The test probably should add
>> another path without i-t-a bit, run the same "diff --cached" with
>> updated expectation before write-tre, and run the "diff --cached"
>> again to make sure it produces a result that match the updated
>> expectation.
>
> Would adding another non-i-t-a entry help? Before this patch
> "diff --cached" after write-tree shows the i-t-a entry only when
> eec3e7e4 is applied. But with this patch we don't show i-t-a entry any
> more, before or after write-tree, eec3e7e4 makes no visible difference.
>
> We could even revert eec3e7e4 and the outcome of "diff --cached" would
> be the same because we just sort of move the "invalidation" part from
> cache-tree to do_oneway_diff(). Not invalidating would speed up "diff
> --cached" when i-t-a entries are present. Still it may be a good idea
> to invalidate i-t-a paths to be on the safe side. Perhaps a patch like
> this to resurrect the test?

My unerstanding of what eec3e7e4 (cache-tree: invalidate i-t-a paths
after generating trees, 2012-12-16) fixed was that in this sequence:

    - You prepare an index.

    - You write-tree out of the index, which involves:

      - updating the cache-tree to match the shape of the resulting
        from writing the index out.

      - create tree objects matching all levels of the cache-tree as
        needed on disk.

      - report the top-level tree object name

   - run "diff-index --cached", which can and will take advantage of
     the fact that everything in a subtree below a known-to-be-valid
     cache-tree entry does not have to be checked one-by-one.  If a
     cache-tree says "everything under D/ in the index would hash to
     tree object T" and the HEAD has tree object T at D/, then the
     diff machinery will bypass the entire section in the index
     under D/, which is a valid optimization.

     However, when there is an i-t-a entry, we excluded that entry
     from the tree object computation, its presence did not
     contribute to the tree object name, but still marked the
     cache-tree entries that contain it as valid by mistake.  This
     old bug was what the commit fixed, so an invocation of "diff
     --cached" after a write-tree, even if the index contains an
     i-t-a entry, will not see cache-tree entries that are marked
     valid when they are not.  Instead, "diff --cached" will bypass
     the optimization and makes comparison one-by-one for the index
     entries.

So reverting the fix obviously is not the right thing to do.  If the
tests show different results from two invocations of "diff --cached"
with your patch applied, there is something that is broken by your
patch, because the index and the HEAD does not change across
write-tree in that test.

If on the other hand the tests show the same result from these two
"diff --cached" and the result is different from what the test
expects, that means your patch changed the world order, i.e. an
i-t-a entry used to be treated as if it were adding an empty blob to
the index but it is now treated as non-existent, then that is a good
thing and the only thing we need to update is what the test expects.
I am guessing that instead of expecting dir/bar to be shown, it now
should expect no output?

Does adding an non-i-t-a entry help?  It does not hurt, and it makes
the test uses a non-empty output, making its effect more visible,
which may or may not count as helping.


     

  reply	other threads:[~2015-03-17 17:57 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-09 14:14 [PATCH/RFC 0/2] Bug fixes regarding diff and "git add -N" Nguyễn Thái Ngọc Duy
2015-03-09 14:14 ` [PATCH 1/2] diff --cached: do not report i-t-a entries as "new" Nguyễn Thái Ngọc Duy
2015-03-15  6:55   ` Junio C Hamano
2015-03-16 13:56   ` [PATCH v2] diff-lib.c: adjust position of i-t-a entries in diff Nguyễn Thái Ngọc Duy
2015-03-16 15:15     ` Michael J Gruber
2015-03-16 16:05       ` Junio C Hamano
2015-03-17 14:07         ` Duy Nguyen
2015-03-17 17:57           ` Junio C Hamano [this message]
2015-03-18 12:47             ` Duy Nguyen
2015-03-18 20:30               ` Junio C Hamano
2015-03-19  6:00                 ` Junio C Hamano
2015-03-24  1:15                   ` Duy Nguyen
2015-03-24 17:00                     ` Junio C Hamano
2015-03-23 20:52                 ` Junio C Hamano
2015-03-09 14:14 ` [PATCH 2/2] diff-files: mark i-t-a paths as "new" Nguyễn Thái Ngọc Duy
2015-03-15  7:05   ` Junio C Hamano
2015-03-09 15:45 ` [PATCH] t2203,t4011: adjust to changed intent-to-add treatment Michael J Gruber
2015-03-15  7:07   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq1tknpkwd.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=git@drmicha$(echo .)warpmail.net \
    --cc=git@vger$(echo .)kernel.org \
    --cc=pclouds@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox