public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web•de>
To: Jeff King <peff@peff•net>
Cc: "D. Ben Knoble" <ben.knoble@gmail•com>, Git <git@vger•kernel.org>,
	Phillip Wood <phillip.wood@dunelm•org.uk>,
	Junio C Hamano <gitster@pobox•com>
Subject: Re: [PATCH v2] diff: disable rename detection with --quiet
Date: Sat, 22 Nov 2025 22:44:59 +0100	[thread overview]
Message-ID: <8cc12ef2-1d2c-4924-b130-bd740a975ce0@web.de> (raw)
In-Reply-To: <20251110175408.GB76603@coredump.intra.peff.net>

On 11/10/25 6:54 PM, Jeff King wrote:
> 
> I wonder if we should _also_ take the hunk from v1 that teaches
> can_quit_early() to avoid triggering when copy detection is on. It's
> probably redundant now, but it feels to me like that's the place where
> the correctness check should kick in. And the patch here is just
> optimizing out the unnecessary work, but also happens to align things
> for correctness downstream.
> 
> But I dunno. Maybe a check for a condition that we think can never be
> triggered becomes too confusing for later maintenance.
That check was only necessary because we queue unchanged filepairs for
rename detection as if they were changes.  With --quiet forcing rename
detection off we won't run into this anymore, but it still feels like a
trip hazard.  Adding a check would help, but we could also stop doing
that in the first place.  Patch below.

René


--- >8 ---
Subject: [PATCH] diff-index: don't queue unchanged filepairs with diff_change()

diff_cache() queues unchanged filepairs if the flag find_copies_harder
is set, and uses diff_change() for that.  This function does a few
things that are unnecessary for unchanged filepairs and always sets the
diff_flag has_changes, which is simply misleading in this case.

Add a new streamlined function for queuing unchanged filepairs and
use it in show_modified(), which is called by diff_cache() via
oneway_diff() and do_oneway_diff().  It allocates only one half of each
filepair, which has a measurable effect if there are a lot of them, like
in the Linux repo:

Benchmark 1: ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder
  Time (mean ± σ):      31.8 ms ±   0.2 ms    [User: 24.2 ms, System: 6.3 ms]
  Range (min … max):    31.5 ms …  32.3 ms    85 runs

Benchmark 2: ./git -C ../linux diff --cached --find-copies-harder
  Time (mean ± σ):      23.9 ms ±   0.2 ms    [User: 18.1 ms, System: 4.6 ms]
  Range (min … max):    23.5 ms …  24.4 ms    111 runs

Summary
  ./git -C ../linux diff --cached --find-copies-harder ran
    1.33 ± 0.01 times faster than ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder

Signed-off-by: René Scharfe <l.s.r@web•de>
---
 diff-lib.c | 13 ++++++-------
 diff.c     | 20 ++++++++++++++++++++
 diff.h     |  5 +++++
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index b8f8f3bc31..8e624f38c6 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -418,13 +418,12 @@ static int show_modified(struct rev_info *revs,
 	}
 
 	oldmode = old_entry->ce_mode;
-	if (mode == oldmode && oideq(oid, &old_entry->oid) && !dirty_submodule &&
-	    !revs->diffopt.flags.find_copies_harder)
-		return 0;
-
-	diff_change(&revs->diffopt, oldmode, mode,
-		    &old_entry->oid, oid, 1, !is_null_oid(oid),
-		    old_entry->name, 0, dirty_submodule);
+	if (mode != oldmode || !oideq(oid, &old_entry->oid) || dirty_submodule)
+		diff_change(&revs->diffopt, oldmode, mode,
+			    &old_entry->oid, oid, 1, !is_null_oid(oid),
+			    old_entry->name, 0, dirty_submodule);
+	else if (revs->diffopt.flags.find_copies_harder)
+		diff_same(&revs->diffopt, mode, oid, old_entry->name);
 	return 0;
 }
 
diff --git a/diff.c b/diff.c
index efa8d9773c..e2a2927f8c 100644
--- a/diff.c
+++ b/diff.c
@@ -7349,6 +7349,26 @@ void diff_change(struct diff_options *options,
 			  concatpath, old_dirty_submodule, new_dirty_submodule);
 }
 
+void diff_same(struct diff_options *options,
+	       unsigned mode,
+	       const struct object_id *oid,
+	       const char *concatpath)
+{
+	struct diff_filespec *one;
+
+	if (S_ISGITLINK(mode) && is_submodule_ignored(concatpath, options))
+		return;
+
+	if (options->prefix &&
+	    strncmp(concatpath, options->prefix, options->prefix_length))
+		return;
+
+	one = alloc_filespec(concatpath);
+	fill_filespec(one, oid, 1, mode);
+	one->count++;
+	diff_queue(&diff_queued_diff, one, one);
+}
+
 struct diff_filepair *diff_unmerge(struct diff_options *options, const char *path)
 {
 	struct diff_filepair *pair;
diff --git a/diff.h b/diff.h
index 31eedd5c0c..e80503aebb 100644
--- a/diff.h
+++ b/diff.h
@@ -572,6 +572,11 @@ void diff_change(struct diff_options *,
 		 const char *fullpath,
 		 unsigned dirty_submodule1, unsigned dirty_submodule2);
 
+void diff_same(struct diff_options *,
+	       unsigned mode,
+	       const struct object_id *oid,
+	       const char *fullpath);
+
 struct diff_filepair *diff_unmerge(struct diff_options *, const char *path);
 
 void compute_diffstat(struct diff_options *options, struct diffstat_t *diffstat,
-- 
2.52.0


  parent reply	other threads:[~2025-11-22 21:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-08 19:05 diff --cached --no-ext-diff --find-copies-harder --quiet exits with wrong status code D. Ben Knoble
2025-11-08 19:08 ` D. Ben Knoble
2025-11-08 19:12   ` D. Ben Knoble
2025-11-09 12:11 ` [PATCH] diff: disabled quick optimization with --find-copies-harder René Scharfe
2025-11-09 14:18   ` Phillip Wood
2025-11-09 16:43     ` René Scharfe
2025-11-09 16:43 ` [PATCH v2] diff: disable rename detection with --quiet René Scharfe
2025-11-09 17:34   ` D. Ben Knoble
2025-11-09 18:35     ` René Scharfe
2025-11-10 23:58       ` D. Ben Knoble
2025-11-10  9:42     ` Phillip Wood
2025-11-10 17:54   ` Jeff King
2025-11-10 19:13     ` Junio C Hamano
2025-11-22 21:44     ` René Scharfe [this message]
2025-11-23  7:09       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8cc12ef2-1d2c-4924-b130-bd740a975ce0@web.de \
    --to=l.s.r@web$(echo .)de \
    --cc=ben.knoble@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=peff@peff$(echo .)net \
    --cc=phillip.wood@dunelm$(echo .)org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox