public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web•de>
To: Git List <git@vger•kernel.org>
Cc: "D. Ben Knoble" <ben.knoble@gmail•com>, Jeff King <peff@peff•net>,
	Phillip Wood <phillip.wood@dunelm•org.uk>,
	Junio C Hamano <gitster@pobox•com>
Subject: [PATCH v2] diff-index: don't queue unchanged filepairs with diff_change()
Date: Sun, 30 Nov 2025 12:47:17 +0100	[thread overview]
Message-ID: <aa28974b-ec73-4562-bfc8-4745ad58b55a@web.de> (raw)

diff_cache() queues unchanged filepairs if the flag find_copies_harder
is set, and uses diff_change() for that.  This function allocates a
filespec for each side, does a few other things that are unnecessary for
unchanged filepairs and always sets the diff_flag has_changes, which is
simply misleading in this case.

Add a new streamlined function for queuing unchanged filepairs and
use it in show_modified(), which is called by diff_cache() via
oneway_diff() and do_oneway_diff().  It allocates only a single filespec
for each filepair and uses it twice with reference counting.  This has a
measurable effect if there are a lot of them, like in the Linux repo:

Benchmark 1: ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder
  Time (mean ± σ):      31.8 ms ±   0.2 ms    [User: 24.2 ms, System: 6.3 ms]
  Range (min … max):    31.5 ms …  32.3 ms    85 runs

Benchmark 2: ./git -C ../linux diff --cached --find-copies-harder
  Time (mean ± σ):      23.9 ms ±   0.2 ms    [User: 18.1 ms, System: 4.6 ms]
  Range (min … max):    23.5 ms …  24.4 ms    111 runs

Summary
  ./git -C ../linux diff --cached --find-copies-harder ran
    1.33 ± 0.01 times faster than ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder

Signed-off-by: René Scharfe <l.s.r@web•de>
---
Changes since v1:
- Clearer description of memory usage in the commit message.

 diff-lib.c | 13 ++++++-------
 diff.c     | 20 ++++++++++++++++++++
 diff.h     |  5 +++++
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index b8f8f3bc31..8e624f38c6 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -418,13 +418,12 @@ static int show_modified(struct rev_info *revs,
 	}
 
 	oldmode = old_entry->ce_mode;
-	if (mode == oldmode && oideq(oid, &old_entry->oid) && !dirty_submodule &&
-	    !revs->diffopt.flags.find_copies_harder)
-		return 0;
-
-	diff_change(&revs->diffopt, oldmode, mode,
-		    &old_entry->oid, oid, 1, !is_null_oid(oid),
-		    old_entry->name, 0, dirty_submodule);
+	if (mode != oldmode || !oideq(oid, &old_entry->oid) || dirty_submodule)
+		diff_change(&revs->diffopt, oldmode, mode,
+			    &old_entry->oid, oid, 1, !is_null_oid(oid),
+			    old_entry->name, 0, dirty_submodule);
+	else if (revs->diffopt.flags.find_copies_harder)
+		diff_same(&revs->diffopt, mode, oid, old_entry->name);
 	return 0;
 }
 
diff --git a/diff.c b/diff.c
index 915317025f..63d33251cd 100644
--- a/diff.c
+++ b/diff.c
@@ -7348,6 +7348,26 @@ void diff_change(struct diff_options *options,
 			  concatpath, old_dirty_submodule, new_dirty_submodule);
 }
 
+void diff_same(struct diff_options *options,
+	       unsigned mode,
+	       const struct object_id *oid,
+	       const char *concatpath)
+{
+	struct diff_filespec *one;
+
+	if (S_ISGITLINK(mode) && is_submodule_ignored(concatpath, options))
+		return;
+
+	if (options->prefix &&
+	    strncmp(concatpath, options->prefix, options->prefix_length))
+		return;
+
+	one = alloc_filespec(concatpath);
+	fill_filespec(one, oid, 1, mode);
+	one->count++;
+	diff_queue(&diff_queued_diff, one, one);
+}
+
 struct diff_filepair *diff_unmerge(struct diff_options *options, const char *path)
 {
 	struct diff_filepair *pair;
diff --git a/diff.h b/diff.h
index 31eedd5c0c..e80503aebb 100644
--- a/diff.h
+++ b/diff.h
@@ -572,6 +572,11 @@ void diff_change(struct diff_options *,
 		 const char *fullpath,
 		 unsigned dirty_submodule1, unsigned dirty_submodule2);
 
+void diff_same(struct diff_options *,
+	       unsigned mode,
+	       const struct object_id *oid,
+	       const char *fullpath);
+
 struct diff_filepair *diff_unmerge(struct diff_options *, const char *path);
 
 void compute_diffstat(struct diff_options *options, struct diffstat_t *diffstat,
-- 
2.52.0

             reply	other threads:[~2025-11-30 11:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-30 11:47 René Scharfe [this message]
2025-11-30 18:02 ` [PATCH v2] diff-index: don't queue unchanged filepairs with diff_change() Junio C Hamano
2025-12-02 21:16   ` René Scharfe
2025-12-02 22:07   ` René Scharfe
2025-12-03 15:06     ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa28974b-ec73-4562-bfc8-4745ad58b55a@web.de \
    --to=l.s.r@web$(echo .)de \
    --cc=ben.knoble@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=peff@peff$(echo .)net \
    --cc=phillip.wood@dunelm$(echo .)org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox