From: "René Scharfe" <l.s.r@web•de>
To: Git List <git@vger•kernel.org>
Cc: "D. Ben Knoble" <ben.knoble@gmail•com>, Jeff King <peff@peff•net>,
Phillip Wood <phillip.wood@dunelm•org.uk>,
Junio C Hamano <gitster@pobox•com>
Subject: [PATCH v2] diff-index: don't queue unchanged filepairs with diff_change()
Date: Sun, 30 Nov 2025 12:47:17 +0100 [thread overview]
Message-ID: <aa28974b-ec73-4562-bfc8-4745ad58b55a@web.de> (raw)
diff_cache() queues unchanged filepairs if the flag find_copies_harder
is set, and uses diff_change() for that. This function allocates a
filespec for each side, does a few other things that are unnecessary for
unchanged filepairs and always sets the diff_flag has_changes, which is
simply misleading in this case.
Add a new streamlined function for queuing unchanged filepairs and
use it in show_modified(), which is called by diff_cache() via
oneway_diff() and do_oneway_diff(). It allocates only a single filespec
for each filepair and uses it twice with reference counting. This has a
measurable effect if there are a lot of them, like in the Linux repo:
Benchmark 1: ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder
Time (mean ± σ): 31.8 ms ± 0.2 ms [User: 24.2 ms, System: 6.3 ms]
Range (min … max): 31.5 ms … 32.3 ms 85 runs
Benchmark 2: ./git -C ../linux diff --cached --find-copies-harder
Time (mean ± σ): 23.9 ms ± 0.2 ms [User: 18.1 ms, System: 4.6 ms]
Range (min … max): 23.5 ms … 24.4 ms 111 runs
Summary
./git -C ../linux diff --cached --find-copies-harder ran
1.33 ± 0.01 times faster than ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder
Signed-off-by: René Scharfe <l.s.r@web•de>
---
Changes since v1:
- Clearer description of memory usage in the commit message.
diff-lib.c | 13 ++++++-------
diff.c | 20 ++++++++++++++++++++
diff.h | 5 +++++
3 files changed, 31 insertions(+), 7 deletions(-)
diff --git a/diff-lib.c b/diff-lib.c
index b8f8f3bc31..8e624f38c6 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -418,13 +418,12 @@ static int show_modified(struct rev_info *revs,
}
oldmode = old_entry->ce_mode;
- if (mode == oldmode && oideq(oid, &old_entry->oid) && !dirty_submodule &&
- !revs->diffopt.flags.find_copies_harder)
- return 0;
-
- diff_change(&revs->diffopt, oldmode, mode,
- &old_entry->oid, oid, 1, !is_null_oid(oid),
- old_entry->name, 0, dirty_submodule);
+ if (mode != oldmode || !oideq(oid, &old_entry->oid) || dirty_submodule)
+ diff_change(&revs->diffopt, oldmode, mode,
+ &old_entry->oid, oid, 1, !is_null_oid(oid),
+ old_entry->name, 0, dirty_submodule);
+ else if (revs->diffopt.flags.find_copies_harder)
+ diff_same(&revs->diffopt, mode, oid, old_entry->name);
return 0;
}
diff --git a/diff.c b/diff.c
index 915317025f..63d33251cd 100644
--- a/diff.c
+++ b/diff.c
@@ -7348,6 +7348,26 @@ void diff_change(struct diff_options *options,
concatpath, old_dirty_submodule, new_dirty_submodule);
}
+void diff_same(struct diff_options *options,
+ unsigned mode,
+ const struct object_id *oid,
+ const char *concatpath)
+{
+ struct diff_filespec *one;
+
+ if (S_ISGITLINK(mode) && is_submodule_ignored(concatpath, options))
+ return;
+
+ if (options->prefix &&
+ strncmp(concatpath, options->prefix, options->prefix_length))
+ return;
+
+ one = alloc_filespec(concatpath);
+ fill_filespec(one, oid, 1, mode);
+ one->count++;
+ diff_queue(&diff_queued_diff, one, one);
+}
+
struct diff_filepair *diff_unmerge(struct diff_options *options, const char *path)
{
struct diff_filepair *pair;
diff --git a/diff.h b/diff.h
index 31eedd5c0c..e80503aebb 100644
--- a/diff.h
+++ b/diff.h
@@ -572,6 +572,11 @@ void diff_change(struct diff_options *,
const char *fullpath,
unsigned dirty_submodule1, unsigned dirty_submodule2);
+void diff_same(struct diff_options *,
+ unsigned mode,
+ const struct object_id *oid,
+ const char *fullpath);
+
struct diff_filepair *diff_unmerge(struct diff_options *, const char *path);
void compute_diffstat(struct diff_options *options, struct diffstat_t *diffstat,
--
2.52.0
next reply other threads:[~2025-11-30 11:52 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-30 11:47 René Scharfe [this message]
2025-11-30 18:02 ` [PATCH v2] diff-index: don't queue unchanged filepairs with diff_change() Junio C Hamano
2025-12-02 21:16 ` René Scharfe
2025-12-02 22:07 ` René Scharfe
2025-12-03 15:06 ` René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa28974b-ec73-4562-bfc8-4745ad58b55a@web.de \
--to=l.s.r@web$(echo .)de \
--cc=ben.knoble@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=peff@peff$(echo .)net \
--cc=phillip.wood@dunelm$(echo .)org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox