From: "Michael Montalbo via GitGitGadget" <gitgitgadget@gmail•com>
To: git@vger•kernel.org
Cc: Michael Montalbo <mmontalbo@gmail•com>,
Michael Montalbo <mmontalbo@gmail•com>
Subject: [PATCH v3 6/6] blame: consult diff process for no-hunk detection
Date: Fri, 29 May 2026 20:48:19 +0000 [thread overview]
Message-ID: <370e766978717e7185d7949632ed4126daf9f331.1780087700.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2120.v3.git.1780087700.gitgitgadget@gmail.com>
From: Michael Montalbo <mmontalbo@gmail•com>
When a diff process is configured via diff.<driver>.process,
consult it during blame's per-commit diffing. If the process
returns no hunks for a commit's changes to a file, treat the
commit as having no changes, causing blame to attribute lines
to earlier commits.
The consultation happens at the pass_blame_to_parent() callsite
using diff_process_fill_hunks(), matching how builtin_diff() in
diff.c uses the same function. A new diff_hunks_xpp() variant
accepts a pre-populated xpparam_t for this callsite, while the
existing diff_hunks() retains its original signature and behavior.
The copy-detection callsite is unaffected since it does not use
the diff process.
The subprocess is long-running (one startup cost amortized
across the blame traversal), but each commit in the file's
history incurs a round-trip to the tool.
Signed-off-by: Michael Montalbo <mmontalbo@gmail•com>
---
blame.c | 40 +++++++++++----
t/t4080-diff-process.sh | 106 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 137 insertions(+), 9 deletions(-)
diff --git a/blame.c b/blame.c
index a3c49d132e..79f02735a4 100644
--- a/blame.c
+++ b/blame.c
@@ -19,6 +19,8 @@
#include "tag.h"
#include "trace2.h"
#include "blame.h"
+#include "diff-process.h"
+#include "xdiff-interface.h"
#include "alloc.h"
#include "commit-slab.h"
#include "bloom.h"
@@ -314,17 +316,25 @@ static struct commit *fake_working_tree_commit(struct repository *r,
-static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
- xdl_emit_hunk_consume_func_t hunk_func, void *cb_data, int xdl_opts)
+static int diff_hunks_xpp(mmfile_t *file_a, mmfile_t *file_b,
+ xdl_emit_hunk_consume_func_t hunk_func,
+ void *cb_data, xpparam_t *xpp)
{
- xpparam_t xpp = {0};
xdemitconf_t xecfg = {0};
xdemitcb_t ecb = {NULL};
- xpp.flags = xdl_opts;
xecfg.hunk_func = hunk_func;
ecb.priv = cb_data;
- return xdi_diff(file_a, file_b, &xpp, &xecfg, &ecb);
+ return xdi_diff(file_a, file_b, xpp, &xecfg, &ecb);
+}
+
+static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
+ xdl_emit_hunk_consume_func_t hunk_func, void *cb_data, int xdl_opts)
+{
+ xpparam_t xpp = {0};
+
+ xpp.flags = xdl_opts;
+ return diff_hunks_xpp(file_a, file_b, hunk_func, cb_data, &xpp);
}
static const char *get_next_line(const char *start, const char *end)
@@ -1943,6 +1953,7 @@ static void pass_blame_to_parent(struct blame_scoreboard *sb,
struct blame_origin *parent, int ignore_diffs)
{
mmfile_t file_p, file_o;
+ xpparam_t xpp = {0};
struct blame_chunk_cb_data d;
struct blame_entry *newdest = NULL;
@@ -1961,10 +1972,21 @@ static void pass_blame_to_parent(struct blame_scoreboard *sb,
&sb->num_read_blob, ignore_diffs);
sb->num_get_patch++;
- if (diff_hunks(&file_p, &file_o, blame_chunk_cb, &d, sb->xdl_opts))
- die("unable to generate diff (%s -> %s)",
- oid_to_hex(&parent->commit->object.oid),
- oid_to_hex(&target->commit->object.oid));
+ xpp.flags = sb->xdl_opts;
+ /*
+ * If the diff process considers the files equivalent,
+ * skip the diff so blame looks past this commit.
+ */
+ if (diff_process_fill_hunks(&sb->revs->diffopt, target->path,
+ &file_p, &file_o, &xpp)
+ != DIFF_PROCESS_EQUIVALENT) {
+ if (diff_hunks_xpp(&file_p, &file_o, blame_chunk_cb,
+ &d, &xpp))
+ die("unable to generate diff (%s -> %s)",
+ oid_to_hex(&parent->commit->object.oid),
+ oid_to_hex(&target->commit->object.oid));
+ }
+ free(xpp.external_hunks);
/* The rest are the same as the parent */
blame_chunk(&d.dstq, &d.srcq, INT_MAX, d.offset, INT_MAX, 0,
parent, target, 0);
diff --git a/t/t4080-diff-process.sh b/t/t4080-diff-process.sh
index ee0c306abd..fdf6da1c34 100755
--- a/t/t4080-diff-process.sh
+++ b/t/t4080-diff-process.sh
@@ -551,4 +551,110 @@ test_expect_success PYTHON 'diff process fallback on overlapping hunks' '
test_grep "NEW5" actual
'
+#
+# Blame integration.
+#
+
+test_expect_success PYTHON 'blame uses tool-provided hunks' '
+ cat >blame-hunk.c <<-\EOF &&
+ line1
+ line2
+ line3
+ line4
+ original5
+ original6
+ line7
+ line8
+ line9
+ line10
+ EOF
+ git add blame-hunk.c &&
+ git commit -m "add blame-hunk.c" &&
+ ORIG=$(git rev-parse --short HEAD) &&
+
+ cat >blame-hunk.c <<-\EOF &&
+ line1
+ line2
+ line3
+ line4
+ changed5
+ changed6
+ line7
+ line8
+ changed9
+ changed10
+ EOF
+ git add blame-hunk.c &&
+ git commit -m "change blame-hunk.c" &&
+ CHANGE=$(git rev-parse --short HEAD) &&
+
+ # With fixed-hunk mode the tool reports only lines 5-6 as changed,
+ # so blame should attribute lines 9-10 to the original commit
+ # even though the builtin diff would show them as changed.
+ git -c diff.cdiff.process="$BACKEND --mode=fixed-hunk" \
+ blame blame-hunk.c >actual &&
+ sed -n "9p" actual >line9 &&
+ sed -n "10p" actual >line10 &&
+ test_grep "$ORIG" line9 &&
+ test_grep "$ORIG" line10 &&
+ sed -n "5p" actual >line5 &&
+ sed -n "6p" actual >line6 &&
+ test_grep "$CHANGE" line5 &&
+ test_grep "$CHANGE" line6
+'
+
+test_expect_success PYTHON 'blame skips commits with no hunks from diff process' '
+ cat >blame.c <<-\EOF &&
+ int main(void)
+ {
+ return 0;
+ }
+ EOF
+ git add blame.c &&
+ git commit -m "add blame.c" &&
+ ORIG_COMMIT=$(git rev-parse --short HEAD) &&
+
+ cat >blame.c <<-\EOF &&
+ int main(void)
+ {
+ return 0;
+ }
+ EOF
+ git add blame.c &&
+ git commit -m "reformat blame.c" &&
+ BLAME_COMMIT=$(git rev-parse --short HEAD) &&
+
+ # Without no-hunks mode, blame attributes the change.
+ git blame blame.c >without &&
+ test_grep "$BLAME_COMMIT" without &&
+
+ # With no-hunks mode, the process considers the files equivalent
+ # and blame skips the reformat commit, attributing to the original.
+ git -c diff.cdiff.process="$BACKEND --mode=no-hunks" \
+ blame blame.c >with &&
+ test_grep ! "$BLAME_COMMIT" with &&
+ test_grep "$ORIG_COMMIT" with
+'
+
+test_expect_success PYTHON 'blame --no-ext-diff bypasses diff process' '
+ rm -f backend.log &&
+ git -c diff.cdiff.process="$BACKEND --mode=no-hunks --log=backend.log" \
+ blame --no-ext-diff blame.c >actual &&
+ # Without the process, blame attributes the reformat commit normally.
+ test_grep "$BLAME_COMMIT" actual &&
+ test_path_is_missing backend.log
+'
+
+test_expect_success PYTHON 'blame --no-ext-diff uses builtin hunks' '
+ # fixed-hunk mode would narrow blame to lines 5-6, but
+ # --no-ext-diff should bypass it and use the builtin diff.
+ rm -f backend.log &&
+ git -c diff.cdiff.process="$BACKEND --mode=fixed-hunk --log=backend.log" \
+ blame --no-ext-diff blame-hunk.c >actual &&
+ # Builtin diff attributes lines 9-10 to the change commit.
+ sed -n "9p" actual >line9 &&
+ test_grep "$CHANGE" line9 &&
+ test_path_is_missing backend.log
+'
+
test_done
--
gitgitgadget
next prev parent reply other threads:[~2026-05-29 20:48 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 2:11 [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 1/5] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` Junio C Hamano
2026-05-22 19:06 ` Michael Montalbo
2026-05-24 8:50 ` Junio C Hamano
2026-05-24 18:01 ` Michael Montalbo
2026-05-22 2:11 ` [PATCH 2/5] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 3/5] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 4/5] blame: consult diff process for zero-hunk detection Michael Montalbo via GitGitGadget
2026-05-22 2:11 ` [PATCH 5/5] diff-process-normalize: add built-in whitespace normalizer Michael Montalbo via GitGitGadget
2026-05-22 5:29 ` [PATCH 0/5] [RFC] diff: add diff.<driver>.process for external hunk providers Junio C Hamano
2026-05-22 17:19 ` Michael Montalbo
2026-05-25 18:29 ` [PATCH v2 0/4] " Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 1/4] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 2/4] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-25 18:29 ` [PATCH v2 3/4] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-26 1:56 ` Junio C Hamano
2026-05-29 0:51 ` Michael Montalbo
2026-05-26 2:26 ` Junio C Hamano
2026-05-29 0:55 ` Michael Montalbo
2026-05-25 18:29 ` [PATCH v2 4/4] blame: consult diff process for zero-hunk detection Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 0/6] [RFC] diff: add diff.<driver>.process for external hunk providers Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 1/6] xdiff: support external hunks via xpparam_t Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 2/6] userdiff: add diff.<driver>.process config Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 3/6] sub-process: separate process lifecycle from hashmap management Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 4/6] diff: add long-running diff process via diff.<driver>.process Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` [PATCH v3 5/6] diff: bypass diff process with --no-ext-diff and in format-patch Michael Montalbo via GitGitGadget
2026-05-29 20:48 ` Michael Montalbo via GitGitGadget [this message]
2026-05-31 10:44 ` [PATCH v3 0/6] [RFC] diff: add diff.<driver>.process for external hunk providers Junio C Hamano
2026-06-01 4:28 ` Michael Montalbo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=370e766978717e7185d7949632ed4126daf9f331.1780087700.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=mmontalbo@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox