* [PATCH] merge: use repo_in_merge_bases for octopus up-to-date check
@ 2026-05-12 6:11 Kristofer Karlsson via GitGitGadget
2026-05-18 12:20 ` Derrick Stolee
0 siblings, 1 reply; 3+ messages in thread
From: Kristofer Karlsson via GitGitGadget @ 2026-05-12 6:11 UTC (permalink / raw)
To: git; +Cc: Kristofer Karlsson, Kristofer Karlsson
From: Kristofer Karlsson <krka@spotify•com>
The octopus merge path checks whether each remote head is already
an ancestor of HEAD by computing all merge-bases via
repo_get_merge_bases() and comparing the first result's OID to
the remote head. This is more expensive than necessary:
repo_get_merge_bases() calls paint_down_to_common() with
min_generation=0, performs the full STALE drain, and may run
remove_redundant(), when all we need is a yes/no reachability
answer.
Replace this with repo_in_merge_bases(), which answers the
is-ancestor question directly. When generation numbers are
available, repo_in_merge_bases() uses can_all_from_reach() -- a
DFS bounded by generation number that stops as soon as the target
is found or ruled out, without entering paint_down_to_common() at
all. Without generation numbers, it still benefits from a tighter
min_generation floor.
Signed-off-by: Kristofer Karlsson <krka@spotify•com>
---
merge: use repo_in_merge_bases for octopus up-to-date check
While reviewing callers of repo_get_merge_bases() for a different patch,
I noticed the octopus up-to-date loop in builtin/merge.c computes full
merge-bases only to check whether each remote head is an ancestor of
HEAD.
The existing code calls repo_get_merge_bases(), takes the first result,
frees the list, and compares the OID to the remote head. This is
equivalent to an is-ancestor check, which repo_in_merge_bases() answers
directly.
Using repo_in_merge_bases() simplifies the code (-14/+4 lines) and
avoids unnecessary work: with generation numbers it uses
can_all_from_reach() instead of paint_down_to_common(), and without
generation numbers it still benefits from a tighter min_generation
floor. In practice this only matters for octopus merges on repos with
deep history, so the main value here is the simplification.
The comment "Here we have to calculate the individual merge_bases again"
dates from 2008 (1c7b76be, "Build in merge"). At the time,
in_merge_bases() was the same cost as computing merge bases. Stolee's
2018 generation number work (d7c1ec3e) and the later switch to
can_all_from_reach (6cc01743) made it significantly cheaper, but this
call site was never updated.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2110%2Fspkrka%2Fmerge-octopus-in-merge-bases-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2110/spkrka/merge-octopus-in-merge-bases-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2110
builtin/merge.c | 18 ++++--------------
t/t6408-merge-up-to-date.sh | 10 ++++++++++
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/builtin/merge.c b/builtin/merge.c
index 2cbce56f8d..862107cf41 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1735,21 +1735,11 @@ int cmd_merge(int argc,
struct commit_list *j;
for (j = remoteheads; j; j = j->next) {
- struct commit_list *common_one = NULL;
- struct commit *common_item;
-
- /*
- * Here we *have* to calculate the individual
- * merge_bases again, otherwise "git merge HEAD^
- * HEAD^^" would be missed.
- */
- if (repo_get_merge_bases(the_repository, head_commit,
- j->item, &common_one) < 0)
+ int ret = repo_in_merge_bases(the_repository,
+ j->item, head_commit);
+ if (ret < 0)
exit(128);
-
- common_item = common_one->item;
- commit_list_free(common_one);
- if (!oideq(&common_item->object.oid, &j->item->object.oid)) {
+ if (!ret) {
up_to_date = 0;
break;
}
diff --git a/t/t6408-merge-up-to-date.sh b/t/t6408-merge-up-to-date.sh
index 7763c1ba98..be0840efb6 100755
--- a/t/t6408-merge-up-to-date.sh
+++ b/t/t6408-merge-up-to-date.sh
@@ -89,4 +89,14 @@ test_expect_success 'merge fast-forward octopus' '
test "$expect" = "$current"
'
+test_expect_success 'merge octopus already up to date' '
+
+ git reset --hard c2 &&
+ test_tick &&
+ git merge c0 c1 &&
+ expect=$(git rev-parse c2) &&
+ current=$(git rev-parse HEAD) &&
+ test "$expect" = "$current"
+'
+
test_done
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] merge: use repo_in_merge_bases for octopus up-to-date check
2026-05-12 6:11 [PATCH] merge: use repo_in_merge_bases for octopus up-to-date check Kristofer Karlsson via GitGitGadget
@ 2026-05-18 12:20 ` Derrick Stolee
2026-05-18 14:38 ` Kristofer Karlsson
0 siblings, 1 reply; 3+ messages in thread
From: Derrick Stolee @ 2026-05-18 12:20 UTC (permalink / raw)
To: Kristofer Karlsson via GitGitGadget, git; +Cc: Kristofer Karlsson
On 5/12/2026 2:11 AM, Kristofer Karlsson via GitGitGadget wrote:
> From: Kristofer Karlsson <krka@spotify•com>
>
> The octopus merge path checks whether each remote head is already
> an ancestor of HEAD by computing all merge-bases via
> repo_get_merge_bases() and comparing the first result's OID to
> the remote head. This is more expensive than necessary:
> repo_get_merge_bases() calls paint_down_to_common() with
> min_generation=0, performs the full STALE drain, and may run
> remove_redundant(), when all we need is a yes/no reachability
> answer.
>
> Replace this with repo_in_merge_bases(), which answers the
> is-ancestor question directly. When generation numbers are
> available, repo_in_merge_bases() uses can_all_from_reach() -- a
> DFS bounded by generation number that stops as soon as the target
> is found or ruled out, without entering paint_down_to_common() at
> all. Without generation numbers, it still benefits from a tighter
> min_generation floor.
>
> Signed-off-by: Kristofer Karlsson <krka@spotify•com>
> ---
> merge: use repo_in_merge_bases for octopus up-to-date check
>
> While reviewing callers of repo_get_merge_bases() for a different patch,
> I noticed the octopus up-to-date loop in builtin/merge.c computes full
> merge-bases only to check whether each remote head is an ancestor of
> HEAD.
>
> The existing code calls repo_get_merge_bases(), takes the first result,
> frees the list, and compares the OID to the remote head. This is
> equivalent to an is-ancestor check, which repo_in_merge_bases() answers
> directly.
>
> Using repo_in_merge_bases() simplifies the code (-14/+4 lines) and
> avoids unnecessary work: with generation numbers it uses
> can_all_from_reach() instead of paint_down_to_common(), and without
> generation numbers it still benefits from a tighter min_generation
> floor. In practice this only matters for octopus merges on repos with
> deep history, so the main value here is the simplification.
The code change looks right to me. Do you have any performance numbers
to share? Or was this motivated mostly as an opportunity for code cleanup?
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] merge: use repo_in_merge_bases for octopus up-to-date check
2026-05-18 12:20 ` Derrick Stolee
@ 2026-05-18 14:38 ` Kristofer Karlsson
0 siblings, 0 replies; 3+ messages in thread
From: Kristofer Karlsson @ 2026-05-18 14:38 UTC (permalink / raw)
To: Derrick Stolee; +Cc: Kristofer Karlsson via GitGitGadget, git
Good question! No, this was intended only as a code cleanup and
semantic simplification.
The code path seems like an edge case, so benchmarking it did not seem
worthwhile.
The main win as I see it is clearer semantics for what it's doing and
the optimization is just a bonus.
Checking if repo_get_merge_bases(HEAD, J) == J is better expressed as
repo_in_merge_bases(J, HEAD)
repo_in_merge_bases should probably be renamed to something like
repo_is_ancestor, since its comment says:
Is "commit" an ancestor of (i.e. reachable from) the "reference"?
but I can understand that it may be painful to rename it in practice.
If I do a mental rename, this becomes simpler to reason about:
Checking if repo_get_merge_bases(HEAD, J) == J is better expressed as
repo_is_ancestor(J, HEAD)
- Kristofer
On Mon, 18 May 2026 at 14:20, Derrick Stolee <stolee@gmail•com> wrote:
>
> On 5/12/2026 2:11 AM, Kristofer Karlsson via GitGitGadget wrote:
> > From: Kristofer Karlsson <krka@spotify•com>
> >
> > The octopus merge path checks whether each remote head is already
> > an ancestor of HEAD by computing all merge-bases via
> > repo_get_merge_bases() and comparing the first result's OID to
> > the remote head. This is more expensive than necessary:
> > repo_get_merge_bases() calls paint_down_to_common() with
> > min_generation=0, performs the full STALE drain, and may run
> > remove_redundant(), when all we need is a yes/no reachability
> > answer.
> >
> > Replace this with repo_in_merge_bases(), which answers the
> > is-ancestor question directly. When generation numbers are
> > available, repo_in_merge_bases() uses can_all_from_reach() -- a
> > DFS bounded by generation number that stops as soon as the target
> > is found or ruled out, without entering paint_down_to_common() at
> > all. Without generation numbers, it still benefits from a tighter
> > min_generation floor.
> >
> > Signed-off-by: Kristofer Karlsson <krka@spotify•com>
> > ---
> > merge: use repo_in_merge_bases for octopus up-to-date check
> >
> > While reviewing callers of repo_get_merge_bases() for a different patch,
> > I noticed the octopus up-to-date loop in builtin/merge.c computes full
> > merge-bases only to check whether each remote head is an ancestor of
> > HEAD.
> >
> > The existing code calls repo_get_merge_bases(), takes the first result,
> > frees the list, and compares the OID to the remote head. This is
> > equivalent to an is-ancestor check, which repo_in_merge_bases() answers
> > directly.
> >
> > Using repo_in_merge_bases() simplifies the code (-14/+4 lines) and
> > avoids unnecessary work: with generation numbers it uses
> > can_all_from_reach() instead of paint_down_to_common(), and without
> > generation numbers it still benefits from a tighter min_generation
> > floor. In practice this only matters for octopus merges on repos with
> > deep history, so the main value here is the simplification.
>
> The code change looks right to me. Do you have any performance numbers
> to share? Or was this motivated mostly as an opportunity for code cleanup?
>
> Thanks,
> -Stolee
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-18 14:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-12 6:11 [PATCH] merge: use repo_in_merge_bases for octopus up-to-date check Kristofer Karlsson via GitGitGadget
2026-05-18 12:20 ` Derrick Stolee
2026-05-18 14:38 ` Kristofer Karlsson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox