From: Derrick Stolee <stolee@gmail•com>
To: Junio C Hamano <gitster@pobox•com>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail•com>
Cc: git@vger•kernel.org, vdye@github•com
Subject: Re: [PATCH v2 1/3] commit-reach: add get_branch_base_for_tip
Date: Tue, 13 Aug 2024 09:39:10 -0400 [thread overview]
Message-ID: <bc7fac52-a055-46e5-83de-218fbcbdd605@gmail.com> (raw)
In-Reply-To: <xmqqv8051o22.fsf@gitster.g>
On 8/12/24 4:30 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail•com> writes:
>> Repositories that use pull requests (or merge requests) to advance one or
>> more "protected" branches, the history of that reference can be recovered by
>> following the first-parent history in most cases.
>
> I cannot quite parse it, but perhaps "Repositories that" -> "In
> repositories that"?
That is an improvement, thanks.
>> Most are completed using
>> no-fast-forward merges, though squash merges are quite common. Less common
>> is rebase-and-merge, which still validates this assumption. Finally, the
>> case that breaks this assumption is the fast-forward update (with potential
>> rebasing). Even in this case, the previous commit commonly appears in the
>> first-parent history of the branch.
>
>> Given current command-line interface options, this optimization criteria is
>> not easy to detect directly. Even using the command
>>
>> git rev-list --count --first-parent <base>..<source>
>>
>> does not measure this count, as it uses full reachability from <base> to
>> determine which commits to remove from the range '<base>..<source>'.
>
> Makes me wonder if "--ancestry-path" would help.
One difficulty here is that we don't know the "first-parent merge base"
to supply to the --ancestry-path argument. You could first find this by
running
git rev-list --first-parent --boundary --reverse A...B
and pulling out the first boundary commit 'C'. Then, that could be used in
git rev-list --first-parent --count --ancestry-path=C B
I believe that this two-process-per-ref approach would provide an
existing way to compute these results.
>> The trickiest part of the integer slab is what happens when reaching a
>> collision among the histories of the bases and the history of the source.
>> This is noticed when viewing the first parent and seeing that it has a slab
>> value that differs in sign (negative or positive). In this case, the
>> collision commit is stored in the method variable 'branch_point' and its
>> slab value is set to -1. The index of the best base (so far) is stored in
>> the method variable 'best_index'. It is possible that there are multiple
>> commits that have the branch_point as its first parent, leading to multiple
>> updates of best_index. The result is determined when 'branch_point' is
>> visited in the commit walk, giving the guarantee that all commits that could
>> reach 'branch_point' were visited.
>
> OK.
>
>> +/*
>> + * This slab initializes integers to zero, so use "-1" for "tip is best" and
>> + * "i + 1" for "bases[i] is best".
>> + */
>> +define_commit_slab(best_branch_base, int);
>> +static struct best_branch_base best_branch_base;
>> +#define get_best(c) (*best_branch_base_at(&best_branch_base, c))
>> +#define set_best(c,v) (*best_branch_base_at(&best_branch_base, c) = v)
>
> Micronit. Prepare for macro arguments to be expressions, even if
> current callers don't use anything more complex, i.e., something
> like
>
> (*best_branch_base_at(&best_branch_base, (c)))
> (*best_branch_base_at(&best_branch_base, (c)) = (v))
Thanks. I should have caught this myself.
>> +
>> + /* Initialize queue and slab now that generations are guaranteed. */
>> + init_best_branch_base(&best_branch_base);
>> + set_best(tip, -1);
>> + prio_queue_put(&queue, tip);
>> +
>> + for (size_t i = 0; i < bases_nr; i++) {
>> + struct commit *c = bases[i];
>> +
>> + /* Has this already been marked as best by another commit? */
>> + if (get_best(c))
>> + continue;
>
> Oh, so this defines the tie-breaking behaviour, but simply removing
> it is a wrong solution if we wanted our tie-breaking to work as
> "last one wins", as we still do not want to put it in the queue, so
> this "if best is already found, skip the rest" is serving dual
> purposes. Good.
When trying to make a test case for the for-each-ref behavior around
non-commits, I noticed a bug here. If get_best(c) is -1, then 'c' is
equal to the base and should be selected. I will update the logic here
and add an appropriate test in this patch.
Thanks,
-Stolee
next prev parent reply other threads:[~2024-08-13 13:39 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-01 22:10 [PATCH 0/3] git for-each-ref: is-base atom and base branches Derrick Stolee via GitGitGadget
2024-08-01 22:10 ` [PATCH 1/3] commit-reach: add get_branch_base_for_tip Derrick Stolee via GitGitGadget
2024-08-01 22:10 ` [PATCH 2/3] for-each-ref: add 'is-base' token Derrick Stolee via GitGitGadget
2024-08-01 22:10 ` [PATCH 3/3] p1500: add is-base performance tests Derrick Stolee via GitGitGadget
2024-08-01 23:06 ` [PATCH 0/3] git for-each-ref: is-base atom and base branches Junio C Hamano
2024-08-02 14:32 ` Derrick Stolee
2024-08-02 16:55 ` Junio C Hamano
2024-08-02 17:30 ` Junio C Hamano
2024-08-11 17:34 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2024-08-11 17:34 ` [PATCH v2 1/3] commit-reach: add get_branch_base_for_tip Derrick Stolee via GitGitGadget
2024-08-12 20:30 ` Junio C Hamano
2024-08-13 13:39 ` Derrick Stolee [this message]
2024-08-11 17:34 ` [PATCH v2 2/3] for-each-ref: add 'is-base' token Derrick Stolee via GitGitGadget
2024-08-12 21:05 ` Junio C Hamano
2024-08-13 13:44 ` Derrick Stolee
2024-08-11 17:34 ` [PATCH v2 3/3] p1500: add is-base performance tests Derrick Stolee via GitGitGadget
2024-08-14 10:31 ` [PATCH v3 0/4] git for-each-ref: is-base atom and base branches Derrick Stolee via GitGitGadget
2024-08-14 10:31 ` [PATCH v3 1/4] commit-reach: add get_branch_base_for_tip Derrick Stolee via GitGitGadget
2024-08-14 10:31 ` [PATCH v3 2/4] commit: add gentle reference lookup method Derrick Stolee via GitGitGadget
2024-08-14 10:31 ` [PATCH v3 3/4] for-each-ref: add 'is-base' token Derrick Stolee via GitGitGadget
2024-08-14 10:31 ` [PATCH v3 4/4] p1500: add is-base performance tests Derrick Stolee via GitGitGadget
2024-08-19 19:52 ` [PATCH v3 0/4] git for-each-ref: is-base atom and base branches Junio C Hamano
2024-08-20 1:33 ` Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bc7fac52-a055-46e5-83de-218fbcbdd605@gmail.com \
--to=stolee@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitgitgadget@gmail$(echo .)com \
--cc=gitster@pobox$(echo .)com \
--cc=vdye@github$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox