From: Phillip Wood <phillip.wood123@gmail•com>
To: Colin Stagner <ask+git@howdoi•land>,
git@vger•kernel.org, phillip.wood@dunelm•org.uk
Cc: Zach FettersMoore <zach.fetters@apollographql•com>,
Christian Couder <chriscool@tuxfamily•org>,
Patrik Weiskircher <patrik@pspdfkit•com>
Subject: Re: [PATCH] contrib/subtree: fix split with squashed subtrees
Date: Tue, 2 Sep 2025 14:22:34 +0100 [thread overview]
Message-ID: <773ed81e-34b4-4116-88de-7e4307b6c679@gmail.com> (raw)
In-Reply-To: <ee480c22-0dd3-4c45-a2bd-838c238f1d55@howdoi.land>
On 01/09/2025 21:43, Colin Stagner wrote:
> On 9/1/25 08:54, Phillip Wood wrote:
> Colin Stagner <ask+git@howdoi•land> writes:
>>> - if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
>>> + if test -n "$(git log -1 --grep="git-subtree-dir:" "$rev^!")"
>>
>> We could drop the "-1" as we're only considering a single commit.
>
> Concur.
>
>>> - if test -z "$(git log -1 --grep="git-subtree-mainline:"
>>> $rev)" &&
>>> - test -z "$(git log -1 --grep="git-subtree-dir:
>>> $arg_prefix$" $rev)"
>>> + if test -z "$(git log -1 --grep="git-subtree-mainline:"
>>> "$rev^!")" &&
>>> + test -z "$(git log -1 --grep="git-subtree-dir:
>>> $arg_prefix$" "$rev^!")"
>>
>> I'm less sure about this change. Is the second test checking
>> making sure we don't prune this commit if it has an ancestor
>> that is a subtree merge for the subtree we're interested in?
>
> The outer loop in git-subtree.sh:983 appears to iterate from the root
> commit forwards… and not from the HEAD backwards.
>
> git rev-list --topo-order --reverse --parents $rev $unrevs
> # ^^^^^^^^^
>
> Since the iteration is ancestor-first, I'm having difficulty seeing why
> `should_ignore_subtree_split_commit()` would want to do an ancestor
> traversal at all. It already sees the commits ancestor-first. But there
> could be a reason that I don't know.
Ah, I was only looking at this patch, not how it was called. That begs
the question "what's the point of these checks if we've already visited
all the ancestors anyway". I think the answer is that it is pruning the
recursion that happens in check_parent() and checking the commits that
come from that rev-list command is pointless. The regression test
introduced with this function only looks at $extracount which comes from
the recursion. I haven't looked too closely but it would be nice if we
could move this check so it is only run when check_parents() is recursing.
> Here is a more long-winded breakdown of these tests. From what I can
> determine:
>
> test -z "$(git log -1 --grep="git-subtree-mainline:" "$rev")
>
> excludes squashed commits created from
>
> git subtree merge --prefix subM --squash srcBranch
> > The --squash creates two commits:
>
> 1. A single-parent "Squashed 'subM/' content from", which
> squashes the changes from srcBranch. This commit's tree
> is like the one on srcBranch. It does not have the `subM/`
> prefix.
Yes, and the commit message comes from squash_msg() which adds the
"git-subtree-dir:" and "git-subtree-split:" trailers but not
"git-subtree-mainline:".
> 2. A merge commit which rewrites the tree in (1) to add
> the `subM/` leading directory, then merge it with the
> current branch. The merge commit doesn't have any
> `git-subtree:` trailers.
Indeed. Running
git subtree split --squash --rejoin --prefix subM
will create a squash commit as above and a merge with a commit message
created by rejoin_msg() and contains the "git-subtree-dir:",
"git-subtree-split:" and "git-subtree-mainline:" trailers.
> We must exclude (1) since the trees aren't actually compatible with
> HEAD. (They don't have the `subM` prefix). We must keep (2). The above
> `test -z` appears to do this.
So we'll exclude squashed merges but not squashed splits? I think you're
right that we don't want "git log" to walk the history here.
> I am *much* less certain about the second test:
>
> test -z "$(git log -1 \
> --grep="git-subtree-dir: $arg_prefix$" $rev)"
>
> I think this was intended to keep the mainline portion from a previous
> `git subtree split --rejoin`. But if I remove this `test -z`, all the
> unit tests still pass—including mine. There may not be any test coverage
> for this line. I will probably omit this `test` from v2.
I'm not very familiar with git-subtree but I thought this was ensuring
that we did not exclude the ancestors of a squash or split that involves
the subtree that we're interested in. I wouldn't be surprised if the
test coverage was lacking.
>> It would be very helpful if Zach could comment on what was intended here.
>
> Yes, this would aid my understanding a lot.
and mine too.
>> If it turns out that all three tests only want to consider a single
>> commit then it would be be more efficient to run a single git command
>> and check the output with something like
>>
>> git show -s --format='%(trailers:key=git-subtree-dir,key=git-
>> subtree-mainline' $rev | while read trailer
>> do
>> # check trailers here using case "$trailer"
>> done
>
> This is a cleaner approach, and I'll explore it for v2. Any objection to
> long options like `--no-patch` instead of `-s`? I find these are better
> for scripts since there's less hunting around in man pages.
Sure, I used '-s' out of habit but '--no-patch' would be clearer
(especially as I can't see any link between the short and long option
names in this case)
Thanks
Phillip
next prev parent reply other threads:[~2025-09-02 13:22 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-24 19:10 [PATCH] contrib/subtree: fix split with squashed subtrees Colin Stagner
2025-09-01 13:54 ` Phillip Wood
2025-09-01 20:43 ` Colin Stagner
2025-09-02 13:22 ` Phillip Wood [this message]
2025-09-02 14:57 ` Phillip Wood
2025-09-04 1:34 ` Colin Stagner
2025-09-05 2:27 ` [PATCH v2] " Colin Stagner
2025-09-08 15:21 ` Phillip Wood
2025-09-10 1:56 ` Colin Stagner
2025-09-10 2:02 ` Junio C Hamano
2025-09-10 3:00 ` Colin Stagner
2025-09-10 15:10 ` Junio C Hamano
2025-09-10 3:11 ` [PATCH v3] " Colin Stagner
2025-09-10 9:39 ` Phillip Wood
2025-09-11 16:01 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=773ed81e-34b4-4116-88de-7e4307b6c679@gmail.com \
--to=phillip.wood123@gmail$(echo .)com \
--cc=ask+git@howdoi$(echo .)land \
--cc=chriscool@tuxfamily$(echo .)org \
--cc=git@vger$(echo .)kernel.org \
--cc=patrik@pspdfkit$(echo .)com \
--cc=phillip.wood@dunelm$(echo .)org.uk \
--cc=zach.fetters@apollographql$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox