public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail•com>
To: Colin Stagner <ask+git@howdoi•land>,
	git@vger•kernel.org, phillip.wood@dunelm•org.uk
Cc: Zach FettersMoore <zach.fetters@apollographql•com>,
	Christian Couder <chriscool@tuxfamily•org>,
	Patrik Weiskircher <patrik@pspdfkit•com>
Subject: Re: [PATCH] contrib/subtree: fix split with squashed subtrees
Date: Tue, 2 Sep 2025 14:22:34 +0100	[thread overview]
Message-ID: <773ed81e-34b4-4116-88de-7e4307b6c679@gmail.com> (raw)
In-Reply-To: <ee480c22-0dd3-4c45-a2bd-838c238f1d55@howdoi.land>

On 01/09/2025 21:43, Colin Stagner wrote:
> On 9/1/25 08:54, Phillip Wood wrote:
> Colin Stagner <ask+git@howdoi•land> writes:
>>> -    if test -n "$(git log -1 --grep="git-subtree-dir:" $rev)"
>>> +    if test -n "$(git log -1 --grep="git-subtree-dir:" "$rev^!")"
>>
>> We could drop the "-1" as we're only considering a single commit.
> 
> Concur.
> 
>>> -        if test -z "$(git log -1 --grep="git-subtree-mainline:" 
>>> $rev)" &&
>>> -            test -z "$(git log -1 --grep="git-subtree-dir: 
>>> $arg_prefix$" $rev)"
>>> +        if test -z "$(git log -1 --grep="git-subtree-mainline:" 
>>> "$rev^!")" &&
>>> +            test -z "$(git log -1 --grep="git-subtree-dir: 
>>> $arg_prefix$" "$rev^!")"
>>
>> I'm less sure about this change. Is the second test checking
>> making sure we don't prune this commit if it has an ancestor
>> that is a subtree merge for the subtree we're interested in?
> 
> The outer loop in git-subtree.sh:983 appears to iterate from the root 
> commit forwards… and not from the HEAD backwards.
> 
>      git rev-list --topo-order --reverse --parents $rev $unrevs
>      #                         ^^^^^^^^^
> 
> Since the iteration is ancestor-first, I'm having difficulty seeing why 
> `should_ignore_subtree_split_commit()` would want to do an ancestor 
> traversal at all. It already sees the commits ancestor-first. But there 
> could be a reason that I don't know.

Ah, I was only looking at this patch, not how it was called. That begs 
the question "what's the point of these checks if we've already visited 
all the ancestors anyway". I think the answer is that it is pruning the 
recursion that happens in check_parent() and checking the commits that 
come from that rev-list command is pointless. The regression test 
introduced with this function only looks at $extracount which comes from 
the recursion. I haven't looked too closely but it would be nice if we 
could move this check so it is only run when check_parents() is recursing.

> Here is a more long-winded breakdown of these tests. From what I can 
> determine:
> 
>      test -z "$(git log -1 --grep="git-subtree-mainline:" "$rev")
> 
> excludes squashed commits created from
> 
>      git subtree merge --prefix subM --squash srcBranch
 > > The --squash creates two commits:
> 
> 1. A single-parent "Squashed 'subM/' content from", which
>     squashes the changes from srcBranch. This commit's tree
>     is like the one on srcBranch. It does not have the `subM/`
>     prefix.
Yes, and the commit message comes from squash_msg() which adds the 
"git-subtree-dir:" and "git-subtree-split:" trailers but not 
"git-subtree-mainline:".

> 2. A merge commit which rewrites the tree in (1) to add
>     the `subM/` leading directory, then merge it with the
>     current branch. The merge commit doesn't have any
>     `git-subtree:` trailers.

Indeed. Running

	git subtree split --squash --rejoin --prefix subM

will create a squash commit as above and a merge with a commit message 
created by rejoin_msg() and contains the "git-subtree-dir:", 
"git-subtree-split:" and "git-subtree-mainline:" trailers.

> We must exclude (1) since the trees aren't actually compatible with 
> HEAD. (They don't have the `subM` prefix). We must keep (2). The above 
> `test -z` appears to do this.

So we'll exclude squashed merges but not squashed splits? I think you're 
right that we don't want "git log" to walk the history here.

> I am *much* less certain about the second test:
> 
>     test -z "$(git log -1 \
>                --grep="git-subtree-dir: $arg_prefix$" $rev)"
> 
> I think this was intended to keep the mainline portion from a previous 
> `git subtree split --rejoin`. But if I remove this `test -z`, all the 
> unit tests still pass—including mine. There may not be any test coverage 
> for this line. I will probably omit this `test` from v2.

I'm not very familiar with git-subtree but I thought this was ensuring 
that we did not exclude the ancestors of a squash or split that involves 
the subtree that we're interested in. I wouldn't be surprised if the 
test coverage was lacking.
>> It would be very helpful if Zach could comment on what was intended here.
> 
> Yes, this would aid my understanding a lot.

and mine too.

>> If it turns out that all three tests only want to consider a single 
>> commit then it would be be more efficient to run a single git command 
>> and check the output with something like
>>
>>      git show -s --format='%(trailers:key=git-subtree-dir,key=git- 
>> subtree-mainline' $rev | while read trailer
>>          do
>>              # check trailers here using case "$trailer"
>>          done
> 
> This is a cleaner approach, and I'll explore it for v2. Any objection to 
> long options like `--no-patch` instead of `-s`? I find these are better 
> for scripts since there's less hunting around in man pages.

Sure, I used '-s' out of habit but '--no-patch' would be clearer 
(especially as I can't see any link between the short and long option 
names in this case)

Thanks

Phillip


  reply	other threads:[~2025-09-02 13:22 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-24 19:10 [PATCH] contrib/subtree: fix split with squashed subtrees Colin Stagner
2025-09-01 13:54 ` Phillip Wood
2025-09-01 20:43   ` Colin Stagner
2025-09-02 13:22     ` Phillip Wood [this message]
2025-09-02 14:57       ` Phillip Wood
2025-09-04  1:34         ` Colin Stagner
2025-09-05  2:27 ` [PATCH v2] " Colin Stagner
2025-09-08 15:21   ` Phillip Wood
2025-09-10  1:56     ` Colin Stagner
2025-09-10  2:02       ` Junio C Hamano
2025-09-10  3:00         ` Colin Stagner
2025-09-10 15:10           ` Junio C Hamano
2025-09-10  3:11 ` [PATCH v3] " Colin Stagner
2025-09-10  9:39   ` Phillip Wood
2025-09-11 16:01     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=773ed81e-34b4-4116-88de-7e4307b6c679@gmail.com \
    --to=phillip.wood123@gmail$(echo .)com \
    --cc=ask+git@howdoi$(echo .)land \
    --cc=chriscool@tuxfamily$(echo .)org \
    --cc=git@vger$(echo .)kernel.org \
    --cc=patrik@pspdfkit$(echo .)com \
    --cc=phillip.wood@dunelm$(echo .)org.uk \
    --cc=zach.fetters@apollographql$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox