public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Ben Knoble <ben.knoble@gmail•com>
To: "Esteban Küber" <esteban@kuber•com.ar>
Cc: git@vger•kernel.org
Subject: Re: Metadata for merge conflicts during rebase (to aid rustc) and potential for better user experience?
Date: Sat, 27 Dec 2025 13:22:17 -0500	[thread overview]
Message-ID: <C9EB355F-1CDF-4ACA-8EDF-498B457E85C9@gmail.com> (raw)
In-Reply-To: <CAHnEOG29C1fRBZtpEkebat8znMst7D1JiWdqDAVJQceYqMZGkA@mail.gmail.com>


> Le 24 déc. 2025 à 10:03, Esteban Küber <esteban@kuber•com.ar> a écrit :
> 
> On Mon, Dec 22, 2025 at 1:56 PM D. Ben Knoble <ben.knoble@gmail•com> wrote:
>>> On Mon, Dec 22, 2025 at 9:31 AM Esteban Küber <esteban@kuber•com.ar> wrote:
>>> The questions I have are:
>>> - can I *avoid* `--points-at` in any way to identify what branch we're
>>>  rebasing onto?
>> According to "git help rebase", ORIG_HEAD is not reliable but @{1} should be.
> 
> After talking with other members of the compiler team, people have
> concerns about invoking git from the compiler, as it can be a vector
> for unwanted behavior. I would agree with that assessment, so I am
> trying to settle on a mechanism where I can parse git state myself
> (on a best-effort basis; this is only for diagnostics, so fully
> featured support for all environments is not necessary).
> 
>>> - is there already a better way to identify if the rebase was triggered by
>>>  `git rebase` or `git pull` (configured to rebase)?
>> I haven't studied the internals on this yet, but I think the common
>> pattern is to look at REBASE_HEAD vs. MERGE_HEAD.
> 
> Thank you for the additional information! That prompted me to look
> into the rest of the files once more, which gave me some hacky ideas
> on how to get the data I want, and this indeed seems to be
> sufficient to differentiate these two.

I think you will have issues with the reftables backend, then, which stores references differently (and is _probably_ simpler to access via git, though this might motivate contributing support to libgit2; this was also mentioned recently in Discord).

> 
>>> - if neither of the above has a "yes" answer, would git consider *adding*
>>>  that information, both for third-parties as well as to extend its own UI?
>> I think "git status" already shows some of this (maybe not the
>> branches in question, but certainly the "it looks like you're in the
>> middle of a rebase/merge/cherry-pick/etc.").
> 
> I looked around again and arrived to the following conclusions:
> 
> - presence of .git/rebase-merge (and its files) is enough to
>  differentiate between a rebase and a merge
> - .git/rebase-merge/head-name is enough to identify one of the sections
> - identifying *at least* one of the sections is enough to make the
>  output clear enough (even if ideally you'd identify both)
> - the sha in FETCH_HEAD matching .git/rebase-merge/onto is enough
>  to identify that we're dealing with a `git rebase --rebase`
> - there's information that is only present in MERGE_MSG in
>  free-form text, that isn't present anywhere else
> - I can extract the "missing" information for either the
>  identifying information of where we are merging, be it because of
>  a `git pull --no-rebase` or `git merge`; the only issue I see is
>  in having to rely that the output will not change from either of
>  "Merge branch 'main' into branch-name" and
>  "Merge branch 'main' of example.url:user/repo" (how much trouble
>  am I inviting if I were to try and rely on this text not changing
>  so that I can get 'main' and the remote url from here?)

I think you can also get just remote names here, and I’m not sure how to define remote URL when I think there’s support for multiple URLs for a single remote, so it would be a lot of best-effort work IMO. Not that it isn’t worth it, but you’d want to decide what is worth putting the effort into.

> 
> With this, I'm successfully able to identify at least one of the
> sections in the patches in all cases, which is "good enough" for my
> use-case, and with some hacks I can identify both for all but the
> `git rebase` case, without having to invoke git.
> 
> Beyond hearing from any warnings about me relying on the textual
> format of MERGE_MSG or mistakes on the assumptions laid above, I
> would like to suggest two changes to git that I think would be
> beneficial to devs and users.
> 
> First, the information present in MERGE_MSG should be available in a
> more structured format, to allow for tools to deal with git state in
> a less coupled way. (This might not be worth it, and the textual
> representation is already "stable enough" to rely on.)

I think specifying what information is valuable here would help inform a concrete proposal. My first thought is that it could be added to {human,machine}-readable git status, but that’s less accessible to programs that don’t want to invoke Git.

> 
> Secondly, and perhaps more importantly, when generating the diff
> markers that end up in the user files, their description includes
> only the full sha or HEAD, or the short-sha and the commit message.
> I would propose that the branch be identified as well in the
> generated code.  This could look something like:
> 
> `git rebase`:
> <<<<<<< HEAD [branch 'main']
> =======
>>>>>>>> e644375 (commit message) [branch 'name']
> 
> `git merge`:
> <<<<<<< HEAD [branch 'name']
> =======
> ------- between this marker and `>>>>>>>` is the code from branch 'master'
>   println!("Hello, main!");
>>>>>>>> [branch 'main']
> 
> `git pull --rebase`:
> <<<<<<< HEAD [local branch 'main']
> =======
>>>>>>>> 8191e7e4f9f82be45bdd4e71c37d2adcf4f88aa2 [branch 'main' of example.tld:user/repo]
> 
> `git pull --no-rebase`:
> <<<<<<< HEAD [branch 'main' of example.tld:user/repo]
> =======
>>>>>>>> ebbeec7 (commit message) [local branch 'main']
> 
> The format doesn't have to match the above exactly, but having the
> commit *and branch* information will make it much easier for people
> to identify things at a glance, at the cost of some additional
> verbosity in the generated code.

I suppose if no branch was used in the original operation, we could omit it.

I would probably say “from branch X,” since in a typical rebase only the very last commit is actually pointed at directly by the branch.

> The source of the issue is that where "our" and "their" code is in
> the patch depends on a somewhat "arbitrary" distinction (as far as
> a non-implementer is concerned) and it *swaps places* depending on
> whether we are rebasing or merging. Adding some context to the
> resulting patches would go a long way of mitigating the confusion
> this causes.
> 
> Happy holidays,
> Esteban Küber

I can’t seem to puzzle it out, but it seems like perhaps you’d have a better solution than optionally sprinkling branch names if we addressed (somehow) this other issue about location of code? Idk.

Of course you won’t be able to assume “main” is always the effective “ours” ;) so maybe I’m confused about how those 2 things play together. 

  reply	other threads:[~2025-12-27 18:22 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-22 14:31 Metadata for merge conflicts during rebase (to aid rustc) and potential for better user experience? Esteban Küber
2025-12-22 21:56 ` D. Ben Knoble
2025-12-24 15:03   ` Esteban Küber
2025-12-27 18:22     ` Ben Knoble [this message]
2026-01-06 14:29     ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C9EB355F-1CDF-4ACA-8EDF-498B457E85C9@gmail.com \
    --to=ben.knoble@gmail$(echo .)com \
    --cc=esteban@kuber$(echo .)com.ar \
    --cc=git@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox