From: Gusted <gusted@codeberg•org>
To: Toon Claes <toon@iotcl•com>, git@vger•kernel.org
Subject: Re: git-last-modified weirdness
Date: Mon, 5 Jan 2026 12:52:01 +0100 [thread overview]
Message-ID: <4b6fe686-bb3d-4d10-8a4d-7542b4c93e45@codeberg.org> (raw)
In-Reply-To: <87v7hgpbrk.fsf@iotcl.com>
On 1/5/26 11:57 AM, Toon Claes wrote:
> Gusted <gusted@codeberg•org> writes:
>
>> Hi,
>>
>> Resending this mail as it looks like it might not have arrived (couldn't
>> find it in the mailing list archive).
> Thanks for following up. I didn't see it yet.
>
>> For Forgejo, I wanted to look into using git-last-modified to gain extra
>> performance for larger repositories where this can often result in being
>> (one of) the slowest git operation. However I noticed some problems that
>> looks to be bugs.
>>
>> I've ran all the following commands on the following Git repository,
on Git
>> v2.52.0 (Arch Linux) and my git config does not enable or disable any
>> feature that should've impacted the any of the following observations.
>>
>> $ tmp=$(mktemp -d)
>> $ git clone https://codeberg.org/forgejo/forgejo $tmp
>> $ cd tmp
>>
>> During some experiments I noticed it being slower for some files. An
>> example:
>>
>> $ hyperfine --warmup 5 'git log --max-count=1 DCO' 'git
last-modified DCO'
>> Benchmark 1: git log --max-count=1 DCO
>> Time (mean ± σ): 86.9 ms ± 0.8 ms [User: 70.1 ms,
System: 15.6 ms]
>> Range (min … max): 85.5 ms … 88.3 ms 34 runs
>>
>> Benchmark 2: git last-modified DCO
>> Time (mean ± σ): 151.3 ms ± 4.3 ms [User: 133.4 ms,
System: 15.9 ms]
>> Range (min … max): 145.4 ms … 167.1 ms 19 runs
> In my local benchmarks I see similar results.
>
> I agree this isn't great, but git-log(1) is just very good at logging a
> single path. git-last-modified(1) is mostly designed to give commits
> for a bunch of paths. For example:
>
> $ hyperfine --warmup 5 'git ls-tree HEAD --name-only | xargs
--max-args=1 git log --max-count=1 --format=oneline --' 'git last-modified'
> Benchmark 1: git ls-tree HEAD --name-only | xargs --max-args=1
git log --max-count=1 --format=oneline --
> Time (mean ± σ): 852.5 ms ± 9.2 ms [User: 703.8 ms,
System: 141.9 ms]
> Range (min … max): 841.9 ms … 869.4 ms 10 runs
>
> Benchmark 2: git last-modified
> Time (mean ± σ): 141.2 ms ± 2.0 ms [User: 133.0 ms,
System: 7.9 ms]
> Range (min … max): 137.7 ms … 146.0 ms 21 runs
>
> Summary
> git last-modified ran
> 6.04 ± 0.11 times faster than git ls-tree HEAD --name-only |
xargs --max-args=1 git log --max-count=1 --format=oneline --
Only using git-last-modified when there are more than a few paths is
okay for how I want to use it. I was not really able to deduce this
from the manual, the general feeling after reading Github blog, Gitlab
blog and the release notes of v2.52.0 it looked to be a good
replacement of git log -n1 in all cases.
>> This might be me misunderstanding the feature, but it looks to me this
>> cannot be used for paths that is inside a directory. The following
two commands
>> yield the same output:
>>
>> $ git last-modified -- web_src
>> 24019ef5e83fd7bed7f31ad09dd8d5f26b4bdc69 web_src
>> $ git last-modified -- web_src/svg
>> 24019ef5e83fd7bed7f31ad09dd8d5f26b4bdc69 web_src
>>
>> Where I expected the latter command to return the last commit of
>> web_src/svg.
> I agree this is confusing. And I plan to propose a change to this
> behavior. But at the moment what you're supposed to do in this
> situation:
>
> $ git last-modified -- web_src
> 28e0af23faf6c8e8f353ba2ae818ee0f83fd3e5c web_src
> $ git last-modified -r --max-depth=0 -- web_src/svg
> b8f15e4ea09c6571872607874ae099269ea4b201 web_src/svg
>
> I plan to change the default behavior to basically behave like `-r
> --max-depth=0`. But I'm happy to hear your input if you think it should
> be something else?
> There's some context here[1], but as said, I might shift direction a bit
> toward making the default more intuitive.
>
> [1]:
https://lore.kernel.org/git/20251126-toon-last-modified-zzzz-v1-0-608350df0caa@iotcl.com/
Oh, there's a whole new option! That's exactly what I was looking for
to get that behavior. Only returning the root level information by
default looks and feels silly and does remind me of git-diff-tree's
default, so I would agree on having -r --max-depth=0 as the default.
Returning the information exactly for the paths being given sounds most
reasonable.
Although given you mention that this command works best for multiple
paths I can also imagine -r --max-depth=1 as default to nudge people to
use it for that purpose.
>> I'm not sure why I tried this, but I can trigger a BUG when giving
it some
>> nonsense input:
>>
>> $ git last-modified fb06ce04173d47aaaa498385621cba8b8dfd7584
>> BUG: builtin/last-modified.c:456: paths remaining beyond boundary in
>> last-modified
>> [1] 690163 IOT instruction (core dumped) git last-modified
>>
>> `fb06ce04173d47aaaa498385621cba8b8dfd7584` is the tree commit id of
>> web_src. I
>> suppose this should've returned a nice error message or blank output. It
>> does
>> give a blank output when you specify a valid path:
>>
>> $ git last-modified fb06ce04173d47aaaa498385621cba8b8dfd7584 web_src
>>
> Hah, that sounds like a real bug. Thanks for reporting, I will look into
> it.
>
>> Kind regards,
>> Gusted
>>
>>
Kind Regards
Gusted
prev parent reply other threads:[~2026-01-05 11:54 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <406222e6-d10b-47d8-a177-de5912db4512@codeberg.org>
2026-01-04 5:13 ` git-last-modified weirdness Gusted
2026-01-05 10:57 ` Toon Claes
2026-01-05 11:52 ` Gusted [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4b6fe686-bb3d-4d10-8a4d-7542b4c93e45@codeberg.org \
--to=gusted@codeberg$(echo .)org \
--cc=git@vger$(echo .)kernel.org \
--cc=toon@iotcl$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox