public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Toon Claes <toon@iotcl•com>
To: Gusted <gusted@codeberg•org>, git@vger•kernel.org
Subject: Re: git-last-modified weirdness
Date: Mon, 05 Jan 2026 11:57:51 +0100	[thread overview]
Message-ID: <87v7hgpbrk.fsf@iotcl.com> (raw)
In-Reply-To: <03f96860-29fc-42a7-a220-c3ec65eb8516@codeberg.org>

Gusted <gusted@codeberg•org> writes:

> Hi,
>
> Resending this mail as it looks like it might not have arrived (couldn't 
> find it in the mailing list archive).

Thanks for following up. I didn't see it yet.

> For Forgejo, I wanted to look into using git-last-modified to gain extra
> performance for larger repositories where this can often result in being 
> (one of) the slowest git operation. However I noticed some problems that 
> looks to be bugs.
>
> I've ran all the following commands on the following Git repository, on Git
> v2.52.0 (Arch Linux) and my git config does not enable or disable any 
> feature that should've impacted the any of the following observations.
>
> $ tmp=$(mktemp -d)
> $ git clone https://codeberg.org/forgejo/forgejo $tmp
> $ cd tmp
>
> During some experiments I noticed it being slower for some files. An 
> example:
>
> $ hyperfine --warmup 5 'git log --max-count=1 DCO' 'git last-modified DCO'
> Benchmark 1: git log --max-count=1 DCO
>    Time (mean ± σ):      86.9 ms ±   0.8 ms    [User: 70.1 ms, System: 15.6 ms]
>    Range (min … max):    85.5 ms …  88.3 ms    34 runs
>
> Benchmark 2: git last-modified DCO
>    Time (mean ± σ):     151.3 ms ±   4.3 ms    [User: 133.4 ms, System: 15.9 ms]
>    Range (min … max):   145.4 ms … 167.1 ms    19 runs

In my local benchmarks I see similar results.

I agree this isn't great, but git-log(1) is just very good at logging a
single path. git-last-modified(1) is mostly designed to give commits
for a bunch of paths. For example:

    $ hyperfine --warmup 5 'git ls-tree HEAD --name-only | xargs --max-args=1 git log --max-count=1 --format=oneline --' 'git last-modified'
    Benchmark 1: git ls-tree HEAD --name-only | xargs --max-args=1 git log --max-count=1 --format=oneline --
      Time (mean ± σ):     852.5 ms ±   9.2 ms    [User: 703.8 ms, System: 141.9 ms]
      Range (min … max):   841.9 ms … 869.4 ms    10 runs

    Benchmark 2: git last-modified
      Time (mean ± σ):     141.2 ms ±   2.0 ms    [User: 133.0 ms, System: 7.9 ms]
      Range (min … max):   137.7 ms … 146.0 ms    21 runs

    Summary
      git last-modified ran
        6.04 ± 0.11 times faster than git ls-tree HEAD --name-only | xargs --max-args=1 git log --max-count=1 --format=oneline --

> This might be me misunderstanding the feature, but it looks to me this 
> cannot be used for paths that is inside a directory. The following two commands 
> yield the same output:
>
> $ git last-modified -- web_src
> 24019ef5e83fd7bed7f31ad09dd8d5f26b4bdc69        web_src
> $ git last-modified -- web_src/svg
> 24019ef5e83fd7bed7f31ad09dd8d5f26b4bdc69        web_src
>
> Where I expected the latter command to return the last commit of 
> web_src/svg.

I agree this is confusing. And I plan to propose a change to this
behavior. But at the moment what you're supposed to do in this
situation:

    $ git last-modified -- web_src
    28e0af23faf6c8e8f353ba2ae818ee0f83fd3e5c        web_src
    $ git last-modified -r --max-depth=0 -- web_src/svg
    b8f15e4ea09c6571872607874ae099269ea4b201        web_src/svg

I plan to change the default behavior to basically behave like `-r
--max-depth=0`. But I'm happy to hear your input if you think it should
be something else?
There's some context here[1], but as said, I might shift direction a bit
toward making the default more intuitive.

[1]: https://lore.kernel.org/git/20251126-toon-last-modified-zzzz-v1-0-608350df0caa@iotcl.com/

> I'm not sure why I tried this, but I can trigger a BUG when giving it some
> nonsense input:
>
> $ git last-modified fb06ce04173d47aaaa498385621cba8b8dfd7584
> BUG: builtin/last-modified.c:456: paths remaining beyond boundary in
> last-modified
> [1]    690163 IOT instruction (core dumped)  git last-modified
>
> `fb06ce04173d47aaaa498385621cba8b8dfd7584` is the tree commit id of 
> web_src. I
> suppose this should've returned a nice error message or blank output. It 
> does
> give a blank output when you specify a valid path:
>
> $ git last-modified fb06ce04173d47aaaa498385621cba8b8dfd7584 web_src
>

Hah, that sounds like a real bug. Thanks for reporting, I will look into
it.

> Kind regards,
> Gusted
>
>

-- 
Cheers,
Toon

  reply	other threads:[~2026-01-05 10:58 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <406222e6-d10b-47d8-a177-de5912db4512@codeberg.org>
2026-01-04  5:13 ` git-last-modified weirdness Gusted
2026-01-05 10:57   ` Toon Claes [this message]
2026-01-05 11:52     ` Gusted

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v7hgpbrk.fsf@iotcl.com \
    --to=toon@iotcl$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gusted@codeberg$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox