From: Jeff King <peff@peff•net>
To: Martin Fick <mfick@nvidia•com>
Cc: "brian m. carlson" <sandals@crustytoothpaste•net>,
"git@vger•kernel.org" <git@vger•kernel.org>
Subject: Re: Slow git pack-refs --all
Date: Tue, 6 Jan 2026 05:38:03 -0500 [thread overview]
Message-ID: <20260106103803.GA69061@coredump.intra.peff.net> (raw)
In-Reply-To: <CH3PR12MB90260C4887067C88629BBE52C286A@CH3PR12MB9026.namprd12.prod.outlook.com>
On Mon, Jan 05, 2026 at 11:45:41PM +0000, Martin Fick wrote:
> By repacking to get one used, and one cruft pack only, and no loose
> objects, I have confirmed that pack-refs it is still slow. This rules out the
> idea that the loose object, or pack file counts were making things slow.
OK, that is interesting. I'd still expect opening the objects to be the
dominating factor, but now the load would be on jumping around the
mmap'd packfile rather than open/read/close calls.
> OK, after discovering the strace -r and -T options, I have determined that
> the 29K writes were all very fast in themselves. However, most of the
> writes seem to follow each other with no other system calls in between.
> This explains why it looks like the writes are slow, even though they aren't.
>
> If I tally up the time between the previous system call, and each write(),
> it adds up to the bulk of the time (4mins out of 4m15s) that it takes to
> pack refs. This tells me that no visible I/O or system calls are the problem,
> but rather that the program itself is taking a long time between writes.
> I very much doubt that this is heavy CPU time, but rather I am going to
> guess that this is hidden system time spent accessing mmaped memory.
That would be consistent with reading object data from the packfile.
We'll jump around within the packfile to get that data.
> Could it be really slow reading the packed-refs file? I can see the
> packed-refs file is mmaped() before the writes start, and then
> munmapped after the writes are completed. If I had to guess, that likely
> means that the packed-refs file is being read in small increments by the
> kernel via mmap, and that is what is making things very slow over NFS.
The packed-refs file is mmap'd, but we'll be reading it sequentially. I
guess whether or not there is good read-ahead there may depend on the
NFS implementation.
> My alternative theory, is that each ref is being looked up via a binary
> search, but I don't think git does this?
Git does binary search within the packed-refs file, but it shouldn't be
doing so here. The write-out phase of packing refs is a straight merge
between two lists: the existing packed-refs entries and the new entries
we are adding.
I'd second Patrick's suggestion to use perf or similar to try to see
where the time is going.
You might also try building Git with NO_MMAP. That might make the I/O
costs more apparent via strace, because they'll be coming via pread().
-Peff
next prev parent reply other threads:[~2026-01-06 10:38 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-25 22:13 Slow git pack-refs --all Martin Fick
2025-12-25 23:38 ` brian m. carlson
2025-12-26 4:45 ` Jeff King
2025-12-26 17:15 ` brian m. carlson
2025-12-27 7:36 ` Jeff King
2025-12-31 5:48 ` Martin Fick
2026-01-02 7:49 ` Jeff King
2026-01-05 23:45 ` Martin Fick
2026-01-06 6:53 ` Patrick Steinhardt
2026-01-06 23:02 ` Martin Fick
2026-01-07 11:42 ` Patrick Steinhardt
2026-01-07 22:58 ` Martin Fick
2026-01-08 6:33 ` Patrick Steinhardt
2026-01-15 21:09 ` Jeff King
2026-01-16 20:35 ` Martin Fick
2026-01-07 17:05 ` Martin Fick
2026-01-06 10:38 ` Jeff King [this message]
2026-01-06 23:03 ` Martin Fick
2025-12-31 5:39 ` Martin Fick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260106103803.GA69061@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=mfick@nvidia$(echo .)com \
--cc=sandals@crustytoothpaste$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox