public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks•im>
To: Martin Fick <mfick@nvidia•com>
Cc: Jeff King <peff@peff•net>,
	"brian m. carlson" <sandals@crustytoothpaste•net>,
	"git@vger•kernel.org" <git@vger•kernel.org>
Subject: Re: Slow git pack-refs --all
Date: Wed, 7 Jan 2026 12:42:56 +0100	[thread overview]
Message-ID: <aV5GwOS_N2jyIFaz@pks.im> (raw)
In-Reply-To: <CH3PR12MB9026F1E4B99D32E138800EEBC287A@CH3PR12MB9026.namprd12.prod.outlook.com>

On Tue, Jan 06, 2026 at 11:02:19PM +0000, Martin Fick wrote:
> > From: Patrick Steinhardt <ps@pks•im> Sent: Monday, January 5, 2026 11:53 PM
> > On Mon, Jan 05, 2026 at 11:45:41PM +0000, Martin Fick wrote:
> > > OK, after discovering the strace -r and -T options, I have determined that
> > > the 29K writes were all very fast in themselves. However, most of the
> > > writes seem to follow each other with no other system calls in between.
> > > This explains why it looks like the writes are slow, even though they aren't.
> > >
> > > If I tally up the time between the previous system call, and each write(),
> > > it adds up to the bulk of the time (4mins out of 4m15s) that it takes to
> > > pack refs. This tells me that no visible I/O or system calls are the problem,
> > > but rather that the program itself is taking a long time between writes.
> > > I very much doubt that this is heavy CPU time, but rather I am going to
> > > guess that this is hidden system time spent accessing mmaped memory.
> > > Could it be really slow reading the packed-refs file? I can see the
> > > packed-refs file is mmaped() before the writes start, and then
> > > munmapped after the writes are completed. If I had to guess, that likely
> > > means that the packed-refs file is being read in small increments by the
> > > kernel via mmap, and that is what is making things very slow over NFS.
> > 
> > I wouldn't be surprised if NFS was the culprit. At GitLab we found it to
> > be a constant source of issues, which is why we eventually sunsetted the
> > use of it completely. Do you use any special flags for mounting the NFS
> > filesystem?
> 
> I am open to alternatives to NFS. Do you know of any NFS alternatives that 
> provides instantaneous replication to potentially hundreds of mirrors? I 
> have used Gerrit and git-daemon for many years on NFS, and it generally 
> has performed very well for us, and it solves many real performance issues 
> which I have yet to find a viable alternative able to even come close to
> matching. NFS with all it warts it is for us (and likely will be for many) until 
> there is a viable enterprise ready alternative with low (zero) replication 
> latency and high throughput.

Yeah, agreed, NFS can get you a long way, until you eventually start to
hit some road blocks once you reach a certain scale. Unfortunately
though, there isn't really a ready-made alternative solution that serves
your needs, or at least none that I know of. That's why GitLab
eventually settled on Gitaly Cluster with Praefect handling replication,
and why GitHub has its Spokes architecture that does basically the same
thing.

> That being said, NFS can cause many issues. In this case, I would say that
> something is particularly "broken" here with git, and I believe that it
> would be helpful to the git community to be aware of this fairly specific 
> broken case which clearly has a lot of room for improvement (as seen
> by the fact that jgit, in java, can do essentially the same thing more 
> than 10Xs faster). While I have been mostly assuming that this is a 
> particularly specific bad case since git daemon generally is fast for most
> users, this might actually be something that if improved would greatly 
> improve many parts of git (not just this use case).

Chances are that if we can improve the case for NFS, other filesystems
might benefit, as well. So if this is something that we can improve I
agree that we should. It's too early to tell though, as we don't really
know what the actual root cause is just yet.

> It would be nice to improve git to not hold the packed-refs.lock so long 
> to avoid this blocking behavior on servers. Of course, to be fair, this 
> likely only blocks Gerrit servers since Gerrit uses the packed-refs file to 
> perform atomic updates for many things, and most other servers use 
> loose refs instead. It would be great if git were optimized to avoid any 
> unnecessary reads while the lock is held.  In theory, almost all of the 
> data that git needs to read here (including tags for peeling) could be 
> read before acquiring the lock, and it would only need to double 
> check certain reads after it acquires the lock in case things changed. 
> That wouldn't make git pack-refs faster, but it would drastically 
> reduce the impact of any problematic I/O by not holding the lock for 
> almost the entire operation.

It can probably be improved, true. I think that it's a bit of a wasted
effort, as I'd rather invest the time into improving reftables as a more
future-proof solution. But as you are well aware I'm quite biased here,
and I'd welcome any efforts to also improve the files backend. I am just
unlikely to work on it myself :)

> > Did you try using perf(1) to profile the process and generate a flame
> > graph from it? That should likely make it immediately obvious where Git
> > is spending all of its time.
> 
> I will pursue this. Unfortunately this might be difficult on this 
> particular server.

True, on the server side this can be a bit tricky.

Patrick

  reply	other threads:[~2026-01-07 11:43 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-25 22:13 Slow git pack-refs --all Martin Fick
2025-12-25 23:38 ` brian m. carlson
2025-12-26  4:45   ` Jeff King
2025-12-26 17:15     ` brian m. carlson
2025-12-27  7:36       ` Jeff King
2025-12-31  5:48     ` Martin Fick
2026-01-02  7:49       ` Jeff King
2026-01-05 23:45         ` Martin Fick
2026-01-06  6:53           ` Patrick Steinhardt
2026-01-06 23:02             ` Martin Fick
2026-01-07 11:42               ` Patrick Steinhardt [this message]
2026-01-07 22:58                 ` Martin Fick
2026-01-08  6:33                   ` Patrick Steinhardt
2026-01-15 21:09                   ` Jeff King
2026-01-16 20:35                     ` Martin Fick
2026-01-07 17:05             ` Martin Fick
2026-01-06 10:38           ` Jeff King
2026-01-06 23:03             ` Martin Fick
2025-12-31  5:39   ` Martin Fick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aV5GwOS_N2jyIFaz@pks.im \
    --to=ps@pks$(echo .)im \
    --cc=git@vger$(echo .)kernel.org \
    --cc=mfick@nvidia$(echo .)com \
    --cc=peff@peff$(echo .)net \
    --cc=sandals@crustytoothpaste$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox