Re: [PATCH] config: retry acquiring config.lock for 100ms

public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed

From: Johannes Schindelin <Johannes.Schindelin@gmx•de>
To: Patrick Steinhardt <ps@pks•im>
Cc: "Jörg Thalheim" <joerg@thalheim•io>,
	"Junio C Hamano" <gitster@pobox•com>,
	git@vger•kernel.org
Subject: Re: [PATCH] config: retry acquiring config.lock for 100ms
Date: Sun, 17 May 2026 13:27:55 +0200 (CEST)	[thread overview]
Message-ID: <409d05a5-235b-6b19-5a33-a4e613dd447c@gmx.de> (raw)
In-Reply-To: <agGo9Prt8Hs2gbic@pks.im>

[-- Attachment #1: Type: text/plain, Size: 6855 bytes --]

Hi Patrick & Jörg,

On Mon, 11 May 2026, Patrick Steinhardt wrote:

> On Mon, May 11, 2026 at 09:06:00AM +0000, Jörg Thalheim wrote:
> > May 11, 2026 at 4:32 AM, "Junio C Hamano" <gitster@pobox•com
> > mailto:gitster@pobox•com?to=%22Junio%20C%20Hamano%22%20%3Cgitster%40pobox.com%3E
> > > wrote:
> > > Patrick Steinhardt <ps@pks•im> writes:
> > > > > This bites in practice when running `git worktree add -b` concurrently
> > > > >  against the same repository. Each invocation makes several writes to
> > > > >  ".git/config" to set up branch tracking, and tooling that creates
> > > > >  worktrees in parallel sees intermittent failures. Worse, `git worktree
> > > > >  add` does not propagate the failed config write to its exit code: the
> > > > >  worktree is created and the command exits 0, but tracking
> > > > >  configuration is silently dropped.
> > > > > 
> > > >  This very much sounds like a bug that is worth fixing independently.
> > > > 
> > > > > 
> > > > > The lock is held only for the duration of rewriting a small file, so
> > > > >  retrying for 100 ms papers over any realistic contention while still
> > > > >  failing fast if a stale lock has been left behind by a crashed
> > > > >  process. This mirrors what we already do for individual reference
> > > > >  locks (4ff0f01cb7 (refs: retry acquiring reference locks for 100ms,
> > > > >  2017-08-21)).
> > > > > 
> > > >  Famous last words :) Experience tells me that any timeout value that
> > > >  isn't excessive will eventually be hit in some production system. Which
> > > >  raises the question whether we want to make the timeout configurable,
> > > >  similar to "core.filesRefLockTimeout" and "core.packedRefsTimeout".
> > > >  ...
> > > >  Honestly though, I'm not really sure what to make with this. We could
> > > >  of course also add some validation that the configuration we want to set
> > > >  hasn't been modified meanwhile. But that would now lead to a situation
> > > >  where we have to update every single caller in our tree to make use of
> > > >  the new mechanism, which would be a bunch of work.
> > > > 
> > > >  And adding the timeout doesn't really change the status quo, either. We
> > > >  already have the case that we'll happily overwrite changes made by
> > > >  concurrent processes. The only thing that changes is that we make it
> > > >  more likely for concurrent changes to succeed.
> > > > 
> > > We haven't heard any response to these points raised in the message
> > > I am responding to. Should I still keep the patch in my tree,
> > > hoping that a responses may come some day? I am tempted to discard
> > > the topic as it has been quite a while since we last looked at it.
> > 
> > I am not really sure what you want me to do here.
> 
> In general, the idea here is to engage in a discussion that can
> ultimately lead to one of two outcomes:
> 
>   - The discussion surfaces an area the author hasn't thought about, so
>     the patch is adapted accordingly.
> 
>   - The discussion shows that the author already did think about the
>     issue, but hasn't documented the assumptions. In this case, it
>     should be the commit message that gets adapted.

For what it's worth, I meant to chime in earlier, but obligations kept
preventing me from setting aside the time to do so. Well, better late than
never.

> > I don't see how git can have this value configurable, given it's about
> > reading the configuration itself. Is the user supposed via command
> > line?
> 
> This is a fair point indeed. But if it's not possible to change via the
> configuration itself, then the next-best thing might be to introduce an
> environment variable that allows configuring it.

Well, given that the config is read first before it's written, it is
totally possible to configure a timeout via the config, and I have some
real-world proof that this works as intended (see below).

> The other aspect that wasn't discussed in the commit message is how
> concurrent writes are handled, both when they are non-conflicting
> (updating different keys) and when they are conflicting (updating the
> same key). After spending some more time in the code I think it's
> ultimately nothing we have to worry about too much, as we only start
> reading the configuration after we've locked it.

Correct. I had performed this analysis myself when writing a similar patch
to fix problems in Scalar's Functional Test suite, which wants to register
_many_ Scalar repositories with ~/.gitconfig concurrently. The current
iteration of the patch can be found here:

https://github.com/microsoft/git/commit/a1c2d97cb61bc3697086d1749de848586df2ec54

It does include the config setting, leaving the default as "off" (but I
missed the separate code path to rename sections, which has _independent_
code that also wants to lock the config file, which your patch did not
miss). The subsequent child commit

https://github.com/microsoft/git/commit/5d365c1f332b8d2214ae9c44970d6370ed9caffc

configures it to 150ms in Scalar repositories only. This is notably larger
than the 100ms you suggested, and it is rooted in the fact that NTFS I/O
characteristics are unfortunately in need of a wider margin. In other
words, the optimal value depends on the operating system (and the CPU
load, as Junio had pointed out).

For the record, feel free to adopt whatever you want from my patches for
your next iteration (but also feel free to ignore all of it).

> So in the semantically non-conflicting case there isn't really much of a
> race, because things already work as expected. But in the semantically
> conflicting case it's a bit different, as the latter writer will
> overwrite the result of the former one. In theory it would be possible
> to detect such conflicts by:
> 
>   - Reading the configuration file.
> 
>   - Taking the lock.
> 
>   - Rereading the configuration to check for conflicts.
> 
> But even that is racy as the first writer might have succeeded before we
> read the configuration the first time. So I'm not sure whether we can do
> anything about that in the first place, as the race basically exists in
> the outer loop controlled by the caller.
> 
> So there probably isn't much we can do about that, and unless I missed
> something I think your timeout is sensible. But ideally, such nuances
> would be discussed as part of the commit message so that reviewers and
> future readers are made aware of them.

I agree. Complex cases like this would require a sort of transactional
support to be added to `git config`, and that would in and of itself open
a can of worms I'm not sure we should open unless there is a concrete use
case that bites enough real-world scenarios to require acting upon.

Ciao,
Johannes

next prev parent reply	other threads:[~2026-05-17 11:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-03 10:01 [PATCH] config: retry acquiring config.lock for 100ms Joerg Thalheim
2026-04-03 17:53 ` Junio C Hamano
2026-04-08 10:34 ` Patrick Steinhardt
2026-05-11  2:32   ` Junio C Hamano
2026-05-11  7:33     ` Patrick Steinhardt
2026-05-11  9:06     ` Jörg Thalheim
2026-05-11 10:01       ` Patrick Steinhardt
2026-05-17 11:27         ` Johannes Schindelin [this message]
2026-05-17 13:21           ` [PATCH v2] config: retry acquiring config.lock, configurable via core.configLockTimeout Joerg Thalheim
2026-05-18  0:46             ` Junio C Hamano
2026-05-18  8:07               ` Patrick Steinhardt
2026-05-28 11:51             ` Johannes Schindelin
2026-05-28 12:23               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=409d05a5-235b-6b19-5a33-a4e613dd447c@gmx.de \
    --to=johannes.schindelin@gmx$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=joerg@thalheim$(echo .)io \
    --cc=ps@pks$(echo .)im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox