From: Patrick Steinhardt <ps@pks•im>
To: Justin Tobler <jltobler@gmail•com>
Cc: git@vger•kernel.org,
"brian m. carlson" <sandals@crustytoothpaste•net>,
Karthik Nayak <karthik.188@gmail•com>,
K Jayatheerth <jayatheerthkulkarni2005@gmail•com>,
ryenus@gmail•com, Junio C Hamano <gitster@pobox•com>
Subject: Re: [PATCH 1/2] BreakingChanges: announce switch to "reftable" format
Date: Thu, 3 Jul 2025 07:00:21 +0200 [thread overview]
Message-ID: <aGYOZVyaR_OYIhtl@pks.im> (raw)
In-Reply-To: <q6zyvqpyxobtp65ptrmkdg3kvc2plxmsltaurqf52hglitikir@5p5jpcqc577o>
On Wed, Jul 02, 2025 at 12:17:50PM -0500, Justin Tobler wrote:
> On 25/07/02 12:14PM, Patrick Steinhardt wrote:
> > diff --git a/Documentation/BreakingChanges.adoc b/Documentation/BreakingChanges.adoc
> > index c6bd94986c5..c96b5319cdd 100644
> > --- a/Documentation/BreakingChanges.adoc
> > +++ b/Documentation/BreakingChanges.adoc
> > @@ -118,6 +118,45 @@ Cf. <2f5de416-04ba-c23d-1e0b-83bb655829a7@zombino•com>,
> > <20170223155046.e7nxivfwqqoprsqj@LykOS•localdomain>,
> > <CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@mail•gmail.com>.
> >
> > +* The default storage format for references in newly created repositories will
> > + be changed from "files" to "reftable". The "reftable" format provides
> > + multiple advantages over the "files" format:
> > ++
> > + ** It is impossible to store two references that only differ in casing on
> > + case-insensitive filesystems with the "files" format. This issue is
> > + especially common on Windows, but also on older versions of macOS. As the
> > + "reftable" backend does not use filesystem paths anymore to encode
> > + reference names this problem goes away.
>
> I believe even modern macOS by default uses a case-insensitive
> file-system. Maybe we should instead say:
>
> This limitation is common on Windows and macOS platforms.
Okay, thanks for the clarification. I thought recent versions of macOS
were case-sensitive by default.
> > + ** Similarly, macOS normalizes path names that contain unicode characters,
> > + which has the consequence that you cannot store two names with unicode
> > + characters that are encoded differently with the "files" backend. Again,
> > + this is not an issue with the "reftable" backend.
> > + ** Deleting references with the "files" backend requires Git to rewrite the
> > + complete "packed-refs" file. In large repositories with many references
> > + this file can easily be dozens of megabytes in size, in extreme cases it
> > + may be gigabytes. The "reftable" backend uses tombstone markers for
> > + deleted references and thus does not have to rewrite all of its data.
> > + ** Repository housekeeping with the "files" backend typically performs
> > + all-into-one repacks of references. This can be quite expensive, and
> > + consequently housekeeping is a tradeoff between the number of loose
> > + references that accumulate and slow down operations that read references,
> > + and compressing those loose references into the "packed-refs" file. The
> > + "reftable" backend uses geometric compaction after every write, which
> > + amortizes costs and ensures that the backend is always in a
> > + well-maintained state.
> > + ** Operations that write multiple references at once are not atomic with the
> > + "files" backend. Consequently, Git may see in-between states when it reads
> > + references while a reference transaction is in the process of being
> > + committed to disk.
> > + ** Writing many references at once is slow with the "files" backend because
> > + every reference is created as a separate file. The "reftable" backend
> > + significantly outperforms the "files" backend by multiple orders of
> > + magnitude.
>
> The examples above do a good job at explaining individual technical
> benefits. I do wonder if we should include a more general statement
> aimed at users as to why the change to reftables is beneficial. Maybe
> something like:
>
> The reftables backend addresses several performance concerns as the
> number of references scale in a repository.
I think this would be a bit too handwavy. I'd rather want to point out
the specific cases where we know it to perform better.
Patrick
next prev parent reply other threads:[~2025-07-03 5:00 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 10:14 [PATCH 0/2] Add reftable by default as a breaking change Patrick Steinhardt
2025-07-02 10:14 ` [PATCH 1/2] BreakingChanges: announce switch to "reftable" format Patrick Steinhardt
2025-07-02 17:03 ` Junio C Hamano
2025-07-02 21:21 ` brian m. carlson
2025-07-03 4:43 ` Patrick Steinhardt
2025-07-03 4:43 ` Patrick Steinhardt
2025-07-02 17:17 ` Justin Tobler
2025-07-03 5:00 ` Patrick Steinhardt [this message]
2025-07-02 10:14 ` [PATCH 2/2] setup: use "reftable" format when experimental features are enabled Patrick Steinhardt
2025-07-03 6:15 ` [PATCH v2 0/2] Add reftable by default as a breaking change Patrick Steinhardt
2025-07-03 6:15 ` [PATCH v2 1/2] BreakingChanges: announce switch to "reftable" format Patrick Steinhardt
2025-07-03 10:54 ` Karthik Nayak
2025-07-03 11:42 ` Patrick Steinhardt
2025-07-03 12:24 ` Karthik Nayak
2025-07-03 13:08 ` Patrick Steinhardt
2025-07-03 6:15 ` [PATCH v2 2/2] setup: use "reftable" format when experimental features are enabled Patrick Steinhardt
2025-07-07 5:37 ` [PATCH v2 0/2] Add reftable by default as a breaking change Junio C Hamano
2025-07-04 9:42 ` [PATCH v3 " Patrick Steinhardt
2025-07-04 9:42 ` [PATCH v3 1/2] BreakingChanges: announce switch to "reftable" format Patrick Steinhardt
2025-07-04 9:42 ` [PATCH v3 2/2] setup: use "reftable" format when experimental features are enabled Patrick Steinhardt
2025-07-04 13:14 ` [PATCH v3 0/2] Add reftable by default as a breaking change Karthik Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGYOZVyaR_OYIhtl@pks.im \
--to=ps@pks$(echo .)im \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=jayatheerthkulkarni2005@gmail$(echo .)com \
--cc=jltobler@gmail$(echo .)com \
--cc=karthik.188@gmail$(echo .)com \
--cc=ryenus@gmail$(echo .)com \
--cc=sandals@crustytoothpaste$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox