From: Jeff King <peff@peff•net>
To: Arijit Banerjee via GitGitGadget <gitgitgadget@gmail•com>
Cc: git@vger•kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail•com>,
"Junio C Hamano" <gitster@pobox•com>,
"Derrick Stolee" <stolee@gmail•com>,
"Arijit Banerjee" <arijit91@gmail•com>,
"Arijit Banerjee" <arijit@effectiveailabs•com>
Subject: Re: [PATCH v3] index-pack: retain child bases in delta cache
Date: Thu, 4 Jun 2026 03:12:04 -0400 [thread overview]
Message-ID: <20260604071204.GA3196596@coredump.intra.peff.net> (raw)
In-Reply-To: <pull.2131.v3.git.1780445118653.gitgitgadget@gmail.com>
On Wed, Jun 03, 2026 at 12:05:17AM +0000, Arijit Banerjee via GitGitGadget wrote:
> * Addressed Jeff King's review question by releasing cached base data
> after all direct children have been dispatched, while keeping the
> existing subtree bookkeeping intact.
> * Re-ran t/t5302-pack-index.sh, p5302-pack-index.sh, and end-to-end
> full clone spot checks with the precise-release version.
Thanks for humoring me. I fully expected the answer to be "it is hard to
do and doesn't show much improvement, so let's not bother". ;)
It was hard to see the difference between v2 and v3 performance (which I
tried to dig out from the range-diff below), but it looks like it was
basically none. I did my own run of p5302 between the two versions using
both git.git and linux.git, and likewise didn't find anything.
I guess it would make a difference only if we were routinely expiring
useful items out of the cache due to the limit. And even though
linux.git is a "large" repo compared to git.git, cache locality here is
mostly based on how wide the delta tree for a file gets (that is, how
often we go down one chain, caching bases, while still finding it useful
to keep earlier parts of the chain to go down a parallel path).
And that probably has less to do with overall repo size rather than with
how we tend to pack things. Though I guess a repo with a lot of large
files might see more cache pressure (just because each single entry
"costs" more). We could simulate that by dropping the cache size in
p5302, but I still couldn't find any effect even with a tiny cache.
(Actually, with a tiny cache it looked like things got ~1% slower; maybe
noise, but maybe extra thread contention due to the release code?).
So I am happy with either v2 or v3.
-Peff
prev parent reply other threads:[~2026-06-04 7:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 16:06 [PATCH] index-pack: retain child bases in delta cache Arijit Banerjee via GitGitGadget
2026-06-01 12:50 ` Derrick Stolee
2026-06-01 16:13 ` [PATCH v2] " Arijit Banerjee via GitGitGadget
2026-06-02 6:45 ` Jeff King
2026-06-03 0:05 ` [PATCH v3] " Arijit Banerjee via GitGitGadget
2026-06-03 12:24 ` Derrick Stolee
2026-06-04 7:12 ` Jeff King [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260604071204.GA3196596@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=arijit91@gmail$(echo .)com \
--cc=arijit@effectiveailabs$(echo .)com \
--cc=avarab@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitgitgadget@gmail$(echo .)com \
--cc=gitster@pobox$(echo .)com \
--cc=stolee@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox