From: Derrick Stolee <stolee@gmail•com>
To: Taylor Blau <me@ttaylorr•com>, git@vger•kernel.org
Cc: Junio C Hamano <gitster@pobox•com>, Jeff King <peff@peff•net>,
Elijah Newren <newren@gmail•com>
Subject: Re: [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk`
Date: Fri, 29 May 2026 13:26:33 -0400 [thread overview]
Message-ID: <22a7e32f-f645-4f00-bc5b-6b4309e483c2@gmail.com> (raw)
In-Reply-To: <a708e23d-e0c2-48c9-86e9-1227f12edd53@gmail.com>
On 5/28/26 11:28 AM, Derrick Stolee wrote:
> On 5/27/26 7:18 PM, Taylor Blau wrote:
> Do you have any end-to-end performance data to demonstrate that these
> changes are effective at scale? Are we still producing packfiles with the
> pack-file compression and now with .bitmap files? How does this impact
> the performance of a clone or fetch when using a bitmap index at read
> time?
Here's my attempt to use our existing performance tests to analyze the
impact of this series.
Running p5311 against the base of this topic and this topic with
GIT_TEST_PACK_PATH_WALK=1, I get this output:
Test HEAD~3 HEAD
-----------------------------------------------------------------
5311.4: server (1 days) (lookup=true) 0.02 0.03 +50.0%
5311.5: size (1 days) 6.8K 124.9K +1730.9%
5311.6: client (1 days) (lookup=true) 0.02 0.01 -50.0%
5311.8: server (2 days) (lookup=true) 0.02 0.03 +50.0%
5311.9: size (2 days) 6.8K 124.9K +1730.9%
5311.10: client (2 days) (lookup=true) 0.02 0.01 -50.0%
5311.12: server (4 days) (lookup=true) 0.02 0.03 +50.0%
5311.13: size (4 days) 6.8K 124.9K +1730.9%
5311.14: client (4 days) (lookup=true) 0.02 0.01 -50.0%
5311.16: server (8 days) (lookup=true) 0.03 0.03 +0.0%
5311.17: size (8 days) 37.3K 186.0K +398.2%
5311.18: client (8 days) (lookup=true) 0.03 0.02 -33.3%
5311.20: server (16 days) (lookup=true) 0.02 0.03 +50.0%
5311.21: size (16 days) 37.3K 186.0K +398.2%
5311.22: client (16 days) (lookup=true) 0.03 0.02 -33.3%
5311.24: server (32 days) (lookup=true) 0.03 0.03 +0.0%
5311.25: size (32 days) 46.5K 197.2K +324.3%
5311.26: client (32 days) (lookup=true) 0.03 0.02 -33.3%
5311.28: server (64 days) (lookup=true) 0.24 0.16 -33.3%
5311.29: size (64 days) 1.5M 5.1M +239.8%
5311.30: client (64 days) (lookup=true) 0.42 0.35 -16.7%
5311.32: server (128 days) (lookup=true) 0.49 0.29 -40.8%
5311.33: size (128 days) 4.1M 9.8M +139.5%
5311.34: client (128 days) (lookup=true) 0.86 0.65 -24.4%
5311.38: server (1 days) (lookup=false) 0.02 0.03 +50.0%
5311.39: size (1 days) 6.8K 124.9K +1730.9%
5311.40: client (1 days) (lookup=false) 0.02 0.02 +0.0%
5311.42: server (2 days) (lookup=false) 0.02 0.03 +50.0%
5311.43: size (2 days) 6.8K 124.9K +1730.9%
5311.44: client (2 days) (lookup=false) 0.02 0.02 +0.0%
5311.46: server (4 days) (lookup=false) 0.02 0.03 +50.0%
5311.47: size (4 days) 6.8K 124.9K +1730.9%
5311.48: client (4 days) (lookup=false) 0.02 0.02 +0.0%
5311.50: server (8 days) (lookup=false) 0.02 0.03 +50.0%
5311.51: size (8 days) 37.3K 186.0K +398.2%
5311.52: client (8 days) (lookup=false) 0.03 0.02 -33.3%
5311.54: server (16 days) (lookup=false) 0.02 0.03 +50.0%
5311.55: size (16 days) 37.3K 186.0K +398.2%
5311.56: client (16 days) (lookup=false) 0.03 0.02 -33.3%
5311.58: server (32 days) (lookup=false) 0.03 0.03 +0.0%
5311.59: size (32 days) 46.5K 197.2K +324.3%
5311.60: client (32 days) (lookup=false) 0.03 0.02 -33.3%
5311.62: server (64 days) (lookup=false) 0.25 0.17 -32.0%
5311.63: size (64 days) 1.5M 5.1M +239.8%
5311.64: client (64 days) (lookup=false) 0.43 0.37 -14.0%
5311.66: server (128 days) (lookup=false) 0.50 0.29 -42.0%
5311.67: size (128 days) 4.1M 9.8M +138.6%
5311.68: client (128 days) (lookup=false) 0.87 0.67 -23.0%
It's important to realize that even with the test variable, the
path-walk logic is overriding the bitmap logic in the HEAD~3
case.
What's happening is that the path-walk mode (without bitmaps)
is computing a smaller packfile for all of these cases. Some
are significantly smaller, but only when it's a very small
pack anyway. The bitmap case is faster only for larger fetches.
I did the same test without the path-walk feature and both columns
looked the same (as expected, no change due to this series) and
the data matched the path-walk test's HEAD column pretty closely.
So this shows that adding path-walk to bitmap-focused efforts is
not a regression on any of these dimensions.
This test was for my local copy of the Git repository, including
all the forks I fetch. I hoped the results would be different
for repositories that have data shapes that struggle with
name-hash collisions, but microsoft/fluentui is an example that
I've used for path-walk repacks before and it had similar data.
Do you have a good feeling for why the path-walk feature doesn't
make a huge change in these test scenarios?
Thanks,
-Stolee
next prev parent reply other threads:[~2026-05-29 17:26 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-27 23:18 [PATCH 0/3] pack-objects: support bitmaps and delta-islands with `--path-walk` Taylor Blau
2026-05-27 23:18 ` [PATCH 1/3] pack-objects: support reachability bitmaps " Taylor Blau
2026-05-27 23:18 ` [PATCH 2/3] pack-objects: extract `record_tree_depth()` helper Taylor Blau
2026-05-27 23:18 ` [PATCH 3/3] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
2026-05-28 15:28 ` [PATCH 0/3] pack-objects: support bitmaps and delta-islands " Derrick Stolee
2026-05-29 17:26 ` Derrick Stolee [this message]
2026-05-29 20:07 ` Taylor Blau
2026-05-29 21:28 ` Derrick Stolee
2026-05-29 22:20 ` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 0/4] " Taylor Blau
2026-06-02 22:21 ` [PATCH v2 1/4] t/perf: drop p5311's lookup-table permutation Taylor Blau
2026-06-02 22:21 ` [PATCH v2 2/4] pack-objects: support reachability bitmaps with `--path-walk` Taylor Blau
2026-06-02 22:21 ` [PATCH v2 3/4] pack-objects: extract `record_tree_depth()` helper Taylor Blau
2026-06-02 22:21 ` [PATCH v2 4/4] pack-objects: support `--delta-islands` with `--path-walk` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22a7e32f-f645-4f00-bc5b-6b4309e483c2@gmail.com \
--to=stolee@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=me@ttaylorr$(echo .)com \
--cc=newren@gmail$(echo .)com \
--cc=peff@peff$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox