From: Jeff King <peff@peff•net>
To: Taylor Blau <me@ttaylorr•com>
Cc: git@vger•kernel.org, Junio C Hamano <gitster@pobox•com>,
Elijah Newren <newren@gmail•com>,
Derrick Stolee <stolee@gmail•com>
Subject: Re: [PATCH 3/8] pack-bitmap: reuse stored selected bitmaps
Date: Wed, 27 May 2026 05:24:12 -0400 [thread overview]
Message-ID: <20260527092412.GD981444@coredump.intra.peff.net> (raw)
In-Reply-To: <6e1f6bef5f641481a6a875bc215b35fc56cef80c.1779207127.git.me@ttaylorr.com>
On Tue, May 19, 2026 at 12:12:41PM -0400, Taylor Blau wrote:
> Building bitmaps from scratch on the same test repository from the
> previous commits yields a significant speed-up:
>
> +------------------+-------------+-------------+---------------------+
> | | HEAD^ | HEAD | Delta |
> +------------------+-------------+-------------+---------------------+
> | elapsed | 562.8 s | 324.8 s | -237.9 s (-42.3%) |
> | cycles | 2,621.3 B | 1,508.6 B | -1,112.7 B (-42.4%) |
> | instructions | 2,348.9 B | 1,436.6 B | -912.3 B (-38.8%) |
> | CPI | 1.116 | 1.050 | -0.066 (-5.9%) |
> +------------------+-------------+-------------+---------------------+
Oh my, that's a rather nice speedup. I can reproduce here on linux.git
(~47% improvement).
> When `fill_bitmap_commit()` reaches an ancestor that was selected for
> its own bitmap and processed earlier, its object closure is already
> stored in `writer->bitmaps` as an EWAH bitmap. As a result, walking
> through that commit's tree and parents again is redundant.
>
> Teach `fill_bitmap_commit()` to notice that case. For non-root commits in
> the walk, look for a stored selected bitmap and OR it into the bitmap
> being built. If one exists, skip the commit, its tree, and its parents.
I feel like this _shouldn't_ be necessary, because the idea of the
current writing code is to go from the roots up, following inverted
parent pointers, and passing the bitmap up as we go. So whenever we
visit a commit we should in theory have all of the ancestor's bits set
in that bitmap. But I remember that the simple-and-stupid approach ended
up being too memory hungry, so we pick some focal points in the graph
and then fill them independently.
And I guess that's what you're showing here:
> In our testing repository, there are 1,261 commits selected for bitmap
> coverage, and 1,382 maximal commits induced as a result of that. Of the
> 1,382 calls made to `fill_bitmap_commit()` (one per maximal commit), 131
> of them can be short-circuited at some point during their traversal as a
> consequence of this change.
We'll end up seeing some of the same parts of history for various
maximal commits, and this lets us sometimes reuse the earlier efforts.
Anyway, it is hard to argue with the numbers.
> @@ -553,6 +559,28 @@ static int fill_bitmap_commit(struct bitmap_writer *writer,
> bitmap_free(remapped);
> }
>
> + /*
> + * If we encounter an ancestor for which we have already
> + * computed a bitmap during this build (i.e. a regular
> + * selected commit processed earlier in topo order), we can
> + * short-circuit the walk: its stored bitmap already covers
> + * the commit itself, its tree, and all of its ancestors.
> + */
> + if (c != commit) {
> + khiter_t hash_pos = kh_get_oid_map(writer->bitmaps,
> + c->object.oid);
> + if (hash_pos != kh_end(writer->bitmaps)) {
> + struct bitmapped_commit *stored =
> + kh_value(writer->bitmaps, hash_pos);
> + if (stored && stored->bitmap) {
> + fill_bitmap_commit_found_ancestor_nr++;
> + bitmap_or_ewah(ent->bitmap,
> + stored->bitmap);
> + continue;
> + }
> + }
> + }
OK, so we incur one hash lookup per commit as we walk, which seems like
a good tradeoff.
I wondered about "c != commit" here. "c" is the commit we're traversing,
and "commit" is the one for which we're trying to build the bitmap. So
we would not expect to ever have an entry in writer->bitmaps for "c"
yet, but the conditional is just short-circuiting the hash lookup.
The rest of the patch looks obviously correct. The trace2 bits aren't
strictly necessary, of course, but some metrics might help with further
tuning.
-Peff
next prev parent reply other threads:[~2026-05-27 9:24 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-19 16:12 [PATCH 0/8] pack-bitmap-write: speed up bitmap generation Taylor Blau
2026-05-19 16:12 ` [PATCH 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()` Taylor Blau
2026-05-27 8:57 ` Jeff King
2026-05-27 14:36 ` Taylor Blau
2026-05-19 16:12 ` [PATCH 2/8] pack-bitmap: check subtree bits before recursing Taylor Blau
2026-05-27 9:03 ` Jeff King
2026-05-27 14:36 ` Taylor Blau
2026-05-19 16:12 ` [PATCH 3/8] pack-bitmap: reuse stored selected bitmaps Taylor Blau
2026-05-27 9:24 ` Jeff King [this message]
2026-05-27 14:40 ` Taylor Blau
2026-05-29 6:00 ` Jeff King
2026-05-19 16:12 ` [PATCH 4/8] pack-bitmap: consolidate `find_object_pos()` success path Taylor Blau
2026-05-20 14:42 ` SZEDER Gábor
2026-05-20 17:12 ` Taylor Blau
2026-05-27 9:27 ` Jeff King
2026-05-19 16:12 ` [PATCH 5/8] pack-bitmap: cache object positions during fill Taylor Blau
2026-05-27 9:45 ` Jeff King
2026-05-27 14:46 ` Taylor Blau
2026-05-19 16:12 ` [PATCH 6/8] pack-bitmap: sort bitmaps before XORing Taylor Blau
2026-05-27 10:04 ` Jeff King
2026-05-27 16:56 ` Taylor Blau
2026-05-29 8:26 ` Jeff King
2026-05-19 16:12 ` [PATCH 7/8] pack-bitmap: remember pseudo-merge parents Taylor Blau
2026-05-19 16:12 ` [PATCH 8/8] pack-bitmap: build pseudo-merge bitmaps after regular bitmaps Taylor Blau
2026-05-27 10:25 ` Jeff King
2026-05-27 19:24 ` Taylor Blau
2026-05-29 8:33 ` Jeff King
2026-05-27 10:27 ` [PATCH 0/8] pack-bitmap-write: speed up bitmap generation Jeff King
2026-05-27 19:55 ` [PATCH v2 " Taylor Blau
2026-05-27 19:55 ` [PATCH v2 1/8] pack-bitmap: pass object position to `fill_bitmap_tree()` Taylor Blau
2026-05-27 19:55 ` [PATCH v2 2/8] pack-bitmap: check subtree bits before recursing Taylor Blau
2026-05-27 19:55 ` [PATCH v2 3/8] pack-bitmap: reuse stored selected bitmaps Taylor Blau
2026-05-27 19:55 ` [PATCH v2 4/8] pack-bitmap: consolidate `find_object_pos()` success path Taylor Blau
2026-05-27 19:56 ` [PATCH v2 5/8] pack-bitmap: cache object positions during fill Taylor Blau
2026-05-27 19:56 ` [PATCH v2 6/8] pack-bitmap: sort bitmaps before XORing Taylor Blau
2026-05-27 19:56 ` [PATCH v2 7/8] pack-bitmap: remember pseudo-merge parents Taylor Blau
2026-05-27 19:56 ` [PATCH v2 8/8] pack-bitmap: build pseudo-merge bitmaps after regular bitmaps Taylor Blau
2026-05-29 8:34 ` [PATCH v2 0/8] pack-bitmap-write: speed up bitmap generation Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260527092412.GD981444@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=me@ttaylorr$(echo .)com \
--cc=newren@gmail$(echo .)com \
--cc=stolee@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox