public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
* [PATCH] cat-file: only use bitmaps when filtering
@ 2026-01-06 10:25 Jeff King
  2026-01-07  8:20 ` Patrick Steinhardt
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2026-01-06 10:25 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt

Commit 8002e8ee18 (builtin/cat-file: use bitmaps to efficiently filter
by object type, 2025-04-02) introduced a performance regression when we
are not filtering objects: it uses bitmaps even when they won't help,
incurring extra costs. For example, running the new perf tests from this
commit, which check the performance of listing objects by oid:

  $ export GIT_PERF_LARGE_REPO=/path/to/linux.git
  $ git -C "$GIT_PERF_LARGE_REPO" repack -adb
  $ GIT_SKIP_TESTS=p1006.1 ./run 8002e8ee18^ 8002e8ee18 p1006-cat-file.sh
  [...]
  Test                                  8002e8ee18^       8002e8ee18
  -------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.48(1.44+0.04)   6.39(6.35+0.04) +331.8%
  1006.3: list all objects (unsorted)   3.01(2.97+0.04)   3.40(3.29+0.10) +13.0%
  1006.4: list blobs                    4.85(4.67+0.17)   1.68(1.58+0.10) -65.4%

An invocation that filters, like listing all blobs (1006.4), does
benefit from using the bitmaps; it now doesn't have to check the type of
each object from the pack data, so the tradeoff is worth it.

But for listing all objects in sorted idx order (1006.2), we otherwise
would never open the bitmap nor the revindex file. Worse, our sorting
step gets much worse. Normally we append into an array in pack .idx
order, and the sort step is trivial. But with bitmaps, we get the
objects in pack order, which is apparently random with respect to oid,
and have to sort the whole thing. (Note that this freshly-packed state
represents the best case for .idx sorting; if we had two packs, then
we'd have their objects one after the other and qsort would have to
interleave them).

The unsorted test in 1006.3 is interesting: there we are going in pack
order, so we load the revindex for the pack anyway. And though we don't
sort the result, we do use an oidset to check for duplicates. So we can
see in the 8002e8ee18^ timings that those two things cost ~1.5s over the
sorted case (mostly the oidset hash cost). We also incur the extra cost
to open the bitmap file as of 8002e8ee18, which seems to be ~400ms.
(This would probably be faster with a bitmap lookup table, but writing
that out is not yet the default).

So we know that bitmaps help when there's filtering to be done, but
otherwise make things worse. Let's only use them when there's a filter.

The perf script shows that we've fixed the regressions without hurting
the bitmap case:

  Test                                  8002e8ee18^       8002e8ee18                HEAD
  --------------------------------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.56(1.53+0.03)   6.44(6.37+0.06) +312.8%   1.62(1.54+0.06) +3.8%
  1006.3: list all objects (unsorted)   3.04(2.98+0.06)   3.45(3.38+0.07) +13.5%    3.04(2.99+0.04) +0.0%
  1006.4: list blobs                    5.14(4.98+0.15)   1.76(1.68+0.06) -65.8%    1.73(1.64+0.09) -66.3%

Note that there's another related case: we might have a filter that
cannot be used with bitmaps. That check is handled already for us in
for_each_bitmapped_object(), though we'd still load the bitmap and
revindex files pointlessly in that case. I don't think it can happen in
practice for cat-file, though, since it allows only blob:none,
blob:limit, and object:type filters, all of which work with bitmaps.

It would be easy-ish to insert an extra check like:

  can_filter_bitmap(&opt->objects_filter);

into the conditional, but I didn't bother here. It would be redundant
with the call in for_each_bitmapped_object(), and the can_filter helper
function is static local in the bitmap code (so we'd have to make it
public).

Signed-off-by: Jeff King <peff@peff•net>
---
There are some timing tests in 8002e8ee18 that claim the non-filter case
is not regressed, but it's not clear to me exactly which commands were
run. Given the size of the repo there and the fact that it's more I/O
bound, I'd guess it is using an output format that requires looking at
the packed objects.

You can see the mean user CPU does jump by almost 2 seconds in the
timings given there. So it may be that the problem was there but drowned
out by I/O noise. At any rate, the new t/perf tests isolate it better
and reproduce consistently for me.

 builtin/cat-file.c       |  8 +++++---
 t/perf/p1006-cat-file.sh | 14 ++++++++++++++
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 505ddaa12f..3cb725940d 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -846,12 +846,14 @@ static void batch_each_object(struct batch_options *opt,
 		.callback = callback,
 		.payload = _payload,
 	};
-	struct bitmap_index *bitmap = prepare_bitmap_git(the_repository);
+	struct bitmap_index *bitmap = NULL;
 
 	for_each_loose_object(the_repository->objects, batch_one_object_loose, &payload, 0);
 
-	if (bitmap && !for_each_bitmapped_object(bitmap, &opt->objects_filter,
-						 batch_one_object_bitmapped, &payload)) {
+	if (opt->objects_filter.choice != LOFC_DISABLED &&
+	    (bitmap = prepare_bitmap_git(the_repository)) &&
+	    !for_each_bitmapped_object(bitmap, &opt->objects_filter,
+				       batch_one_object_bitmapped, &payload)) {
 		struct packed_git *pack;
 
 		repo_for_each_pack(the_repository, pack) {
diff --git a/t/perf/p1006-cat-file.sh b/t/perf/p1006-cat-file.sh
index dcd8015379..da34ece242 100755
--- a/t/perf/p1006-cat-file.sh
+++ b/t/perf/p1006-cat-file.sh
@@ -9,4 +9,18 @@ test_perf 'cat-file --batch-check' '
 	git cat-file --batch-all-objects --batch-check
 '
 
+test_perf 'list all objects (sorted)' '
+	git cat-file --batch-all-objects --batch-check="%(objectname)"
+'
+
+test_perf 'list all objects (unsorted)' '
+	git cat-file --batch-all-objects --batch-check="%(objectname)" \
+		--unordered
+'
+
+test_perf 'list blobs' '
+	git cat-file --batch-all-objects --batch-check="%(objectname)" \
+		--unordered --filter=object:type=blob
+'
+
 test_done
-- 
2.52.0.664.g9f53c65b4c

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] cat-file: only use bitmaps when filtering
  2026-01-06 10:25 [PATCH] cat-file: only use bitmaps when filtering Jeff King
@ 2026-01-07  8:20 ` Patrick Steinhardt
  2026-01-16 16:45   ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Patrick Steinhardt @ 2026-01-07  8:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Tue, Jan 06, 2026 at 05:25:58AM -0500, Jeff King wrote:
> There are some timing tests in 8002e8ee18 that claim the non-filter case
> is not regressed, but it's not clear to me exactly which commands were
> run. Given the size of the repo there and the fact that it's more I/O
> bound, I'd guess it is using an output format that requires looking at
> the packed objects.

I honestly can't remember anymore, either. I really should stick with
the actual command run in the Hyperfine benchmark names.

> You can see the mean user CPU does jump by almost 2 seconds in the
> timings given there. So it may be that the problem was there but drowned
> out by I/O noise. At any rate, the new t/perf tests isolate it better
> and reproduce consistently for me.

Could be, yeah.

> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 505ddaa12f..3cb725940d 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -846,12 +846,14 @@ static void batch_each_object(struct batch_options *opt,
>  		.callback = callback,
>  		.payload = _payload,
>  	};
> -	struct bitmap_index *bitmap = prepare_bitmap_git(the_repository);
> +	struct bitmap_index *bitmap = NULL;
>  
>  	for_each_loose_object(the_repository->objects, batch_one_object_loose, &payload, 0);
>  
> -	if (bitmap && !for_each_bitmapped_object(bitmap, &opt->objects_filter,
> -						 batch_one_object_bitmapped, &payload)) {
> +	if (opt->objects_filter.choice != LOFC_DISABLED &&
> +	    (bitmap = prepare_bitmap_git(the_repository)) &&
> +	    !for_each_bitmapped_object(bitmap, &opt->objects_filter,
> +				       batch_one_object_bitmapped, &payload)) {
>  		struct packed_git *pack;
>  
>  		repo_for_each_pack(the_repository, pack) {

Yeah, this seems like a reasonable change to me. I would've preferred to
avoid the assignment in the conditional, but other than that this looks
good to me.

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] cat-file: only use bitmaps when filtering
  2026-01-07  8:20 ` Patrick Steinhardt
@ 2026-01-16 16:45   ` Jeff King
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff King @ 2026-01-16 16:45 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Wed, Jan 07, 2026 at 09:20:59AM +0100, Patrick Steinhardt wrote:

> > -	if (bitmap && !for_each_bitmapped_object(bitmap, &opt->objects_filter,
> > -						 batch_one_object_bitmapped, &payload)) {
> > +	if (opt->objects_filter.choice != LOFC_DISABLED &&
> > +	    (bitmap = prepare_bitmap_git(the_repository)) &&
> > +	    !for_each_bitmapped_object(bitmap, &opt->objects_filter,
> > +				       batch_one_object_bitmapped, &payload)) {
> >  		struct packed_git *pack;
> >  
> >  		repo_for_each_pack(the_repository, pack) {
> 
> Yeah, this seems like a reasonable change to me. I would've preferred to
> avoid the assignment in the conditional, but other than that this looks
> good to me.

Yeah, I tried to rewrite this to avoid the assignment-in-conditional,
but the logic gets even more convoluted because we need to get to the
"else" clause from multiple places then.

I do think that the for_each_bitmapped_object() interface is making this
a bit harder. Before it was added, the main bitmap entry point was
always prepare_bitmap_walk(), which opened the bitmap file itself (and
only after doing the cheap can_filter_bitmap() check).

But here that doesn't quite work, because we need the bitmap_index to
persist after the for_each_bitmapped_object() call so that we can check
bitmap_index_contains_pack() on it.

I was tempted to suggest that for_each_bitmapped_object() should return
the bitmap_index itself, and then this code would become:

   if (filter.choice != LOFC_DISABLED)
	   bitmap = for_each_bitmapped_object(filter, cb, &payload);
   if (bitmap) {
	   /* we iterated those objects; check for other packs */
   } else {
	   /* we did nothing; look at all packs */
   }

   free_bitmap_index(bitmap);

which is not too bad. Mostly I wanted to make the fix as small as
possible, but I was also a little hesitant to tweak the API when we have
only one caller (and we don't know what a second caller might want).
But we could always revisit it on top.

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-01-16 16:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-06 10:25 [PATCH] cat-file: only use bitmaps when filtering Jeff King
2026-01-07  8:20 ` Patrick Steinhardt
2026-01-16 16:45   ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox