From: Jeff King <peff@peff•net>
To: Aaron Plattner <aplattner@nvidia•com>
Cc: git@vger•kernel.org
Subject: Re: [PATCH] packfile: skip decompressing and hashing blobs in add_promisor_object()
Date: Fri, 5 Dec 2025 13:01:06 -0500 [thread overview]
Message-ID: <20251205180106.GC18566@coredump.intra.peff.net> (raw)
In-Reply-To: <20251205174854.GA18566@coredump.intra.peff.net>
On Fri, Dec 05, 2025 at 12:48:54PM -0500, Jeff King wrote:
> OK, so we are checking the type up front and then skipping
> parse_object() if we can. But there is already some logic inside
> parse_object() for these kinds of optimizations. If we tell it we are
> not interested in checking the hash of the objects, then it knows it can
> skip loading the blob entirely.
>
> But it can _also_ use that flag for other things, like using the
> commit-graph rather than loading individual commit objects. So doing
> this:
>
> diff --git a/packfile.c b/packfile.c
> index 9cc11b6dc5..01b992a4e1 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -2310,7 +2310,8 @@ static int add_promisor_object(const struct object_id *oid,
> we_parsed_object = 0;
> } else {
> we_parsed_object = 1;
> - obj = parse_object(pack->repo, oid);
> + obj = parse_object_with_flags(pack->repo, oid,
> + PARSE_OBJECT_SKIP_HASH_CHECK);
> }
>
> if (!obj)
>
> drops my linux.git case down to 49s. It's skipping the blobs (with no
> need for your patch) and loading the commits out of the graph file. Note
> that you may need to "git commit-graph write --reachable" to see the
> effect (I think we do generate graphs by default in git-gc these days,
> but I'm not sure if we do so right after cloning).
Oh, and obviously it is skipping the hash computation on the objects,
too. That's probably not as important as avoiding the object loads in
the first place, but it may also be making a measurable difference on
the ones we do load (notably trees here).
-Peff
next prev parent reply other threads:[~2025-12-05 18:01 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-04 17:21 [PATCH] packfile: skip decompressing and hashing blobs in add_promisor_object() Aaron Plattner
2025-12-05 12:36 ` Patrick Steinhardt
2025-12-05 16:55 ` Aaron Plattner
2025-12-05 17:59 ` Jeff King
2025-12-05 17:48 ` Jeff King
2025-12-05 18:01 ` Jeff King [this message]
2025-12-05 18:50 ` Aaron Plattner
2025-12-05 21:28 ` Jeff King
2025-12-05 21:56 ` Aaron Plattner
2025-12-06 1:58 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251205180106.GC18566@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=aplattner@nvidia$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox