From: Jeff King <peff@peff•net>
To: Junio C Hamano <gitster@pobox•com>
Cc: Taylor Blau <me@ttaylorr•com>,
git@vger•kernel.org, Elijah Newren <newren@gmail•com>,
Patrick Steinhardt <ps@pks•im>
Subject: Re: [PATCH 00/11] pack-bitmap: convert offset to ref deltas where possible
Date: Fri, 11 Oct 2024 03:54:51 -0400 [thread overview]
Message-ID: <20241011075451.GD18010@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqq8quvk8w9.fsf@gitster.g>
On Thu, Oct 10, 2024 at 01:20:06PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr•com> writes:
>
> >> So when you pick the copy of Y out of another pack, what's so
> >> different? After emitting Y to the resulting pack stream (and
> >> remembering where in the packstream you did so), when it is X's turn
> >> to be emitted, shouldn't you be able to compute the distance in the
> >> resulting packstream to represent X as an ofs-delta against Y, which
> >> should already be happening when you had both X and Y in the same
> >> original pack?
> >
> > Good question. The difference is that if you're reusing X and Y from
> > same pack, you know that Y occurs some number of bytes *before* X in the
> > resulting pack.
> >
> > But if Y comes from a different pack, it may get pushed further back in
> > the MIDX pseudo-pack order. So in that case the assembled pack may list
> > X before Y, in which case X cannot be an OFS_DELTA of Y, since offset
> > deltas require that the base object appears first.
>
> That is what we have always done even before we started bitmap based
> optimization. If we happen to write Y before X, we consider doing
> ofs-delta for X, but otherwise we do ref-delta for X. We do reorder
> fairly late in the pipeline when we notice that X that we are about
> to write out depends on Y that we haven't emitted to avoid this,
> though. All of that the bitmap-based optimization code path should
> be able to imitate, I would think.
A small nitpick on your final sentence here. As you note, we do not ever
write Y before X, because compute_write_order() always places bases
before their deltas in the output pack (and we do not allow cycles of
deltas, of course).
And even with bitmaps we'd do the same, as long as those objects are
both fed to the regular pack-writing machinery.
It is only the special verbatim-pack-reuse[1] code that is trying to
blit out the start of an existing pack that is affected. And in theory
there it _could_ try to reorder to produce an ofs delta, but in practice
the whole point is to take a single very cheap pass over the start of
the pack (or multiple packs in the case of the midx). Doing any
reordering would be counterproductive to the "cheap" adjective there (it
does not even keep a list of object ids it is sending), so we are better
to leave those objects for the regular output code (which does make such
a list).
Taylor's series introduces an in-between where we choose not to reorder,
but switch to REF_DELTA. That is still cheap on CPU on the generating
side, though the resulting pack is slightly larger.
-Peff
[1] I wish we had good names to distinguish the various cases, because
the term "reuse" is kind of overloaded. The "slower" regular
object-sending path may still reuse verbatim bytes found in an
on-disk path. But this "blit out matching parts of a pack without
otherwise considering the objects" feature happens outside of that.
We called it "pack reuse" back in 2013, but that was not a good name
even then. I don't have a good suggestion, though.
next prev parent reply other threads:[~2024-10-11 7:54 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-09 20:30 [PATCH 00/11] pack-bitmap: convert offset to ref deltas where possible Taylor Blau
2024-10-09 20:31 ` [PATCH 01/11] pack-bitmap.c: do not pass `pack_pos` to `try_partial_reuse()` Taylor Blau
2024-10-09 20:31 ` [PATCH 02/11] pack-bitmap.c: avoid unnecessary `offset_to_pack_pos()` Taylor Blau
2024-10-09 20:31 ` [PATCH 03/11] pack-bitmap.c: delay calling 'offset_to_pack_pos()' Taylor Blau
2024-10-09 20:31 ` [PATCH 04/11] pack-bitmap.c: compare `base_offset` to `delta_obj_offset` Taylor Blau
2024-10-09 20:31 ` [PATCH 05/11] pack-bitmap.c: extract `find_base_bitmap_pos()` Taylor Blau
2024-10-09 20:31 ` [PATCH 06/11] pack-bitmap: drop `from_midx` field from `bitmapped_pack` Taylor Blau
2024-10-09 20:31 ` [PATCH 07/11] write_reused_pack_one(): translate bit positions directly Taylor Blau
2024-10-11 8:16 ` Jeff King
2024-11-04 20:36 ` Taylor Blau
2024-10-09 20:31 ` [PATCH 08/11] t5332: enable OFS_DELTAs via test_pack_objects_reused Taylor Blau
2024-10-11 8:19 ` Jeff King
2024-11-04 20:50 ` Taylor Blau
2024-10-09 20:31 ` [PATCH 09/11] pack-bitmap: enable cross-pack delta reuse Taylor Blau
2024-10-11 8:31 ` Jeff King
2024-11-04 21:00 ` Taylor Blau
2024-10-09 20:31 ` [PATCH 10/11] pack-bitmap.c: record whether the result was filtered Taylor Blau
2024-10-11 8:35 ` Jeff King
2024-11-04 21:01 ` Taylor Blau
2024-10-09 20:31 ` [PATCH 11/11] pack-bitmap: enable reusing deltas with base objects in 'haves' bitmap Taylor Blau
2024-10-10 16:46 ` [PATCH 00/11] pack-bitmap: convert offset to ref deltas where possible Junio C Hamano
2024-10-10 17:07 ` Taylor Blau
2024-10-10 20:20 ` Junio C Hamano
2024-10-10 20:32 ` Taylor Blau
2024-10-11 7:54 ` Jeff King [this message]
2024-10-11 8:01 ` Jeff King
2024-10-11 16:23 ` Junio C Hamano
2024-10-11 8:38 ` Jeff King
2024-11-19 23:08 ` Taylor Blau
2024-11-19 23:34 ` Taylor Blau
2024-12-18 12:57 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241011075451.GD18010@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=me@ttaylorr$(echo .)com \
--cc=newren@gmail$(echo .)com \
--cc=ps@pks$(echo .)im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox