From: Patrick Steinhardt <ps@pks•im>
To: Jeff King <peff@peff•net>
Cc: git@vger•kernel.org
Subject: Re: [PATCH 02/13] cat-file: make --allow-unknown-type a noop
Date: Fri, 16 May 2025 11:52:18 +0200 [thread overview]
Message-ID: <aCcK0p7wMURHppe7@pks.im> (raw)
In-Reply-To: <20250516044935.GB22242@coredump.intra.peff.net>
On Fri, May 16, 2025 at 12:49:35AM -0400, Jeff King wrote:
> The cat-file command has some minor support for handling objects with
> "unknown" types. I.e., strings that are not "blob", "commit", "tree", or
> "tag".
>
> In theory this could be used for debugging or experimenting with
> extensions to Git. But in practice this support is not very useful:
>
> 1. You can get the type and size of such objects, but nothing else.
> Not even the contents!
>
> 2. Only loose objects are supported, since packfiles use numeric ids
> for the types, rather than strings.
>
> 3. Likewise you cannot ever transfer objects between repositories,
> because they cannot be represented in the packfiles used for the
> on-the-wire protocol.
All of these are good reasons. To add one more: they would probably
prove to be a pain for pluggable object databases, too. No need to add
that to the list here though given that there isn't even a design doc
for those yet.
> The support for these unknown types complicates the object-parsing code,
> and has led to bugs such as b748ddb7a4 (unpack_loose_header(): fix
> infinite loop on broken zlib input, 2025-02-25). So let's drop it.
>
> The first step is to remove the user-facing parts, which are accessible
> only via cat-file. This is technically backwards-incompatible, but given
> the limitations listed above, these objects couldn't possibly be useful
> in any workflow.
I agree.
> However, we can't just rip out the option entirely. That would hurt a
> caller who ran:
>
> git cat-file -t --allow-unknown-object <oid>
>
> and fed it normal, well-formed objects. There --allow-unknown-type was
> doing nothing, but we wouldn't want to start bailing with an error. So
> to protect any such callers, we'll retain --allow-unknown-type as a
> noop.
Okay, that sounds reasonable to me.
> The code change is fairly small (but we'll able to clean up more code in
> follow-on patches). The test updates drop any use of the option. We
> still retain tests that feed the broken objects to cat-file without
> --allow-unknown-type, as we should continue to confirm that those
> objects are rejected. Note that in one spot we can drop a layer of loop,
> re-indenting the body; viewing the diff with "-w" helps there.
Shouldn't we have a test though that the option is still accepted, even
though it doesn't do anything?
> Signed-off-by: Jeff King <peff@peff•net>
> ---
> Documentation/git-cat-file.adoc | 6 +-
> builtin/cat-file.c | 18 +--
> t/t1006-cat-file.sh | 211 ++++++++------------------------
> 3 files changed, 56 insertions(+), 179 deletions(-)
>
> diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc
> index fc4b92f104..cde79ad242 100644
> --- a/Documentation/git-cat-file.adoc
> +++ b/Documentation/git-cat-file.adoc
> @@ -9,8 +9,7 @@ SYNOPSIS
> --------
> [verse]
> 'git cat-file' <type> <object>
> -'git cat-file' (-e | -p) <object>
> -'git cat-file' (-t | -s) [--allow-unknown-type] <object>
> +'git cat-file' (-e | -p | -t | -s) <object>
> 'git cat-file' (--textconv | --filters)
> [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
> 'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
> @@ -202,9 +201,6 @@ flush::
> only once, even if it is stored multiple times in the
> repository.
>
> ---allow-unknown-type::
> - Allow `-s` or `-t` to query broken/corrupt objects of unknown type.
> -
Should we maybe introduce a "deprecated" section and spell out that this
option is a no-op nowadays that will be removed for example in Git 3.0?
Patrick
next prev parent reply other threads:[~2025-05-16 9:52 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-16 4:49 [PATCH 0/13] dropping support for non-standard object types Jeff King
2025-05-16 4:49 ` [PATCH 01/13] object-file.h: fix typo in variable declaration Jeff King
2025-05-16 4:49 ` [PATCH 02/13] cat-file: make --allow-unknown-type a noop Jeff King
2025-05-16 9:52 ` Patrick Steinhardt [this message]
2025-05-19 6:16 ` Jeff King
2025-05-19 7:22 ` Patrick Steinhardt
2025-05-16 16:47 ` Junio C Hamano
2025-05-16 4:49 ` [PATCH 03/13] object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag Jeff King
2025-05-16 4:49 ` [PATCH 04/13] cat-file: use type enum instead of buffer for -t option Jeff King
2025-05-16 16:56 ` Junio C Hamano
2025-05-16 4:49 ` [PATCH 05/13] oid_object_info_convert(): stop using string for object type Jeff King
2025-05-16 4:49 ` [PATCH 06/13] fsck: stop using object_info->type_name strbuf Jeff King
2025-05-16 9:52 ` Patrick Steinhardt
2025-05-19 14:26 ` Junio C Hamano
2025-05-19 17:00 ` Jeff King
2025-05-16 4:49 ` [PATCH 07/13] oid_object_info(): drop type_name strbuf Jeff King
2025-05-19 14:58 ` Junio C Hamano
2025-05-16 4:49 ` [PATCH 08/13] t/helper: add zlib test-tool Jeff King
2025-05-19 15:03 ` Junio C Hamano
2025-05-19 17:03 ` Jeff King
2025-05-21 13:44 ` Junio C Hamano
2025-05-16 4:50 ` [PATCH 09/13] t: add lib-loose.sh Jeff King
2025-05-16 9:52 ` Patrick Steinhardt
2025-05-19 6:17 ` Jeff King
2025-05-19 15:12 ` Junio C Hamano
2025-05-16 4:50 ` [PATCH 10/13] hash-object: stop allowing unknown types Jeff King
2025-05-19 15:15 ` Junio C Hamano
2025-05-16 4:50 ` [PATCH 11/13] hash-object: merge HASH_* and INDEX_* flags Jeff King
2025-05-16 9:52 ` Patrick Steinhardt
2025-05-16 4:50 ` [PATCH 12/13] hash-object: handle --literally with OPT_NEGBIT Jeff King
2025-05-19 15:30 ` Junio C Hamano
2025-05-16 4:50 ` [PATCH 13/13] object-file: drop support for writing objects with unknown types Jeff King
2025-05-16 9:52 ` Patrick Steinhardt
2025-05-19 15:32 ` Junio C Hamano
2025-05-16 16:36 ` [PATCH 0/13] dropping support for non-standard object types Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aCcK0p7wMURHppe7@pks.im \
--to=ps@pks$(echo .)im \
--cc=git@vger$(echo .)kernel.org \
--cc=peff@peff$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox