public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Toon Claes <toon@iotcl•com>
To: Patrick Steinhardt <ps@pks•im>, git@vger•kernel.org
Cc: Karthik Nayak <karthik.188@gmail•com>,
	Taylor Blau <me@ttaylorr•com>, Junio C Hamano <gitster@pobox•com>
Subject: Re: [PATCH v2 02/10] builtin/cat-file: wire up an option to filter objects
Date: Tue, 01 Apr 2025 13:45:46 +0200	[thread overview]
Message-ID: <87r02cf6l1.fsf@iotcl.com> (raw)
In-Reply-To: <20250327-pks-cat-file-object-type-filter-v2-2-4bbc7085d7c5@pks.im>

Patrick Steinhardt <ps@pks•im> writes:

> In batch mode, git-cat-file(1) enumerates all objects and prints them
> by iterating through both loose and packed objects. This works without
> considering their reachability at all, and consequently most options to
> filter objects as they exist in e.g. git-rev-list(1) are not applicable.
> In some situations it may still be useful though to filter objects based
> on properties that are inherent to them. This includes the object size
> as well as its type.
>
> Such a filter already exists in git-rev-list(1) with the `--filter=`
> command line option. While this option supports a couple of filters that
> are not applicable to our usecase, some of them are quite a neat fit.
>
> Wire up the filter as an option for git-cat-file(1). This allows us to
> reuse the same syntax as in git-rev-list(1) so that we don't have to
> reinvent the wheel. For now, we die when any of the filter options has
> been passed by the user, but they will be wired up in subsequent
> commits.
>
> Further note that the filters that we are about to introduce don't
> significantly speed up the runtime of git-cat-file(1). While we can skip
> emitting a lot of objects in case they are uninteresting to us, the
> majority of time is spent reading the packfile, which is bottlenecked by
> I/O and not the processor. This will change though once we start to make
> use of bitmaps, which will allow us to skip reading the whole packfile.
>
> Signed-off-by: Patrick Steinhardt <ps@pks•im>
> ---
>  Documentation/git-cat-file.adoc |  6 ++++++
>  builtin/cat-file.c              | 37 +++++++++++++++++++++++++++++++++----
>  t/t1006-cat-file.sh             | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 71 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc
> index d5890ae3686..f7f57b7f538 100644
> --- a/Documentation/git-cat-file.adoc
> +++ b/Documentation/git-cat-file.adoc
> @@ -81,6 +81,12 @@ OPTIONS
>  	end-of-line conversion, etc). In this case, `<object>` has to be of
>  	the form `<tree-ish>:<path>`, or `:<path>`.
>  
> +--filter=<filter-spec>::
> +--no-filter::
> +	Omit objects from the list of printed objects. This can only be used in
> +	combination with one of the batched modes. The '<filter-spec>' may be
> +	one of the following:
> +
>  --path=<path>::
>  	For use with `--textconv` or `--filters`, to allow specifying an object
>  	name and a path separately, e.g. when it is difficult to figure out
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 8e40016dd24..940900d92ad 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -936,10 +946,13 @@ int cmd_cat_file(int argc,
>  	int opt_cw = 0;
>  	int opt_epts = 0;
>  	const char *exp_type = NULL, *obj_name = NULL;
> -	struct batch_options batch = {0};
> +	struct batch_options batch = {
> +		.objects_filter = LIST_OBJECTS_FILTER_INIT,
> +	};
>  	int unknown_type = 0;
>  	int input_nul_terminated = 0;
>  	int nul_terminated = 0;
> +	int ret;
>  
>  	const char * const builtin_catfile_usage[] = {
>  		N_("git cat-file <type> <object>"),
> @@ -1000,6 +1013,8 @@ int cmd_cat_file(int argc,
>  			    N_("run filters on object's content"), 'w'),
>  		OPT_STRING(0, "path", &force_path, N_("blob|tree"),
>  			   N_("use a <path> for (--textconv | --filters); Not with 'batch'")),
> +		OPT_CALLBACK(0, "filter", &batch.objects_filter, N_("args"),
> +			     N_("object filtering"), opt_parse_list_objects_filter),

Because we've decided on `--filter` we can use
`OPT_PARSE_LIST_OBJECTS_FILTER` here now.

--
Toon

  reply	other threads:[~2025-04-01 11:46 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-21  7:47 [PATCH 0/9] builtin/cat-file: allow filtering objects in batch mode Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 1/9] builtin/cat-file: rename variable that tracks usage Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 2/9] builtin/cat-file: wire up an option to filter objects Patrick Steinhardt
2025-02-26 15:20   ` Toon Claes
2025-02-28 10:51     ` Patrick Steinhardt
2025-02-28 17:44       ` Junio C Hamano
2025-03-03 10:40         ` Patrick Steinhardt
2025-02-27 11:20   ` Karthik Nayak
2025-02-21  7:47 ` [PATCH 3/9] builtin/cat-file: support "blob:none" objects filter Patrick Steinhardt
2025-02-26 15:22   ` Toon Claes
2025-02-27 11:26   ` Karthik Nayak
2025-02-21  7:47 ` [PATCH 4/9] builtin/cat-file: support "blob:limit=" " Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 5/9] builtin/cat-file: support "object:type=" " Patrick Steinhardt
2025-02-26 15:23   ` Toon Claes
2025-02-28 10:51     ` Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 6/9] pack-bitmap: expose function to iterate over bitmapped objects Patrick Steinhardt
2025-02-24 18:05   ` Junio C Hamano
2025-02-25  6:59     ` Patrick Steinhardt
2025-02-25 16:59       ` Junio C Hamano
2025-02-27 23:26       ` Taylor Blau
2025-02-28 10:54         ` Patrick Steinhardt
2025-02-27 23:23     ` Taylor Blau
2025-02-27 23:32       ` Junio C Hamano
2025-02-27 23:39         ` Taylor Blau
2025-02-21  7:47 ` [PATCH 7/9] pack-bitmap: introduce function to check whether a pack is bitmapped Patrick Steinhardt
2025-02-27 23:33   ` Taylor Blau
2025-02-21  7:47 ` [PATCH 8/9] builtin/cat-file: deduplicate logic to iterate over all objects Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 9/9] builtin/cat-file: use bitmaps to efficiently filter by object type Patrick Steinhardt
2025-02-27 11:38   ` Karthik Nayak
2025-02-27 23:48   ` Taylor Blau
2025-03-27  9:43 ` [PATCH v2 00/10] builtin/cat-file: allow filtering objects in batch mode Patrick Steinhardt
2025-03-27  9:43   ` [PATCH v2 01/10] builtin/cat-file: rename variable that tracks usage Patrick Steinhardt
2025-04-01  9:51     ` Karthik Nayak
2025-04-02 11:13       ` Patrick Steinhardt
2025-04-07 20:25         ` Junio C Hamano
2025-03-27  9:43   ` [PATCH v2 02/10] builtin/cat-file: wire up an option to filter objects Patrick Steinhardt
2025-04-01 11:45     ` Toon Claes [this message]
2025-04-02 11:13       ` Patrick Steinhardt
2025-04-01 12:05     ` Karthik Nayak
2025-04-02 11:13       ` Patrick Steinhardt
2025-03-27  9:43   ` [PATCH v2 03/10] builtin/cat-file: support "blob:none" objects filter Patrick Steinhardt
2025-04-01 12:22     ` Karthik Nayak
2025-04-01 12:31       ` Karthik Nayak
2025-04-02 11:13         ` Patrick Steinhardt
2025-03-27  9:43   ` [PATCH v2 04/10] builtin/cat-file: support "blob:limit=" " Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 05/10] builtin/cat-file: support "object:type=" " Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 06/10] pack-bitmap: allow passing payloads to `show_reachable_fn()` Patrick Steinhardt
2025-04-01 12:17     ` Toon Claes
2025-04-02 11:13       ` Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 07/10] pack-bitmap: add function to iterate over filtered bitmapped objects Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 08/10] pack-bitmap: introduce function to check whether a pack is bitmapped Patrick Steinhardt
2025-04-01 11:46     ` Toon Claes
2025-04-02 11:13       ` Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 09/10] builtin/cat-file: deduplicate logic to iterate over all objects Patrick Steinhardt
2025-04-01 12:13     ` Toon Claes
2025-04-02 11:13       ` Patrick Steinhardt
2025-04-03 18:24         ` Toon Claes
2025-03-27  9:44   ` [PATCH v2 10/10] builtin/cat-file: use bitmaps to efficiently filter by object type Patrick Steinhardt
2025-04-02 11:13 ` [PATCH v3 00/11] builtin/cat-file: allow filtering objects in batch mode Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 01/11] builtin/cat-file: rename variable that tracks usage Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 02/11] builtin/cat-file: introduce function to report object status Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 03/11] builtin/cat-file: wire up an option to filter objects Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 04/11] builtin/cat-file: support "blob:none" objects filter Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 05/11] builtin/cat-file: support "blob:limit=" " Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 06/11] builtin/cat-file: support "object:type=" " Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 07/11] pack-bitmap: allow passing payloads to `show_reachable_fn()` Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 08/11] pack-bitmap: add function to iterate over filtered bitmapped objects Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 09/11] pack-bitmap: introduce function to check whether a pack is bitmapped Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 10/11] builtin/cat-file: deduplicate logic to iterate over all objects Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 11/11] builtin/cat-file: use bitmaps to efficiently filter by object type Patrick Steinhardt
2025-04-03  8:17   ` [PATCH v3 00/11] builtin/cat-file: allow filtering objects in batch mode Karthik Nayak
2025-04-08  0:32     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r02cf6l1.fsf@iotcl.com \
    --to=toon@iotcl$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=karthik.188@gmail$(echo .)com \
    --cc=me@ttaylorr$(echo .)com \
    --cc=ps@pks$(echo .)im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox