public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Justin Tobler <jltobler@gmail•com>
To: git@vger•kernel.org
Cc: ps@pks•im, karthik.188@gmail•com, sunshine@sunshineco•com,
	Justin Tobler <jltobler@gmail•com>
Subject: [PATCH v4 0/7] builtin/repo: introduce stats subcommand
Date: Sat, 27 Sep 2025 09:50:42 -0500	[thread overview]
Message-ID: <20250927145049.723341-1-jltobler@gmail.com> (raw)
In-Reply-To: <20250925232928.3846-1-jltobler@gmail.com>

Greetings,

The shape of a repository's history can have huge impacts on the
performance and health of the repository itself. Currently, Git lacks a
means to surface key stats/information regarding the shape of a
repository via a single command. Acquiring this information requires
users to be fairly knowledgeable about the structure of a Git repository
and how to identify the relevant data points. To fill this gap,
supplemental tools such as git-sizer(1) have been developed.

To allow users to more readily identify potential issues for a
repository, introduce the "stats" subcommand in git-repo(1) to output
stats for the repository that may be of interest to users. The goal of
this subcommand is to eventually provide similar functionality to
git-sizer(1), but in Git natively.

In this initial version, the "stats" subcommand only surfaces counts of
the various reference and object types in a repository. In a follow-up
series, I would like to introduce additional data points that are
present in git-sizer(1) such as largest objects, combined object sizes
by type, and other general repository shape information.

Some other general features that would be nice to introduce eventually:

- A "level of concern" meter for reported stats. This could indicate to
  users which stats may be worth looking into further.
- Links to OIDs of interesting objects that correspond to certain stats.
- Options to limit which references to use when evaluating the
  repository.

Changes since V3:

- Changed from using strlen() to utf8_strlen() to take into
  consideration that translatable strings may have characters that are
  more than one byte.

Changes since V2:

- Added clang-format patch to address false postive triggered in this
  series.
- Use varargs for stats_table_add() family of functions.
- Print to stdout directly instead of using strbuf.
- Add parse_option() earlier in the series.
- Use start_delayed_progress() instead of start_progress().
- Add test to validate --[no-]progress options.
- Some other small fixes.

Changes since V1:

- Translatable terms displayed in the table have formatting separated
  out.
- Squashed the `keyvalue` and `nul` output format patches into one.
- Added a progress meter to provide users with more feedback.
- Updated docs to outline to outline reported data in a bulleted list.
- Combined similar tests together to reduce repetitive setup.
- Added patch to improve ref-filter interface so we don't have to create
  a dummy patterns array.
- Many other renames and cleanups to improve patch clarity.

Thanks,
-Justin

Justin Tobler (7):
  builtin/repo: rename repo_info() to cmd_repo_info()
  ref-filter: allow NULL filter pattern
  clang-format: exclude control macros from SpaceBeforeParens
  builtin/repo: introduce stats subcommand
  builtin/repo: add object counts in stats output
  builtin/repo: add keyvalue and nul format for stats
  builtin/repo: add progress meter for stats

 .clang-format               |   2 +-
 Documentation/git-repo.adoc |  30 +++
 builtin/repo.c              | 374 +++++++++++++++++++++++++++++++++++-
 ref-filter.c                |   4 +-
 t/meson.build               |   1 +
 t/t1901-repo-stats.sh       | 129 +++++++++++++
 6 files changed, 534 insertions(+), 6 deletions(-)
 create mode 100755 t/t1901-repo-stats.sh

Range-diff against v3:
1:  ed04168562 = 1:  ed04168562 builtin/repo: rename repo_info() to cmd_repo_info()
2:  6aa76d1323 = 2:  6aa76d1323 ref-filter: allow NULL filter pattern
3:  02a3fcc5fb = 3:  02a3fcc5fb clang-format: exclude control macros from SpaceBeforeParens
4:  12cfbdc464 ! 4:  8ec9914886 builtin/repo: introduce stats subcommand
    @@ builtin/repo.c
      #include "strbuf.h"
     +#include "string-list.h"
      #include "shallow.h"
    ++#include "utf8.h"
      
      static const char *const repo_usage[] = {
      	"git repo info [--format=(keyvalue|nul)] [-z] [<key>...]",
    @@ builtin/repo.c: static int cmd_repo_info(int argc, const char **argv, const char
     +	size_t name_width;
     +
     +	strbuf_vaddf(&buf, format, ap);
    -+	formatted_name = strbuf_detach(&buf, &name_width);
    ++	formatted_name = strbuf_detach(&buf, NULL);
    ++	name_width = utf8_strwidth(formatted_name);
     +
     +	item = string_list_append_nodup(&table->rows, formatted_name);
     +	item->util = entry;
    @@ builtin/repo.c: static int cmd_repo_info(int argc, const char **argv, const char
     +	if (name_width > table->name_col_width)
     +		table->name_col_width = name_width;
     +	if (entry) {
    -+		size_t value_width = strlen(entry->value);
    ++		size_t value_width = utf8_strwidth(entry->value);
     +		if (value_width > table->value_col_width)
     +			table->value_col_width = value_width;
     +	}
    @@ builtin/repo.c: static int cmd_repo_info(int argc, const char **argv, const char
     +{
     +	const char *name_col_title = _("Repository stats");
     +	const char *value_col_title = _("Value");
    -+	size_t name_title_len = strlen(name_col_title);
    -+	size_t value_title_len = strlen(value_col_title);
    ++	size_t name_title_len = utf8_strwidth(name_col_title);
    ++	size_t value_title_len = utf8_strwidth(value_col_title);
     +	struct string_list_item *item;
     +	int name_col_width;
     +	int value_col_width;
5:  ab27340d58 = 5:  584d35f2c7 builtin/repo: add object counts in stats output
6:  f69110224d = 6:  76975b2eab builtin/repo: add keyvalue and nul format for stats
7:  cff5e183bb = 7:  1105346a3c builtin/repo: add progress meter for stats

base-commit: ca2559c1d630eb4f04cdee2328aaf1c768907a9e
-- 
2.51.0.193.g4975ec3473b


  parent reply	other threads:[~2025-09-27 14:50 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-23  2:56 [PATCH 0/4] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-23  2:56 ` [PATCH 1/4] " Justin Tobler
2025-09-23 10:52   ` Patrick Steinhardt
2025-09-23 15:10     ` Justin Tobler
2025-09-23 15:26       ` Patrick Steinhardt
2025-09-23 15:22   ` Karthik Nayak
2025-09-23 15:55     ` Justin Tobler
2025-09-23  2:56 ` [PATCH 2/4] builtin/repo: add object counts in stats output Justin Tobler
2025-09-23 10:52   ` Patrick Steinhardt
2025-09-23 15:19     ` Justin Tobler
2025-09-23 15:30   ` Karthik Nayak
2025-09-23 15:56     ` Justin Tobler
2025-09-23  2:56 ` [PATCH 3/4] builtin/repo: add keyvalue format for stats Justin Tobler
2025-09-23 10:53   ` Patrick Steinhardt
2025-09-23 15:26     ` Justin Tobler
2025-09-23 15:39   ` Karthik Nayak
2025-09-23 15:59     ` Justin Tobler
2025-09-23  2:57 ` [PATCH 4/4] builtin/repo: add nul " Justin Tobler
2025-09-23 10:53   ` Patrick Steinhardt
2025-09-23 15:33     ` Justin Tobler
2025-09-24  4:48       ` Patrick Steinhardt
2025-09-23 15:41   ` Karthik Nayak
2025-09-23 16:02     ` Justin Tobler
2025-09-24 21:24 ` [PATCH v2 0/6] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-24 21:24   ` [PATCH v2 1/6] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-09-24 21:24   ` [PATCH v2 2/6] ref-filter: allow NULL filter pattern Justin Tobler
2025-09-24 21:24   ` [PATCH v2 3/6] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-25  5:38     ` Patrick Steinhardt
2025-09-25 13:01       ` Justin Tobler
2025-09-24 21:24   ` [PATCH v2 4/6] builtin/repo: add object counts in stats output Justin Tobler
2025-09-24 21:24   ` [PATCH v2 5/6] builtin/repo: add keyvalue and nul format for stats Justin Tobler
2025-09-25  5:39     ` Patrick Steinhardt
2025-09-25 13:16       ` Justin Tobler
2025-09-25 13:58         ` Patrick Steinhardt
2025-09-24 21:24   ` [PATCH v2 6/6] builtin/repo: add progress meter " Justin Tobler
2025-09-25  5:39     ` Patrick Steinhardt
2025-09-25 13:20       ` Justin Tobler
2025-09-25 23:29   ` [PATCH v3 0/7] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-25 23:29     ` [PATCH v3 1/7] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-09-25 23:29     ` [PATCH v3 2/7] ref-filter: allow NULL filter pattern Justin Tobler
2025-09-25 23:29     ` [PATCH v3 3/7] clang-format: exclude control macros from SpaceBeforeParens Justin Tobler
2025-09-25 23:29     ` [PATCH v3 4/7] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-25 23:51       ` Eric Sunshine
2025-09-26  1:38         ` Justin Tobler
2025-09-25 23:29     ` [PATCH v3 5/7] builtin/repo: add object counts in stats output Justin Tobler
2025-09-25 23:29     ` [PATCH v3 6/7] builtin/repo: add keyvalue and nul format for stats Justin Tobler
2025-09-25 23:29     ` [PATCH v3 7/7] builtin/repo: add progress meter " Justin Tobler
2025-09-27 14:50     ` Justin Tobler [this message]
2025-09-27 14:50       ` [PATCH v4 1/7] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-09-27 14:50       ` [PATCH v4 2/7] ref-filter: allow NULL filter pattern Justin Tobler
2025-09-27 14:50       ` [PATCH v4 3/7] clang-format: exclude control macros from SpaceBeforeParens Justin Tobler
2025-09-27 15:40         ` Junio C Hamano
2025-09-27 15:51           ` Justin Tobler
2025-09-27 23:49             ` Junio C Hamano
2025-09-27 14:50       ` [PATCH v4 4/7] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-27 16:32         ` Junio C Hamano
2025-10-09 22:09           ` Justin Tobler
2025-10-10  0:42             ` Justin Tobler
2025-10-10  6:53               ` Patrick Steinhardt
2025-10-10 14:34                 ` Justin Tobler
2025-10-13  6:13                   ` Patrick Steinhardt
2025-09-27 14:50       ` [PATCH v4 5/7] builtin/repo: add object counts in stats output Justin Tobler
2025-09-27 14:50       ` [PATCH v4 6/7] builtin/repo: add keyvalue and nul format for stats Justin Tobler
2025-09-27 14:50       ` [PATCH v4 7/7] builtin/repo: add progress meter " Justin Tobler
2025-09-27 16:33       ` [PATCH v4 0/7] builtin/repo: introduce stats subcommand Junio C Hamano
2025-10-15 21:12       ` [PATCH v5 0/6] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-15 21:12         ` [PATCH v5 1/6] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-10-15 21:12         ` [PATCH v5 2/6] ref-filter: allow NULL filter pattern Justin Tobler
2025-10-15 21:12         ` [PATCH v5 3/6] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-16 10:58           ` Patrick Steinhardt
2025-10-21 16:04             ` Justin Tobler
2025-10-15 21:12         ` [PATCH v5 4/6] builtin/repo: add object counts in structure output Justin Tobler
2025-10-15 21:12         ` [PATCH v5 5/6] builtin/repo: add keyvalue and nul format for structure stats Justin Tobler
2025-10-15 21:12         ` [PATCH v5 6/6] builtin/repo: add progress meter " Justin Tobler
2025-10-21 18:25         ` [PATCH v6 0/7] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-21 18:25           ` [PATCH v6 1/7] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-10-21 18:25           ` [PATCH v6 2/7] ref-filter: allow NULL filter pattern Justin Tobler
2025-10-21 18:25           ` [PATCH v6 3/7] ref-filter: export ref_kind_from_refname() Justin Tobler
2025-10-21 18:25           ` [PATCH v6 4/7] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-22  5:01             ` Patrick Steinhardt
2025-10-22 13:50               ` Justin Tobler
2025-10-22 20:15             ` Lucas Seiki Oshiro
2025-10-22 23:42               ` Justin Tobler
2025-10-21 18:25           ` [PATCH v6 5/7] builtin/repo: add object counts in structure output Justin Tobler
2025-10-21 18:26           ` [PATCH v6 6/7] builtin/repo: add keyvalue and nul format for structure stats Justin Tobler
2025-10-22 20:34             ` Lucas Seiki Oshiro
2025-10-23  0:03               ` Justin Tobler
2025-10-21 18:26           ` [PATCH v6 7/7] builtin/repo: add progress meter " Justin Tobler
2025-10-22 19:23           ` [PATCH v6 0/7] builtin/repo: introduce structure subcommand Lucas Seiki Oshiro
2025-10-23  0:05             ` Justin Tobler
2025-10-23 20:54           ` Junio C Hamano
2025-10-24  5:14             ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250927145049.723341-1-jltobler@gmail.com \
    --to=jltobler@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=karthik.188@gmail$(echo .)com \
    --cc=ps@pks$(echo .)im \
    --cc=sunshine@sunshineco$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox