From: Justin Tobler <jltobler@gmail•com>
To: git@vger•kernel.org
Cc: ps@pks•im, karthik.188@gmail•com, sunshine@sunshineco•com,
Justin Tobler <jltobler@gmail•com>
Subject: [PATCH v4 0/7] builtin/repo: introduce stats subcommand
Date: Sat, 27 Sep 2025 09:50:42 -0500 [thread overview]
Message-ID: <20250927145049.723341-1-jltobler@gmail.com> (raw)
In-Reply-To: <20250925232928.3846-1-jltobler@gmail.com>
Greetings,
The shape of a repository's history can have huge impacts on the
performance and health of the repository itself. Currently, Git lacks a
means to surface key stats/information regarding the shape of a
repository via a single command. Acquiring this information requires
users to be fairly knowledgeable about the structure of a Git repository
and how to identify the relevant data points. To fill this gap,
supplemental tools such as git-sizer(1) have been developed.
To allow users to more readily identify potential issues for a
repository, introduce the "stats" subcommand in git-repo(1) to output
stats for the repository that may be of interest to users. The goal of
this subcommand is to eventually provide similar functionality to
git-sizer(1), but in Git natively.
In this initial version, the "stats" subcommand only surfaces counts of
the various reference and object types in a repository. In a follow-up
series, I would like to introduce additional data points that are
present in git-sizer(1) such as largest objects, combined object sizes
by type, and other general repository shape information.
Some other general features that would be nice to introduce eventually:
- A "level of concern" meter for reported stats. This could indicate to
users which stats may be worth looking into further.
- Links to OIDs of interesting objects that correspond to certain stats.
- Options to limit which references to use when evaluating the
repository.
Changes since V3:
- Changed from using strlen() to utf8_strlen() to take into
consideration that translatable strings may have characters that are
more than one byte.
Changes since V2:
- Added clang-format patch to address false postive triggered in this
series.
- Use varargs for stats_table_add() family of functions.
- Print to stdout directly instead of using strbuf.
- Add parse_option() earlier in the series.
- Use start_delayed_progress() instead of start_progress().
- Add test to validate --[no-]progress options.
- Some other small fixes.
Changes since V1:
- Translatable terms displayed in the table have formatting separated
out.
- Squashed the `keyvalue` and `nul` output format patches into one.
- Added a progress meter to provide users with more feedback.
- Updated docs to outline to outline reported data in a bulleted list.
- Combined similar tests together to reduce repetitive setup.
- Added patch to improve ref-filter interface so we don't have to create
a dummy patterns array.
- Many other renames and cleanups to improve patch clarity.
Thanks,
-Justin
Justin Tobler (7):
builtin/repo: rename repo_info() to cmd_repo_info()
ref-filter: allow NULL filter pattern
clang-format: exclude control macros from SpaceBeforeParens
builtin/repo: introduce stats subcommand
builtin/repo: add object counts in stats output
builtin/repo: add keyvalue and nul format for stats
builtin/repo: add progress meter for stats
.clang-format | 2 +-
Documentation/git-repo.adoc | 30 +++
builtin/repo.c | 374 +++++++++++++++++++++++++++++++++++-
ref-filter.c | 4 +-
t/meson.build | 1 +
t/t1901-repo-stats.sh | 129 +++++++++++++
6 files changed, 534 insertions(+), 6 deletions(-)
create mode 100755 t/t1901-repo-stats.sh
Range-diff against v3:
1: ed04168562 = 1: ed04168562 builtin/repo: rename repo_info() to cmd_repo_info()
2: 6aa76d1323 = 2: 6aa76d1323 ref-filter: allow NULL filter pattern
3: 02a3fcc5fb = 3: 02a3fcc5fb clang-format: exclude control macros from SpaceBeforeParens
4: 12cfbdc464 ! 4: 8ec9914886 builtin/repo: introduce stats subcommand
@@ builtin/repo.c
#include "strbuf.h"
+#include "string-list.h"
#include "shallow.h"
++#include "utf8.h"
static const char *const repo_usage[] = {
"git repo info [--format=(keyvalue|nul)] [-z] [<key>...]",
@@ builtin/repo.c: static int cmd_repo_info(int argc, const char **argv, const char
+ size_t name_width;
+
+ strbuf_vaddf(&buf, format, ap);
-+ formatted_name = strbuf_detach(&buf, &name_width);
++ formatted_name = strbuf_detach(&buf, NULL);
++ name_width = utf8_strwidth(formatted_name);
+
+ item = string_list_append_nodup(&table->rows, formatted_name);
+ item->util = entry;
@@ builtin/repo.c: static int cmd_repo_info(int argc, const char **argv, const char
+ if (name_width > table->name_col_width)
+ table->name_col_width = name_width;
+ if (entry) {
-+ size_t value_width = strlen(entry->value);
++ size_t value_width = utf8_strwidth(entry->value);
+ if (value_width > table->value_col_width)
+ table->value_col_width = value_width;
+ }
@@ builtin/repo.c: static int cmd_repo_info(int argc, const char **argv, const char
+{
+ const char *name_col_title = _("Repository stats");
+ const char *value_col_title = _("Value");
-+ size_t name_title_len = strlen(name_col_title);
-+ size_t value_title_len = strlen(value_col_title);
++ size_t name_title_len = utf8_strwidth(name_col_title);
++ size_t value_title_len = utf8_strwidth(value_col_title);
+ struct string_list_item *item;
+ int name_col_width;
+ int value_col_width;
5: ab27340d58 = 5: 584d35f2c7 builtin/repo: add object counts in stats output
6: f69110224d = 6: 76975b2eab builtin/repo: add keyvalue and nul format for stats
7: cff5e183bb = 7: 1105346a3c builtin/repo: add progress meter for stats
base-commit: ca2559c1d630eb4f04cdee2328aaf1c768907a9e
--
2.51.0.193.g4975ec3473b
next prev parent reply other threads:[~2025-09-27 14:50 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 2:56 [PATCH 0/4] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-23 2:56 ` [PATCH 1/4] " Justin Tobler
2025-09-23 10:52 ` Patrick Steinhardt
2025-09-23 15:10 ` Justin Tobler
2025-09-23 15:26 ` Patrick Steinhardt
2025-09-23 15:22 ` Karthik Nayak
2025-09-23 15:55 ` Justin Tobler
2025-09-23 2:56 ` [PATCH 2/4] builtin/repo: add object counts in stats output Justin Tobler
2025-09-23 10:52 ` Patrick Steinhardt
2025-09-23 15:19 ` Justin Tobler
2025-09-23 15:30 ` Karthik Nayak
2025-09-23 15:56 ` Justin Tobler
2025-09-23 2:56 ` [PATCH 3/4] builtin/repo: add keyvalue format for stats Justin Tobler
2025-09-23 10:53 ` Patrick Steinhardt
2025-09-23 15:26 ` Justin Tobler
2025-09-23 15:39 ` Karthik Nayak
2025-09-23 15:59 ` Justin Tobler
2025-09-23 2:57 ` [PATCH 4/4] builtin/repo: add nul " Justin Tobler
2025-09-23 10:53 ` Patrick Steinhardt
2025-09-23 15:33 ` Justin Tobler
2025-09-24 4:48 ` Patrick Steinhardt
2025-09-23 15:41 ` Karthik Nayak
2025-09-23 16:02 ` Justin Tobler
2025-09-24 21:24 ` [PATCH v2 0/6] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-24 21:24 ` [PATCH v2 1/6] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-09-24 21:24 ` [PATCH v2 2/6] ref-filter: allow NULL filter pattern Justin Tobler
2025-09-24 21:24 ` [PATCH v2 3/6] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-25 5:38 ` Patrick Steinhardt
2025-09-25 13:01 ` Justin Tobler
2025-09-24 21:24 ` [PATCH v2 4/6] builtin/repo: add object counts in stats output Justin Tobler
2025-09-24 21:24 ` [PATCH v2 5/6] builtin/repo: add keyvalue and nul format for stats Justin Tobler
2025-09-25 5:39 ` Patrick Steinhardt
2025-09-25 13:16 ` Justin Tobler
2025-09-25 13:58 ` Patrick Steinhardt
2025-09-24 21:24 ` [PATCH v2 6/6] builtin/repo: add progress meter " Justin Tobler
2025-09-25 5:39 ` Patrick Steinhardt
2025-09-25 13:20 ` Justin Tobler
2025-09-25 23:29 ` [PATCH v3 0/7] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-25 23:29 ` [PATCH v3 1/7] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-09-25 23:29 ` [PATCH v3 2/7] ref-filter: allow NULL filter pattern Justin Tobler
2025-09-25 23:29 ` [PATCH v3 3/7] clang-format: exclude control macros from SpaceBeforeParens Justin Tobler
2025-09-25 23:29 ` [PATCH v3 4/7] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-25 23:51 ` Eric Sunshine
2025-09-26 1:38 ` Justin Tobler
2025-09-25 23:29 ` [PATCH v3 5/7] builtin/repo: add object counts in stats output Justin Tobler
2025-09-25 23:29 ` [PATCH v3 6/7] builtin/repo: add keyvalue and nul format for stats Justin Tobler
2025-09-25 23:29 ` [PATCH v3 7/7] builtin/repo: add progress meter " Justin Tobler
2025-09-27 14:50 ` Justin Tobler [this message]
2025-09-27 14:50 ` [PATCH v4 1/7] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-09-27 14:50 ` [PATCH v4 2/7] ref-filter: allow NULL filter pattern Justin Tobler
2025-09-27 14:50 ` [PATCH v4 3/7] clang-format: exclude control macros from SpaceBeforeParens Justin Tobler
2025-09-27 15:40 ` Junio C Hamano
2025-09-27 15:51 ` Justin Tobler
2025-09-27 23:49 ` Junio C Hamano
2025-09-27 14:50 ` [PATCH v4 4/7] builtin/repo: introduce stats subcommand Justin Tobler
2025-09-27 16:32 ` Junio C Hamano
2025-10-09 22:09 ` Justin Tobler
2025-10-10 0:42 ` Justin Tobler
2025-10-10 6:53 ` Patrick Steinhardt
2025-10-10 14:34 ` Justin Tobler
2025-10-13 6:13 ` Patrick Steinhardt
2025-09-27 14:50 ` [PATCH v4 5/7] builtin/repo: add object counts in stats output Justin Tobler
2025-09-27 14:50 ` [PATCH v4 6/7] builtin/repo: add keyvalue and nul format for stats Justin Tobler
2025-09-27 14:50 ` [PATCH v4 7/7] builtin/repo: add progress meter " Justin Tobler
2025-09-27 16:33 ` [PATCH v4 0/7] builtin/repo: introduce stats subcommand Junio C Hamano
2025-10-15 21:12 ` [PATCH v5 0/6] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-15 21:12 ` [PATCH v5 1/6] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-10-15 21:12 ` [PATCH v5 2/6] ref-filter: allow NULL filter pattern Justin Tobler
2025-10-15 21:12 ` [PATCH v5 3/6] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-16 10:58 ` Patrick Steinhardt
2025-10-21 16:04 ` Justin Tobler
2025-10-15 21:12 ` [PATCH v5 4/6] builtin/repo: add object counts in structure output Justin Tobler
2025-10-15 21:12 ` [PATCH v5 5/6] builtin/repo: add keyvalue and nul format for structure stats Justin Tobler
2025-10-15 21:12 ` [PATCH v5 6/6] builtin/repo: add progress meter " Justin Tobler
2025-10-21 18:25 ` [PATCH v6 0/7] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-21 18:25 ` [PATCH v6 1/7] builtin/repo: rename repo_info() to cmd_repo_info() Justin Tobler
2025-10-21 18:25 ` [PATCH v6 2/7] ref-filter: allow NULL filter pattern Justin Tobler
2025-10-21 18:25 ` [PATCH v6 3/7] ref-filter: export ref_kind_from_refname() Justin Tobler
2025-10-21 18:25 ` [PATCH v6 4/7] builtin/repo: introduce structure subcommand Justin Tobler
2025-10-22 5:01 ` Patrick Steinhardt
2025-10-22 13:50 ` Justin Tobler
2025-10-22 20:15 ` Lucas Seiki Oshiro
2025-10-22 23:42 ` Justin Tobler
2025-10-21 18:25 ` [PATCH v6 5/7] builtin/repo: add object counts in structure output Justin Tobler
2025-10-21 18:26 ` [PATCH v6 6/7] builtin/repo: add keyvalue and nul format for structure stats Justin Tobler
2025-10-22 20:34 ` Lucas Seiki Oshiro
2025-10-23 0:03 ` Justin Tobler
2025-10-21 18:26 ` [PATCH v6 7/7] builtin/repo: add progress meter " Justin Tobler
2025-10-22 19:23 ` [PATCH v6 0/7] builtin/repo: introduce structure subcommand Lucas Seiki Oshiro
2025-10-23 0:05 ` Justin Tobler
2025-10-23 20:54 ` Junio C Hamano
2025-10-24 5:14 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250927145049.723341-1-jltobler@gmail.com \
--to=jltobler@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=karthik.188@gmail$(echo .)com \
--cc=ps@pks$(echo .)im \
--cc=sunshine@sunshineco$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox