public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Justin Tobler <jltobler@gmail•com>
To: git@vger•kernel.org
Cc: ps@pks•im, gitster@pobox•com, worldhello.net@gmail•com,
	Justin Tobler <jltobler@gmail•com>
Subject: [PATCH v4 0/7] builtin/repo: add object size info to structure output
Date: Tue, 16 Dec 2025 11:38:35 -0600	[thread overview]
Message-ID: <20251216173842.3357832-1-jltobler@gmail.com> (raw)
In-Reply-To: <20251215205639.2700270-1-jltobler@gmail.com>

Greetings,

This patch series extends the recently introduced "structure" subcommand
for git-repo(1) to collect object size information. More specifically,
it shows total inflated and disk sizes of objects by object type. The
aim to provide additional insight that may be useful to users regarding
the structure of a repository.

In addition to this change, this series also updates the table output
format to downscale larger output values along with the appropriate unit
prefix. This is done to make table output more human friendly. The
keyvalue and nul output formats are left the same since they are
intended more for machine parsing.

Changes in V4:
- Unmark "byte" string in "t/helper/test-simple-ipc.c" for translation
  to avoid conflict with translated plural "byte/bytes" string.
- Remove some unnecessary translations and add comments to clarify some
  of the added translations.
- Some small changes to the tests in patch 7.

Changes in V3:
- Address potential localization regression by making the downscaled
  number format string also translatable. Also make the format string
  for how the values and unit prefixes are displayed via
  `strbuf_humanise_{bytes,rate}()` translatable to be more flexible.
- `strbuf_humanise_{bytes,count}_value()` has been renamed to
  `humanise_{bytes,count}()` and updated to provide both the value and
  unit prefix as separate strings.
- Unit prefix strings are no longer allocated and instead constant.
- The humanise flags are now defined in an enum.
- Instead of using `OBJECT_INFO_FOR_PREFETCH`,
  `OBJECT_INFO_SKIP_FETCH_OBJECT` and `OBJECT_INFO_QUICK` are used
  explicitly.
- Tests now use git-rev-list(1) to verify disk size info.

Changes in V2:
- Factor out and reuse existing logic from strbuf_humanise() to handle
  downscaling values and determining the appropriate unit prefix
  separately. This enables more control over how exactly the values are
  written to the structure output table which is useful for alignment
  reasons. I'm not how about the interface used in patch 2. Feedback is
  most welcome.
- In the previous version, when checking object size on a missing object
  we would die. Instead we now ignore missing objects. This allows the
  structure command to work on partial clones.
- disk/inflated keyvalue names renamed to disk_size/inflated_size.
- Unit prefixes are marked for translation.
- The test for keyvalue disk size values are updated to check against
  real expected values instead of skipping. Table output tests still
  skip verifing human-readable values though.

Thanks,
-Justin

Justin Tobler (7):
  builtin/repo: group per-type object values into struct
  strbuf: split out logic to humanise byte values
  builtin/repo: humanise count values in structure output
  builtin/repo: add inflated object info to keyvalue structure output
  builtin/repo: add inflated object info to structure table
  builtin/repo: add disk size info to keyvalue stucture output
  builtin/repo: add object disk size info to structure table

 Documentation/git-repo.adoc |   2 +
 builtin/repo.c              | 175 ++++++++++++++++++++++++++++++------
 strbuf.c                    | 102 ++++++++++++++-------
 strbuf.h                    |  25 ++++++
 t/helper/test-simple-ipc.c  |   7 +-
 t/t1901-repo-structure.sh   | 118 ++++++++++++++++--------
 6 files changed, 331 insertions(+), 98 deletions(-)

Range-diff against v3:
1:  be14de68f6 = 1:  be14de68f6 builtin/repo: group per-type object values into struct
2:  1fa33f5906 ! 2:  0a145cfeec strbuf: split out logic to humanise byte values
    @@ Commit message
         determine the corresponding unit prefix into a separate humanise_bytes()
         function that provides seperate value and unit strings.
     
    +    Note that the "byte" string in "t/helper/test-simple-ipc.c" is unmarked
    +    for translation here so that it doesn't conflict with the newly defined
    +    plural "byte/bytes" translation and instead uses it.
    +
         Signed-off-by: Justin Tobler <jltobler@gmail•com>
     
      ## strbuf.c ##
    @@ strbuf.c: void strbuf_addstr_urlencode(struct strbuf *sb, const char *s,
     -					/* TRANSLATORS: IEC 80000-13:2008 byte/second */
     -					Q_("%u byte/s", "%u bytes/s", bytes),
     -				(unsigned)bytes);
    -+		*value = xstrfmt(_("%u"), (unsigned)bytes);
    ++		*value = xstrfmt("%u", (unsigned)bytes);
     +		*unit = humanise_rate ?
     +			       /* TRANSLATORS: IEC 80000-13:2008 byte/second */
     +			       Q_("byte/s", "bytes/s", bytes) :
    @@ strbuf.c: void strbuf_addstr_urlencode(struct strbuf *sb, const char *s,
     +	const char *unit;
     +
     +	humanise_bytes(bytes, &value, &unit, flags);
    ++
    ++	/*
    ++	 * TRANSLATORS: The first argument is the number string. The second
    ++	 * argument is the unit prefix string (i.e. "12.34 MiB/s").
    ++	 */
     +	strbuf_addf(buf, _("%s %s"), value, unit);
     +	free(value);
     +}
    @@ strbuf.h: void strbuf_addbuf_percentquote(struct strbuf *dst, const struct strbu
      /**
       * Append the given byte size as a human-readable string (i.e. 12.23 KiB,
       * 3.50 MiB).
    +
    + ## t/helper/test-simple-ipc.c ##
    +@@ t/helper/test-simple-ipc.c: int cmd__simple_ipc(int argc, const char **argv)
    + 		OPT_INTEGER(0, "bytecount", &cl_args.bytecount, N_("number of bytes")),
    + 		OPT_INTEGER(0, "batchsize", &cl_args.batchsize, N_("number of requests per thread")),
    + 
    +-		OPT_STRING(0, "byte", &bytevalue, N_("byte"), N_("ballast character")),
    ++		/*
    ++		 * The "byte" string here is not marked for translation and
    ++		 * instead relies on translation in strbuf.c:humanise_bytes() to
    ++		 * avoid conflict with the plural form.
    ++		 */
    ++		OPT_STRING(0, "byte", &bytevalue, "byte", N_("ballast character")),
    + 		OPT_STRING(0, "token", &cl_args.token, N_("token"), N_("command token to send to the server")),
    + 
    + 		OPT_END()
3:  8f09f6358e ! 3:  eebf0d917b builtin/repo: humanise count values in structure output
    @@ strbuf.c: void strbuf_addstr_urlencode(struct strbuf *sb, const char *s,
     +		size_t x = count + 5000000; /* for rounding */
     +		*value = xstrfmt(_("%u.%2.2u"), (unsigned)(x / 1000000000),
     +				 (unsigned)(x % 1000000000 / 10000000));
    ++		/* TRANSLATORS: SI decimal prefix symbol for 10^9 */
     +		*unit = _("G");
     +	} else if (count >= 1000000) {
     +		size_t x = count + 5000; /* for rounding */
     +		*value = xstrfmt(_("%u.%2.2u"), (unsigned)(x / 1000000),
     +				 (unsigned)(x % 1000000 / 10000));
    ++		/* TRANSLATORS: SI decimal prefix symbol for 10^6 */
     +		*unit = _("M");
     +	} else if (count >= 1000) {
     +		size_t x = count + 5; /* for rounding */
     +		*value = xstrfmt(_("%u.%2.2u"), (unsigned)(x / 1000),
     +				 (unsigned)(x % 1000 / 10));
    ++		/* TRANSLATORS: SI decimal prefix symbol for 10^3 */
     +		*unit = _("k");
     +	} else {
    -+		*value = xstrfmt(_("%u"), (unsigned)count);
    ++		*value = xstrfmt("%u", (unsigned)count);
     +		*unit = NULL;
     +	}
     +}
4:  3f4eabe94f = 4:  37f71cc1bc builtin/repo: add inflated object info to keyvalue structure output
5:  85d1052100 ! 5:  40edf4c20b builtin/repo: add inflated object info to structure table
    @@ strbuf.c
     @@ strbuf.c: void humanise_bytes(off_t bytes, char **value, const char **unit,
      		*unit = humanise_rate ? _("KiB/s") : _("KiB");
      	} else {
    - 		*value = xstrfmt(_("%u"), (unsigned)bytes);
    + 		*value = xstrfmt("%u", (unsigned)bytes);
     -		*unit = humanise_rate ?
     -			       /* TRANSLATORS: IEC 80000-13:2008 byte/second */
     -			       Q_("byte/s", "bytes/s", bytes) :
     -			       /* TRANSLATORS: IEC 80000-13:2008 byte */
     -			       Q_("byte", "bytes", bytes);
     +		if (flags & HUMANISE_COMPACT)
    ++			/* TRANSLATORS: IEC 80000-13:2008 byte/second and byte */
     +			*unit = humanise_rate ? _("B/s") : _("B");
     +		else
     +			*unit = humanise_rate ?
6:  e9fa9babec = 6:  ba861f37c9 builtin/repo: add disk size info to keyvalue stucture output
7:  df542c7bdf ! 7:  3118c17ae3 builtin/repo: add object disk size info to structure table
    @@ t/t1901-repo-structure.sh: test_description='test git repo structure'
     -		--filter-provided-objects
     +	disk_usage_opt="--disk-usage"
     +
    -+	if [ "$2" = "true" ]; then
    ++	if test "$2" = "true"
    ++	then
     +		disk_usage_opt="--disk-usage=human"
     +	fi
     +
    -+	if [ "$1" = "all" ]; then
    ++	if test "$1" = "all"
    ++	then
     +		git rev-list --all --objects $disk_usage_opt
     +	else
     +		git rev-list --all --objects $disk_usage_opt \
    @@ t/t1901-repo-structure.sh: test_expect_success SHA1 'repository with references
      		git notes add -m foo &&
      
     -		cat >expect <<-\EOF &&
    ++		# The tags disk size is handled specially due to the
    ++		# git-rev-list(1) --disk-usage=human option printing the full
    ++		# "byte/bytes" unit prefix instead of just "B".
     +		cat >expect <<-EOF &&
      		| Repository structure | Value      |
      		| -------------------- | ---------- |

base-commit: e85ae279b0d58edc2f4c3fd5ac391b51e1223985
-- 
2.52.0.209.ge85ae279b0


  parent reply	other threads:[~2025-12-16 17:39 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-09 22:58 [PATCH 0/6] builtin/repo: add object size info to structure output Justin Tobler
2025-12-09 22:58 ` [PATCH 1/6] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-09 22:58 ` [PATCH 2/6] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:10     ` Justin Tobler
2025-12-11  2:57       ` Junio C Hamano
2025-12-12 16:46         ` Justin Tobler
2025-12-09 22:58 ` [PATCH 3/6] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-09 22:58 ` [PATCH 4/6] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:21     ` Justin Tobler
2025-12-09 22:58 ` [PATCH 5/6] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:24     ` Justin Tobler
2025-12-12 20:40     ` Justin Tobler
2025-12-15  5:33       ` Patrick Steinhardt
2025-12-15 16:24         ` Justin Tobler
2025-12-10 14:58   ` Junio C Hamano
2025-12-10 19:09     ` Lucas Seiki Oshiro
2025-12-12 22:36     ` Justin Tobler
2025-12-12 23:58       ` Junio C Hamano
2025-12-09 22:58 ` [PATCH 6/6] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:24     ` Justin Tobler
2025-12-12 22:36 ` [PATCH v2 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-12 22:36   ` [PATCH v2 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-12 22:36   ` [PATCH v2 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-15 16:26       ` Justin Tobler
2025-12-15  8:21     ` Junio C Hamano
2025-12-15 16:47       ` Justin Tobler
2025-12-16  2:26     ` Jiang Xin
2025-12-16  4:37       ` Junio C Hamano
2025-12-16  6:18         ` Jiang Xin
2025-12-16 14:41           ` Justin Tobler
2025-12-12 22:36   ` [PATCH v2 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-12 22:36   ` [PATCH v2 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-15 16:48       ` Justin Tobler
2025-12-12 22:36   ` [PATCH v2 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-12 22:36   ` [PATCH v2 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-12 22:36   ` [PATCH v2 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-15 20:56   ` [PATCH v3 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-15 20:56     ` [PATCH v3 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-15 20:56     ` [PATCH v3 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-16  1:19       ` Junio C Hamano
2025-12-16  1:36         ` Justin Tobler
2025-12-15 20:56     ` [PATCH v3 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-16  8:25       ` Patrick Steinhardt
2025-12-15 20:56     ` [PATCH v3 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-15 20:56     ` [PATCH v3 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-15 20:56     ` [PATCH v3 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-15 20:56     ` [PATCH v3 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-16  8:25       ` Patrick Steinhardt
2025-12-16 14:48         ` Justin Tobler
2025-12-16 17:38     ` Justin Tobler [this message]
2025-12-16 17:38       ` [PATCH v4 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-16 17:38       ` [PATCH v4 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-16 18:59         ` Junio C Hamano
2025-12-16 19:39           ` Justin Tobler
2025-12-16 17:38       ` [PATCH v4 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-16 17:38       ` [PATCH v4 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-17  7:03         ` Patrick Steinhardt
2025-12-17 16:10           ` Justin Tobler
2025-12-16 17:38       ` [PATCH v4 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-16 17:38       ` [PATCH v4 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-16 17:38       ` [PATCH v4 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-17  7:03       ` [PATCH v4 0/7] builtin/repo: add object size info to structure output Patrick Steinhardt
2025-12-17 17:49         ` Justin Tobler
2025-12-17 17:53       ` [PATCH v5 " Justin Tobler
2025-12-17 17:53         ` [PATCH v5 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-17 17:53         ` [PATCH v5 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-17 17:54         ` [PATCH v5 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-17 17:54         ` [PATCH v5 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-17 17:54         ` [PATCH v5 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-17 17:54         ` [PATCH v5 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-17 17:54         ` [PATCH v5 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-18  6:32         ` [PATCH v5 0/7] builtin/repo: add object size info to structure output Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251216173842.3357832-1-jltobler@gmail.com \
    --to=jltobler@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=ps@pks$(echo .)im \
    --cc=worldhello.net@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox