public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Philip Oakley via GitGitGadget" <gitgitgadget@gmail•com>
To: git@vger•kernel.org
Cc: Johannes Schindelin <johannes.schindelin@gmx•de>,
	Philip Oakley <philipoakley@iee•email>
Subject: [PATCH 3/6] hash algorithms: use size_t for section lengths
Date: Thu, 04 Jun 2026 17:15:09 +0000	[thread overview]
Message-ID: <253d6f8004e710d05b5de1f8279d67d2220f83de.1780593313.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2138.git.1780593313.gitgitgadget@gmail.com>

From: Philip Oakley <philipoakley@iee•email>

Continue walking the code path for the >4GB `hash-object --literally`
test to the hash algorithm step for LLP64 systems.

This patch lets the SHA1DC code use `size_t`, making it compatible with
LLP64 data models (as used e.g. by Windows).

The interested reader of this patch will note that we adjust the
signature of the `git_SHA1DCUpdate()` function without updating _any_
call site. This certainly puzzled at least one reviewer already, so here
is an explanation:

This function is never called directly, but always via the macro
`platform_SHA1_Update`, which is usually called via the macro
`git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly
in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`,
which is defined thusly:

    static void git_hash_sha1_update(git_hash_ctx *ctx,
                                     const void *data, size_t len)
    {
        git_SHA1_Update(&ctx->sha1, data, len);
    }

i.e. it contains an implicit downcast from `size_t` to `unsigned long`
(before this here patch). With this patch, there is no downcast anymore.

With this patch, finally, the t1007-hash-object.sh "files over 4GB hash
literally" test case is fixed.

Signed-off-by: Philip Oakley <philipoakley@iee•email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx•de>
---
 object-file.c          | 4 ++--
 sha1dc_git.c           | 3 +--
 sha1dc_git.h           | 2 +-
 t/t1007-hash-object.sh | 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/object-file.c b/object-file.c
index 1f5f9daf24..c648cecd80 100644
--- a/object-file.c
+++ b/object-file.c
@@ -561,7 +561,7 @@ int odb_source_loose_read_object_info(struct odb_source *source,
 }
 
 static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
-			     const void *buf, unsigned long len,
+			     const void *buf, size_t len,
 			     struct object_id *oid,
 			     char *hdr, size_t *hdrlen)
 {
@@ -581,7 +581,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo,
 	/* Generate the header */
 	*hdrlen = format_object_header(hdr, *hdrlen, type, len);
 
-	/* Sha1.. */
+	/* Hash (function pointers) computation */
 	hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
 }
 
diff --git a/sha1dc_git.c b/sha1dc_git.c
index 9b675a046e..fe58d7962a 100644
--- a/sha1dc_git.c
+++ b/sha1dc_git.c
@@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
 /*
  * Same as SHA1DCUpdate, but adjust types to match git's usual interface.
  */
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len)
 {
 	const char *data = vdata;
-	/* We expect an unsigned long, but sha1dc only takes an int */
 	while (len > INT_MAX) {
 		SHA1DCUpdate(ctx, data, INT_MAX);
 		data += INT_MAX;
diff --git a/sha1dc_git.h b/sha1dc_git.h
index f6f880cabe..0bcf1aa84b 100644
--- a/sha1dc_git.h
+++ b/sha1dc_git.h
@@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *);
 #endif
 
 void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len);
 
 #define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */
 
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 7867fd1dbf..10382a815e 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
 	test_cmp expect actual
 '
 
-test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
 		'files over 4GB hash literally' '
 	test-tool genzeros $((5*1024*1024*1024)) >big &&
 	test_oid large5GB >expect &&
-- 
gitgitgadget


  parent reply	other threads:[~2026-06-04 17:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 17:15 [PATCH 0/6] Support hashing objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-06-04 17:15 ` [PATCH 1/6] hash-object: demonstrate a >4GB/LLP64 problem Philip Oakley via GitGitGadget
2026-06-04 17:15 ` [PATCH 2/6] object-file.c: use size_t for header lengths Philip Oakley via GitGitGadget
2026-06-04 17:15 ` Philip Oakley via GitGitGadget [this message]
2026-06-04 17:15 ` [PATCH 4/6] hash-object --stdin: verify that it works with >4GB/LLP64 Philip Oakley via GitGitGadget
2026-06-04 17:15 ` [PATCH 5/6] hash-object: add another >4GB/LLP64 test case Philip Oakley via GitGitGadget
2026-06-04 17:15 ` [PATCH 6/6] hash-object: add a >4GB/LLP64 test case using filtered input Philip Oakley via GitGitGadget
2026-06-04 21:56 ` [PATCH 0/6] Support hashing objects larger than 4GB on Windows Philip Oakley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=253d6f8004e710d05b5de1f8279d67d2220f83de.1780593313.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=johannes.schindelin@gmx$(echo .)de \
    --cc=philipoakley@iee$(echo .)email \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox