From: "Philip Oakley via GitGitGadget" <gitgitgadget@gmail•com>
To: git@vger•kernel.org
Cc: Johannes Schindelin <johannes.schindelin@gmx•de>,
Philip Oakley <philipoakley@iee•email>
Subject: [PATCH 3/6] hash algorithms: use size_t for section lengths
Date: Thu, 04 Jun 2026 17:15:09 +0000 [thread overview]
Message-ID: <253d6f8004e710d05b5de1f8279d67d2220f83de.1780593313.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2138.git.1780593313.gitgitgadget@gmail.com>
From: Philip Oakley <philipoakley@iee•email>
Continue walking the code path for the >4GB `hash-object --literally`
test to the hash algorithm step for LLP64 systems.
This patch lets the SHA1DC code use `size_t`, making it compatible with
LLP64 data models (as used e.g. by Windows).
The interested reader of this patch will note that we adjust the
signature of the `git_SHA1DCUpdate()` function without updating _any_
call site. This certainly puzzled at least one reviewer already, so here
is an explanation:
This function is never called directly, but always via the macro
`platform_SHA1_Update`, which is usually called via the macro
`git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly
in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`,
which is defined thusly:
static void git_hash_sha1_update(git_hash_ctx *ctx,
const void *data, size_t len)
{
git_SHA1_Update(&ctx->sha1, data, len);
}
i.e. it contains an implicit downcast from `size_t` to `unsigned long`
(before this here patch). With this patch, there is no downcast anymore.
With this patch, finally, the t1007-hash-object.sh "files over 4GB hash
literally" test case is fixed.
Signed-off-by: Philip Oakley <philipoakley@iee•email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx•de>
---
object-file.c | 4 ++--
sha1dc_git.c | 3 +--
sha1dc_git.h | 2 +-
t/t1007-hash-object.sh | 2 +-
4 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/object-file.c b/object-file.c
index 1f5f9daf24..c648cecd80 100644
--- a/object-file.c
+++ b/object-file.c
@@ -561,7 +561,7 @@ int odb_source_loose_read_object_info(struct odb_source *source,
}
static void hash_object_body(const struct git_hash_algo *algo, struct git_hash_ctx *c,
- const void *buf, unsigned long len,
+ const void *buf, size_t len,
struct object_id *oid,
char *hdr, size_t *hdrlen)
{
@@ -581,7 +581,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo,
/* Generate the header */
*hdrlen = format_object_header(hdr, *hdrlen, type, len);
- /* Sha1.. */
+ /* Hash (function pointers) computation */
hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen);
}
diff --git a/sha1dc_git.c b/sha1dc_git.c
index 9b675a046e..fe58d7962a 100644
--- a/sha1dc_git.c
+++ b/sha1dc_git.c
@@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
/*
* Same as SHA1DCUpdate, but adjust types to match git's usual interface.
*/
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len)
{
const char *data = vdata;
- /* We expect an unsigned long, but sha1dc only takes an int */
while (len > INT_MAX) {
SHA1DCUpdate(ctx, data, INT_MAX);
data += INT_MAX;
diff --git a/sha1dc_git.h b/sha1dc_git.h
index f6f880cabe..0bcf1aa84b 100644
--- a/sha1dc_git.h
+++ b/sha1dc_git.h
@@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *);
#endif
void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len);
#define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index 7867fd1dbf..10382a815e 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -261,7 +261,7 @@ test_expect_success '--stdin outside of repository (uses default hash)' '
test_cmp expect actual
'
-test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
+test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
'files over 4GB hash literally' '
test-tool genzeros $((5*1024*1024*1024)) >big &&
test_oid large5GB >expect &&
--
gitgitgadget
next prev parent reply other threads:[~2026-06-04 17:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 17:15 [PATCH 0/6] Support hashing objects larger than 4GB on Windows Johannes Schindelin via GitGitGadget
2026-06-04 17:15 ` [PATCH 1/6] hash-object: demonstrate a >4GB/LLP64 problem Philip Oakley via GitGitGadget
2026-06-04 17:15 ` [PATCH 2/6] object-file.c: use size_t for header lengths Philip Oakley via GitGitGadget
2026-06-04 17:15 ` Philip Oakley via GitGitGadget [this message]
2026-06-04 17:15 ` [PATCH 4/6] hash-object --stdin: verify that it works with >4GB/LLP64 Philip Oakley via GitGitGadget
2026-06-04 17:15 ` [PATCH 5/6] hash-object: add another >4GB/LLP64 test case Philip Oakley via GitGitGadget
2026-06-04 17:15 ` [PATCH 6/6] hash-object: add a >4GB/LLP64 test case using filtered input Philip Oakley via GitGitGadget
2026-06-04 21:56 ` [PATCH 0/6] Support hashing objects larger than 4GB on Windows Philip Oakley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=253d6f8004e710d05b5de1f8279d67d2220f83de.1780593313.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=johannes.schindelin@gmx$(echo .)de \
--cc=philipoakley@iee$(echo .)email \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox