From: Junio C Hamano <gitster@pobox•com>
To: Atousa Duprat <atousa.p@gmail•com>
Cc: git@vger•kernel.org,
"Rafael Espíndola" <rafael.espindola@gmail•com>,
"Filipe Cabecinhas" <filcab@gmail•com>
Subject: Re: git fsck failure on OS X with files >= 4 GiB
Date: Thu, 29 Oct 2015 10:19:14 -0700 [thread overview]
Message-ID: <xmqqlhalsict.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <CA+izobtdwszVrYsnKU=_ytLuNbPGyRe_7kXqyrQO7u5Lo+OdPg@mail.gmail.com> (Atousa Duprat's message of "Thu, 29 Oct 2015 09:02:49 -0700")
Atousa Duprat <atousa.p@gmail•com> writes:
> [PATCH] Limit the size of the data block passed to SHA1_Update()
>
> This avoids issues where OS-specific implementations use
> a 32-bit integer to specify block size. Limit currently
> set to 1GiB.
> ---
> cache.h | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/cache.h b/cache.h
> index 79066e5..c305985 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -14,10 +14,28 @@
> #ifndef git_SHA_CTX
> #define git_SHA_CTX SHA_CTX
> #define git_SHA1_Init SHA1_Init
> -#define git_SHA1_Update SHA1_Update
> #define git_SHA1_Final SHA1_Final
> #endif
>
> +#define SHA1_MAX_BLOCK_SIZE (1024*1024*1024)
> +
> +static inline int git_SHA1_Update(SHA_CTX *c, const void *data, size_t len)
> +{
> + size_t nr;
> + size_t total = 0;
> + char *cdata = (char*)data;
> + while(len > 0) {
> + nr = len;
> + if(nr > SHA1_MAX_BLOCK_SIZE)
> + nr = SHA1_MAX_BLOCK_SIZE;
> + SHA1_Update(c, cdata, nr);
> + total += nr;
> + cdata += nr;
> + len -= nr;
> + }
> + return total;
> +}
> +
I think the idea illustrated above is a good start, but there are
a few issues:
* SHA1_Update() is used in fairly many places; it is unclear if it
is a good idea to inline.
* There is no need to punish implementations with working
SHA1_Update by another level of wrapping.
* What would you do when you find an implementation for which 1G is
still too big?
Perhaps something like this in the header
#ifdef SHA1_MAX_BLOCK_SIZE
extern int SHA1_Update_Chunked(SHA_CTX *, const void *, size_t);
#define git_SHA1_Update SHA1_Update_Chunked
#endif
with compat/sha1_chunked.c that has
#ifdef SHA1_MAX_BLOCK_SIZE
int SHA1_Update_Chunked(SHA_CTX *c, const void *data, size_t len)
{
... your looping implementation ...
}
#endif
in it, that is only triggered via a Makefile macro, e.g.
might be a good workaround.
diff --git a/Makefile b/Makefile
index 8466333..83348b8 100644
--- a/Makefile
+++ b/Makefile
@@ -139,6 +139,10 @@ all::
# Define PPC_SHA1 environment variable when running make to make use of
# a bundled SHA1 routine optimized for PowerPC.
#
+# Define SHA1_MAX_BLOCK_SIZE if your SSH1_Update() implementation can
+# hash only a limited amount of data in one call (e.g. APPLE_COMMON_CRYPTO
+# may want 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined).
+#
# Define NEEDS_CRYPTO_WITH_SSL if you need -lcrypto when using -lssl (Darwin).
#
# Define NEEDS_SSL_WITH_CRYPTO if you need -lssl when using -lcrypto (Darwin).
@@ -1002,6 +1006,7 @@ ifeq ($(uname_S),Darwin)
ifndef NO_APPLE_COMMON_CRYPTO
APPLE_COMMON_CRYPTO = YesPlease
COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO
+ SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L
endif
NO_REGEX = YesPlease
PTHREAD_LIBS =
@@ -1350,6 +1355,11 @@ endif
endif
endif
+ifdef SHA1_MAX_BLOCK_SIZE
+LIB_OBJS += compat/sha1_chunked.o
+BASIC_CFLAGS += SHA1_MAX_BLOCK_SIZE="$(SHA1_MAX_BLOCK_SIZE)"
+endif
+
ifdef NO_PERL_MAKEMAKER
export NO_PERL_MAKEMAKER
endif
next prev parent reply other threads:[~2015-10-29 17:19 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-28 23:10 git fsck failure on OS X with files >= 4 GiB Rafael Espíndola
2015-10-29 6:46 ` Filipe Cabecinhas
[not found] ` <CAEDE8505fXAwVXx=EZwxPHvXpMByzpnXJ9LBgfx3U6VUaFbPHw@mail.gmail.com>
2015-10-29 10:46 ` Rafael Espíndola
2015-10-29 15:15 ` Filipe Cabecinhas
2015-10-29 16:02 ` Atousa Duprat
2015-10-29 17:19 ` Junio C Hamano [this message]
2015-10-30 2:15 ` Atousa Duprat
2015-10-30 22:12 ` [PATCH] Limit the size of the data block passed to SHA1_Update() Atousa Pahlevan Duprat
2015-10-30 22:22 ` Junio C Hamano
2015-11-01 6:41 ` Atousa Duprat
2015-11-01 18:31 ` Junio C Hamano
2015-11-01 1:32 ` Eric Sunshine
2015-11-01 6:32 ` atousa.p
2015-11-01 8:30 ` Eric Sunshine
2015-11-01 18:37 ` Junio C Hamano
2015-11-02 20:52 ` Atousa Duprat
2015-11-02 21:21 ` Junio C Hamano
2015-11-03 6:58 ` [PATCH 1/2] " atousa.p
2015-11-03 11:51 ` Torsten Bögershausen
2015-11-04 4:24 ` [PATCH] " atousa.p
2015-11-04 19:51 ` Eric Sunshine
2015-11-05 6:38 ` [PATCH v4 1/3] Provide another level of abstraction for the SHA1 utilities atousa.p
2015-11-05 18:29 ` Junio C Hamano
2015-11-05 6:38 ` [PATCH v4 2/3] Limit the size of the data block passed to SHA1_Update() atousa.p
2015-11-05 18:29 ` Junio C Hamano
2015-11-11 23:46 ` Atousa Duprat
2015-11-05 6:38 ` [PATCH v4 3/3] Move all the SHA1 implementations into one directory atousa.p
2015-11-05 18:29 ` Junio C Hamano
2015-11-04 4:27 ` [PATCH 1/2] Limit the size of the data block passed to SHA1_Update() Atousa Duprat
2015-11-04 17:09 ` [PATCH] " Junio C Hamano
2015-10-30 22:18 ` Atousa Pahlevan Duprat
2015-10-30 22:26 ` Randall S. Becker
2015-10-31 17:35 ` Junio C Hamano
2015-11-01 6:37 ` Atousa Duprat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqlhalsict.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=atousa.p@gmail$(echo .)com \
--cc=filcab@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=rafael.espindola@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox