From: "brian m. carlson" <sandals@crustytoothpaste•net>
To: Junio C Hamano <gitster@pobox•com>
Cc: git@vger•kernel.org, Patrick Steinhardt <ps@pks•im>
Subject: Re: [PATCH 02/10] hash: add a constant for the original hash algorithm
Date: Fri, 20 Jun 2025 20:43:07 +0000 [thread overview]
Message-ID: <aFXH2_PpZrJxJRCs@fruit.crustytoothpaste.net> (raw)
In-Reply-To: <xmqq1prf89cd.fsf@gitster.g>
[-- Attachment #1: Type: text/plain, Size: 3342 bytes --]
On 2025-06-20 at 01:56:02, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste•net> writes:
>
> > We have a a variety of uses of GIT_HASH_SHA1 littered throughout our
> > code. Some of these really mean to represent specifically SHA-1, but
> > some actually represent the original hash algorithm used in Git which is
> > implied by older formats and protocols which do not contain hash
> > information. For instance, the bundle v1 and v2 formats do not contain
> > hash algorithm information, and thus SHA-1 is implied by the use of
> > these formats.
>
> Does that mean use of _ORIGINAL is a sign that these places should
> keep using SHA-1 and should not change?
Yes.
> I am having a hard time guessing/assessing the value of having _ORIGINAL
> that is a synonym for _SHA1; with redirection, it pretends as if the
> underlying value can be updated from SHA-1 to SHA-256 (and that is
> the very intention behind GIT_HASH_DEFAULT symbol that gives us a
> level of indirection), but it is hard to imagine we would ever want
> to change what _ORIGINAL means, as that word talks about a historical
> fact that will never change over time.
I agree. _ORIGINAL indicates that this is a use of SHA-1 which is a
historical fact and is a legacy decision as opposed to one specified
explicitly.
For instance, if we're setting the algorithm for bundle v1 and v2, then
we'd use _ORIGINAL because those formats did not specify a hash value
when they were designed and, for legacy reasons, we cannot change that
fact. However, if with bundle v3, a user specified @object-format=sha1,
then we'd use _SHA1, since that was an explicit decision documented.
Similarly, _SHA1 represents extensions.objectFormat=sha1, which is an
intentional decision to use the older algorithm.
> > Add a constant for documentary purposes which indicates this value. It
> > will always be the same as SHA-1, since this is an essential part of
> > these formats, but its use indicates this particular reason and not any
> > other reason why SHA-1 might be used.
>
> I am not sure what this means. If we use GIT_HASH_SHA1 in such
> places explicitly (as opposed to GIT_HASH_DEFAULT), isn't it a sign
> enough that with different versions of Git, that particular code
> path should keep using SHA-1 no matter what the default is?
If we have a test helper that computes hashes and someone specified
"sha1" on the command line, that's GIT_HASH_SHA1. Someone said, "I'd
like to use SHA-1." Similarly, in the reftable code, we can read the
byte value indicating that the reftable is in SHA-1 and that's an
explicit decision.
If we default to SHA-1 because nobody specified extensions.objectformat,
then that's GIT_HASH_ORIGINAL. Nobody made a decision or opted into an
algorithm; we just didn't think hard enough about cryptographic agility
in the original Git and we assumed SHA-1.
They're both the same numeric constant here and always will be (even if,
in a future version of Git, we get rid of SHA-1 altogether and we
otherwise die on that code). But there's a difference in intention: one
explicitly stated SHA-1 as opposed to a different algorithm and one just
got a default because that's the compatible legacy behaviour.
--
brian m. carlson (they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
next prev parent reply other threads:[~2025-06-20 20:43 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-20 1:19 [PATCH 00/10] Add SHA-256 by default as a breaking change brian m. carlson
2025-06-20 1:19 ` [PATCH 01/10] hash: add a constant for the default hash algorithm brian m. carlson
2025-06-20 1:19 ` [PATCH 02/10] hash: add a constant for the original " brian m. carlson
2025-06-20 1:56 ` Junio C Hamano
2025-06-20 20:43 ` brian m. carlson [this message]
2025-07-01 11:35 ` Patrick Steinhardt
2025-06-20 1:19 ` [PATCH 03/10] builtin: use default hash when outside a repository brian m. carlson
2025-06-20 14:19 ` Junio C Hamano
2025-07-01 11:35 ` Patrick Steinhardt
2025-07-01 21:14 ` brian m. carlson
2025-07-02 15:08 ` Patrick Steinhardt
2025-06-20 1:19 ` [PATCH 04/10] Use original hash for legacy formats brian m. carlson
2025-06-20 14:26 ` Junio C Hamano
2025-06-20 20:51 ` brian m. carlson
2025-06-20 21:14 ` Junio C Hamano
2025-07-01 11:35 ` Patrick Steinhardt
2025-06-20 1:19 ` [PATCH 05/10] setup: use the default algorithm to initialize repo format brian m. carlson
2025-06-20 14:55 ` Junio C Hamano
2025-06-20 20:28 ` brian m. carlson
2025-06-20 21:05 ` Junio C Hamano
2025-06-20 1:19 ` [PATCH 06/10] t: default to compile-time default hash if not set brian m. carlson
2025-06-20 1:19 ` [PATCH 07/10] t1007: choose the built-in hash outside of a repo brian m. carlson
2025-06-20 1:19 ` [PATCH 08/10] t4042: " brian m. carlson
2025-06-20 1:19 ` [PATCH 09/10] t5300: " brian m. carlson
2025-06-20 1:19 ` [PATCH 10/10] Enable SHA-256 by default in breaking changes mode brian m. carlson
2025-06-20 14:58 ` Junio C Hamano
2025-06-20 19:18 ` brian m. carlson
2025-06-20 15:03 ` Junio C Hamano
2025-06-20 19:15 ` brian m. carlson
2025-06-20 20:42 ` Junio C Hamano
2025-06-20 21:06 ` brian m. carlson
2025-07-01 11:35 ` Patrick Steinhardt
2025-07-01 21:22 ` [PATCH v2 00/11] Add SHA-256 by default as a breaking change brian m. carlson
2025-07-01 21:22 ` [PATCH v2 01/11] hash: add a constant for the default hash algorithm brian m. carlson
2025-07-01 21:22 ` [PATCH v2 02/11] hash: add a constant for the legacy " brian m. carlson
2025-07-01 21:22 ` [PATCH v2 03/11] builtin: use default hash when outside a repository brian m. carlson
2025-07-01 21:22 ` [PATCH v2 04/11] Use legacy hash for legacy formats brian m. carlson
2025-07-01 21:22 ` [PATCH v2 05/11] setup: use the default algorithm to initialize repo format brian m. carlson
2025-07-01 21:22 ` [PATCH v2 06/11] t: default to compile-time default hash if not set brian m. carlson
2025-07-01 21:22 ` [PATCH v2 07/11] t1007: choose the built-in hash outside of a repo brian m. carlson
2025-07-01 21:22 ` [PATCH v2 08/11] t4042: " brian m. carlson
2025-07-01 21:22 ` [PATCH v2 09/11] t5300: " brian m. carlson
2025-07-01 21:22 ` [PATCH v2 10/11] help: add a build option for default hash brian m. carlson
2025-07-01 21:22 ` [PATCH v2 11/11] Enable SHA-256 by default in breaking changes mode brian m. carlson
2025-07-01 22:10 ` [PATCH v2 00/11] Add SHA-256 by default as a breaking change Junio C Hamano
2025-07-02 14:46 ` Patrick Steinhardt
2025-07-02 15:01 ` Kristoffer Haugsbakk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aFXH2_PpZrJxJRCs@fruit.crustytoothpaste.net \
--to=sandals@crustytoothpaste$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=ps@pks$(echo .)im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox