public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste•net>
To: Junio C Hamano <gitster@pobox•com>
Cc: git@vger•kernel.org, Patrick Steinhardt <ps@pks•im>
Subject: Re: [PATCH 02/10] hash: add a constant for the original hash algorithm
Date: Fri, 20 Jun 2025 20:43:07 +0000	[thread overview]
Message-ID: <aFXH2_PpZrJxJRCs@fruit.crustytoothpaste.net> (raw)
In-Reply-To: <xmqq1prf89cd.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 3342 bytes --]

On 2025-06-20 at 01:56:02, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste•net> writes:
> 
> > We have a a variety of uses of GIT_HASH_SHA1 littered throughout our
> > code.  Some of these really mean to represent specifically SHA-1, but
> > some actually represent the original hash algorithm used in Git which is
> > implied by older formats and protocols which do not contain hash
> > information.  For instance, the bundle v1 and v2 formats do not contain
> > hash algorithm information, and thus SHA-1 is implied by the use of
> > these formats.
> 
> Does that mean use of _ORIGINAL is a sign that these places should
> keep using SHA-1 and should not change?

Yes.

> I am having a hard time guessing/assessing the value of having _ORIGINAL
> that is a synonym for _SHA1; with redirection, it pretends as if the
> underlying value can be updated from SHA-1 to SHA-256 (and that is
> the very intention behind GIT_HASH_DEFAULT symbol that gives us a
> level of indirection), but it is hard to imagine we would ever want
> to change what _ORIGINAL means, as that word talks about a historical
> fact that will never change over time.

I agree.  _ORIGINAL indicates that this is a use of SHA-1 which is a
historical fact and is a legacy decision as opposed to one specified
explicitly.

For instance, if we're setting the algorithm for bundle v1 and v2, then
we'd use _ORIGINAL because those formats did not specify a hash value
when they were designed and, for legacy reasons, we cannot change that
fact.  However, if with bundle v3, a user specified @object-format=sha1,
then we'd use _SHA1, since that was an explicit decision documented.
Similarly, _SHA1 represents extensions.objectFormat=sha1, which is an
intentional decision to use the older algorithm.

> > Add a constant for documentary purposes which indicates this value.  It
> > will always be the same as SHA-1, since this is an essential part of
> > these formats, but its use indicates this particular reason and not any
> > other reason why SHA-1 might be used.
> 
> I am not sure what this means.  If we use GIT_HASH_SHA1 in such
> places explicitly (as opposed to GIT_HASH_DEFAULT), isn't it a sign
> enough that with different versions of Git, that particular code
> path should keep using SHA-1 no matter what the default is?

If we have a test helper that computes hashes and someone specified
"sha1" on the command line, that's GIT_HASH_SHA1.  Someone said, "I'd
like to use SHA-1."  Similarly, in the reftable code, we can read the
byte value indicating that the reftable is in SHA-1 and that's an
explicit decision.

If we default to SHA-1 because nobody specified extensions.objectformat,
then that's GIT_HASH_ORIGINAL.  Nobody made a decision or opted into an
algorithm; we just didn't think hard enough about cryptographic agility
in the original Git and we assumed SHA-1.

They're both the same numeric constant here and always will be (even if,
in a future version of Git, we get rid of SHA-1 altogether and we
otherwise die on that code).  But there's a difference in intention: one
explicitly stated SHA-1 as opposed to a different algorithm and one just
got a default because that's the compatible legacy behaviour.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  reply	other threads:[~2025-06-20 20:43 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-20  1:19 [PATCH 00/10] Add SHA-256 by default as a breaking change brian m. carlson
2025-06-20  1:19 ` [PATCH 01/10] hash: add a constant for the default hash algorithm brian m. carlson
2025-06-20  1:19 ` [PATCH 02/10] hash: add a constant for the original " brian m. carlson
2025-06-20  1:56   ` Junio C Hamano
2025-06-20 20:43     ` brian m. carlson [this message]
2025-07-01 11:35       ` Patrick Steinhardt
2025-06-20  1:19 ` [PATCH 03/10] builtin: use default hash when outside a repository brian m. carlson
2025-06-20 14:19   ` Junio C Hamano
2025-07-01 11:35   ` Patrick Steinhardt
2025-07-01 21:14     ` brian m. carlson
2025-07-02 15:08       ` Patrick Steinhardt
2025-06-20  1:19 ` [PATCH 04/10] Use original hash for legacy formats brian m. carlson
2025-06-20 14:26   ` Junio C Hamano
2025-06-20 20:51     ` brian m. carlson
2025-06-20 21:14       ` Junio C Hamano
2025-07-01 11:35         ` Patrick Steinhardt
2025-06-20  1:19 ` [PATCH 05/10] setup: use the default algorithm to initialize repo format brian m. carlson
2025-06-20 14:55   ` Junio C Hamano
2025-06-20 20:28     ` brian m. carlson
2025-06-20 21:05       ` Junio C Hamano
2025-06-20  1:19 ` [PATCH 06/10] t: default to compile-time default hash if not set brian m. carlson
2025-06-20  1:19 ` [PATCH 07/10] t1007: choose the built-in hash outside of a repo brian m. carlson
2025-06-20  1:19 ` [PATCH 08/10] t4042: " brian m. carlson
2025-06-20  1:19 ` [PATCH 09/10] t5300: " brian m. carlson
2025-06-20  1:19 ` [PATCH 10/10] Enable SHA-256 by default in breaking changes mode brian m. carlson
2025-06-20 14:58   ` Junio C Hamano
2025-06-20 19:18     ` brian m. carlson
2025-06-20 15:03   ` Junio C Hamano
2025-06-20 19:15     ` brian m. carlson
2025-06-20 20:42       ` Junio C Hamano
2025-06-20 21:06         ` brian m. carlson
2025-07-01 11:35   ` Patrick Steinhardt
2025-07-01 21:22 ` [PATCH v2 00/11] Add SHA-256 by default as a breaking change brian m. carlson
2025-07-01 21:22   ` [PATCH v2 01/11] hash: add a constant for the default hash algorithm brian m. carlson
2025-07-01 21:22   ` [PATCH v2 02/11] hash: add a constant for the legacy " brian m. carlson
2025-07-01 21:22   ` [PATCH v2 03/11] builtin: use default hash when outside a repository brian m. carlson
2025-07-01 21:22   ` [PATCH v2 04/11] Use legacy hash for legacy formats brian m. carlson
2025-07-01 21:22   ` [PATCH v2 05/11] setup: use the default algorithm to initialize repo format brian m. carlson
2025-07-01 21:22   ` [PATCH v2 06/11] t: default to compile-time default hash if not set brian m. carlson
2025-07-01 21:22   ` [PATCH v2 07/11] t1007: choose the built-in hash outside of a repo brian m. carlson
2025-07-01 21:22   ` [PATCH v2 08/11] t4042: " brian m. carlson
2025-07-01 21:22   ` [PATCH v2 09/11] t5300: " brian m. carlson
2025-07-01 21:22   ` [PATCH v2 10/11] help: add a build option for default hash brian m. carlson
2025-07-01 21:22   ` [PATCH v2 11/11] Enable SHA-256 by default in breaking changes mode brian m. carlson
2025-07-01 22:10   ` [PATCH v2 00/11] Add SHA-256 by default as a breaking change Junio C Hamano
2025-07-02 14:46   ` Patrick Steinhardt
2025-07-02 15:01     ` Kristoffer Haugsbakk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFXH2_PpZrJxJRCs@fruit.crustytoothpaste.net \
    --to=sandals@crustytoothpaste$(echo .)net \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=ps@pks$(echo .)im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox