public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: "brian m. carlson" <sandals@crustytoothpaste•net>
Cc: <git@vger•kernel.org>,  Patrick Steinhardt <ps@pks•im>
Subject: Re: [PATCH 05/10] setup: use the default algorithm to initialize repo format
Date: Fri, 20 Jun 2025 07:55:12 -0700	[thread overview]
Message-ID: <xmqqtt4a5upb.fsf@gitster.g> (raw)
In-Reply-To: <20250620011943.586596-6-sandals@crustytoothpaste.net> (brian m. carlson's message of "Fri, 20 Jun 2025 01:19:37 +0000")

"brian m. carlson" <sandals@crustytoothpaste•net> writes:

> When we define a new repository format with REPOSITORY_FORMAT_INIT, we
> always use GIT_HASH_SHA1, and this value ends up getting used as the
> default value to initialize a repository if none of the command line,
> environment, or config tell us to do otherwise.
>
> Because we might not always want to use SHA-1 as the default, let's
> instead specify the default hash algorithm constant so that we will use
> whatever the specified default is.  

All of the above hints the use of _DEFAULT instead of _SHA1 but ...

> However, to make sure we continue to
> read repositories without a specified hash algorithm as SHA-1, default
> the repository format to the original hash algorithm (SHA-1) when
> reading the repository format.

... this explains why we may want to

 - expect that we would be able to read the hash from repository
 - read from repository
 - if the repository specifies the hash, happily use it
 - otherwise it is a lagacy repository so use the SHA-1

Is that what is going on here?  Because I find some things that
happens in the code somewhat questionable.

> Signed-off-by: brian m. carlson <sandals@crustytoothpaste•net>
> ---
>  setup.c | 5 ++++-
>  setup.h | 2 +-
>  2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/setup.c b/setup.c
> index 641c857ed5..fb38824a2b 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -835,9 +835,12 @@ static void init_repository_format(struct repository_format *format)
>  int read_repository_format(struct repository_format *format, const char *path)
>  {
>  	clear_repository_format(format);
> +	format->hash_algo = GIT_HASH_ORIGINAL;

If we expect we can read from the config, and otherwise we fall back
to the hardcoded legacy SHA-1, do we need this assignment?  We
cleared and version is set to -1, and then we read from the config ...

>  	git_config_from_file(check_repo_format, path, format);

... so if the file said anything about "extensions.objectformat", we
would know it by now.  If not, wouldn't version be left to -1 as our
previous clal to clear_repository_format() set it via its call to
init_repository_format()?

Ahh, this code is prepared to handle a repository that claims to use
version 1 but does not set extensions.objectformat at all.  And in
such a case, we do want to use SHA-1.  OK, the above code makes
sense for that case.

> -	if (format->version == -1)
> +	if (format->version == -1) {

And if there is no core.repositoryformatversion set, we will come
here.  According to the comment before handle_extension_v0(), some
extensions.* should still be honored even in such a repository, and
the above call to git_config_from_file() should have handled them
just fine.

However, I do not understand why we clear all of what we read with
another call to clear_repository_format() here.

>  		clear_repository_format(format);
> +		format->hash_algo = GIT_HASH_ORIGINAL;
> +	}
>  	return format->version;
>  }

Admittedly, I find that the way how check_repo_format() does its
thing somewhat questionable.  Even though it reads into the .version
member the value of core.repositoryformatversion, it slurps in all
the extensions.* regardless of what .version the repository claims
to be in.  So even though there are two separate functions to handle
"historical compatibility" handle_extension_v0() and other extensions,
we still end up honoring extensions.objectformat in a repository that
does not say what format version it uses.  And clearing them with a
call to clear_repository_format() may make sense, but do we want to
clear things we read with handle_extension_v0() as well?

> diff --git a/setup.h b/setup.h
> index 18dc3b7368..8522fa8575 100644
> --- a/setup.h
> +++ b/setup.h
> @@ -149,7 +149,7 @@ struct repository_format {
>  { \
>  	.version = -1, \
>  	.is_bare = -1, \
> -	.hash_algo = GIT_HASH_SHA1, \
> +	.hash_algo = GIT_HASH_DEFAULT, \
>  	.ref_storage_format = REF_STORAGE_FORMAT_FILES, \
>  	.unknown_extensions = STRING_LIST_INIT_DUP, \
>  	.v1_only_extensions = STRING_LIST_INIT_DUP, \

  reply	other threads:[~2025-06-20 14:55 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-20  1:19 [PATCH 00/10] Add SHA-256 by default as a breaking change brian m. carlson
2025-06-20  1:19 ` [PATCH 01/10] hash: add a constant for the default hash algorithm brian m. carlson
2025-06-20  1:19 ` [PATCH 02/10] hash: add a constant for the original " brian m. carlson
2025-06-20  1:56   ` Junio C Hamano
2025-06-20 20:43     ` brian m. carlson
2025-07-01 11:35       ` Patrick Steinhardt
2025-06-20  1:19 ` [PATCH 03/10] builtin: use default hash when outside a repository brian m. carlson
2025-06-20 14:19   ` Junio C Hamano
2025-07-01 11:35   ` Patrick Steinhardt
2025-07-01 21:14     ` brian m. carlson
2025-07-02 15:08       ` Patrick Steinhardt
2025-06-20  1:19 ` [PATCH 04/10] Use original hash for legacy formats brian m. carlson
2025-06-20 14:26   ` Junio C Hamano
2025-06-20 20:51     ` brian m. carlson
2025-06-20 21:14       ` Junio C Hamano
2025-07-01 11:35         ` Patrick Steinhardt
2025-06-20  1:19 ` [PATCH 05/10] setup: use the default algorithm to initialize repo format brian m. carlson
2025-06-20 14:55   ` Junio C Hamano [this message]
2025-06-20 20:28     ` brian m. carlson
2025-06-20 21:05       ` Junio C Hamano
2025-06-20  1:19 ` [PATCH 06/10] t: default to compile-time default hash if not set brian m. carlson
2025-06-20  1:19 ` [PATCH 07/10] t1007: choose the built-in hash outside of a repo brian m. carlson
2025-06-20  1:19 ` [PATCH 08/10] t4042: " brian m. carlson
2025-06-20  1:19 ` [PATCH 09/10] t5300: " brian m. carlson
2025-06-20  1:19 ` [PATCH 10/10] Enable SHA-256 by default in breaking changes mode brian m. carlson
2025-06-20 14:58   ` Junio C Hamano
2025-06-20 19:18     ` brian m. carlson
2025-06-20 15:03   ` Junio C Hamano
2025-06-20 19:15     ` brian m. carlson
2025-06-20 20:42       ` Junio C Hamano
2025-06-20 21:06         ` brian m. carlson
2025-07-01 11:35   ` Patrick Steinhardt
2025-07-01 21:22 ` [PATCH v2 00/11] Add SHA-256 by default as a breaking change brian m. carlson
2025-07-01 21:22   ` [PATCH v2 01/11] hash: add a constant for the default hash algorithm brian m. carlson
2025-07-01 21:22   ` [PATCH v2 02/11] hash: add a constant for the legacy " brian m. carlson
2025-07-01 21:22   ` [PATCH v2 03/11] builtin: use default hash when outside a repository brian m. carlson
2025-07-01 21:22   ` [PATCH v2 04/11] Use legacy hash for legacy formats brian m. carlson
2025-07-01 21:22   ` [PATCH v2 05/11] setup: use the default algorithm to initialize repo format brian m. carlson
2025-07-01 21:22   ` [PATCH v2 06/11] t: default to compile-time default hash if not set brian m. carlson
2025-07-01 21:22   ` [PATCH v2 07/11] t1007: choose the built-in hash outside of a repo brian m. carlson
2025-07-01 21:22   ` [PATCH v2 08/11] t4042: " brian m. carlson
2025-07-01 21:22   ` [PATCH v2 09/11] t5300: " brian m. carlson
2025-07-01 21:22   ` [PATCH v2 10/11] help: add a build option for default hash brian m. carlson
2025-07-01 21:22   ` [PATCH v2 11/11] Enable SHA-256 by default in breaking changes mode brian m. carlson
2025-07-01 22:10   ` [PATCH v2 00/11] Add SHA-256 by default as a breaking change Junio C Hamano
2025-07-02 14:46   ` Patrick Steinhardt
2025-07-02 15:01     ` Kristoffer Haugsbakk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqtt4a5upb.fsf@gitster.g \
    --to=gitster@pobox$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=ps@pks$(echo .)im \
    --cc=sandals@crustytoothpaste$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox