public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Jeff King <peff@peff•net>
To: Taylor Blau <me@ttaylorr•com>
Cc: git@vger•kernel.org, Elijah Newren <newren@gmail•com>,
	Junio C Hamano <gitster@pobox•com>,
	"brian m. carlson" <sandals@crustytoothpaste•net>
Subject: Re: [PATCH 3/6] hash.h: introduce `unsafe_hash_algo()`
Date: Thu, 21 Nov 2024 04:37:31 -0500	[thread overview]
Message-ID: <20241121093731.GD602681@coredump.intra.peff.net> (raw)
In-Reply-To: <17f92dba34bee235177c8100daab49068fe37254.1732130001.git.me@ttaylorr.com>

On Wed, Nov 20, 2024 at 02:13:50PM -0500, Taylor Blau wrote:

> +static const struct git_hash_algo sha1_unsafe_algo = {
> +	.name = "sha1",
> +	.format_id = GIT_SHA1_FORMAT_ID,
> +	.rawsz = GIT_SHA1_RAWSZ,
> +	.hexsz = GIT_SHA1_HEXSZ,
> +	.blksz = GIT_SHA1_BLKSZ,
> +	.init_fn = git_hash_sha1_init_unsafe,
> +	.clone_fn = git_hash_sha1_clone_unsafe,
> +	.update_fn = git_hash_sha1_update_unsafe,
> +	.final_fn = git_hash_sha1_final_unsafe,
> +	.final_oid_fn = git_hash_sha1_final_oid_unsafe,
> +	.empty_tree = &empty_tree_oid,
> +	.empty_blob = &empty_blob_oid,
> +	.null_oid = &null_oid_sha1,
> +};

All of the non-function fields here naturally must match the ones in the
parent algo struct, or chaos ensues. That's a little fragile, but it's
not like we're adding new algorithm variants a lot. The biggest risk, I
guess, would be adding a new field to git_hash_algo which defaults to
zero-initialization here. But again, there are only three total and they
are defined near each other here, so I don't think it's a big risk
overall.

I think this struct is a potential landmine for hash_algo_by_ptr():

  static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
  {
          return p - hash_algos;
  }

It's undefined behavior to pass in sha1_unsafe_algo to this function
(but the compiler would not complain since the types are the same). I
don't find it incredibly likely that somebody would want to do that on
an unsafe variant, but I'm not thrilled about leaving that wart for
somebody to find.

If we don't care about the speed of this function, then an
implementation like:

  for (i = 0; i < GIT_HASH_NALGOS; i++) {
	if (p == &hash_algos[i] || p == hash_algos[i]->unsafe)
		return i;
  }
  return GIT_HASH_UNKNOWN;

would work. I'm not sure if that would be measurable. I was surprised at
the number of places that hash_algo_by_ptr() is called. Many low-level
oid functions need it because we store the integer id there rather than
a direct pointer (so oidread(), oidclr(), oid_object_info_extended(),
and so on). But I'd also expect the loop above to be pretty fast. So I
dunno.

>  const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
>  	{
>  		.name = NULL,
> @@ -241,6 +257,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
>  		.unsafe_update_fn = git_hash_sha1_update_unsafe,
>  		.unsafe_final_fn = git_hash_sha1_final_unsafe,
>  		.unsafe_final_oid_fn = git_hash_sha1_final_oid_unsafe,
> +		.unsafe = &sha1_unsafe_algo,
>  		.empty_tree = &empty_tree_oid,
>  		.empty_blob = &empty_blob_oid,
>  		.null_oid = &null_oid_sha1,

OK, and we can leave the sha256 pointer zero-initialized, since the
function handles that at runtime. Good.

-Peff

  reply	other threads:[~2024-11-21  9:37 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-20 19:13 [PATCH 0/6] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Taylor Blau
2024-11-20 19:13 ` [PATCH 1/6] csum-file: store the hash algorithm as a struct field Taylor Blau
2024-11-21  9:18   ` Jeff King
2024-11-20 19:13 ` [PATCH 2/6] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2024-11-20 19:13 ` [PATCH 3/6] hash.h: introduce `unsafe_hash_algo()` Taylor Blau
2024-11-21  9:37   ` Jeff King [this message]
2024-11-22  0:39     ` brian m. carlson
2024-11-22  8:25       ` Jeff King
2024-11-22 20:37         ` brian m. carlson
2025-01-10 21:38     ` Taylor Blau
2025-01-11  2:45       ` Jeff King
2024-11-20 19:13 ` [PATCH 4/6] csum-file.c: use unsafe_hash_algo() Taylor Blau
2024-11-20 19:13 ` [PATCH 5/6] t/helper/test-hash.c: " Taylor Blau
2024-11-20 19:13 ` [PATCH 6/6] hash.h: drop unsafe_ function variants Taylor Blau
2024-11-21  9:41   ` Jeff King
2025-01-08 19:14 ` [PATCH v2 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Taylor Blau
2025-01-08 19:14   ` [PATCH v2 1/8] t/helper/test-tool: implement sha1-unsafe helper Taylor Blau
2025-01-08 19:14   ` [PATCH v2 2/8] csum-file: store the hash algorithm as a struct field Taylor Blau
2025-01-16 11:48     ` Patrick Steinhardt
2025-01-17 21:17       ` Taylor Blau
2025-01-08 19:14   ` [PATCH v2 3/8] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2025-01-08 19:14   ` [PATCH v2 4/8] hash.h: introduce `unsafe_hash_algo()` Taylor Blau
2025-01-16 11:49     ` Patrick Steinhardt
2025-01-17 21:18       ` Taylor Blau
2025-01-08 19:14   ` [PATCH v2 5/8] csum-file.c: use unsafe_hash_algo() Taylor Blau
2025-01-08 19:14   ` [PATCH v2 6/8] t/helper/test-hash.c: " Taylor Blau
2025-01-08 19:14   ` [PATCH v2 7/8] csum-file: introduce hashfile_checkpoint_init() Taylor Blau
2025-01-10 10:37     ` Jeff King
2025-01-10 21:50       ` Taylor Blau
2025-01-17 21:30         ` Taylor Blau
2025-01-18 12:15           ` Jeff King
2025-01-08 19:14   ` [PATCH v2 8/8] hash.h: drop unsafe_ function variants Taylor Blau
2025-01-10 10:41   ` [PATCH v2 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Jeff King
2025-01-10 21:29     ` Taylor Blau
2025-01-11  2:42       ` Jeff King
2025-01-11  0:14   ` Junio C Hamano
2025-01-11 17:14     ` Taylor Blau
2025-01-17 22:03 ` [PATCH v3 " Taylor Blau
2025-01-17 22:03   ` [PATCH v3 1/8] t/helper/test-tool: implement sha1-unsafe helper Taylor Blau
2025-01-17 22:03   ` [PATCH v3 2/8] csum-file: store the hash algorithm as a struct field Taylor Blau
2025-01-17 22:03   ` [PATCH v3 3/8] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2025-01-17 22:03   ` [PATCH v3 4/8] hash.h: introduce `unsafe_hash_algo()` Taylor Blau
2025-01-17 22:03   ` [PATCH v3 5/8] csum-file.c: use unsafe_hash_algo() Taylor Blau
2025-01-17 22:03   ` [PATCH v3 6/8] t/helper/test-hash.c: " Taylor Blau
2025-01-17 22:03   ` [PATCH v3 7/8] csum-file: introduce hashfile_checkpoint_init() Taylor Blau
2025-01-17 22:03   ` [PATCH v3 8/8] hash.h: drop unsafe_ function variants Taylor Blau
2025-01-18 12:28   ` [PATCH v3 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Jeff King
2025-01-18 12:43     ` Jeff King
2025-01-22 21:31       ` Junio C Hamano
2025-01-23 17:34 ` [PATCH v4 " Taylor Blau
2025-01-23 17:34   ` [PATCH v4 1/8] t/helper/test-tool: implement sha1-unsafe helper Taylor Blau
2025-01-23 17:34   ` [PATCH v4 2/8] csum-file: store the hash algorithm as a struct field Taylor Blau
2025-01-23 17:34   ` [PATCH v4 3/8] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2025-01-23 17:34   ` [PATCH v4 4/8] hash.h: introduce `unsafe_hash_algo()` Taylor Blau
2025-01-23 17:34   ` [PATCH v4 5/8] csum-file.c: use unsafe_hash_algo() Taylor Blau
2025-01-23 17:34   ` [PATCH v4 6/8] t/helper/test-hash.c: " Taylor Blau
2025-01-23 17:34   ` [PATCH v4 7/8] csum-file: introduce hashfile_checkpoint_init() Taylor Blau
2025-01-23 17:34   ` [PATCH v4 8/8] hash.h: drop unsafe_ function variants Taylor Blau
2025-01-23 18:30   ` [PATCH v4 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Junio C Hamano
2025-01-23 18:50     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241121093731.GD602681@coredump.intra.peff.net \
    --to=peff@peff$(echo .)net \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=me@ttaylorr$(echo .)com \
    --cc=newren@gmail$(echo .)com \
    --cc=sandals@crustytoothpaste$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox