From: Jeff King <peff@peff•net>
To: Junio C Hamano <gitster@pobox•com>
Cc: Karthik Nayak <karthik.188@gmail•com>,
shejialuo <shejialuo@gmail•com>,
git@vger•kernel.org, Patrick Steinhardt <ps@pks•im>
Subject: Re: [PATCH v2 2/4] string-list: replace negative index encoding with "exact_match" parameter
Date: Thu, 9 Oct 2025 01:52:37 -0400 [thread overview]
Message-ID: <20251009055237.GC1614343@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqq5xd6irmu.fsf@gitster.g>
On Thu, Sep 25, 2025 at 06:33:13AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff•net> writes:
>
> > Yes, but it's much harder to wrap a size_t, especially if the code is
> > allocating as it goes (e.g., a loop expanding an array). Because if
> > expanding your allocation from "n" to "n+k" items will overflow, then
> > that implies the current allocation is within "k" items of filling up
> > the entire memory space.
>
> We'd be protecting ourselves by noticing that n+k wraps around with
> st_add() and friends, and relying on malloc() and realloc() to
> notice and signal an error. Use of size_t to count the number of
> things that are getting allocated is not making these any easier to
> do compared to the case you were counting in "int", no? Either way
> we'd need to be careful.
Yes and no. I agree it is probably good to check for overflow as a
general principle, even if using size_t. But it is also easy to miss
such spots, and I do think using size_t (or something similarly large)
can provide some backup safety.
If you are getting values "foo" and "bar" from the user and computing a
length like "size_t len = foo * bar", then the size of the type will not
help you, and you need to do checked arithmetic. But I think those cases
are relatively easy to spot. The much more insidious ones are those that
append to a data structure one item at a time. With a type that is close
to the practical size of memory, you will fail to grow your data
structure before you hit the overflow. With a much smaller type like
int (on an LP64 platform), it is much easier for malicious input to
cause funky wrapping[1].
If your position is "it would not matter if we were properly checking
for overflow", then I agree. I just think it's hard to catch all of the
spots. But maybe I am being overly pessimistic. _Most_ of that should go
through ALLOC_GROW() or similar, so the checks could be centralized-ish[2].
-Peff
[1] We have had (and I'd wager probably still have) bugs where an
attacker can wrap around to negative int. We get saved from a
vulnerability here because the negative value is eventually cast to
a gigantic size_t to allocate, which fails. In that sense I think
"unsigned int" is the _most_ dangerous type to use.
[2] I think there are some practical implementation questions here, too.
If we use st_add() etc in ALLOC_GROW(), then we may get caught by
accidental type promotions, where those functions say "sure, it is
OK to increase this len", but then return a size_t which is
truncated when we try to stuff it back in an "int". So to make
ALLOC_GROW() type independent, we probably need to do some more
macro hackery with our checked arithmetic to make sure we are
operating with the same size type that the caller is using. Not
impossible, but something we'd have to be careful to get right.
next prev parent reply other threads:[~2025-10-09 5:52 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-07 16:40 [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-09-07 16:42 ` [PATCH 1/4] string-list: allow passing NULL for `get_entry_index` shejialuo
2025-09-09 6:22 ` Patrick Steinhardt
2025-09-07 16:42 ` [PATCH 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-09 6:22 ` Patrick Steinhardt
2025-09-15 12:11 ` shejialuo
2025-09-07 16:42 ` [PATCH 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-09-09 6:23 ` Patrick Steinhardt
2025-09-09 19:21 ` Junio C Hamano
2025-09-10 4:57 ` Patrick Steinhardt
2025-09-07 16:42 ` [PATCH 4/4] refs: enable sign compare warnings check shejialuo
2025-09-09 6:23 ` Patrick Steinhardt
2025-09-07 16:43 ` [PATCH 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-09-17 9:18 ` [PATCH v2 " shejialuo
2025-09-17 9:19 ` [PATCH v2 1/4] string-list: use bool instead of int for "exact_match" shejialuo
2025-09-17 9:19 ` [PATCH v2 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-09-23 8:14 ` Patrick Steinhardt
2025-10-05 13:31 ` shejialuo
2025-09-23 9:35 ` Karthik Nayak
2025-09-23 18:48 ` Junio C Hamano
2025-09-24 5:36 ` Jeff King
2025-09-24 13:20 ` Junio C Hamano
2025-09-25 2:50 ` Jeff King
2025-09-25 13:33 ` Junio C Hamano
2025-10-09 5:52 ` Jeff King [this message]
2025-10-08 1:49 ` Collin Funk
2025-10-09 5:55 ` Jeff King
2025-10-05 14:11 ` shejialuo
2025-10-05 14:06 ` shejialuo
2025-09-17 9:20 ` [PATCH v2 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-09-23 9:44 ` Karthik Nayak
2025-10-05 9:29 ` shejialuo
2025-09-17 9:20 ` [PATCH v2 4/4] refs: enable sign compare warnings check shejialuo
2025-10-06 6:28 ` [PATCH v3 0/4] enhance string-list API to fix sign compare warnings shejialuo
2025-10-06 6:32 ` [PATCH v3 1/4] string-list: use bool instead of int for "exact_match" shejialuo
2025-10-06 6:32 ` [PATCH v3 2/4] string-list: replace negative index encoding with "exact_match" parameter shejialuo
2025-10-06 6:32 ` [PATCH v3 3/4] string-list: change "string_list_find_insert_index" return type to "size_t" shejialuo
2025-10-09 6:03 ` Jeff King
2025-10-06 6:32 ` [PATCH v3 4/4] refs: enable sign compare warnings check shejialuo
2025-10-06 22:09 ` [PATCH v3 0/4] enhance string-list API to fix sign compare warnings Junio C Hamano
2025-10-08 1:52 ` Collin Funk
2025-10-08 15:56 ` Junio C Hamano
2025-10-08 8:11 ` Karthik Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251009055237.GC1614343@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=karthik.188@gmail$(echo .)com \
--cc=ps@pks$(echo .)im \
--cc=shejialuo@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox