From: Jeff King <peff@peff•net>
To: Junio C Hamano <gitster@pobox•com>
Cc: Phillip Wood <phillip.wood123@gmail•com>,
git@vger•kernel.org, Patrick Steinhardt <ps@pks•im>,
correctmost <cmlists@sent•com>, Taylor Blau <me@ttaylorr•com>
Subject: [PATCH 0/4] more robust functions for parsing int from buf
Date: Sun, 30 Nov 2025 08:13:51 -0500 [thread overview]
Message-ID: <20251130131351.GA198697@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqldjsogip.fsf@gitster.g>
On Wed, Nov 26, 2025 at 09:22:38AM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff•net> writes:
>
> > Hmm, I thought both of those things were reasonably clever. The other
> > obvious way to do it, AFAICT, is to used checked-operation intrinsics or
> > add unsigned_add_overflows() before every operation.
>
> Yup, but the thing is, I didn't want something "clever". I prefer
> "clean and obvious" if we add extra code for safety.
Yeah, that's fair. It turns out that one half of that is easy: checking
for overflow as we compute the number). And one half is hard. If you
don't assume a twos-complement style range where the "min = -max - 1",
then you are stuck using INT_MIN. Which is OK for "int", but not for
arbitrary types. We already make the same assumption in git_parse_int(),
etc.
So I went with that approach here, but it is at least documented
clearly.
> > It looks like you merged what I had into 'next'. Where do you want to go
> > from there? I am mostly content to let it be, but we can also try to
> > replace with something like your version.
>
> That is my preference. While the topic is still in 'next', or after
> the topic graduates to 'master'. Either is fine. And it is fine if
> such an update did not come, too. After all, this is to deal with
> contents in a locally generated file (.git/index), so a maliciously
> corrupt string that lack the expected whitespace character after the
> digit string is a sign that you are trying to burn yourself and you
> have only yourself to blame, isn't it? An attacker that can put
> garbage in your .git/index has better ways to fool you by updating
> your .git/config file that sits next to it. Or teach the sanitizer
> that this code path is already OK somehow?
Yeah, I agree the stakes are low here. Though they were somewhat low to
begin with for the same reason! But I was grossed out enough by the
whole thing that I tried to put together a decent helper for parsing
integers from buffers, and converted both sites here.
I suspect it could be used in other places, too, but I didn't convert
any.
> > Or even, I guess, work on a
> > global strntoi() that could be used everywhere, if we think it is robust
> > enough. (Though technically that name is reserved by the standard, which
> > is a shame, because that is really what this thing is).
>
> Well, we already use plenty of names beginning with 'str' followed
> by a lowercase letter, like strbuf_foo() and string_list_init().
In the end it was sufficiently different from strtoi() that I decided
not to use that name. It was but one of many bike-sheddable decisions,
which I tried to document. So I guess let the flaming commence. ;)
This is built on top of jk/asan-bonanza.
[1/4]: parse: prefer bool to int for boolean returns
[2/4]: parse: add functions for parsing from non-string buffers
[3/4]: cache-tree: use parse_int_from_buf()
[4/4]: fsck: use parse_unsigned_from_buf() for parsing timestamp
Makefile | 1 +
cache-tree.c | 28 ++-----
compat/posix.h | 2 +
fsck.c | 20 +----
parse.c | 162 +++++++++++++++++++++++++++++--------
parse.h | 31 +++++--
t/meson.build | 1 +
t/unit-tests/u-parse-int.c | 98 ++++++++++++++++++++++
8 files changed, 263 insertions(+), 80 deletions(-)
create mode 100644 t/unit-tests/u-parse-int.c
-Peff
next prev parent reply other threads:[~2025-11-30 13:14 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-12 7:55 [PATCH 0/9] asan bonanza Jeff King
2025-11-12 7:56 ` [PATCH 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-12 8:01 ` [PATCH 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-12 11:25 ` Patrick Steinhardt
2025-11-13 2:55 ` Taylor Blau
2025-11-18 8:59 ` Jeff King
2025-11-12 8:02 ` [PATCH 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-12 8:17 ` Collin Funk
2025-11-12 10:31 ` Jeff King
2025-11-12 20:06 ` Collin Funk
2025-11-12 11:26 ` Patrick Steinhardt
2025-11-13 3:12 ` Taylor Blau
2025-11-13 6:34 ` Patrick Steinhardt
2025-11-18 8:49 ` Jeff King
2025-11-13 16:30 ` Junio C Hamano
2025-11-14 7:00 ` Patrick Steinhardt
2025-11-15 2:13 ` Jeff King
2025-11-12 8:05 ` [PATCH 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-12 11:26 ` Patrick Steinhardt
2025-11-13 3:09 ` Taylor Blau
2025-11-18 8:40 ` Jeff King
2025-11-18 8:38 ` Jeff King
2025-11-12 8:06 ` [PATCH 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-12 8:06 ` [PATCH 6/9] fsck: avoid strcspn() " Jeff King
2025-11-12 8:06 ` [PATCH 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-12 8:10 ` [PATCH 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-12 11:25 ` Patrick Steinhardt
2025-11-12 19:36 ` Junio C Hamano
2025-11-15 2:12 ` Jeff King
2025-11-12 8:10 ` [PATCH 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-13 3:17 ` [PATCH 0/9] asan bonanza Taylor Blau
2025-11-18 9:11 ` [PATCH v2 " Jeff King
2025-11-18 9:11 ` [PATCH v2 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-18 9:12 ` [PATCH v2 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-18 9:12 ` [PATCH v2 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-18 9:12 ` [PATCH v2 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-18 14:30 ` Phillip Wood
2025-11-23 6:19 ` Junio C Hamano
2025-11-23 15:51 ` Phillip Wood
2025-11-23 18:06 ` Junio C Hamano
2025-11-24 22:30 ` Jeff King
2025-11-24 23:09 ` Junio C Hamano
2025-11-26 15:09 ` Jeff King
2025-11-26 17:22 ` Junio C Hamano
2025-11-30 13:13 ` Jeff King [this message]
2025-11-30 13:14 ` [PATCH 1/4] parse: prefer bool to int for boolean returns Jeff King
2025-12-04 11:23 ` Patrick Steinhardt
2025-11-30 13:15 ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Jeff King
2025-11-30 13:46 ` my complaints with clar Jeff King
2025-12-01 14:16 ` Phillip Wood
2025-12-04 11:09 ` Patrick Steinhardt
2025-12-05 18:30 ` Jeff King
2025-12-04 11:23 ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Patrick Steinhardt
2025-12-05 16:11 ` Phillip Wood
2026-01-20 20:54 ` Junio C Hamano
2026-01-21 5:27 ` Jeff King
2025-11-30 13:15 ` [PATCH 3/4] cache-tree: use parse_int_from_buf() Jeff King
2025-11-30 13:16 ` [PATCH 4/4] fsck: use parse_unsigned_from_buf() for parsing timestamp Jeff King
2025-11-18 9:12 ` [PATCH v2 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-18 9:12 ` [PATCH v2 6/9] fsck: avoid strcspn() " Jeff King
2025-11-18 9:12 ` [PATCH v2 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-18 9:12 ` [PATCH v2 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-18 9:12 ` [PATCH v2 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-23 5:49 ` [PATCH v2 0/9] asan bonanza Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251130131351.GA198697@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=cmlists@sent$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=me@ttaylorr$(echo .)com \
--cc=phillip.wood123@gmail$(echo .)com \
--cc=ps@pks$(echo .)im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox