public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Jeff King <peff@peff•net>
To: git@vger•kernel.org
Cc: correctmost <cmlists@sent•com>, Taylor Blau <me@ttaylorr•com>
Subject: [PATCH 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
Date: Wed, 12 Nov 2025 03:10:40 -0500	[thread overview]
Message-ID: <20251112081040.GH979063@coredump.intra.peff.net> (raw)
In-Reply-To: <20251112075522.GA978866@coredump.intra.peff.net>

In fsck_ident(), we parse the timestamp with parse_timestamp(), which is
really an alias for strtoumax(). But since our buffer may not be
NUL-terminated, this can trigger a complaint from ASan's
strict_string_checks mode. This is a false positive, since we know that
the buffer contains a trailing newline (which we checked earlier in the
function), and that strtoumax() would stop there.

But it is worth working around ASan's complaint. One is because that
will let us turn on strict_string_checks by default, which has helped
catch other real problems. And two is that the safety of the current
code is very hard to reason about (it subtly depends on distant code
which could change).

One option here is to just parse the number left-to-right ourselves. But
we care about the size of a timestamp_t and detecting overflow, since
that's part of the point of these checks. And doing that correctly is
tricky. So we'll instead just pull the digits into a separate,
NUL-terminated buffer, and use that to call parse_timestamp().

Signed-off-by: Jeff King <peff@peff•net>
---
There's one step we could take after this commit, which is to annotate
all of the spots that look at "p" to do:

  -if (*p != ' ')
  +if (p >= ident_end || *p != ' ')
	return report(..., "expected space");

or similar. And then I think it would be safe to call fsck_ident() with
a buffer that is not NUL-terminated (and does not have our "safety"
newline). I stopped short for this series because we are just trying to
appease ASan (and not fixing real bugs), and because the result gets
rather unwieldy. But it might be worth it in the long run. We could
always do it later on top.

Again, I find the strto*() wrapping to be gross. Here we use the
extra-buffer trick. But we still don't get away with avoiding custom
logic (for example, if we ever want to support negative timestamps, the
fsck code will have to recognize "-" signs). But it feels like the best
we can do for now.

 fsck.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/fsck.c b/fsck.c
index 266c965cec..8e8083e7c6 100644
--- a/fsck.c
+++ b/fsck.c
@@ -860,13 +860,28 @@ static int verify_headers(const void *data, unsigned long size,
 		FSCK_MSG_UNTERMINATED_HEADER, "unterminated header");
 }
 
+static timestamp_t parse_timestamp_from_buf(const char **start, const char *end)
+{
+	const char *p = *start;
+	char buf[24]; /* big enough for 2^64 */
+	size_t i = 0;
+
+	while (p < end && isdigit(*p)) {
+		if (i >= ARRAY_SIZE(buf) - 1)
+			return TIME_MAX;
+		buf[i++] = *p++;
+	}
+	buf[i] = '\0';
+	*start = p;
+	return parse_timestamp(buf, NULL, 10);
+}
+
 static int fsck_ident(const char **ident, const char *ident_end,
 		      const struct object_id *oid, enum object_type type,
 		      struct fsck_options *options)
 {
 	const char *p = *ident;
 	const char *nl;
-	char *end;
 
 	nl = memchr(p, '\n', ident_end - p);
 	if (!nl)
@@ -918,11 +933,11 @@ static int fsck_ident(const char **ident, const char *ident_end,
 			      "invalid author/committer line - bad date");
 	if (*p == '0' && p[1] != ' ')
 		return report(options, oid, type, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
-	if (date_overflows(parse_timestamp(p, &end, 10)))
+	if (date_overflows(parse_timestamp_from_buf(&p, ident_end)))
 		return report(options, oid, type, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow");
-	if (*end != ' ')
+	if (*p != ' ')
 		return report(options, oid, type, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date");
-	p = end + 1;
+	p++;
 	if ((*p != '+' && *p != '-') ||
 	    !isdigit(p[1]) ||
 	    !isdigit(p[2]) ||
-- 
2.52.0.rc1.260.g3e4993586f


  parent reply	other threads:[~2025-11-12  8:10 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-12  7:55 [PATCH 0/9] asan bonanza Jeff King
2025-11-12  7:56 ` [PATCH 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-12  8:01 ` [PATCH 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-12 11:25   ` Patrick Steinhardt
2025-11-13  2:55   ` Taylor Blau
2025-11-18  8:59     ` Jeff King
2025-11-12  8:02 ` [PATCH 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-12  8:17   ` Collin Funk
2025-11-12 10:31     ` Jeff King
2025-11-12 20:06       ` Collin Funk
2025-11-12 11:26   ` Patrick Steinhardt
2025-11-13  3:12     ` Taylor Blau
2025-11-13  6:34       ` Patrick Steinhardt
2025-11-18  8:49       ` Jeff King
2025-11-13 16:30     ` Junio C Hamano
2025-11-14  7:00       ` Patrick Steinhardt
2025-11-15  2:13         ` Jeff King
2025-11-12  8:05 ` [PATCH 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-12 11:26   ` Patrick Steinhardt
2025-11-13  3:09     ` Taylor Blau
2025-11-18  8:40       ` Jeff King
2025-11-18  8:38     ` Jeff King
2025-11-12  8:06 ` [PATCH 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-12  8:06 ` [PATCH 6/9] fsck: avoid strcspn() " Jeff King
2025-11-12  8:06 ` [PATCH 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-12  8:10 ` Jeff King [this message]
2025-11-12 11:25   ` [PATCH 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Patrick Steinhardt
2025-11-12 19:36     ` Junio C Hamano
2025-11-15  2:12     ` Jeff King
2025-11-12  8:10 ` [PATCH 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-13  3:17 ` [PATCH 0/9] asan bonanza Taylor Blau
2025-11-18  9:11 ` [PATCH v2 " Jeff King
2025-11-18  9:11   ` [PATCH v2 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-18  9:12   ` [PATCH v2 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-18  9:12   ` [PATCH v2 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-18  9:12   ` [PATCH v2 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-18 14:30     ` Phillip Wood
2025-11-23  6:19       ` Junio C Hamano
2025-11-23 15:51         ` Phillip Wood
2025-11-23 18:06           ` Junio C Hamano
2025-11-24 22:30         ` Jeff King
2025-11-24 23:09           ` Junio C Hamano
2025-11-26 15:09             ` Jeff King
2025-11-26 17:22               ` Junio C Hamano
2025-11-30 13:13                 ` [PATCH 0/4] more robust functions for parsing int from buf Jeff King
2025-11-30 13:14                   ` [PATCH 1/4] parse: prefer bool to int for boolean returns Jeff King
2025-12-04 11:23                     ` Patrick Steinhardt
2025-11-30 13:15                   ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Jeff King
2025-11-30 13:46                     ` my complaints with clar Jeff King
2025-12-01 14:16                       ` Phillip Wood
2025-12-04 11:09                         ` Patrick Steinhardt
2025-12-05 18:30                           ` Jeff King
2025-12-04 11:23                     ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Patrick Steinhardt
2025-12-05 16:11                     ` Phillip Wood
2026-01-20 20:54                       ` Junio C Hamano
2026-01-21  5:27                         ` Jeff King
2025-11-30 13:15                   ` [PATCH 3/4] cache-tree: use parse_int_from_buf() Jeff King
2025-11-30 13:16                   ` [PATCH 4/4] fsck: use parse_unsigned_from_buf() for parsing timestamp Jeff King
2025-11-18  9:12   ` [PATCH v2 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-18  9:12   ` [PATCH v2 6/9] fsck: avoid strcspn() " Jeff King
2025-11-18  9:12   ` [PATCH v2 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-18  9:12   ` [PATCH v2 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-18  9:12   ` [PATCH v2 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-23  5:49   ` [PATCH v2 0/9] asan bonanza Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251112081040.GH979063@coredump.intra.peff.net \
    --to=peff@peff$(echo .)net \
    --cc=cmlists@sent$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=me@ttaylorr$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox