public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Jeff King <peff@peff•net>
To: git@vger•kernel.org
Cc: Patrick Steinhardt <ps@pks•im>, correctmost <cmlists@sent•com>,
	Taylor Blau <me@ttaylorr•com>
Subject: [PATCH v2 0/9] asan bonanza
Date: Tue, 18 Nov 2025 04:11:27 -0500	[thread overview]
Message-ID: <20251118091127.GA4175601@coredump.intra.peff.net> (raw)
In-Reply-To: <20251112075522.GA978866@coredump.intra.peff.net>

On Wed, Nov 12, 2025 at 02:55:22AM -0500, Jeff King wrote:

> This series fixes a handful of issues that ASan finds in our test suite
> if we tweak a few options to let it look deeper.

Here's a v2 based on feedback:

  - added the extra assertion in the midx code

  - meson changes are squashed into patch 3

  - The cache-tree integer parsing is more robust around total garbage
    inputs (with no digits at all). I agree with the reviewers that it
    would be nice to have a robust, reusable integer parsing function.
    But I think it's non-trivial to do (and I left more comments in the
    thread). I'd like to stick here to just fixing the memory issues
    without making anything worse (which I think this version does).

    Note that since the new helper takes an out-parameter, we have to
    match the type more strictly to what the callers have. So it is now
    parse_int(), and not parse_long().

Range diff is below.

  [1/9]: compat/mmap: mark unused argument in git_munmap()
  [2/9]: pack-bitmap: handle name-hash lookups in incremental bitmaps
  [3/9]: Makefile: turn on NO_MMAP when building with ASan
  [4/9]: cache-tree: avoid strtol() on non-string buffer
  [5/9]: fsck: assert newline presence in fsck_ident()
  [6/9]: fsck: avoid strcspn() in fsck_ident()
  [7/9]: fsck: remove redundant date timestamp check
  [8/9]: fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
  [9/9]: t: enable ASan's strict_string_checks option

 Makefile      |  1 +
 cache-tree.c  | 50 ++++++++++++++++++++++++++----------
 compat/mmap.c |  2 +-
 fsck.c        | 71 ++++++++++++++++++++++++++++++++++++---------------
 meson.build   |  8 +++++-
 pack-bitmap.c | 29 ++++++++++++++++++---
 t/test-lib.sh |  1 +
 7 files changed, 122 insertions(+), 40 deletions(-)

 1:  e24015d41b =  1:  3ce5bd39b5 compat/mmap: mark unused argument in git_munmap()
 2:  e217fb0e3b !  2:  9908283c33 pack-bitmap: handle name-hash lookups in incremental bitmaps
    @@ pack-bitmap.c: static uint32_t bitmap_num_objects(struct bitmap_index *index)
     +static uint32_t bitmap_name_hash(struct bitmap_index *index, uint32_t pos)
     +{
     +	if (bitmap_is_midx(index)) {
    -+		while (index && pos < index->midx->num_objects_in_base)
    ++		while (index && pos < index->midx->num_objects_in_base) {
    ++			ASSERT(bitmap_is_midx(index));
     +			index = index->base;
    ++		}
     +
     +		if (!index)
     +			BUG("NULL base bitmap for object position: %"PRIu32, pos);
 3:  8c85dad3c5 <  -:  ---------- Makefile: turn on NO_MMAP when building with ASan
 -:  ---------- >  3:  fe3421f6ec Makefile: turn on NO_MMAP when building with ASan
 4:  38d42984da !  4:  5e228f2c90 cache-tree: avoid strtol() on non-string buffer
    @@ Commit message
              further. You'd mostly get stopped by seeing non-digits in the oid
              field (and if it is likewise truncated, there will still be 20 or
              more bytes of the index checksum). So it's possible, though
    -         unlikely, to see read off the end of the mmap'd buffer. Of course a
    +         unlikely, to read off the end of the mmap'd buffer. Of course a
              malicious index file can fake the oid and the index checksum to all
              (ASCII) 0's.
     
    @@ cache-tree.c: void cache_tree_write(struct strbuf *sb, struct cache_tree *root)
      	trace2_region_leave("cache_tree", "write", the_repository);
      }
      
    -+static long parse_long(const char **ptr, unsigned long *len_p)
    ++static int parse_int(const char **ptr, unsigned long *len_p, int *out)
     +{
     +	const char *s = *ptr;
     +	unsigned long len = *len_p;
    -+	long ret = 0;
    ++	int ret = 0;
     +	int sign = 1;
     +
     +	while (len && *s == '-') {
    @@ cache-tree.c: void cache_tree_write(struct strbuf *sb, struct cache_tree *root)
     +		s++;
     +		len--;
     +	}
    ++
    ++	if (s == *ptr)
    ++		return -1;
    ++
     +	*ptr = s;
     +	*len_p = len;
    -+	return sign * ret;
    ++	*out = sign * ret;
    ++	return 0;
     +}
     +
      static struct cache_tree *read_one(const char **buffer, unsigned long *size_p)
    @@ cache-tree.c: static struct cache_tree *read_one(const char **buffer, unsigned l
     -	cp = buf;
     -	it->entry_count = strtol(cp, &ep, 10);
     -	if (cp == ep)
    -+	it->entry_count = parse_long(&buf, &size);
    -+	if (!size || *buf != ' ')
    ++	if (parse_int(&buf, &size, &it->entry_count) < 0)
      		goto free_return;
     -	cp = ep;
     -	subtree_nr = strtol(cp, &ep, 10);
     -	if (cp == ep)
    --		goto free_return;
    ++	if (!size || *buf != ' ')
    + 		goto free_return;
     -	while (size && *buf && *buf != '\n') {
     -		size--;
     -		buf++;
     -	}
     -	if (!size)
     +	buf++; size--;
    -+	subtree_nr = parse_long(&buf, &size);
    ++	if (parse_int(&buf, &size, &subtree_nr) < 0)
    ++		goto free_return;
     +	if (!size || *buf != '\n')
      		goto free_return;
      	buf++; size--;
 5:  73e921a34e =  5:  1d6814233c fsck: assert newline presence in fsck_ident()
 6:  95e8961df9 =  6:  8cf8152449 fsck: avoid strcspn() in fsck_ident()
 7:  34baa85dae =  7:  563c3006e4 fsck: remove redundant date timestamp check
 8:  f5ff2dc8ef =  8:  6f88309d76 fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
 9:  1b5c0e7ce7 =  9:  ad1a1f6a82 t: enable ASan's strict_string_checks option

  parent reply	other threads:[~2025-11-18  9:11 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-12  7:55 [PATCH 0/9] asan bonanza Jeff King
2025-11-12  7:56 ` [PATCH 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-12  8:01 ` [PATCH 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-12 11:25   ` Patrick Steinhardt
2025-11-13  2:55   ` Taylor Blau
2025-11-18  8:59     ` Jeff King
2025-11-12  8:02 ` [PATCH 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-12  8:17   ` Collin Funk
2025-11-12 10:31     ` Jeff King
2025-11-12 20:06       ` Collin Funk
2025-11-12 11:26   ` Patrick Steinhardt
2025-11-13  3:12     ` Taylor Blau
2025-11-13  6:34       ` Patrick Steinhardt
2025-11-18  8:49       ` Jeff King
2025-11-13 16:30     ` Junio C Hamano
2025-11-14  7:00       ` Patrick Steinhardt
2025-11-15  2:13         ` Jeff King
2025-11-12  8:05 ` [PATCH 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-12 11:26   ` Patrick Steinhardt
2025-11-13  3:09     ` Taylor Blau
2025-11-18  8:40       ` Jeff King
2025-11-18  8:38     ` Jeff King
2025-11-12  8:06 ` [PATCH 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-12  8:06 ` [PATCH 6/9] fsck: avoid strcspn() " Jeff King
2025-11-12  8:06 ` [PATCH 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-12  8:10 ` [PATCH 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-12 11:25   ` Patrick Steinhardt
2025-11-12 19:36     ` Junio C Hamano
2025-11-15  2:12     ` Jeff King
2025-11-12  8:10 ` [PATCH 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-13  3:17 ` [PATCH 0/9] asan bonanza Taylor Blau
2025-11-18  9:11 ` Jeff King [this message]
2025-11-18  9:11   ` [PATCH v2 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-18  9:12   ` [PATCH v2 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-18  9:12   ` [PATCH v2 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-18  9:12   ` [PATCH v2 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-18 14:30     ` Phillip Wood
2025-11-23  6:19       ` Junio C Hamano
2025-11-23 15:51         ` Phillip Wood
2025-11-23 18:06           ` Junio C Hamano
2025-11-24 22:30         ` Jeff King
2025-11-24 23:09           ` Junio C Hamano
2025-11-26 15:09             ` Jeff King
2025-11-26 17:22               ` Junio C Hamano
2025-11-30 13:13                 ` [PATCH 0/4] more robust functions for parsing int from buf Jeff King
2025-11-30 13:14                   ` [PATCH 1/4] parse: prefer bool to int for boolean returns Jeff King
2025-12-04 11:23                     ` Patrick Steinhardt
2025-11-30 13:15                   ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Jeff King
2025-11-30 13:46                     ` my complaints with clar Jeff King
2025-12-01 14:16                       ` Phillip Wood
2025-12-04 11:09                         ` Patrick Steinhardt
2025-12-05 18:30                           ` Jeff King
2025-12-04 11:23                     ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Patrick Steinhardt
2025-12-05 16:11                     ` Phillip Wood
2026-01-20 20:54                       ` Junio C Hamano
2026-01-21  5:27                         ` Jeff King
2025-11-30 13:15                   ` [PATCH 3/4] cache-tree: use parse_int_from_buf() Jeff King
2025-11-30 13:16                   ` [PATCH 4/4] fsck: use parse_unsigned_from_buf() for parsing timestamp Jeff King
2025-11-18  9:12   ` [PATCH v2 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-18  9:12   ` [PATCH v2 6/9] fsck: avoid strcspn() " Jeff King
2025-11-18  9:12   ` [PATCH v2 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-18  9:12   ` [PATCH v2 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-18  9:12   ` [PATCH v2 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-23  5:49   ` [PATCH v2 0/9] asan bonanza Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251118091127.GA4175601@coredump.intra.peff.net \
    --to=peff@peff$(echo .)net \
    --cc=cmlists@sent$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=me@ttaylorr$(echo .)com \
    --cc=ps@pks$(echo .)im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox