From: Jeff King <peff@peff•net>
To: Collin Funk <collin.funk1@gmail•com>
Cc: git@vger•kernel.org, correctmost <cmlists@sent•com>,
Taylor Blau <me@ttaylorr•com>
Subject: Re: [PATCH 3/9] Makefile: turn on NO_MMAP when building with ASan
Date: Wed, 12 Nov 2025 05:31:58 -0500 [thread overview]
Message-ID: <20251112103158.GA983233@coredump.intra.peff.net> (raw)
In-Reply-To: <87y0obis17.fsf@gmail.com>
On Wed, Nov 12, 2025 at 12:17:24AM -0800, Collin Funk wrote:
> I see that an interceptor was added in 2023 [1]. Maybe your compiler is
> older than that?
No, I'm using gcc 15.2.0 (from Debian unstable).
But I'm not sure if the linked code does anything useful for us.
One, it's not clear to me if it is even kicking in or not. It only does
anything if the region is "sanitizer managed", according to the details
at https://reviews.llvm.org/D154659. I'm not sure what that means
exactly, because I'm fuzzy on how the shadow map works.
But even when it does do something, it seems to round up to the nearest
page size. But we really want to know if we go even one byte over the
requested length, because if we touch the 1235th byte of a 1234-byte
buffer (which is going to be a NUL because of mmap rounding up the
pages), then there's probably another test case somewhere where we
access the 4097th byte of a 4096-byte buffer (which is going to
segfault).
> char *ptr = mmap (NULL, getpagesize (), PROT_READ | PROT_WRITE,
> MAP_ANONYMOUS, -1, 0);
> if (ptr == NULL)
> abort ();
I think you want to check for MAP_FAILED here, not NULL. And I think we
always get that, because MAP_ANONYMOUS needs to be OR-ed into MAP_SHARED
or MAP_PRIVATE. So here:
> $ gcc -fsanitize=address main.c && ./a.out 2>&1 | grep ^SUMMARY:
> SUMMARY: AddressSanitizer: SEGV (/home/collin/a.out+0x400554) (BuildId: 1b7a82189bfffb3f73d420e138b9859add25901a) in main
> $ clang -fsanitize=address main.c && ./a.out 2>&1 | grep ^SUMMARY:
> SUMMARY: AddressSanitizer: SEGV (/home/collin/a.out+0x4e9ee6) (BuildId: aca1d168eacebaa239082d8a45ab74c8470f4b31) in main
I don't think this is ASan finding a problem. It is just telling us that
we segfaulted for other reasons. And the fault here is because the
broken mmap() invocation returned MAP_FAILED, and we tried to access
that garbage pointer.
> ptr[getpagesize () + 1] = 'a';
This is also making a map that is a multiple of the page size, and then
touching a byte that's on the next page. That's the easy-ish case that
we can often already find, even without ASan (though it depends on what
comes after the mapped memory; it might be a valid page).
A more interesting test for Git is to actually map a file, like:
$ cat main.c
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <stdio.h>
static void die(const char *msg)
{
perror(msg);
exit(1);
}
int main (int argc, const char **argv)
{
struct stat st;
int fd;
char *ptr;
fd = open(argv[1], O_RDONLY);
if (fd < 0)
die("open");
if (fstat(fd, &st) < 0)
die("fstat");
ptr = mmap (NULL, st.st_size, PROT_READ, MAP_SHARED, fd, 0);
if (ptr == MAP_FAILED)
die("mmap");
printf("last byte: %d\n", ptr[st.st_size-1]);
printf("one byte after: %d\n", ptr[st.st_size]);
return 0;
}
$ yes | head -c 4096 >big
$ yes | head -c 372 >small
And ASan does often detect the problem for the "big" page-sized file,
but not consistently! If I do:
gcc -fsanitize=address main.c
while ./a.out big; do echo ok; done
I may get output like:
last byte: 10
one byte after: 127
ok
last byte: 10
one byte after: 0
ok
last byte: 10
one byte after: 0
ok
last byte: 10
=================================================================
==988617==ERROR: AddressSanitizer: unknown-crash on address 0x7efd40b9f000 at pc 0x564fe77b64eb bp 0x7ffff49e8160 sp 0x7ffff49e8158
READ of size 1 at 0x7efd40b9f000 thread T0
#0 0x564fe77b64ea in main (/home/peff/a.out+0x14ea) (BuildId: 8db121bb5c048cb336f8be729e8cefebd6f059a3)
#1 0x7efd41233ca7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#2 0x7efd41233d64 in __libc_start_main_impl ../csu/libc-start.c:360
#3 0x564fe77b6150 in _start (/home/peff/a.out+0x1150) (BuildId: 8db121bb5c048cb336f8be729e8cefebd6f059a3)
Address 0x7efd40b9f000 is a wild pointer inside of access range of size 0x000000000001.
So it worked three times without ASan noticing the problem (producing
two different outputs), and then ASan finally crashed. But it didn't
give us the usual information we get for a malloc overflow. It's just an
"unknown crash" from a "wild pointer". So I'm not sure if it's even
finding these through its own poisoning, and not just catching an
unlucky segfault.
If we switch to the small file, then ASan never reports anything! The OS
gives us a page-sized chunk, so we consistently read a "0" in from the
byte after our requested size.
If we swap out the mmap for:
ptr = malloc(st.st_size);
read(fd, ptr, st.st_size);
(which is roughly what our NO_MMAP wrapper is doing behind the scenes),
then ASan does catch it consistently, even for the "small" file:
$ ./a.out small
=================================================================
==1008630==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7c7d11fe01b4 at pc 0x55cf89b1b4c8 bp 0x7fff471b84e0 sp 0x7fff471b84d8
READ of size 1 at 0x7c7d11fe01b4 thread T0
#0 0x55cf89b1b4c7 in main (/home/peff/a.out+0x14c7) (BuildId: 2cebfcd0a00064eaaed750af010fcecdae2f5666)
#1 0x7f4d12e33ca7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#2 0x7f4d12e33d64 in __libc_start_main_impl ../csu/libc-start.c:360
#3 0x55cf89b1b150 in _start (/home/peff/a.out+0x1150) (BuildId: 2cebfcd0a00064eaaed750af010fcecdae2f5666)
0x7c7d11fe01b4 is located 0 bytes after 372-byte region [0x7c7d11fe0040,0x7c7d11fe01b4)
allocated by thread T0 here:
#0 0x7f4d1311a0ab in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:67
#1 0x55cf89b1b3a0 in main (/home/peff/a.out+0x13a0) (BuildId: 2cebfcd0a00064eaaed750af010fcecdae2f5666)
#2 0x7f4d12e33ca7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
-Peff
next prev parent reply other threads:[~2025-11-12 10:32 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-12 7:55 [PATCH 0/9] asan bonanza Jeff King
2025-11-12 7:56 ` [PATCH 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-12 8:01 ` [PATCH 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-12 11:25 ` Patrick Steinhardt
2025-11-13 2:55 ` Taylor Blau
2025-11-18 8:59 ` Jeff King
2025-11-12 8:02 ` [PATCH 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-12 8:17 ` Collin Funk
2025-11-12 10:31 ` Jeff King [this message]
2025-11-12 20:06 ` Collin Funk
2025-11-12 11:26 ` Patrick Steinhardt
2025-11-13 3:12 ` Taylor Blau
2025-11-13 6:34 ` Patrick Steinhardt
2025-11-18 8:49 ` Jeff King
2025-11-13 16:30 ` Junio C Hamano
2025-11-14 7:00 ` Patrick Steinhardt
2025-11-15 2:13 ` Jeff King
2025-11-12 8:05 ` [PATCH 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-12 11:26 ` Patrick Steinhardt
2025-11-13 3:09 ` Taylor Blau
2025-11-18 8:40 ` Jeff King
2025-11-18 8:38 ` Jeff King
2025-11-12 8:06 ` [PATCH 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-12 8:06 ` [PATCH 6/9] fsck: avoid strcspn() " Jeff King
2025-11-12 8:06 ` [PATCH 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-12 8:10 ` [PATCH 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-12 11:25 ` Patrick Steinhardt
2025-11-12 19:36 ` Junio C Hamano
2025-11-15 2:12 ` Jeff King
2025-11-12 8:10 ` [PATCH 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-13 3:17 ` [PATCH 0/9] asan bonanza Taylor Blau
2025-11-18 9:11 ` [PATCH v2 " Jeff King
2025-11-18 9:11 ` [PATCH v2 1/9] compat/mmap: mark unused argument in git_munmap() Jeff King
2025-11-18 9:12 ` [PATCH v2 2/9] pack-bitmap: handle name-hash lookups in incremental bitmaps Jeff King
2025-11-18 9:12 ` [PATCH v2 3/9] Makefile: turn on NO_MMAP when building with ASan Jeff King
2025-11-18 9:12 ` [PATCH v2 4/9] cache-tree: avoid strtol() on non-string buffer Jeff King
2025-11-18 14:30 ` Phillip Wood
2025-11-23 6:19 ` Junio C Hamano
2025-11-23 15:51 ` Phillip Wood
2025-11-23 18:06 ` Junio C Hamano
2025-11-24 22:30 ` Jeff King
2025-11-24 23:09 ` Junio C Hamano
2025-11-26 15:09 ` Jeff King
2025-11-26 17:22 ` Junio C Hamano
2025-11-30 13:13 ` [PATCH 0/4] more robust functions for parsing int from buf Jeff King
2025-11-30 13:14 ` [PATCH 1/4] parse: prefer bool to int for boolean returns Jeff King
2025-12-04 11:23 ` Patrick Steinhardt
2025-11-30 13:15 ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Jeff King
2025-11-30 13:46 ` my complaints with clar Jeff King
2025-12-01 14:16 ` Phillip Wood
2025-12-04 11:09 ` Patrick Steinhardt
2025-12-05 18:30 ` Jeff King
2025-12-04 11:23 ` [PATCH 2/4] parse: add functions for parsing from non-string buffers Patrick Steinhardt
2025-12-05 16:11 ` Phillip Wood
2026-01-20 20:54 ` Junio C Hamano
2026-01-21 5:27 ` Jeff King
2025-11-30 13:15 ` [PATCH 3/4] cache-tree: use parse_int_from_buf() Jeff King
2025-11-30 13:16 ` [PATCH 4/4] fsck: use parse_unsigned_from_buf() for parsing timestamp Jeff King
2025-11-18 9:12 ` [PATCH v2 5/9] fsck: assert newline presence in fsck_ident() Jeff King
2025-11-18 9:12 ` [PATCH v2 6/9] fsck: avoid strcspn() " Jeff King
2025-11-18 9:12 ` [PATCH v2 7/9] fsck: remove redundant date timestamp check Jeff King
2025-11-18 9:12 ` [PATCH v2 8/9] fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated Jeff King
2025-11-18 9:12 ` [PATCH v2 9/9] t: enable ASan's strict_string_checks option Jeff King
2025-11-23 5:49 ` [PATCH v2 0/9] asan bonanza Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251112103158.GA983233@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=cmlists@sent$(echo .)com \
--cc=collin.funk1@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=me@ttaylorr$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox