From: Jeff King <peff@peff•net>
To: Junio C Hamano <gitster@pobox•com>
Cc: Git List <git@vger•kernel.org>, Wink Saville <wink@saville•com>
Subject: Re: [PATCH 03/14] tree-diff: clear parent array in path_appendnew()
Date: Fri, 10 Jan 2025 05:54:24 -0500 [thread overview]
Message-ID: <20250110105424.GA1014503@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqo70fj0zu.fsf@gitster.g>
On Thu, Jan 09, 2025 at 10:28:05AM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff•net> writes:
>
> > All of the other functions which allocate a combine_diff_path struct
> > zero out the parent array, but this code path does not. There's no bug,
> > since our caller will fill in most of the fields. But leaving the unused
> > fields (like combine_diff_parent.path) uninitialized makes working with
> > the struct more error-prone than it needs to be.
>
> OK. We however will still not use the array at all when we do not
> need it, so it would be between accessing uninitialized bytes vs
> accessing 0-bytes by mistake? With my devil's advocate hat on, I
> wonder if this would lead to more sloppy users saying "I am not
> following the pointer; I am merely stopping when I see a NULL
> pointer at the end of the array" or something silly like that
> without checking the validity of the array itself (which presumably
> can be inferred by inspecting some other member in the containing
> struct, right?)".
Yes, code may be equally wrong to look at uninitialized versus zero
bytes, depending on what it's doing. I don't think "stop when you see
NULL" is a danger here; this is an array of structs, one of which now
happens to be NULL (rather than an array of char pointers, which might
imply that NULL is the end).
Some of that sloppiness already exists. For instance, before my series,
check out intersect_paths(). If we are removing an element from the
list, we clean it up like this:
for (j = 0; j < num_parent; j++)
if (combined_all_paths &&
filename_changed(p->parent[j].status))
strbuf_release(&p->parent[j].path);
but if we allocated for 3 parents and have only gotten to the second
pass, all of parent[2] will never have been filled in. We zero
initialize the parents in that function, so there's no memory error. But
it is relying on the fact that filename_changed() will reject a zero
status to avoid calling strbuf_release() on a zero'd strbuf (which
incidentally also works, but violates the strbuf API).
Now in that case we are zero-ing, so it is not one of the uninitialized
cases that Wink ran into. But even if he had tried to be careful with:
if (filename_changed(p->parent[i].status))
/* ok to look at p->parent[i].path */
it would not have worked, because that status would have been
uninitialized, too.
> > Let's just zero the parent field to be consistent with the
> > combine_diff_path_new() allocator.
>
> But I like the "let's be consistent" reasoning, so I wouldn't
> complain ;-)
So yeah. This is the part that I think is really helping new code.
Changing the strbuf to a pointer makes it even simpler (you do not even
have to check the status at all), but this is the commit that is
preventing undefined behavior. ;)
-Peff
next prev parent reply other threads:[~2025-01-10 10:54 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-03 19:28 [BUGREPORT] git diff-tree --cc SEGFAUTs Wink Saville
2025-01-03 20:46 ` Jeff King
2025-01-03 23:34 ` Wink Saville
2025-01-04 0:31 ` Jeff King
2025-01-04 2:55 ` Junio C Hamano
2025-01-04 3:32 ` Jeff King
2025-01-04 18:09 ` Wink Saville
2025-01-05 22:13 ` Wink Saville
2025-01-09 8:27 ` [PATCH 0/14] combine-diff cleanups Jeff King
2025-01-09 8:28 ` [PATCH 01/14] run_diff_files(): delay allocation of combine_diff_path Jeff King
2025-01-09 17:57 ` Junio C Hamano
2025-01-09 8:32 ` [PATCH 02/14] combine-diff: add combine_diff_path_new() Jeff King
2025-01-09 18:05 ` Junio C Hamano
2025-01-13 15:40 ` Patrick Steinhardt
2025-01-14 9:29 ` Jeff King
2025-01-09 8:33 ` [PATCH 03/14] tree-diff: clear parent array in path_appendnew() Jeff King
2025-01-09 18:28 ` Junio C Hamano
2025-01-10 10:54 ` Jeff King [this message]
2025-01-09 8:42 ` [PATCH 04/14] combine-diff: use pointer for parent paths Jeff King
2025-01-09 18:49 ` Junio C Hamano
2025-01-09 8:42 ` [PATCH 05/14] diff: add a comment about combine_diff_path.parent.path Jeff King
2025-01-13 15:40 ` Patrick Steinhardt
2025-01-09 8:44 ` [PATCH 06/14] run_diff_files(): de-mystify the size of combine_diff_path struct Jeff King
2025-01-10 16:40 ` Junio C Hamano
2025-01-09 8:46 ` [PATCH 07/14] tree-diff: drop path_appendnew() alloc optimization Jeff King
2025-01-13 15:40 ` Patrick Steinhardt
2025-01-14 10:30 ` Jeff King
2025-01-09 8:49 ` [PATCH 08/14] tree-diff: pass whole path string to path_appendnew() Jeff King
2025-01-13 15:40 ` Patrick Steinhardt
2025-01-14 9:26 ` Jeff King
2025-01-09 8:49 ` [PATCH 09/14] tree-diff: inline path_appendnew() Jeff King
2025-01-11 0:41 ` Junio C Hamano
2025-01-09 8:50 ` [PATCH 10/14] combine-diff: drop public declaration of combine_diff_path_size() Jeff King
2025-01-09 8:51 ` [PATCH 11/14] tree-diff: drop list-tail argument to diff_tree_paths() Jeff King
2025-01-18 0:33 ` Junio C Hamano
2025-01-09 8:53 ` [PATCH 12/14] tree-diff: use the name "tail" to refer to list tail Jeff King
2025-01-09 8:54 ` [PATCH 13/14] tree-diff: simplify emit_path() list management Jeff King
2025-01-09 8:57 ` [PATCH 14/14] tree-diff: make list tail-passing more explicit Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250110105424.GA1014503@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=wink@saville$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox