From: Junio C Hamano <gitster@pobox•com>
To: "René Scharfe" <l.s.r@web•de>
Cc: mqudsi@neosmart•net, git@vger•kernel.org,
Giuseppe Bilotta <giuseppe.bilotta@gmail•com>
Subject: Re: [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines()
Date: Sun, 12 Nov 2017 13:45:47 +0900 [thread overview]
Message-ID: <xmqqinegcdfo.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <6ff27de7-ac22-3f2f-1f3b-2e0e6f10605a@web.de> ("René Scharfe"'s message of "Sat, 11 Nov 2017 15:10:19 +0100")
René Scharfe <l.s.r@web•de> writes:
> fuzzy_matchlines() uses a pointers to the first and last characters of
> two lines to keep track while matching them. This makes it impossible
> to deal with empty strings. It accesses characters before the start of
> empty lines. It can also access characters after the end when checking
> for trailing whitespace in the main loop.
>
> Avoid that by using pointers to the first character and the one *after*
> the last one. This is well-defined as long as the latter is not
> dereferenced. Basically rewrite the function based on that premise; it
> becomes much simpler as a result. There is no need to check for
> leading whitespace outside of the main loop anymore.
I recall vaguely that we were bitten by a bug or two due to another
instance of <begin,end> that deviates from the usual "close on the
left end, open on the right end" convention somewhere in the system
recently?
I think the fix of the function is correct, but at the same time, we
would want to clean it up after this fix lands by replacing the
function with the line comparison function we already have in the
xdiff/ layer, so that we can (1) reduce the code duplication and (2)
more importantly, do not have to be constrained by the (mistakenly
narrow) policy decision we currently seem to have to support only
"ignore-whitespace-change" and nothing else. Of course, that should
not be done as part of this fix. It is strictly a #leftoverbits item.
Thanks.
> Reported-by: Mahmoud Al-Qudsi <mqudsi@neosmart•net>
> Signed-off-by: Rene Scharfe <l.s.r@web•de>
> ---
> apply.c | 59 ++++++++++++++++++++---------------------------------------
> 1 file changed, 20 insertions(+), 39 deletions(-)
>
> diff --git a/apply.c b/apply.c
> index d676debd59..b8087bd29c 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -300,52 +300,33 @@ static uint32_t hash_line(const char *cp, size_t len)
> static int fuzzy_matchlines(const char *s1, size_t n1,
> const char *s2, size_t n2)
> {
> - const char *last1 = s1 + n1 - 1;
> - const char *last2 = s2 + n2 - 1;
> - int result = 0;
> + const char *end1 = s1 + n1;
> + const char *end2 = s2 + n2;
>
> /* ignore line endings */
> - while ((*last1 == '\r') || (*last1 == '\n'))
> - last1--;
> - while ((*last2 == '\r') || (*last2 == '\n'))
> - last2--;
> -
> - /* skip leading whitespaces, if both begin with whitespace */
> - if (s1 <= last1 && s2 <= last2 && isspace(*s1) && isspace(*s2)) {
> - while (isspace(*s1) && (s1 <= last1))
> - s1++;
> - while (isspace(*s2) && (s2 <= last2))
> - s2++;
> - }
> - /* early return if both lines are empty */
> - if ((s1 > last1) && (s2 > last2))
> - return 1;
> - while (!result) {
> - result = *s1++ - *s2++;
> - /*
> - * Skip whitespace inside. We check for whitespace on
> - * both buffers because we don't want "a b" to match
> - * "ab"
> - */
> - if (isspace(*s1) && isspace(*s2)) {
> - while (isspace(*s1) && s1 <= last1)
> + while (s1 < end1 && (end1[-1] == '\r' || end1[-1] == '\n'))
> + end1--;
> + while (s2 < end2 && (end2[-1] == '\r' || end2[-1] == '\n'))
> + end2--;
> +
> + while (s1 < end1 && s2 < end2) {
> + if (isspace(*s1)) {
> + /*
> + * Skip whitespace. We check on both buffers
> + * because we don't want "a b" to match "ab".
> + */
> + if (!isspace(*s2))
> + return 0;
> + while (s1 < end1 && isspace(*s1))
> s1++;
> - while (isspace(*s2) && s2 <= last2)
> + while (s2 < end2 && isspace(*s2))
> s2++;
> - }
> - /*
> - * If we reached the end on one side only,
> - * lines don't match
> - */
> - if (
> - ((s2 > last2) && (s1 <= last1)) ||
> - ((s1 > last1) && (s2 <= last2)))
> + } else if (*s1++ != *s2++)
> return 0;
> - if ((s1 > last1) && (s2 > last2))
> - break;
> }
>
> - return !result;
> + /* If we reached the end on one side only, lines don't match. */
> + return s1 == end1 && s2 == end2;
> }
>
> static void add_line_info(struct image *img, const char *bol, size_t len, unsigned flag)
next prev parent reply other threads:[~2017-11-12 4:45 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-08 16:58 Invalid memory access in `git apply` mqudsi
2017-11-11 14:10 ` René Scharfe
2017-11-11 14:10 ` [PATCH] apply: avoid out-of-bounds access in fuzzy_matchlines() René Scharfe
2017-11-12 4:45 ` Junio C Hamano [this message]
2017-11-16 18:50 ` [PATCH] apply: update line lengths for --inaccurate-eof René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqinegcdfo.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=giuseppe.bilotta@gmail$(echo .)com \
--cc=l.s.r@web$(echo .)de \
--cc=mqudsi@neosmart$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox