public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail•com>
To: Jeff King <peff@peff•net>
Cc: "Дилян Палаузов" <dilyan.palauzov@aegee•org>, git@vger•kernel.org
Subject: Re: git grep -P fatal: pcre_exec failed with error code -8
Date: Mon, 06 Nov 2017 15:04:02 +0100	[thread overview]
Message-ID: <877ev35wrx.fsf@evledraar.booking.com> (raw)
In-Reply-To: <20171106122411.dhi2ltyegzquebhk@sigill.intra.peff.net>


On Mon, Nov 06 2017, Jeff King jotted:

> On Mon, Nov 06, 2017 at 12:50:45PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> Some replies to the thread in general, didn't want to spread this out
>> into different replies.
>>
>>  * Yes this sucks.
>>
>>  * Just emitting a warning without an appropriate exit code would suck
>>    more, would break batch jobs & whatnot that expcept certain results
>>    from grep.
>
> That was my first thought, too, but something that does:
>
>   git grep foo && echo found
>
> would behave basically the same. Do you mean here scripts that actually
> do:
>
>   git grep foo
>   case "$?" in
>   0) echo found ;;
>   1) echo not found ;;
>   *) echo wtf? ;;
>   esac
>
> I agree it would be nice to at least have _some_ way to distinguish
> between those final two cases.

Maybe we should emit a different exit code, but I just had in mind that
we could continue but the same non-zero as when the grep fails now, but
with an additional warning.

> Though something like "git log --grep" is even more complicated. We
> perhaps _would_ want to distinguish between:
>
>   - match (in which case we print the commit)
>
>   - no match (in which case we do not)
>
>   - error (in which case we do not print, but also mark the exit code as
>     failing)
>
>>  * As you point out it would be nice to print out the file name we
>>    didn't match on, we'd need to pass the grep_source struct down
>>    further, it goes as far as grep_source_1 but stops there and isn't
>>    passed to e.g. look_ahead(), which calls patmatch() which calls the
>>    engine-specific matcher and would need to report the error. We could
>>    just do this, would slow down things a bit (probably trivally) but we
>>    could emit better error messages in genreal.
>
> I'm not sure if the grep_source has enough information for all cases.
> E.g., if you hit an error while grepping in commit headers, you'd
> probably want to mention the oid of the commit. There's an "identifier"
> field in the grep_source, but it's opaque.
>
> The caller may also want to do more things than just print an error
> (like the exit code adjustment I mentioned above). Which implies to me
> we should be passing the error information up, not trying to bring the
> context down.

Indeed, I was just thinking of the case where we're grepping a file in
the tree, not when the engine is used for --grep et al.

>>  * You can adjust these limts in PCRE in Git, although it doesn't help
>>    in this case, you just add (*LIMIT_MATCH=NUM) or
>>    (*LIMIT_RECURSION=NUM) (or both) to the start of the pattern. See
>>    pcresyntax(3) or pcre2syntax(3) man pages depending on what version
>>    you have installed.
>
> I saw that in the pcre manual, but I had the impression you can't
> actually raise the limits above the defaults with that, only lower them.

You can, you just can't set them to anything less conservative than
limits already set via the C API, but we don't set any of those.

I.e. PCRE allows you to say expose a regex field in a web form (as we do
with gitweb) with really conservative settings that can't be overriden
via a (*LIMIT) set in the pattern.

But since we don't use that C API PCRE runs in a mode where the user
gets set whatever limit they want in the pattern (or other
pattern-altering switch), which makes sense for interactive git-grep
use.

>>  * While regexec() won't return an error its version of dealing with
>>    this is (at least under glibc) to balloon CPU/memory use until the
>>    OOMkiller kills git (although not on this particular pattern).
>
> So in a sense our current behavior with pcre is the same. We just have
> to provoke the death ourselves. ;)

Indeed :)

      reply	other threads:[~2017-11-06 14:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-05  0:06 git grep -P fatal: pcre_exec failed with error code -8 Дилян Палаузов
2017-11-05  2:16 ` Jeff King
2017-11-05  9:41   ` Дилян Палаузов
2017-11-06 10:31     ` Jeff King
2017-11-06 11:50       ` Ævar Arnfjörð Bjarmason
2017-11-06 12:24         ` Jeff King
2017-11-06 14:04           ` Ævar Arnfjörð Bjarmason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877ev35wrx.fsf@evledraar.booking.com \
    --to=avarab@gmail$(echo .)com \
    --cc=dilyan.palauzov@aegee$(echo .)org \
    --cc=git@vger$(echo .)kernel.org \
    --cc=peff@peff$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox