public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Adrien Schildknecht <adrien+dev@schischi•me>
Cc: git@vger•kernel.org, Matthieu.Moy@grenoble-inp•fr
Subject: Re: [PATCH v2] userdiff: funcname and word patterns for sh
Date: Fri, 13 Mar 2015 22:13:09 -0700	[thread overview]
Message-ID: <xmqqy4n0xiu2.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1425944432-23642-1-git-send-email-adrien+dev@schischi.me> (Adrien Schildknecht's message of "Tue, 10 Mar 2015 00:40:32 +0100")

Adrien Schildknecht <adrien+dev@schischi•me> writes:

> Add regexp based on the "Shell Command Language" specifications.
> Because of the lax syntax of sh, some corner cases may not be
> handled properly.
>
> Signed-off-by: Adrien Schildknecht <adrien+dev@schischi•me>
> ---

Those of you who helped in the first round of review, any comments,
"This round looks good"'s, ...?

> +PATTERNS("sh",
> +	"^([ \t]*(function[ \t]+)?[a-zA-Z_][a-zA-Z0-9_]*[ \t]*\\([ \t]*\\).*)$",
> +	/* -- */

I do not think it is wrong per-se to try to be as precise as
possible, but I wonder if it is sufficient to cheat and make these
"what is a word?" expressions a bit looser, by declaring that it is
OK if a simpler pattern allows something that are syntactically
illegal in shell, as long as it splits valid shell construct
correctly.  For example:

> +	 "[a-zA-Z0-9_]+"
> +	 "|[-+0-9]+"

The first one matches an identifier (e.g. If you have frotz="a b c"
and $frotz, two appearances of 'frotz' are matched) and the second
one I think is trying to catch possibly signed integers, but the
latter also matches 0+1+++2 which is already loose (but I do not
think it is a problem).  Perhaps it is sufficient to collapse the
above into a single "[-+a-zA-Z0-9_$]+"?

> +	 "|[-+*/<>%&^|=!]=|>>=?|<<=?|\\+\\+|--|\\*\\*|&&|\\|\\||\\[\\[|\\]\\]"
> +	 "|>\\||[<>]+&|<>|<<-|;;"),

Likewise.  I wonder if something like "[-~!@#%^&*+=|;/]+" gives too
many false matches.

>  { "default", NULL, -1, { NULL, 0 } },
>  };
>  #undef PATTERNS

  reply	other threads:[~2015-03-14  5:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-09 16:36 [GSoC][PATCH] userdiff: funcname and word patterns for sh Adrien Schildknecht
2015-03-09 16:36 ` [PATCH] " Adrien Schildknecht
2015-03-09 20:34   ` Matthieu Moy
2015-03-09 23:40   ` [PATCH v2] " Adrien Schildknecht
2015-03-14  5:13     ` Junio C Hamano [this message]
2015-03-14 17:19     ` Matthieu Moy
2015-03-25 21:36     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy4n0xiu2.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=Matthieu.Moy@grenoble-inp$(echo .)fr \
    --cc=adrien+dev@schischi$(echo .)me \
    --cc=git@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox