public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Johannes Sixt <j6t@kdbg•org>
To: "Scott L. Burson via GitGitGadget" <gitgitgadget@gmail•com>
Cc: "Junio C Hamano" <gitster@pobox•com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail•com>,
	"Jaydeep P Das" <jaydeepjd.8914@gmail•com>,
	"D. Ben Knoble" <ben.knoble@gmail•com>,
	"Scott L. Burson" <Scott@sympoiesis•com>,
	git@vger•kernel.org
Subject: Re: [PATCH v2 2/2] merge with Scheme regexp; fix bugs
Date: Thu, 27 Nov 2025 17:09:32 +0100	[thread overview]
Message-ID: <b6656e6d-d1e8-4ebe-821f-9211643a71ab@kdbg.org> (raw)
In-Reply-To: <86315aa3e36afa1ee741a2c9b9e95a71ca569302.1764211096.git.gitgitgadget@gmail.com>

Am 27.11.25 um 03:38 schrieb Scott L. Burson via GitGitGadget:
> From: "Scott L. Burson" <Scott@sympoiesis•com>
> 
> This commit merges (by disjoining) the new generic Lisp regexp into
> the existing Scheme regexp.  It also fixes two bugs: the new regexp
> was unintentionally allowing tabs, and the matching of "(def" should
> be case-insensitive.
> 
> Signed-off-by: Scott L. Burson <Scott@sympoiesis•com>
> ---
>  userdiff.c | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/userdiff.c b/userdiff.c
> index e127b4a1f1..b67dfddbef 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -249,14 +249,6 @@ PATTERNS("kotlin",
>  	 "|[.][0-9][0-9_]*([Ee][-+]?[0-9]+)?[fFlLuU]?"
>  	 /* unary and binary operators */
>  	 "|[-+*/<>%&^|=!]==?|--|\\+\\+|<<=|>>=|&&|\\|\\||->|\\.\\*|!!|[?:.][.:]"),
> -PATTERNS("lisp",
> -	 /* Either an unindented left paren, or a slightly indented line
> -	  * starting with "(def" */
> -	 "^((\\(|:space:{1,2}\\(def).*)$",
> -	 /* Common Lisp symbol syntax allows arbitrary strings between vertical bars */
> -	 "\\|([^\\\\]|\\\\\\\\|\\\\\\|)*\\|"
> -	 /* All other words are delimited by spaces or parentheses/brackets/braces */
> -	 "|([^][(){} \t])+"),
>  PATTERNS("markdown",
>  	 "^ {0,3}#{1,6}[ \t].*",
>  	 /* -- */

You made this commit a fixup commit of the commit from the first round.
This isn't desirable as long as the earlier patch has not been
integrated in "next", yet.

You should have squashed the commits into one. The cover letter gives a
really good justification for this change and should be the commit's
message (with its subject line, ie, the PR title). However, don't write
"This commit does X", but write "Do X" instead: you give someone an
order to change the code. (Also, after squashing there is no bug to fix
anymore, of course.)

> @@ -352,14 +344,21 @@ PATTERNS("rust",
>  	 "|[0-9][0-9_a-fA-Fiosuxz]*(\\.([0-9]*[eE][+-]?)?[0-9_fF]*)?"
>  	 "|[-+*\\/<>%&^|=!:]=|<<=?|>>=?|&&|\\|\\||->|=>|\\.{2}=|\\.{3}|::"),
>  PATTERNS("scheme",
> -	 "^[\t ]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/ \t]|(library|module|struct|class)[*+ \t]).*)$",
> +	 /* A possibly indented left paren followed by a Scheme keyword. */
> +	 "^[\t ]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/ \t]|(library|module|struct|class)[*+ \t]).*)$\n"

Mental note how this RE is nested:

	[\t ]*(
		\((
			(
				define|def(
					struct|syntax|class|method
					|rules|record|proto|alias
				)?
			)[-*/ \t]
			|
			(
				library|module|struct|class
			)[*+ \t]
		).*
	)$

> +	 /*
> +	  * For other Lisp dialects: either an unindented left paren, or a
> +	  * slightly indented line starting with "(def".
> +	  */
> +	 "^((\\(| {1,2}\\([Dd][Ee][Ff]).*)$",

Here you are adding a very generous new pattern, the opening parenthesis
without indentation. This will not only apply to "other Lisp dialects",
as the comment says, but also Scheme code and will produce new matches.
It does not change the test cases in t/t4018/scheme-*, because all have
additional matches later.

As such it would possibly be more honest to extract it out into its own
(first) pattern and marked as applying to all dialects:

	/*
	 * An unindented opening parenthesis identifies a top-level
         * structure in all Lisp dialects.
	 */
	"^(\\(.*)$\n",

Note that the Scheme pattern excludes the indentation from the capture.
You may want to do so here, too (and simplify "one or two spaces" like
this):

	"^  ?(\\([Dd][Ee][Ff].*)$",

Would it be possible to have test cases of Lisp code that is not covered
by the Scheme pattern?

>  	 /*
> -	  * R7RS valid identifiers include any sequence enclosed
> -	  * within vertical lines having no backslashes
> +	  * The union of R7RS and Common Lisp symbol syntax: allows arbitrary
> +	  * strings between vertical bars, including escaped backslashes and
> +	  * vertical bars.
>  	  */
> -	 "\\|([^\\\\]*)\\|"
> +	 "\\|([^\\\\]|\\\\\\\\|\\\\\\|)*\\|"

Without the C quoting we have

	\|([^\\]|\\\\|\\\|)*\|

So, this is everthing from | up to the next |, except that \| does not
stop scanning and \\ is also considered so that \\| is not regarded as \
followed by \|. Good.

>  	 /* All other words should be delimited by spaces or parentheses */
> -	 "|([^][)(}{[ \t])+"),
> +	 "|([^][)(}{ \t])+"),

Here we have a single bracket expression. The removed opening [ does not
begin a new one, but is a duplicated character. Good.

-- Hannes


  reply	other threads:[~2025-11-27 16:09 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-15 10:17 [PATCH] diff: "lisp" userdiff_driver Scott L. Burson via GitGitGadget
2025-11-15 17:06 ` Johannes Sixt
2025-11-15 23:32   ` Scott L. Burson
2025-11-20 16:47     ` D. Ben Knoble
2025-11-27  2:10       ` Scott L. Burson
2025-11-16  5:30   ` Junio C Hamano
2025-11-17 23:23     ` Scott L. Burson
2025-11-18  4:38       ` Junio C Hamano
2025-11-27  2:38 ` [PATCH v2 0/2] userdiff: extend Scheme support to cover other Lisp dialects Scott L. Burson via GitGitGadget
2025-11-27  2:38   ` [PATCH v2 1/2] diff: "lisp" userdiff_driver Scott L. Burson via GitGitGadget
2025-11-27 10:32     ` Scott L. Burson
2025-11-27 10:51       ` Johannes Sixt
2025-11-27  2:38   ` [PATCH v2 2/2] merge with Scheme regexp; fix bugs Scott L. Burson via GitGitGadget
2025-11-27 16:09     ` Johannes Sixt [this message]
2025-12-02 10:27       ` Johannes Sixt
2026-01-14  6:18         ` Scott L. Burson
2026-01-14  8:40           ` Johannes Sixt
2026-01-15 23:18   ` [PATCH v3 0/2] userdiff: extend Scheme support to cover other Lisp dialects Scott L. Burson via GitGitGadget
2026-01-15 23:18     ` [PATCH v3 1/2] userdiff: tighten word-diff test case of the scheme driver Johannes Sixt via GitGitGadget
2026-01-15 23:18     ` [PATCH v3 2/2] userdiff: extend Scheme support to cover other Lisp dialects Scott L. Burson via GitGitGadget
2026-01-16  8:49       ` Johannes Sixt
2026-01-17  2:09         ` Scott L. Burson
2026-01-17  8:15           ` Johannes Sixt
2026-04-15  2:27     ` [PATCH v4 0/2] " Scott L. Burson via GitGitGadget
2026-04-15  2:27       ` [PATCH v4 1/2] userdiff: tighten word-diff test case of the scheme driver Johannes Sixt via GitGitGadget
2026-04-15  2:27       ` [PATCH v4 2/2] userdiff: extend Scheme support to cover other Lisp dialects Scott L. Burson via GitGitGadget
2026-04-15  6:54       ` [PATCH v4 0/2] " Johannes Sixt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6656e6d-d1e8-4ebe-821f-9211643a71ab@kdbg.org \
    --to=j6t@kdbg$(echo .)org \
    --cc=Scott@sympoiesis$(echo .)com \
    --cc=avarab@gmail$(echo .)com \
    --cc=ben.knoble@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitgitgadget@gmail$(echo .)com \
    --cc=gitster@pobox$(echo .)com \
    --cc=jaydeepjd.8914@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox