From: Jeff King <peff@peff•net>
To: Jonatan Holmgren <jonatan@jontes•page>
Cc: git@vger•kernel.org
Subject: Re: [RFC] Support UTF-8 characters in Git alias names
Date: Mon, 9 Feb 2026 02:36:02 -0500 [thread overview]
Message-ID: <20260209073602.GC585828@coredump.intra.peff.net> (raw)
In-Reply-To: <3124b359-2929-4f3f-9ac6-793277fe422b@jontes.page>
On Sun, Feb 08, 2026 at 04:30:02PM +0100, Jonatan Holmgren wrote:
> I think the best approach is to support UTF-8 specifically for alias.*
> variables, which would mean modifying the git_config_parse_key() fn to allow
> UTF-8 bytes and make non-ascii aliases case-sensitive to avoid complex
> locale-dependent case folding.
>
> The main pain point would be making sure all platforms handle this nicely,
> esp since mac uses NFD and not NFC Unicode.
>
> Before implementing this, I'd like to hear:
>
> 1. Is this a feature the project would like?
> 2. Is my implementation approach reasonable?
> 3. What concerns should be addressed in said design?
> 4. Any compat requirements I should be aware of?
I think supporting non-ascii aliases is a good goal.
However, I'm not sure that special-casing the parsing of alias config
keys is the best direction. Since it's a syntactic change, the special
case would have to be understand by all code that reads or writes
config, not just git_config_parse_key(). And then you'd potentially run
into problems with older versions of Git, or alternate implementations
(of which there are several).
Plus it doesn't solve all of the issues. E.g., should we allow new
characters like "_" (for a potential "git foo_bar")? That is doable, but
what about "." (for "git foo.bar")? I think that introduces new
ambiguities into the syntax.
Taking a step back, I think the root of the issue is that the schema for
alias keys is poorly designed. Git's config syntax allows for three
levels: section, subsection, and key. The section and key fields are
restricted to alnum and dash, but the subsection is designed to be
unrestricted (modulo NUL bytes).
And that's why we have:
[branch "foo/bar"]
remote = origin
for example, because branch names don't follow the same syntax rules as
config keys. And it's the same issue here: the alias.* schema is trying
to use one syntax (alnum config keys) to store another (command names).
They _usually_ overlap, but not always. The pager.* config has the same
problem.
We've discussed this before, e.g., in:
https://lore.kernel.org/git/20150206124528.GA18859@inner.h.apk.li/
There the immediate problem was that "git foo_bar" caused an error
message. We hacked around it by suppressing the error, but it was still
impossible to add an alias or pager config. We knew that was a
limitation, but punted until somebody came along who actually cared
about making it work. Now you get to be that somebody. ;)
So what I'd propose instead is introducing a new schema like:
- setting "alias.foo.command" to "bar" would alias "git foo" to "bar";
this should work for any command name, as it is just a byte stream
- a given command subsection is matched verbatim. So alias.foo.command
matches "git foo" but not "git Foo". Likewise, we do not do any
normalization. You put what you want into your config, and it should
match the command you invoke. This is perhaps less friendly, but it
punts on any normalization or case-folding that we have to do, and
matches how the rest of Git works (paths are likewise streams of
bytes, and it is mostly up to the user to use them consistently).
- leave "alias.foo" as a historical synonym for "alias.foo.command",
so that existing config continues working
- optionally add new keys within alias.foo.* sections. For example, we
could allow alias.foo.help to provide text shown during "git help
foo". For the most part that could come later, so I'm just
illustrating possible eventual directions that the new schema would
allow. But it might be worth pondering a little now to avoid
painting ourselves into a corner. E.g., you could imagine a schema
where alias.foo.shell is set to "true" instead of sticking a "!" at
the front of the value of alias.foo.command. I don't know if that's
a good idea or not, but if we were going to do stuff like that, we'd
want to decide now before setting the alias.foo.command behavior in
stone.
- likewise, optionally do the same for pager.*
I hacked together some illustrative code below. Note that we do use
strcasecmp() currently to match command names (which kind of makes
sense, since if you had "alias.Foo" in your config, the parser would
downcase it to "alias.foo"). So probably that historical code should
continue to behave like that, but the new "alias.Foo.command" should be
more verbatim (the patch below just feeds them both to strcasecmp).
-Peff
---
diff --git a/alias.c b/alias.c
index 1a1a141a0a..44bdde58af 100644
--- a/alias.c
+++ b/alias.c
@@ -17,19 +17,30 @@ static int config_alias_cb(const char *key, const char *value,
const struct config_context *ctx UNUSED, void *d)
{
struct config_alias_data *data = d;
- const char *p;
+ const char *cmd, *p;
+ size_t cmd_len;
- if (!skip_prefix(key, "alias.", &p))
+ if (parse_config_key(key, "alias", &cmd, &cmd_len, &p) < 0)
return 0;
+ if (cmd) {
+ /* The only 3-level key we understand is alias.*.command */
+ if (strcmp(p, "command"))
+ return 0;
+ } else {
+ /* alias.foo is the same as alias.foo.command */
+ cmd = p;
+ cmd_len = strlen(p);
+ }
+
if (data->alias) {
- if (!strcasecmp(p, data->alias)) {
+ if (!strncasecmp(cmd, data->alias, cmd_len)) {
FREE_AND_NULL(data->v);
return git_config_string(&data->v,
key, value);
}
} else if (data->list) {
- string_list_append(data->list, p);
+ string_list_append_nodup(data->list, xmemdupz(cmd, cmd_len));
}
return 0;
next prev parent reply other threads:[~2026-02-09 7:36 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-08 15:30 [RFC] Support UTF-8 characters in Git alias names Jonatan Holmgren
2026-02-08 16:07 ` D. Ben Knoble
2026-02-08 23:21 ` brian m. carlson
2026-02-09 14:55 ` Junio C Hamano
2026-02-09 15:19 ` Jonatan Holmgren
2026-02-09 17:59 ` Junio C Hamano
2026-02-09 22:40 ` brian m. carlson
2026-02-09 23:14 ` Junio C Hamano
2026-02-10 0:45 ` Ben Knoble
2026-02-10 1:04 ` Junio C Hamano
2026-02-10 6:59 ` Jeff King
2026-02-09 7:36 ` Jeff King [this message]
2026-02-09 13:59 ` Theodore Tso
2026-02-09 22:01 ` [PATCH v1] alias: support UTF-8 characters via subsection syntax Jonatan Holmgren
2026-02-10 7:44 ` Jeff King
2026-02-10 8:30 ` Torsten Bögershausen
2026-02-10 16:35 ` Junio C Hamano
2026-02-10 18:31 ` [PATCH v2 0/2] support UTF-8 in alias names Jonatan Holmgren
2026-02-10 18:31 ` [PATCH v2 1/2] help: use list_aliases() for alias listing and lookup Jonatan Holmgren
2026-02-10 19:27 ` Junio C Hamano
2026-02-10 18:31 ` [PATCH v2 2/2] alias: support non-alphanumeric names via subsection syntax Jonatan Holmgren
2026-02-10 19:47 ` Junio C Hamano
2026-02-10 22:29 ` Jonatan Holmgren
2026-02-23 9:29 ` Kristoffer Haugsbakk
2026-02-23 16:07 ` Kristoffer Haugsbakk
2026-02-23 20:22 ` Junio C Hamano
2026-02-23 20:25 ` Kristoffer Haugsbakk
2026-02-24 10:27 ` Patrick Steinhardt
2026-02-10 22:27 ` [PATCH 0/3] support UTF-8 in alias names Jonatan Holmgren
2026-02-10 22:27 ` [PATCH 1/3] help: use list_aliases() for alias listing Jonatan Holmgren
2026-02-10 23:17 ` Junio C Hamano
2026-02-10 22:27 ` [PATCH 2/3] alias: prepare for subsection aliases Jonatan Holmgren
2026-02-10 22:27 ` [PATCH 3/3] alias: support non-alphanumeric names via subsection syntax Jonatan Holmgren
2026-02-11 21:18 ` [PATCH v4 0/3] support UTF-8 in alias names Jonatan Holmgren
2026-02-11 21:18 ` [PATCH v4 1/3] help: use list_aliases() for alias listing Jonatan Holmgren
2026-02-11 22:29 ` Junio C Hamano
2026-02-11 21:18 ` [PATCH v4 2/3] alias: prepare for subsection aliases Jonatan Holmgren
2026-02-11 21:53 ` Junio C Hamano
2026-02-11 21:18 ` [PATCH v4 3/3] alias: support non-alphanumeric names via subsection syntax Jonatan Holmgren
2026-02-11 22:28 ` Junio C Hamano
2026-02-12 11:16 ` Richard Kerry
2026-02-12 15:34 ` Jonatan Holmgren
2026-02-12 18:52 ` Jonatan Holmgren
2026-02-12 10:27 ` [PATCH v4 0/3] support UTF-8 in alias names Torsten Bögershausen
2026-02-12 15:35 ` Jonatan Holmgren
2026-02-16 16:15 ` [PATCH v5 0/4] support uTF-8 " Jonatan Holmgren
2026-02-16 16:15 ` [PATCH v5 1/4] help: use list_aliases() for alias listing Jonatan Holmgren
2026-02-16 16:15 ` [PATCH v5 2/4] alias: prepare for subsection aliases Jonatan Holmgren
2026-02-16 16:15 ` [PATCH v5 3/4] alias: support non-alphanumeric names via subsection syntax Jonatan Holmgren
2026-02-16 16:15 ` [PATCH v5 4/4] completion: fix zsh alias listing for subsection aliases Jonatan Holmgren
2026-02-16 18:32 ` D. Ben Knoble
2026-02-17 20:01 ` Junio C Hamano
2026-02-18 14:52 ` [PATCH v6 0/4] support UTF-8 in alias names Jonatan Holmgren
2026-02-18 14:52 ` [PATCH v6 1/4] help: use list_aliases() for alias listing Jonatan Holmgren
2026-02-18 14:52 ` [PATCH v6 2/4] alias: prepare for subsection aliases Jonatan Holmgren
2026-02-18 16:21 ` Kristoffer Haugsbakk
2026-02-18 14:52 ` [PATCH v6 3/4] alias: support non-alphanumeric names via subsection syntax Jonatan Holmgren
2026-02-18 14:52 ` [PATCH v6 4/4] completion: fix zsh alias listing for subsection aliases Jonatan Holmgren
2026-02-18 21:57 ` [PATCH v7 0/4] support UTF-8 in alias names Jonatan Holmgren
2026-02-18 21:57 ` [PATCH v7 1/4] help: use list_aliases() for alias listing Jonatan Holmgren
2026-02-24 22:19 ` Jacob Keller
2026-02-24 22:41 ` Junio C Hamano
2026-02-25 20:45 ` Junio C Hamano
2026-02-26 23:33 ` Jacob Keller
2026-02-24 22:21 ` Jacob Keller
2026-02-18 21:57 ` [PATCH v7 2/4] alias: prepare for subsection aliases Jonatan Holmgren
2026-02-18 21:57 ` [PATCH v7 3/4] alias: support non-alphanumeric names via subsection syntax Jonatan Holmgren
2026-02-24 10:55 ` Kristoffer Haugsbakk
2026-02-24 14:48 ` Jonatan Holmgren
2026-02-24 23:23 ` Kristoffer Haugsbakk
2026-02-18 21:57 ` [PATCH v7 4/4] completion: fix zsh alias listing for subsection aliases Jonatan Holmgren
2026-02-19 18:17 ` [PATCH v7 0/4] support UTF-8 in alias names Junio C Hamano
2026-02-19 18:54 ` Jonatan Holmgren
2026-02-24 17:12 ` [PATCH 0/2] Fix small issues in alias subsection handling Jonatan Holmgren
2026-02-24 17:12 ` [PATCH 1/2] doc: fix list continuation in alias subsection example Jonatan Holmgren
2026-02-24 19:11 ` Junio C Hamano
2026-02-24 19:14 ` Kristoffer Haugsbakk
2026-02-24 20:23 ` Junio C Hamano
2026-02-24 17:12 ` [PATCH 2/2] alias: treat empty subsection [alias ""] as plain [alias] Jonatan Holmgren
2026-02-26 17:00 ` [PATCH 0/2] Fix small issues in alias subsection handling Junio C Hamano
2026-02-26 20:53 ` [PATCH v2 0/3] " Jonatan Holmgren
2026-02-26 20:53 ` [PATCH v2 1/3] doc: fix list continuation in alias subsection example Jonatan Holmgren
2026-03-03 9:41 ` Kristoffer Haugsbakk
2026-03-03 15:13 ` [PATCH v2 1/3] doc: fix list continuation in alias subsection example! Jonatan Holmgren
2026-02-26 20:53 ` [PATCH v2 2/3] alias: treat empty subsection [alias ""] as plain [alias] Jonatan Holmgren
2026-02-26 20:53 ` [PATCH v2 3/3] git, help: fix memory leaks in alias listing Jonatan Holmgren
2026-02-26 21:08 ` [PATCH v2 0/3] Fix small issues in alias subsection handling Junio C Hamano
2026-03-03 15:12 ` [PATCH] doc: fix list continuation in alias.adoc Jonatan Holmgren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260209073602.GC585828@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=git@vger$(echo .)kernel.org \
--cc=jonatan@jontes$(echo .)page \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox