From: Jeff King <peff@peff•net>
To: Simon Richter <Simon.Richter@hogyros•de>
Cc: Junio C Hamano <gitster@pobox•com>,
Benson Muite <benson_muite@emailplus•org>,
git@vger•kernel.org
Subject: Re: Mirror repositories for submodules
Date: Thu, 4 Jun 2026 02:16:05 -0400 [thread overview]
Message-ID: <20260604061605.GA3194609@coredump.intra.peff.net> (raw)
In-Reply-To: <d64e7f31-4e00-478c-ab31-b671242865fb@hogyros.de>
On Thu, Jun 04, 2026 at 02:11:38PM +0900, Simon Richter wrote:
> Cloning from our server will, depending on what upstream uses, either a
> relative URL (which will go to our server, but we have little control over
> what the name part of the repository base URL is going to be), or an
> absolute URL that instructs clients to pull from another place, which
> conflicts with our goal to have a self-contained archive.
>
> The idea posited earlier, to have a "repository identity" that remains the
> same across forks and clones, is somewhat appealing, but the best idea I can
> come up with is generating some kind of repository UUID, and adding a
> symlink -- not a great design because it pollutes outside the repo:
>
> $ mkdir myproject
> $ cd myproject
> $ git init
> $ ls -l ..
> lrwxrwxrwx 1 simon simon 9 Jun 4 14:05
> 12345678-9abc-def0-1234-56789abcdef0.git -> myproject
> drwxrwxr-x 2 simon simon 40 Jun 4 14:04 myproject
>
> On the other hand, this can be used to construct a stable relative submodule
> URL.
Here's a thought experiment. What if you put the UUID into a URL, like:
repoid://123456789.git
Then your in-repo .gitconfig would point to that repo id and be
consistent. Of course you need some way to tell Git how to retrieve
repoid:// URLs. You could do so with a custom remote helper
(git-remote-repoid), but presumably that helper is eventually going to
end up going over one of the normal Git protocols.
So we just need to tell Git how to resolve repo id URLs into concrete
URLs. And indeed, we have url.*.insteadOf to do rewriting already. So
for example, you can add a submodule but convert it into a uuid like
this:
$ git submodule add https://github.com/git/git.git
$ git config -f .gitmodules submodule.git.url
https://github.com/git/git.git
$ git config -f .gitmodules submodule.git.url repoid://123456789.git
$ git commit -am 'add submodule with magic repoid'
Now if somebody else comes along and clones it naively, the repo uuid is
not useful to git by itself:
$ git clone --recurse-submodules repo
Submodule 'git' (repoid://123456789.git) registered for path 'git'
Cloning into '/home/peff/tmp/repo/git'...
fatal: transport 'repoid' not allowed
fatal: clone of 'repoid://123456789.git' into submodule path '/home/peff/tmp/repo/git' failed
But imagine that "somehow" they have learned that 123456789.git can be
found at some URL. You can do this:
git -c url.https://github.com/git/git.git.insteadOf=repoid://123456789.git \
clone --recurse-submodules repo.git
which would clone from the original URL. Or you could even imagine that
they have a cache of repositories named by uuid, and then:
git -c url.https://my/cache/.insteadOf=repoid:// ...
would rewrite all repoid://'s automatically.
The use of "-c" here is mostly for illustration. It is a per-command
config, so when you later try to update the submodule, you'd run into
the same problem. Probably you'd want to stuff your mapping into on-disk
config (either ~/.gitconfig, or if you have a lot of them, perhaps some
file included from there).
It would be nice if you could use "git clone -c" (note "-c" as an option
to "clone", not to "git") to set a permanent per-repo config variable.
But sadly the URL rewriting happens in the submodule repository, not the
parent. So it has to be a per-user setting.
Now, all of that said, do we still need uuids at all? If the canonical
submodule name is https://github.com/git/git.git, then anybody can just
rewrite that locally in the same way using url.*.insteadOf config. And I
think this is a pretty standard way of using submodules. E.g., you might
rewrite https:// into ssh:// if you prefer that protocol. Or point to a
local server if it's faster for you.
Which makes me wonder if I am missing something about the original
request that started this thread. But it sounds to me like it is just
asking for the existing URL-rewriting feature.
-Peff
next prev parent reply other threads:[~2026-06-04 6:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 6:11 Mirror repositories for submodules Benson Muite
2026-06-04 1:09 ` Junio C Hamano
2026-06-04 5:11 ` Simon Richter
2026-06-04 6:16 ` Jeff King [this message]
2026-06-04 9:27 ` Simon Richter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260604061605.GA3194609@coredump.intra.peff.net \
--to=peff@peff$(echo .)net \
--cc=Simon.Richter@hogyros$(echo .)de \
--cc=benson_muite@emailplus$(echo .)org \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox