public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
* Re: [PATCH v2 2/2] refs: add GIT_REF_URI to specify reference backend and directory
@ 2025-11-28  5:21 Natee Korn
  0 siblings, 0 replies; 6+ messages in thread
From: Natee Korn @ 2025-11-28  5:21 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, jltobler, jn.avila, sunshine, toon


2..

^ permalink raw reply	[flat|nested] 6+ messages in thread
* [PATCH v2 0/2] refs: allow setting the reference directory
@ 2025-11-26 11:11 Karthik Nayak
  2025-11-26 11:12 ` [PATCH v2 2/2] refs: add GIT_REF_URI to specify reference backend and directory Karthik Nayak
  0 siblings, 1 reply; 6+ messages in thread
From: Karthik Nayak @ 2025-11-26 11:11 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak, jltobler, gitster, toon, sunshine,
	Jean-Noël Avila

While Git allows users to select different reference backends, unlike
with objects, there is no flexibility in selecting the reference
directory. Currently, the reference format is obtained from the config
of the repository and the reference directory is set to the $GIT_DIR.

This patch series adds a new ENV variable 'GIT_REF_URI' which takes the
reference backend and path in a URI form:

    <reference_backend>://<URI-for-resource>

For e.g. 'reftable:///foo' or 'files://$GIT_DIR/ref_migration.0xBsa0'.

One use case for this is migration between different backends. On the
server side, migrating from the files backend to the newly introduced
reftable backend can be achieved by running 'git refs migrate'. However,
for large repositories with millions of references, this migration can
take from seconds to minutes.

For some background, at GitLab, the criteria for our migration was to
reduce the downtime of the migrate ideally to zero. So running 'git refs
migrate --ref-format=reftable' by itself wouldn't work, since it scales
with the number of references and we have repos with millions of
references, so we need to migrate without loosing any information. We
came up with the following plan:

  1. Run git-pack-refs(1) and note timestamp of the generated packed-refs
     file.
  2. Run git refs migrate –dry-run.
  3. If there are no ongoing reference requests (read/write)
     a. Lock the repository by blocking incoming requests (done on a
        layer above git, in Gitaly [1]).
     b. If the timestamp of the packed-refs file has changed, unlock
        the repo and repeat from step 1.
     c. Apply all the loose refs to the dry-run reftable folder (this
        requires support in Git to write refs to arbitrary folder).
     d. Move the reftable dry-run folder into the GIT_DIR.
     e. Swap the repo config
     f. Unlock repo access

Using such a route, scales much better since we only have to worry about
blocking the repository by O(ref written between #1 and #3a) and not
O(refs in repo). But for doing so, we need to be able to write to a
arbitrary reference backend + path. This is to add the missing
references to the dry-run reftable folder. This series, achieves that.

The first commit adds the required changes to create a 'ref_store' for a
given path. The second commit parses the URI if available when creating
the main ref store.

This is based on top of 9a2fb147f2 (Git 2.52, 2025-11-17).

[1]: https://gitlab.com/gitlab-org/gitaly

---
Changes in v2:
- Added more clarification and proper intent in the cover message.
- Changed the format from '<ref_backend>://<path>' to
  `<ref_backend>://<URI-for-resource>` as it much clearer.
- Added logic to check for the '//' in the provided URI and a test for
  the same.
- In the tests:
  - Use test_must_fail() instead of ! git
  - Fix looped tests not using the variables correctly and ensure that
    the test description is correct.
- Link to v1: https://patch.msgid.link/20251119-kn-alternate-ref-dir-v1-0-4cf4a94c8bed@gmail.com

---
 Documentation/git.adoc |   8 ++++
 environment.h          |   1 +
 refs.c                 |  71 +++++++++++++++++++++++++++--
 t/meson.build          |   1 +
 t/t1423-ref-backend.sh | 121 +++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 199 insertions(+), 3 deletions(-)

Karthik Nayak (2):
      refs: support obtaining ref_store for given dir
      refs: add GIT_REF_URI to specify reference backend and directory

Range-diff versus v1:

1:  f6e8aa37fe ! 1:  c925726efd refs: support obtaining ref_store for given dir
    @@ Commit message
         The refs subsystem uses the `get_main_ref_store()` to obtain the main
         ref_store for a given repository. In the upcoming patches we also want
         to create a ref_store for any given reference directory, which may exist
    -    in arbitrary paths. To support such behavior, extract out the core logic
    -    for creating out the ref_store from `get_main_ref_store()` into a new
    -    function `get_ref_store_for_dir()` which can provide the ref_store for a
    +    in arbitrary paths. For the files backend and the reftable backend, the
    +    reference directory is generally the $GIT_DIR.
    +
    +    To support such behavior, extract out the core logic for creating out
    +    the ref_store from `get_main_ref_store()` into a new function
    +    `get_ref_store_for_dir()` which can provide the ref_store for a
         given (repository, directory, reference format) combination.
     
         Signed-off-by: Karthik Nayak <karthik.188@gmail•com>
2:  5e30fa334e ! 2:  b859ebad64 refs: add GIT_REF_URI to specify reference backend and directory
    @@ Commit message
         Add a new environment variable 'GIT_REF_URI' that specifies both the
         reference backend and directory path using a URI format:
     
    -        <ref_backend>://<path>
    +        <ref_backend>://<URI-for-resource>
     
         When set, this variable is used to obtain the main reference store for
         all Git commands. The variable is checked in `get_main_ref_store()`
    @@ Commit message
         Add a new test file 't1423-ref-backend.sh' to test this environment
         variable.
     
    +    Helped-by: Jean-Noël Avila <jn.avila@free•fr>
         Signed-off-by: Karthik Nayak <karthik.188@gmail•com>
     
      ## Documentation/git.adoc ##
    @@ Documentation/git.adoc: double-quotes and respecting backslash escapes. E.g., th
      	See `--ref-format` in linkgit:git-init[1].
      
     +`GIT_REF_URI`::
    -+    Specify which reference backend and path to be used, if not specified the
    -+    backend is inferred from the configuration and $GIT_DIR is used as the
    -+    path.
    ++    Specify which reference backend to be used along with its URI. Reference
    ++    backends like the files, reftable backend use the $GIT_DIR as their URI.
     ++
    -+Expects the format '<ref_backend>://<path>', where the 'backend' specifies the
    -+reference backend and the 'path' specifies the directory used by the backend.
    ++Expects the format `<ref_backend>://<URI-for-resource>`, where the
    ++_<ref_backend>_ specifies the reference backend and the _<URI-for-resource>_
    ++specifies the URI used by the backend.
     +
      Git Commits
      ~~~~~~~~~~~
    @@ refs.c: static struct ref_store *get_ref_store_for_dir(struct repository *r,
     +	}
     +
     +	format_string = ref_backend_info.items[0].string;
    ++	if (!starts_with(ref_backend_info.items[1].string, "//")) {
    ++		error("invalid reference backend uri format '%s'", uri);
    ++		goto cleanup;
    ++	}
    ++	dir = ref_backend_info.items[1].string + 2;
    ++
    ++	format_string = ref_backend_info.items[0].string;
     +	dir = ref_backend_info.items[1].string + 2;
     +
     +	if (!dir || !dir[0]) {
    @@ t/t1423-ref-backend.sh (new)
     +		cd repo &&
     +		GIT_REF_URI="" &&
     +		export GIT_REF_URI &&
    -+		! git refs list 2>err &&
    ++		test_must_fail git refs list 2>err &&
     +		test_grep "reference backend uri is empty" err
     +	)
     +'
    @@ t/t1423-ref-backend.sh (new)
     +		cd repo &&
     +		GIT_REF_URI="reftable@/home/reftable" &&
     +		export GIT_REF_URI &&
    -+		! git refs list 2>err &&
    ++		test_must_fail git refs list 2>err &&
     +		test_grep "invalid reference backend uri format" err
     +	)
     +'
    @@ t/t1423-ref-backend.sh (new)
     +		cd repo &&
     +		GIT_REF_URI="reftable://" &&
     +		export GIT_REF_URI &&
    -+		! git refs list 2>err &&
    ++		test_must_fail git refs list 2>err &&
     +		test_grep "invalid path in uri" err
     +	)
     +'
     +
    ++test_expect_success 'uri ends at colon' '
    ++	test_when_finished "rm -rf repo" &&
    ++	git init --ref-format=files repo &&
    ++	(
    ++		cd repo &&
    ++		GIT_REF_URI="reftable:" &&
    ++		export GIT_REF_URI &&
    ++		test_must_fail git refs list 2>err &&
    ++		test_grep "invalid reference backend uri format" err
    ++	)
    ++'
    ++
     +test_expect_success 'unknown reference backend' '
     +	test_when_finished "rm -rf repo" &&
     +	git init --ref-format=files repo &&
    @@ t/t1423-ref-backend.sh (new)
     +		cd repo &&
     +		GIT_REF_URI="db://.git" &&
     +		export GIT_REF_URI &&
    -+		! git refs list 2>err &&
    ++		test_must_fail git refs list 2>err &&
     +		test_grep "unknown reference backend" err
     +	)
     +'
    @@ t/t1423-ref-backend.sh (new)
     +			continue
     +		fi
     +
    -+		test_expect_success 'read from other reference backend' '
    ++		test_expect_success "read from $to_format backend" '
     +			test_when_finished "rm -rf repo" &&
    -+			git init --ref-format=files repo &&
    ++			git init --ref-format=$from_format repo &&
     +			(
     +				cd repo &&
     +				test_commit 1 &&
     +				test_commit 2 &&
     +				test_commit 3 &&
     +
    -+				git refs migrate --dry-run --ref-format=reftable >out &&
    -+				REFTABLE_PATH=$(cat out | sed "s/.* ${SQ}\(.*\)${SQ}/\1/") &&
    ++				git refs migrate --dry-run --ref-format=$to_format >out &&
    ++				BACKEND_PATH=$(cat out | sed "s/.* ${SQ}\(.*\)${SQ}/\1/") &&
     +				git refs list >expect &&
    -+				GIT_REF_URI="reftable://$REFTABLE_PATH" git refs list >actual &&
    ++				GIT_REF_URI="$to_format://$BACKEND_PATH" git refs list >actual &&
     +				test_cmp expect actual
     +			)
     +		'
     +
    -+		test_expect_success 'write to other reference backend' '
    ++		test_expect_success "write to $to_format backend" '
     +			test_when_finished "rm -rf repo" &&
    -+			git init --ref-format=files repo &&
    ++			git init --ref-format=$from_format repo &&
     +			(
     +				cd repo &&
     +				test_commit 1 &&
     +				test_commit 2 &&
     +				test_commit 3 &&
     +
    -+				git refs migrate --dry-run --ref-format=reftable >out &&
    ++				git refs migrate --dry-run --ref-format=$to_format >out &&
     +				git refs list >expect &&
     +
    -+				REFTABLE_PATH=$(cat out | sed "s/.* ${SQ}\(.*\)${SQ}/\1/") &&
    -+				GIT_REF_URI="reftable://$REFTABLE_PATH" git tag -d 1 &&
    ++				BACKEND_PATH=$(cat out | sed "s/.* ${SQ}\(.*\)${SQ}/\1/") &&
    ++				GIT_REF_URI="$to_format://$BACKEND_PATH" git tag -d 1 &&
     +
     +				git refs list >actual &&
     +				test_cmp expect actual &&
     +
    -+				GIT_REF_URI="reftable://$REFTABLE_PATH" git refs list >expect &&
    ++				GIT_REF_URI="$to_format://$BACKEND_PATH" git refs list >expect &&
     +				git refs list >out &&
     +				cat out | grep -v "refs/tags/1" >actual &&
     +				test_cmp expect actual


base-commit: 9a2fb147f2c61d0cab52c883e7e26f5b7948e3ed
change-id: 20251105-kn-alternate-ref-dir-3e572e8cd0ef

Thanks
- Karthik


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-11-28  5:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-28  5:21 [PATCH v2 2/2] refs: add GIT_REF_URI to specify reference backend and directory Natee Korn
  -- strict thread matches above, loose matches on Subject: below --
2025-11-26 11:11 [PATCH v2 0/2] refs: allow setting the reference directory Karthik Nayak
2025-11-26 11:12 ` [PATCH v2 2/2] refs: add GIT_REF_URI to specify reference backend and directory Karthik Nayak
2025-11-26 16:17   ` Junio C Hamano
2025-11-27 14:52     ` Karthik Nayak
2025-11-27 20:02       ` Junio C Hamano
2025-11-27 21:45         ` Karthik Nayak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox