public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
* [GSoC][RFC v2] Proposal: Improve the new git repo command
@ 2026-03-16 13:04 Pushkar Singh
  2026-03-16 18:10 ` Karthik Nayak
  2026-03-18 13:42 ` [GSoC][RFC v3] " Pushkar Singh
  0 siblings, 2 replies; 4+ messages in thread
From: Pushkar Singh @ 2026-03-16 13:04 UTC (permalink / raw)
  To: git
  Cc: lucasseikioshiro, jltobler, karthik.188, siddharthasthana31,
	ayu.chandekar, christian.couder, peff, gitster

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 16372 bytes --]

Hi Everyone,
This is the second version of my proposal for "Improve the new git repo command" in Google Summer of Code 2026. 

The Doc version:
https://docs.google.com/document/d/1HM1HNQqUrGdqFdUppc02BTmPuwXC2ozCw9mLrbaVUHc/edit?usp=sharing

I'd appreciate any feedback on this.

Thanks,
Pushkar
---------8<----------8<----------8<----------8<----------8<----------8<----------8<----------8<

GSoC 2026 @ Git | Pushkar Singh
Improve the new git repo command
---------------------------------------------------


Personal Information:
---------------------
Name: Pushkar Singh
E-mail: pushkarkumarsingh1970@gmail•com

Education: XIM University, Bhubaneswar, Odisha, India
Year: II/III
Degree: Bachelors in Computer Science & Engineering

Time-Zone: UTC + 5:30 (IST)

Personal page: https://pushkarscripts.com/
Blog: https://medium.com/@pushkarscripts/
GitHub: https://github.com/pushkarscripts/


Pre-GSOC:
---------

I began exploring Git’s codebase by studying its documentation, 
reviewing prior mailing list discussions, and building Git from 
source. 
I focused on understanding the test framework, patch submission 
workflow using git send-email, versioned patch iteration, and 
the review culture on the mailing list.

After becoming familiar with the contribution process, I started 
submitting patches.


Contributions to Git (Chronological Order):
-------------------------------------------

* [PATCH v4] t1300: use test helpers instead of test builtins
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260104194812.15134-1-pushkarkumarsingh1970@gmail.com/t/#u
This patch is my first contribution to fulfill microproject 
criteria. It replaces legacy test -f and test -h checks with 
test_path_is_file and test_path_is_symlink in the test suite.

* [PATCH v2] t1410: use test helpers in reflog rewind test
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260111191525.17087-1-pushkarkumarsingh1970@gmail.com/t/#u
Replaced raw file existence checks in the reflog rewind test 
with test_path_is_file and test_path_is_missing. The subject 
and commit message were refined in v2 following review feedback.

* [PATCH] Documentation/config: fix replacement for --get-urlmatch
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260115110832.15315-1-pushkarkumarsingh1970@gmail.com/T/#u
    Related Bug Report: https://lore.kernel.org/git/CAGJzqs=0Zr2iqsTUZdjdwpbtaS7kuBOf=E_XT=vbdfyNTKkjNQ@mail.gmail.com/t/#u
Corrected documentation that incorrectly suggested combining 
--url with --all for --get-urlmatch. Verified the behavior 
against the implementation and updated the documentation 
accordingly.

* [PATCH v3] path: refactor normalize_path_copy_len for clarity
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260221110511.1592-2-pushkarkumarsingh1970@gmail.com/t/#u
Proposed a refactor of normalize_path_copy_len to improve 
clarity while preserving existing control flow. The discussion
focused on maintaining readability and minimizing structural 
changes.

* [PATCH v4] subtree: validate --prefix against commit in split
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260203164815.68258-2-pushkarkumarsingh1970@gmail.com/T/#u
    Related Bug Report: https://lore.kernel.org/git/CAFePT4xDGegpEFuFemCXsH890E2WXnG3JzUZeiLi9KW8D8beOg@mail.gmail.com/T/#u
Updated git subtree split to validate --prefix against the 
specified commit rather than the working tree. The change 
addresses a mailing list report where --prefix was incorrectly 
validated against the current working directory instead of the 
given revision. Added regression tests and revised the patch 
across four versions following review and CI feedback before 
integration into next.

* [RFC] git repo info: expose repository paths
    Status: WIP, Under discussion
    Thread: https://lore.kernel.org/git/20260218183511.17195-1-pushkarkumarsingh1970@gmail.com/t/#mdd8548b634142f4916e2911f7025e736a4789a07
Proposed extending git repo info to expose additional repository
path-related values currently accessible via git rev-parse.
Initiated design discussion regarding path handling and output
format, incorporating feedback during iteration.

Additional Participation:

In addition to submitting patches, I have:
*  Reviewed patches from other contributors
    (1) https://lore.kernel.org/git/CALE2CrTzYbMam_fi5HszSUFVZADE1haLtpBqhUmd1ki9biM2hA@mail.gmail.com/T/#u
    (2) https://lore.kernel.org/git/20260202134657.15320-1-pushkarkumarsingh1970@gmail.com/T/#u
    (3) https://lore.kernel.org/git/CALE2CrQFZngj6_NDuf0S=_-nDrrf6b6r=C9jMyEVjwMqvh6J2w@mail.gmail.com/
    (4) https://lore.kernel.org/git/CALE2CrTuZkFm1R3Bb6gFmrN1trr88vdO_7Aw6ycBYvFpWMEEtA@mail.gmail.com/T/#u
    (5) https://lore.kernel.org/git/CALE2CrSu-JW___Lav0SnLPfwxB8QCRYMKQgsfbXCHrAQSEyDoA@mail.gmail.com/T/#u
    (6) https://lore.kernel.org/git/CALE2CrQTvHeu21yLXtRg=A6ak9AB_vvwPirQNFDjZ2AmhoTzTQ@mail.gmail.com/T/#u
    (7) https://lore.kernel.org/git/CALE2CrR_Xrei32pc_gJ16mArZPjZ-+bNWWFnsJ3i+OGqbxwPcg@mail.gmail.com/T/#u
*  Assisted in resolving a git rebase issue on the mailing list
    (1) https://lore.kernel.org/git/CALE2CrQ415Ewm_F-DLZu=JY2BTWofmGgorEOa0D=USr5d510SQ@mail.gmail.com/T/#madfc34c4334a7d62baa18b18e3c8fa83600f8455
*  Studied the original discussions on git repo
    (1) https://public-inbox.org/git/20250610152117.14826-1-lucasseikioshiro@gmail.com/t/#u
    (2) https://lore.kernel.org/git/20251207190532.67107-1-lucasseikioshiro@gmail.com/T/#u
    (3) https://lore.kernel.org/git/20260218211845.96009-1-lucasseikioshiro@gmail.com/T/#u
    (4) https://lore.kernel.org/git/20260203221758.1164434-1-jltobler@gmail.com/T/#u
*  Examined the implementation in builtin/repo.c


The Plan
--------

I will be iterating on this project in blocks and with the
review-driven approach. By introducing every changes in small,
logically isolated patches, I'll ensure clarity, ease-of-review,
and architectural stability.

First I want to cover foundational repository path keys,
because they create instant structural value and more closely fit
with existing functionalities of rev-parse.

For every key proposed or enhancement made, I will:

  - Ensure behavior matches with existing helpers.
  - Clarify semantics(absolute vs relative paths, edge cases) by
    discussing on the mailing list before finalizing the behavior.
  - Add one key (or one closely tied family of keys) per patch.
  - Add targeted tests covering:
        * bare repositories
        * linked worktrees
        * submodules
        * shallow clones
  - Update documentation accordingly.

I will avoid large changes and focus on small, reviewable patches,
instead of rapidly expanding features.


Path Key Expansion
------------------

I will incrementally expose selected repository path values
currently accessible via:

  - git rev-parse
  - git rev-parse --git-path

My initial focus will be on foundational keys such as:

  path.git-dir
  path.common-dir
  path.toplevel
  path.superproject-working-tree

Subsequent patches may introduce additional --git-path
equivalents such as:

  path.index-file
  path.objects-dir
  path.config-file

Each key will be evaluated individually to ensure clarity,
necessity, and consistent semantics.


Optional: Category-Based Queries (If Aligned)
--------------------------------------------

If agreed upon through mailing list discussion, I will introduce
explicit grouped queries, such as:

  git repo info paths

The expansion will still be deterministic and predefined.
I'll not be introducing any implicit or dynamic grouping behavior.


repo structure Enhancements
---------------------------

If maintainers deem it appropriate maybe I will tackle some 
carefully scoped improvements to git repo structure.

Potential areas include:
  - Distribution-oriented metrics, only if aligned with the
    tool’s long-term direction.
  - Low-friction structural metrics (e.g., path depth),
    as long as they do not add excessive traversal cost.

Any such enhancement will be introduced in small,
standalone patches, taking performance, maintainability, 
and output stability into account. If scope or review
timelines demand, this stage will be delayed.


Architectural Considerations
----------------------------

Where appropriate, I will:

  - Prefer explicit repository context over global state.
  - Avoid duplicating logic already implemented in rev-parse. 
    Where possible, I'll reuse existing helper functions rather 
    than reimplementing path resolution logic.
  - Preserve conservative output stability.

Structural refactoring will only be undertaken when directly
relevant to git repo and supported through review discussion.


Timeline
--------

Keeping Git's iterative and review-driven workflow in mind, I've 
designed the timeline to focus on core enhancements in order to 
ensure that I can produce meaningful deliverables even if review 
cycles extend.


Pre-Coding Preparation (Before Official Start)

- Continue participating in git repo discussions.
- Improve and restrict scope of path key expansion.
- Confirm semantics for absolute vs relative path handling.
- Define patch ordering to keep the submissions small
  and logically independent.


Community Bonding Period (May)

Primary objective: finalize scope and ordering.

- Confirm priority list of path keys.
- Align on output stability expectations.
- Clarify whether category-based queries are desirable
  in this cycle or deferred.
- Identify architectural considerations relevant
  to builtin/repo.c.

I will get to implementation once the semantics feel reasonably
aligned through mailing list discussion.


Phase 1 (Weeks 1–4): Foundational Path Keys

Objective: establish core path parity in git repo info
with essential rev-parse values.

* Weeks 1–2:
  - Submit path.git-dir
  - Submit path.common-dir

  I'll present these foundational keys early on to keep 
  semantics consistent, and stabilize output expectations.

* Week 3:
  - Submit path.toplevel
  - Submit path.superproject-working-tree

  These will provide working-tree inspection coverage to
  and submodule-aware contexts.

* Week 4:
  - Submit selected stable --git-path equivalents
    (e.g., path.index-file, path.objects-dir),
    introduced incrementally, one per patch.

I'll submit each key independently. When semantics are 
already aligned, I'll send consecutive patches while
older ones will remain pending, which allows a significant 
overlap between submission and iteration.

Midpoint Goal:
 Deliver foundational path keys that are either merged or
 in next, with consensus on semantics.


Phase 2 (Weeks 5–8): Additional Path Keys & Refinement

- Finish the remaining agreed --git-path parity keys.
- Address changes from review cycles of Phase 1.
- Stabilize behaviour across edge-case environments.

This phase purposely leaves time for review-guided
iteration without expanding scope.


Phase 3 (Weeks 9–10): Optional Enhancements

Only if Phase 1 and 2 stabilize earlier than expected,
I'll begin:
- Introducing the grouped category queries(e.g., info paths),
  subject to prior agreement.
- Carefully extending repo structure with one metric 
  at a time.

I’m not going to attempt any bulk metric expansion here.


Final Weeks (Weeks 11–12): Consolidation

Over the last weeks of this program, I will:
- Address remaining review feedback.
- Adjust patches if requested or rework them.
- Finalize documentation.
- Ensure CI stability and cross-platform behavior.

During this time no new features will be introduced.


Prioritization Under Constraints
--------------------------------

Considering Git’s iterative review process, I have structured the
project so that foundational improvements are delivered first.

If review cycles extend longer than anticipated, my priority will be:

1. Core path parity (path.git-dir, path.common-dir,
   path.toplevel, path.superproject-working-tree)
2. Additional agreed --git-path equivalents
3. Category-based queries
4. repo structure metric extensions

This ordering ensures that the most architecturally meaningful
enhancements are completed even if optional improvements
must be deferred.


Post-GSoC Continuation
----------------------

My involvement in Git is not limited to the GSoC period.

After the coding phase, I intend to:
- Continue refining git repo through incremental improvements.
- Address follow-up review feedback or deferred enhancements.
- Participate in reviewing related patches where appropriate.
- Contribute to ongoing efforts around repository introspection
  and gradual libification.

Over time, I hope to contribute not only through patches,
but also by helping new contributors navigate the mailing
list workflow and patch iteration process.

If given the opportunity in the future, I would be glad to
support mentoring efforts and help the community grow further.


Availability
------------

My end-semester examinations conclude on March 28.
Following this, I will not have academic obligations
during the GSoC coding period.

The project is expected to fall within the 175–350 hour
range. I am prepared to commit at the higher end of this
range.

During the official coding phase (approximately 12 weeks),
I will be available for 30–35 hours per week. This allows
for approximately 360–420 hours of focused development time,
comfortably covering the expected project scope.

I will also remain active on the mailing list during the
community bonding period and will use that time to refine
design decisions and prepare patch sequencing.

I do not anticipate any internships, travel, or major
commitments that would interfere with this schedule.


Blogging:
---------

For the past one year I have been writing technical articles 
on Medium, mostly related to Git workflows, developer tooling, 
and lessons from working with real codebases.

I will be sharing weekly updates for the GSoC period to document 
progress and the discussions on these mailing lists for 
transparency, and more importantly, to help future contributors.

Medium: https://medium.com/@pushkarscripts


Risk Assessment and Mitigation
------------------------------

1. Review Cycle Duration

Considering Git’s iterative mailing list workflow, existing 
patches might go through several updates before being 
accepted.

Mitigation:
  The project is structured so that foundational path
  keys are delivered first. Independent patches allow
  parallel review and refinement.

2. Scope Creep

Expanding both path keys and structure metrics
may introduce unintended scope growth.

Mitigation:
  Optional enhancements (categories and additional
  metrics) are explicitly deferred until foundational
  work stabilizes.

3. Semantic Ambiguity

Path-related behavior (absolute vs relative,
worktree interactions, submodules) may require
careful alignment.

Mitigation:
  Semantics will be clarified during the bonding
  period and validated against existing helpers
  before implementation.

---

Thank you for your time and consideration. I look forward
to contributing further to the project and continuing to
learn through the review process.

Regards, 
Pushkar Singh

---------8<----------8<----------8<----------8<----------8<----------8<----------8<----------8<

Changes in v2:
- Updated status of my recent patch activities.
- Added recent patch reviews I made in Mailing List.
- Improved clarity and readability across sections. 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSoC][RFC v2] Proposal: Improve the new git repo command
  2026-03-16 13:04 [GSoC][RFC v2] Proposal: Improve the new git repo command Pushkar Singh
@ 2026-03-16 18:10 ` Karthik Nayak
  2026-03-17 17:20   ` Pushkar Singh
  2026-03-18 13:42 ` [GSoC][RFC v3] " Pushkar Singh
  1 sibling, 1 reply; 4+ messages in thread
From: Karthik Nayak @ 2026-03-16 18:10 UTC (permalink / raw)
  To: Pushkar Singh, git
  Cc: lucasseikioshiro, jltobler, siddharthasthana31, ayu.chandekar,
	christian.couder, peff, gitster

[-- Attachment #1: Type: text/plain, Size: 11712 bytes --]

Pushkar Singh <pushkarkumarsingh1970@gmail•com> writes:

Hello Pushkar,

Thanks for your proposal

[snip]


> The Plan
> --------
>
> I will be iterating on this project in blocks and with the
> review-driven approach. By introducing every changes in small,
> logically isolated patches, I'll ensure clarity, ease-of-review,
> and architectural stability.
>
> First I want to cover foundational repository path keys,
> because they create instant structural value and more closely fit
> with existing functionalities of rev-parse.
>
> For every key proposed or enhancement made, I will:
>
>   - Ensure behavior matches with existing helpers.

Which existing helpers?

>   - Clarify semantics(absolute vs relative paths, edge cases) by
>     discussing on the mailing list before finalizing the behavior.

It would be nice if you explained a bit about this, what is the current
condition what are your thoughts and what do you plan to implement.

>   - Add one key (or one closely tied family of keys) per patch.
>   - Add targeted tests covering:
>         * bare repositories
>         * linked worktrees
>         * submodules
>         * shallow clones

I'd be very interested in what the current test scenario looks like and
how we'll improve on top of that.

>   - Update documentation accordingly.
>
> I will avoid large changes and focus on small, reviewable patches,
> instead of rapidly expanding features.
>

I agree, most often students underestimate the time needed for iterating
and getting reviews on the mailing list. Having smaller well defined
patches helps.

>
> Path Key Expansion
> ------------------
>
> I will incrementally expose selected repository path values
> currently accessible via:
>
>   - git rev-parse
>   - git rev-parse --git-path
>
> My initial focus will be on foundational keys such as:
>
>   path.git-dir
>   path.common-dir
>   path.toplevel
>   path.superproject-working-tree
>
> Subsequent patches may introduce additional --git-path
> equivalents such as:
>
>   path.index-file
>   path.objects-dir
>   path.config-file
>
> Each key will be evaluated individually to ensure clarity,
> necessity, and consistent semantics.
>
>
> Optional: Category-Based Queries (If Aligned)
> --------------------------------------------
>
> If agreed upon through mailing list discussion, I will introduce
> explicit grouped queries, such as:
>
>   git repo info paths
>
> The expansion will still be deterministic and predefined.
> I'll not be introducing any implicit or dynamic grouping behavior.
>
>
> repo structure Enhancements
> ---------------------------
>
> If maintainers deem it appropriate maybe I will tackle some
> carefully scoped improvements to git repo structure.
>
> Potential areas include:
>   - Distribution-oriented metrics, only if aligned with the
>     tool’s long-term direction.
>   - Low-friction structural metrics (e.g., path depth),
>     as long as they do not add excessive traversal cost.
>
> Any such enhancement will be introduced in small,
> standalone patches, taking performance, maintainability,
> and output stability into account. If scope or review
> timelines demand, this stage will be delayed.
>

I'm curios to know about the timeline and how this plans into it.
Reading along.

>
> Architectural Considerations
> ----------------------------
>
> Where appropriate, I will:
>
>   - Prefer explicit repository context over global state.
>   - Avoid duplicating logic already implemented in rev-parse.
>     Where possible, I'll reuse existing helper functions rather
>     than reimplementing path resolution logic.
>   - Preserve conservative output stability.
>

I'm not sure what the last sentence here means.

> Structural refactoring will only be undertaken when directly
> relevant to git repo and supported through review discussion.
>
>
> Timeline
> --------
>
> Keeping Git's iterative and review-driven workflow in mind, I've
> designed the timeline to focus on core enhancements in order to
> ensure that I can produce meaningful deliverables even if review
> cycles extend.
>
>
> Pre-Coding Preparation (Before Official Start)
>
> - Continue participating in git repo discussions.
> - Improve and restrict scope of path key expansion.
> - Confirm semantics for absolute vs relative path handling.
> - Define patch ordering to keep the submissions small
>   and logically independent.
>
>
> Community Bonding Period (May)
>
> Primary objective: finalize scope and ordering.
>
> - Confirm priority list of path keys.
> - Align on output stability expectations.
> - Clarify whether category-based queries are desirable
>   in this cycle or deferred.

In this cycle? IF we do go with category-based queries, isn't that a
design choice which affects all 'git repo info' keys? Would we need to
specifically solve for path keys?

> - Identify architectural considerations relevant
>   to builtin/repo.c.
>

What do you mean by this?

> I will get to implementation once the semantics feel reasonably
> aligned through mailing list discussion.
>
>
> Phase 1 (Weeks 1–4): Foundational Path Keys
>
> Objective: establish core path parity in git repo info
> with essential rev-parse values.
>
> * Weeks 1–2:
>   - Submit path.git-dir
>   - Submit path.common-dir
>
>   I'll present these foundational keys early on to keep
>   semantics consistent, and stabilize output expectations.
>
> * Week 3:
>   - Submit path.toplevel
>   - Submit path.superproject-working-tree
>
>   These will provide working-tree inspection coverage to
>   and submodule-aware contexts.
>
> * Week 4:
>   - Submit selected stable --git-path equivalents
>     (e.g., path.index-file, path.objects-dir),
>     introduced incrementally, one per patch.
>
> I'll submit each key independently. When semantics are
> already aligned, I'll send consecutive patches while
> older ones will remain pending, which allows a significant
> overlap between submission and iteration.
>
> Midpoint Goal:
>  Deliver foundational path keys that are either merged or
>  in next, with consensus on semantics.
>
>
> Phase 2 (Weeks 5–8): Additional Path Keys & Refinement
>
> - Finish the remaining agreed --git-path parity keys.
> - Address changes from review cycles of Phase 1.
> - Stabilize behaviour across edge-case environments.
>
> This phase purposely leaves time for review-guided
> iteration without expanding scope.
>
>
> Phase 3 (Weeks 9–10): Optional Enhancements
>
> Only if Phase 1 and 2 stabilize earlier than expected,
> I'll begin:
> - Introducing the grouped category queries(e.g., info paths),
>   subject to prior agreement.
> - Carefully extending repo structure with one metric
>   at a time.
>
> I’m not going to attempt any bulk metric expansion here.
>
>
> Final Weeks (Weeks 11–12): Consolidation
>
> Over the last weeks of this program, I will:
> - Address remaining review feedback.
> - Adjust patches if requested or rework them.
> - Finalize documentation.
> - Ensure CI stability and cross-platform behavior.
>
> During this time no new features will be introduced.
>

Something we'd also like to see is if you have other events which might
affect the timeline, like exams at college. If not, worthwhile to call
it out.

>
> Prioritization Under Constraints
> --------------------------------
>
> Considering Git’s iterative review process, I have structured the
> project so that foundational improvements are delivered first.
>
> If review cycles extend longer than anticipated, my priority will be:
>
> 1. Core path parity (path.git-dir, path.common-dir,
>    path.toplevel, path.superproject-working-tree)
> 2. Additional agreed --git-path equivalents
> 3. Category-based queries
> 4. repo structure metric extensions
>
> This ordering ensures that the most architecturally meaningful
> enhancements are completed even if optional improvements
> must be deferred.
>
>
> Post-GSoC Continuation
> ----------------------
>
> My involvement in Git is not limited to the GSoC period.
>
> After the coding phase, I intend to:
> - Continue refining git repo through incremental improvements.
> - Address follow-up review feedback or deferred enhancements.
> - Participate in reviewing related patches where appropriate.
> - Contribute to ongoing efforts around repository introspection
>   and gradual libification.
>
> Over time, I hope to contribute not only through patches,
> but also by helping new contributors navigate the mailing
> list workflow and patch iteration process.
>
> If given the opportunity in the future, I would be glad to
> support mentoring efforts and help the community grow further.
>
>
> Availability
> ------------
>

Ah, seems like you do go over it.

> My end-semester examinations conclude on March 28.
> Following this, I will not have academic obligations
> during the GSoC coding period.
>
> The project is expected to fall within the 175–350 hour
> range. I am prepared to commit at the higher end of this
> range.
>
> During the official coding phase (approximately 12 weeks),
> I will be available for 30–35 hours per week. This allows
> for approximately 360–420 hours of focused development time,
> comfortably covering the expected project scope.
>
> I will also remain active on the mailing list during the
> community bonding period and will use that time to refine
> design decisions and prepare patch sequencing.
>
> I do not anticipate any internships, travel, or major
> commitments that would interfere with this schedule.
>
>
> Blogging:
> ---------
>
> For the past one year I have been writing technical articles
> on Medium, mostly related to Git workflows, developer tooling,
> and lessons from working with real codebases.
>
> I will be sharing weekly updates for the GSoC period to document
> progress and the discussions on these mailing lists for
> transparency, and more importantly, to help future contributors.
>
> Medium: https://medium.com/@pushkarscripts
>
>
> Risk Assessment and Mitigation
> ------------------------------
>
> 1. Review Cycle Duration
>
> Considering Git’s iterative mailing list workflow, existing
> patches might go through several updates before being
> accepted.
>
> Mitigation:
>   The project is structured so that foundational path
>   keys are delivered first. Independent patches allow
>   parallel review and refinement.
>
> 2. Scope Creep
>
> Expanding both path keys and structure metrics
> may introduce unintended scope growth.
>
> Mitigation:
>   Optional enhancements (categories and additional
>   metrics) are explicitly deferred until foundational
>   work stabilizes.
>
> 3. Semantic Ambiguity
>
> Path-related behavior (absolute vs relative,
> worktree interactions, submodules) may require
> careful alignment.
>
> Mitigation:
>   Semantics will be clarified during the bonding
>   period and validated against existing helpers
>   before implementation.
>
> ---
>
> Thank you for your time and consideration. I look forward
> to contributing further to the project and continuing to
> learn through the review process.
>
> Regards,
> Pushkar Singh
>
> ---------8<----------8<----------8<----------8<----------8<----------8<----------8<----------8<
>
> Changes in v2:
> - Updated status of my recent patch activities.
> - Added recent patch reviews I made in Mailing List.
> - Improved clarity and readability across sections.

Regards,
Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSoC][RFC v2] Proposal: Improve the new git repo command
  2026-03-16 18:10 ` Karthik Nayak
@ 2026-03-17 17:20   ` Pushkar Singh
  0 siblings, 0 replies; 4+ messages in thread
From: Pushkar Singh @ 2026-03-17 17:20 UTC (permalink / raw)
  To: Karthik Nayak
  Cc: git, lucasseikioshiro, jltobler, siddharthasthana31,
	ayu.chandekar, christian.couder, peff, gitster

Hi Karthik,
Thanks for your review, this was very helpful.

> Which existing helpers?

I was referring to helpers used by git rev-parse and related path
resolution logic, such as repo_git_pathv() and repo_common_path().
I’ll make sure to explicitly clarify this and ensure reuse of existing
helpers instead of introducing new path handling logic.

> It would be nice if you explained a bit about this, what is the current
> condition what are your thoughts and what do you plan to implement.

Makes sense. I’ll expand this section to describe the current ambiguity
around absolute vs relative paths, and outline the approaches being
discussed along with what I plan to follow.

> I'd be very interested in what the current test scenario looks like
> and how we'll improve on top of that.

Got it. I’ll include the current coverage in t1900-repo-info.sh and
describe how I plan to extend it across different repository setups.

> I'm not sure what the last sentence here means.

Understood. I’ll rephrase this to make it more concrete.

> In this cycle? IF we do go with category-based queries, isn't that a
> design choice which affects all git repo info keys? Would we need to
> specifically solve for path keys?

That’s a good point. I’ll clarify this and describe category-based
queries as a general design choice rather than something tied only to
path keys.

> What do you mean by this?

I meant identifying areas in builtin/repo.c where structural changes
may be required while adding new fields. I’ll make this more precise.

I’ll incorporate these changes and send a revised version (v3) shortly.

Regards,
Pushkar

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [GSoC][RFC v3] Proposal: Improve the new git repo command
  2026-03-16 13:04 [GSoC][RFC v2] Proposal: Improve the new git repo command Pushkar Singh
  2026-03-16 18:10 ` Karthik Nayak
@ 2026-03-18 13:42 ` Pushkar Singh
  1 sibling, 0 replies; 4+ messages in thread
From: Pushkar Singh @ 2026-03-18 13:42 UTC (permalink / raw)
  To: pushkarkumarsingh1970
  Cc: ayu.chandekar, christian.couder, git, gitster, jltobler,
	karthik.188, lucasseikioshiro, peff, siddharthasthana31

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 18289 bytes --]

Hi Everyone,
This is the third version of my proposal for "Improve the new git repo command" in Google Summer of Code 2026, updated based on Karthik’s feedback.

The Doc version:
https://docs.google.com/document/d/1HM1HNQqUrGdqFdUppc02BTmPuwXC2ozCw9mLrbaVUHc/edit?usp=sharing

I'd appreciate any feedback on this.

Thanks,
Pushkar
---------8<----------8<----------8<----------8<----------8<----------8<----------8<----------8<

GSoC 2026 @ Git | Pushkar Singh
Improve the new git repo command
---------------------------------------------------


Personal Information:
---------------------
Name: Pushkar Singh
E-mail: pushkarkumarsingh1970@gmail•com

Education: XIM University, Bhubaneswar, Odisha, India
Year: II/III
Degree: Bachelors in Computer Science & Engineering

Time-Zone: UTC + 5:30 (IST)

Personal page: https://pushkarscripts.com/
Blog: https://medium.com/@pushkarscripts/
GitHub: https://github.com/pushkarscripts/


Pre-GSOC:
---------

I began exploring Git’s codebase by studying its documentation, 
reviewing prior mailing list discussions, and building Git from 
source. 
I focused on understanding the test framework, patch submission 
workflow using git send-email, versioned patch iteration, and 
the review culture on the mailing list.

After becoming familiar with the contribution process, I started 
submitting patches.


Contributions to Git (Chronological Order):
-------------------------------------------

* [PATCH v4] t1300: use test helpers instead of test builtins
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260104194812.15134-1-pushkarkumarsingh1970@gmail.com/t/#u
This patch is my first contribution to fulfill microproject 
criteria. It replaces legacy test -f and test -h checks with 
test_path_is_file and test_path_is_symlink in the test suite.

* [PATCH v2] t1410: use test helpers in reflog rewind test
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260111191525.17087-1-pushkarkumarsingh1970@gmail.com/t/#u
Replaced raw file existence checks in the reflog rewind test 
with test_path_is_file and test_path_is_missing. The subject 
and commit message were refined in v2 following review feedback.

* [PATCH] Documentation/config: fix replacement for --get-urlmatch
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260115110832.15315-1-pushkarkumarsingh1970@gmail.com/T/#u
    Related Bug Report: https://lore.kernel.org/git/CAGJzqs=0Zr2iqsTUZdjdwpbtaS7kuBOf=E_XT=vbdfyNTKkjNQ@mail.gmail.com/t/#u
Corrected documentation that incorrectly suggested combining 
--url with --all for --get-urlmatch. Verified the behavior 
against the implementation and updated the documentation 
accordingly.

* [PATCH v3] path: refactor normalize_path_copy_len for clarity
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260221110511.1592-2-pushkarkumarsingh1970@gmail.com/t/#u
Proposed a refactor of normalize_path_copy_len to improve 
clarity while preserving existing control flow. The discussion
focused on maintaining readability and minimizing structural 
changes.

* [PATCH v4] subtree: validate --prefix against commit in split
    Status: Merged into master
    Thread: https://lore.kernel.org/git/20260203164815.68258-2-pushkarkumarsingh1970@gmail.com/T/#u
    Related Bug Report: https://lore.kernel.org/git/CAFePT4xDGegpEFuFemCXsH890E2WXnG3JzUZeiLi9KW8D8beOg@mail.gmail.com/T/#u
Updated git subtree split to validate --prefix against the 
specified commit rather than the working tree. The change 
addresses a mailing list report where --prefix was incorrectly 
validated against the current working directory instead of the 
given revision. Added regression tests and revised the patch 
across four versions following review and CI feedback before 
integration into next.

* [RFC] git repo info: expose repository paths
    Status: WIP, Under discussion
    Thread: https://lore.kernel.org/git/20260218183511.17195-1-pushkarkumarsingh1970@gmail.com/t/#mdd8548b634142f4916e2911f7025e736a4789a07
Proposed extending git repo info to expose additional repository
path-related values currently accessible via git rev-parse.
Initiated design discussion regarding path handling and output
format, incorporating feedback during iteration.

Additional Participation:

In addition to submitting patches, I have:
*  Reviewed patches from other contributors
    (1) https://lore.kernel.org/git/CALE2CrTzYbMam_fi5HszSUFVZADE1haLtpBqhUmd1ki9biM2hA@mail.gmail.com/T/#u
    (2) https://lore.kernel.org/git/20260202134657.15320-1-pushkarkumarsingh1970@gmail.com/T/#u
    (3) https://lore.kernel.org/git/CALE2CrQFZngj6_NDuf0S=_-nDrrf6b6r=C9jMyEVjwMqvh6J2w@mail.gmail.com/
    (4) https://lore.kernel.org/git/CALE2CrTuZkFm1R3Bb6gFmrN1trr88vdO_7Aw6ycBYvFpWMEEtA@mail.gmail.com/T/#u
    (5) https://lore.kernel.org/git/CALE2CrSu-JW___Lav0SnLPfwxB8QCRYMKQgsfbXCHrAQSEyDoA@mail.gmail.com/T/#u
    (6) https://lore.kernel.org/git/CALE2CrQTvHeu21yLXtRg=A6ak9AB_vvwPirQNFDjZ2AmhoTzTQ@mail.gmail.com/T/#u
    (7) https://lore.kernel.org/git/CALE2CrR_Xrei32pc_gJ16mArZPjZ-+bNWWFnsJ3i+OGqbxwPcg@mail.gmail.com/T/#u
*  Assisted in resolving a git rebase issue on the mailing list
    (1) https://lore.kernel.org/git/CALE2CrQ415Ewm_F-DLZu=JY2BTWofmGgorEOa0D=USr5d510SQ@mail.gmail.com/T/#madfc34c4334a7d62baa18b18e3c8fa83600f8455
*  Studied the original discussions on git repo
    (1) https://public-inbox.org/git/20250610152117.14826-1-lucasseikioshiro@gmail.com/t/#u
    (2) https://lore.kernel.org/git/20251207190532.67107-1-lucasseikioshiro@gmail.com/T/#u
    (3) https://lore.kernel.org/git/20260218211845.96009-1-lucasseikioshiro@gmail.com/T/#u
    (4) https://lore.kernel.org/git/20260203221758.1164434-1-jltobler@gmail.com/T/#u
*  Examined the implementation in builtin/repo.c


The Plan
--------

I will be iterating on this project in blocks and with the
review-driven approach. By introducing every changes in small,
logically isolated patches, I'll ensure clarity, ease-of-review,
and architectural stability.

First I want to cover foundational repository path keys,
because they create instant structural value and more closely fit
with existing functionalities of rev-parse.

For every key proposed or enhancement made, I will:

  - Ensure behavior matches helpers currently used in git rev-parse, 
    such as repo_get_work_tree(), repo_get_common_dir(),
    repo_get_git_dir(), and related helpers such as repo_git_path*(), 
    along with print_path() for formatting behavior.
  - Currently, git rev-parse outputs paths based on options like
    --path-format and also varies depending on the specific command
    (e.g., --git-dir, --show-toplevel). This can lead to slightly
    different behaviors across commands. The expected behavior for
    git repo info is still under discussion (e.g., whether paths
    should always be absolute, or configurable).
    During the bonding period, I plan to:
      - Review existing discussions on path formatting
      - Align on a consistent default behavior
      - Ensure compatibility with existing expectations from rev-parse 
  - Category-based queries represent a broader design decision affecting 
    the overall interface of git repo info, not just path-related fields. 
    I will evaluate their impact and ensure they integrate cleanly with 
    existing key-based querying before proposing an implementation.
  - Add one key (or one closely tied family of keys) per patch.
  - The current test coverage in t1900-repo-info.sh focuses on:
      - Basic key-value retrieval
      - Output formats (lines, nul)
      - Error handling for invalid keys, formats, and flag combinations.
      - Repository configurations such as bare repositories, shallow 
        clones, object format, and reference format.
    However, it does not yet cover:
      - Path-related keys (e.g., path.git-dir, path.toplevel)
      - More complex repository setups such as linked worktrees and submodules
      - Edge cases involving relative vs absolute path behavior
    I plan to extend the test suite to include these scenarios, ensuring 
    consistent behavior across different repository layouts while 
    preserving existing invariants.
  - Update documentation accordingly.

I will avoid large changes and focus on small, reviewable patches,
instead of rapidly expanding features.


Path Key Expansion
------------------

I will incrementally expose selected repository path values
currently accessible via:

  - git rev-parse
  - git rev-parse --git-path

My initial focus will be on foundational keys such as:

  path.git-dir
  path.common-dir
  path.toplevel
  path.superproject-working-tree

Subsequent patches may introduce additional --git-path
equivalents such as:

  path.index-file
  path.objects-dir
  path.config-file

Each key will be evaluated individually to ensure clarity,
necessity, and consistent semantics.


repo structure Enhancements
---------------------------

If maintainers deem it appropriate maybe I will tackle some 
carefully scoped improvements to git repo structure.

Potential areas include:
  - Distribution-oriented metrics, only if aligned with the
    tool’s long-term direction.
  - Low-friction structural metrics (e.g., path depth),
    as long as they do not add excessive traversal cost.

Any such enhancement will be introduced in small,
standalone patches, taking performance, maintainability, 
and output stability into account. 

If scope or review timelines demand, I can push this work to 
later stages of GSoC or continue it after GSoC.


Architectural Considerations
----------------------------

Where appropriate, I will:

  - Prefer explicit repository context over global state.
  - Avoid duplicating logic already implemented in rev-parse. 
    Where possible, I'll reuse existing helper functions rather 
    than reimplementing path resolution logic.
  - Ensure that newly added fields do not change existing output 
    behavior (ordering, formats, or flag behavior), and remain 
    consistent with Git’s scripting-friendly output.

Structural refactoring will only be undertaken when directly
relevant to git repo and supported through review discussion.


Timeline
--------

Keeping Git's iterative and review-driven workflow in mind, I've 
designed the timeline to focus on core enhancements in order to 
ensure that I can produce meaningful deliverables even if review 
cycles extend.


Pre-Coding Preparation (Before Official Start)

- Continue participating in git repo discussions.
- Improve and restrict scope of path key expansion.
- Confirm semantics for absolute vs relative path handling.
- Define patch ordering to keep the submissions small
  and logically independent.


Community Bonding Period (May)

Primary objective: finalize scope and ordering.

- Confirm priority list of path keys.
- Align on output stability expectations.
- Evaluate category-based queries as a broader interface design 
  decision affecting all keys, and determine whether they should 
  be introduced in this cycle or deferred.
- Identify specific areas in builtin/repo.c where new fields 
  can be integrated cleanly, reusing existing helpers and 
  maintaining consistency with the current key-to-field dispatch 
  structure.

I will get to implementation once the semantics feel reasonably
aligned through mailing list discussion.


Phase 1 (Weeks 1–4): Foundational Path Keys

Objective: establish core path parity in git repo info
with essential rev-parse values.

* Weeks 1–2:
  - Submit path.git-dir
  - Submit path.common-dir

  I'll present these foundational keys early on to keep 
  semantics consistent, and stabilize output expectations.

* Week 3:
  - Submit path.toplevel
  - Submit path.superproject-working-tree

  These will provide working-tree inspection coverage and
  support submodule-aware contexts.

* Week 4:
  - Submit selected stable --git-path equivalents
    (e.g., path.index-file, path.objects-dir),
    introduced incrementally, one per patch.

I'll submit each key independently. When semantics are 
already aligned, I'll send consecutive patches while
older ones will remain pending, which allows a significant 
overlap between submission and iteration.

Midpoint Goal:
 Deliver foundational path keys that are either merged or
 in next, with consensus on semantics.


Phase 2 (Weeks 5–8): Additional Path Keys & Refinement

- Finish the remaining agreed --git-path parity keys.
- Address changes from review cycles of Phase 1.
- Stabilize behaviour across edge-case environments.

This phase purposely leaves time for review-guided
iteration without expanding scope.


Phase 3 (Weeks 9–10): Optional Enhancements

Only if Phase 1 and 2 stabilize earlier than expected,
I'll begin:
- Introducing the grouped category queries(e.g., info paths),
  subject to prior agreement.
- Carefully extending repo structure with one metric 
  at a time.

I’m not going to attempt any bulk metric expansion here.


Final Weeks (Weeks 11–12): Consolidation

Over the last weeks of this program, I will:
- Address remaining review feedback.
- Adjust patches if requested or rework them.
- Finalize documentation.
- Ensure CI stability and cross-platform behavior.

During this time no new features will be introduced.


Prioritization Under Constraints
--------------------------------

Considering Git’s iterative review process, I have structured the
project so that foundational improvements are delivered first.

If review cycles extend longer than anticipated, my priority will be:

1. Core path parity (path.git-dir, path.common-dir,
   path.toplevel, path.superproject-working-tree)
2. Additional agreed --git-path equivalents
3. Category-based queries
4. repo structure metric extensions

This ordering ensures that the most architecturally meaningful
enhancements are completed even if optional improvements
must be deferred.


Post-GSoC Continuation
----------------------

My involvement in Git is not limited to the GSoC period.

After the coding phase, I intend to:
- Continue refining git repo through incremental improvements.
- Address follow-up review feedback or deferred enhancements.
- Participate in reviewing related patches where appropriate.
- Contribute to ongoing efforts around repository introspection
  and gradual libification.

Over time, I hope to contribute not only through patches,
but also by helping new contributors navigate the mailing
list workflow and patch iteration process.

If given the opportunity in the future, I would be glad to
support mentoring efforts and help the community grow further.


Availability
------------

My end-semester examinations conclude on March 28.
Following this, I will not have academic obligations
during the GSoC coding period.

The project is expected to fall within the 175–350 hour
range. I am prepared to commit at the higher end of this
range.

During the official coding phase (approximately 12 weeks),
I will be available for 30–35 hours per week. This allows
for approximately 360–420 hours of focused development time,
comfortably covering the expected project scope.

I will also remain active on the mailing list during the
community bonding period and will use that time to refine
design decisions and prepare patch sequencing.

I do not anticipate any internships, travel, or major
commitments that would interfere with this schedule.


Blogging:
---------

For the past one year I have been writing technical articles 
on Medium, mostly related to Git workflows, developer tooling, 
and lessons from working with real codebases.

I will be sharing weekly updates for the GSoC period to document 
progress and the discussions on these mailing lists for 
transparency, and more importantly, to help future contributors.

Medium: https://medium.com/@pushkarscripts


Risk Assessment and Mitigation
------------------------------

1. Review Cycle Duration

Considering Git’s iterative mailing list workflow, existing 
patches might go through several updates before being 
accepted.

Mitigation:
  The project is structured so that foundational path
  keys are delivered first. Independent patches allow
  parallel review and refinement.

2. Scope Creep

Expanding both path keys and structure metrics
may introduce unintended scope growth.

Mitigation:
  Optional enhancements (categories and additional
  metrics) are explicitly deferred until foundational
  work stabilizes.

3. Semantic Ambiguity

Path-related behavior (absolute vs relative,
worktree interactions, submodules) may require
careful alignment.

Mitigation:
  Semantics will be clarified during the bonding
  period and validated against existing helpers
  before implementation.

---

Thank you for your time and consideration. I look forward
to contributing further to the project and continuing to
learn through the review process.

Regards, 
Pushkar Singh

---------8<----------8<----------8<----------8<----------8<----------8<----------8<----------8<

Changes in v3:
- Clarified which existing helpers will be reused for path handling
- Expanded discussion on path semantics (absolute vs relative)
- Added details on current test coverage and planned improvements
- Clarified expectations around output behavior
- Reworked category-based queries as a broader design decision
- Improved clarity in architectural sections 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-03-18 13:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-16 13:04 [GSoC][RFC v2] Proposal: Improve the new git repo command Pushkar Singh
2026-03-16 18:10 ` Karthik Nayak
2026-03-17 17:20   ` Pushkar Singh
2026-03-18 13:42 ` [GSoC][RFC v3] " Pushkar Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox