From: Arnav Bhate <bhatearnav@gmail•com>
To: git@vger•kernel.org
Cc: Patrick Steinhardt <ps@pks•im>
Subject: [GSoC PROPOSAL v2] Refactoring in order to reduce Git’s global state
Date: Sun, 6 Apr 2025 00:11:16 +0530 [thread overview]
Message-ID: <7116fc77-2280-4bd1-b2f2-131e1108b8ce@gmail.com> (raw)
In-Reply-To: <1077615a-1c31-416d-a754-58b36d404289@gmail.com>
## Personal Information
- Full name: Arnav Akshaya Bhate
- Email address: bhatearnav@gmail•com
- Mobile no.: +91 8291328838
- Time zone: UTC+05:30
- Education: IIT Bombay
- Year: Second year
- GitHub: https://github.com/arnavbhate
## About Me
I'm Arnav Bhate, a second-year UG student at Indian Institute of
Technology Bombay. I love coding and so I am a member of IIT Bombay's
Developers' Community (DevCom), which is a group of roughly 40 people
developing software for use by students and staff of the institute. Most
of the software developed is not open source, so I can not include
examples of my work there in this proposal. Being a member of DevCom has
exposed me to collaborative software development.
A common link in all software I have worked on is that Git has been used
for version control. I thus see this project as my way of giving back to
the Git community in particular and open source in general. This will be
my first significant contribution to the open source community, and I
wish to stick around afterwards.
## Overview
Git currently uses many global variables, most significantly
`the_repository`, which are included in roughly 290 files. Apart from
`the_repository`, there are many global variables, some of which
logically belong in struct repository, as they represent information
specific to a repository. So even if all instances of the_repository
were converted into a extra repository argument for the function, there
would still be many global variables left.
The use of such variables assumes that Git will only operate on one
repository at a time, which renders multi-repository handling
impossible without kludges.
This project aims to move such variables from global scope into more
appropriate local contexts, mainly `struct repository` and
`struct repository_settings`. This will not only make the environment
repository-specific, allowing easy multi-repository handling, but also
make maintaining the code easier.
The project involves identifying suitable locations for environment
variables in repository specific structs, moving them there and updating
all the code affected by the move.
## Pre-GSoC
I first got into Git's codebase in February 2025, with my first
contribution in March. My first patch was on my microproject and since
then I have submitted two more patches on a similar topic.
### Patches
- (Microproject) decorate: fix sign comparison warnings
Thread: https://lore.kernel.org/git/afa6b428-3190-42ae-9eac-540c95b576fd@gmail.com/
Status: Merged into master
Commit hash: 2bfd3b368572cbf1ce287de09db08b7e7e429ecd
Description: Refactoring of decorate.c to replace signed variables
with unsigned ones when they are used to iterate over arrays whose
sizes are represented by unsigned variables, and remove 2 unnecessary
variables which just hold the value of another variable without being
modified, replacing them with the variable whose value they were
holding.
- rm: fix sign comparison warnings
Thread: https://lore.kernel.org/git/38de63ce-6d4e-4f1f-95b1-049df78d9cfc@gmail.com/
Status: Under discussion
Description: Refactoring of rm.c to make iterators over arrays whose
sizes are represented by unsigned variables unsigned. Specifically in
`get_ours_cache_pos`, where before a signed variable was being passed
and then inverted in the function, now the already inverted variable
is passed as an unsigned variable, with the inversion moved to the
function call.
- pathspec: fix sign comparison warnings
Thread: https://lore.kernel.org/git/a3aa5f99-63ce-4be5-8d64-fb6e226b3bf9@gmail.com/
Status: Under discussion
Description: Refactoring of pathspec.c to make array iterator
variables match the type of the variable storing the array's size.
Where replacing the variable's type is not possible, because of the
large-scale cascade replacements it would cause, an appropriate cast
has been added.
- environment.h: remove unused variables
Thread: https://lore.kernel.org/git/2c547567-2b72-476c-9fc5-71cac050fa15@gmail.com/
Status: Under discussion
Description: Removing two variables which did not have any references
in the codebase, as they had been moved to `struct repo_settings`, but
were not removed from environment.h.
## Proposed Plan
- Identifying global variables in environment.c that should be moved and
identifying suitable locations, some could be moved directly into
`struct repository`, some in its sub-structs that already exist and
some in newly created sub-structs.
- Identifying and updating occurrences of these variables to reference
their new locations.
- Identifying all occurrences of `the_repository` and updating them to
use a `struct repository` passed to the function.
It makes sense that all the variables need not be in the same struct, as
separation would keep the codebase organised, and thus easier to
maintain. It would also make it easier to introduce these changes
systematically, as a group of related variables, combined together in a
struct, could be introduced in a single patch series.
### Timeline
#### Pre-GSoC (Until May 8)
- Explore the codebase, identifying locations where global variables
from environment.c are used.
- Identify suitable locations for these global variables.
#### Community Bonding Period (May 8 - June 1)
- Interact with mentor, discussing the locations I have decided, and
refining the plan if required.
- Start coding early, as my summer break will have started. (See coding
period)
#### Coding Period (June 2 - August 25)
- Move global variables to their new locations in various structs,
and refactor functions that depend on them to use their new locations.
- Variables which represent settings from config (7 weeks)
- Core (5 weeks)
- Others (2 weeks)
- Variables not from config (3 weeks)
- Modify functions to add an `struct repository` argument where they
depend on `the_repository` and replace all occurrences of it in the
function.
#### Final Week (August 25 - September 1)
- Fix any bugs that may be left.
- Write final report.
### Availability
My summer break from college lasts from May to July. I am currently
planning on taking a vacation during this period of about 1 week,
however, the dates have not been decided. Outside of this vacation, I
am not occupied in the break and can devote up to 60 hours a week
towards the project. In August, once classes recommence, I will be
available for 20 hours a week.
## Post-GSoC
After completing my project, I plan on staying active and contributing
patches, and start reviewing code.
--
Regards,
Arnav Bhate
(He/Him)
next prev parent reply other threads:[~2025-04-05 18:41 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-02 18:14 [GSoC PROPOSAL v1] Refactoring in order to reduce Git’s global state Arnav Bhate
2025-04-03 9:59 ` Patrick Steinhardt
2025-04-03 15:26 ` Arnav Bhate
2025-04-04 9:19 ` Patrick Steinhardt
2025-04-05 18:41 ` Arnav Bhate [this message]
-- strict thread matches above, loose matches on Subject: below --
2026-03-17 17:54 [GSoC Proposal] Refactoring in order to reduce Git's " Francesco Paparatto
2026-03-24 19:31 ` [GSoC Proposal v2] " Francesco Paparatto
2026-03-06 14:57 [GSOC][PROPOSAL]: Refactoring in order to reduce Git’s " Shreyansh Paliwal
2026-03-07 20:04 ` [GSOC][PROPOSAL v2]: " Shreyansh Paliwal
2026-03-09 14:42 ` Christian Couder
2026-03-10 14:58 ` Shreyansh Paliwal
2025-03-26 5:26 [GSOC] [PROPOSAL V1]: " Ayush Chandekar
2025-04-04 8:51 ` [GSOC] [PROPOSAL v2]: " Ayush Chandekar
2025-04-04 14:45 ` Karthik Nayak
2025-04-06 10:44 ` Ayush Chandekar
2025-04-07 9:06 ` Christian Couder
2025-04-07 10:07 ` Ayush Chandekar
2025-04-07 8:42 ` Ayush Chandekar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7116fc77-2280-4bd1-b2f2-131e1108b8ce@gmail.com \
--to=bhatearnav@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=ps@pks$(echo .)im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox