From: Junio C Hamano <gitster@pobox•com>
To: Stefan Beller <sbeller@google•com>
Cc: Jens Lehmann <Jens.Lehmann@web•de>,
Jonathan Nieder <jrnieder@gmail•com>,
"git\@vger.kernel.org" <git@vger•kernel.org>
Subject: Re: [RFC] On the --depth argument when fetching with submodules
Date: Fri, 05 Feb 2016 16:05:01 -0800 [thread overview]
Message-ID: <xmqqoabubt5e.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <CAGZ79kbt2-Vm94eTQY0PmJrNwqyTa36FJy5Q+2YBsxu6uYdTmQ@mail.gmail.com> (Stefan Beller's message of "Fri, 5 Feb 2016 14:48:43 -0800")
Stefan Beller <sbeller@google•com> writes:
> Currently when cloning a project, including submodules, the --depth argument
> is passed on recursively, i.e. when cloning with "--depth 2", both the
> superproject as well as the submodule will have a depth of 2. It is not
> garantueed that the commits as specified by the superproject are included
> in these 2 commits of the submodule.
>
> Illustration:
> (superproject with depth 2, so A would have more parents, not shown)
>
> superproject/master: A <- B
> / \
> submodule/master: C <- D <- E <- F <- G
>
> (Current behavior is to fetch G and F)
I think the issue is deeper than merely "--depth 2", and you would
be better off stepping back and think about various use cases to
make sure that we know what kind of behaviour we want to support
before delving into one particular corner case. We currently pass
the depth recursively, and I do not think it makes much sense, but I
view it as a secondary question "among the behaviours we want to
support, which one should be the default?" It may turn out that not
passing it recursively at all, or even passing a different depth, is
a better default, but we wouldn't know until we know what are the
desirable behaviour in various workflows.
If you are actively working on the superproject plus some submodules
but you are merely using the submodule you depicted above, not
working on changing it, even when you want the full history of the
superproject (i.e. no "--depth 2"), you may not want history of the
submodule. Even though we have a way to say "I am not interested in
this submodule AT ALL" by not doing "submodule init", not having
anything at all at the path submodule/ may not allow you to build
the whole thing, and we currently lack a way to express "I am not
interested in the history of this thing, but I need at least the
tree that matches the commit referred to by the superproject".
If you are working on a single submodule, trying to fix a bug in the
context of the whole project, you might want to have a single-depth
clone of the superproject and all other submodules, plus the whole
history of the single submodule.
In either of these examples, the top-level "--depth" does not have
much to do with what depth the user wants to use when cloning or
fetching the submodule repositories.
I have a feeling (but I would not be surprised if somebody who uses
submodules heavily has a counter-example from real life) that
regardless of "--depth" or full clone, fetching the tip of matching
branch is not a good default behaviour. In your picture, even when
depth is not given at all, there isn't much point fetching F or G.
> So to fetch the correct submodule commits, we need to
> * traverse the superproject and list all submodule commits.
> * fetch these submodule commits (C and E) by sha1
I do not think requiring that C to be fetched when the superproject
is cloned with --depth=2 (hence A and B are present in the result)
is a good definition of "correct submodule commits". The initial
clone could be "superproject follows --depth, all submodules are
cloned with --depth=1 at the commits referenced by the superproject
tree"--by that definition, you need E but you do not want C.
As a specification of the behaviour, the above two might work, but I
do not think that should be the implementation. In other words,
"The implementation should behave as if it did the above two" is OK,
and it is also OK to qualify with further conditions to help the
implementation. For example, the current structure assumes that E
and C are reachable from "some" ref in submodule, so that at least a
whole clone of the submodule would give them to you--otherwise you
would not be able to even build the superproject at A or B. Perhaps
it is OK to further require that, when you are working in a single
branch mode and working on 'master', you are required to have
commits C and E reachable on the 'master' branch in the submodule,
and that may lets you limit the need for such scanning of the
history?
next prev parent reply other threads:[~2016-02-06 0:05 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-05 22:48 [RFC] On the --depth argument when fetching with submodules Stefan Beller
2016-02-06 0:05 ` Junio C Hamano [this message]
2016-02-06 7:41 ` Fredrik Gustafsson
2016-02-07 13:32 ` Lars Schneider
2016-02-08 18:27 ` Stefan Beller
2016-02-08 20:18 ` Junio C Hamano
2016-02-08 20:38 ` Jonathan Nieder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqoabubt5e.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=Jens.Lehmann@web$(echo .)de \
--cc=git@vger$(echo .)kernel.org \
--cc=jrnieder@gmail$(echo .)com \
--cc=sbeller@google$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox