From: Junio C Hamano <gitster@pobox•com>
To: David Kastrup <dak@gnu•org>
Cc: git@vger•kernel.org
Subject: Re: [PATCH 3/3] builtin/blame.c: large-scale rewrite
Date: Mon, 27 Jan 2014 12:51:33 -0800 [thread overview]
Message-ID: <xmqqtxcpno6i.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <87a9eh8b0d.fsf@fencepost.gnu.org> (David Kastrup's message of "Mon, 27 Jan 2014 20:45:06 +0100")
David Kastrup <dak@gnu•org> writes:
> Junio C Hamano <gitster@pobox•com> writes:
>
>> David Kastrup <dak@gnu•org> writes:
>>
>>> The previous implementation uses a sorted linear list of struct
>>> blame_entry in a struct scoreboard for organizing all partial or
>>> completed work. Every task that is done requires going through the
>>> whole list where most entries are not relevant to the task at hand.
>>>
>>> This commit reorganizes the data structures in order to have each
>>> remaining subtask work with its own sorted linear list it can work off
>>> front to back. Subtasks are organized into "struct origin" chains
>>> hanging off particular commits. Commits are organized into a priority
>>> queue, processing them in commit date order in order to keep most of
>>> the work affecting a particular blob collated even in the presence of
>>> an extensive merge history. In that manner, linear searches can be
>>> basically avoided anywhere. They still are done for identifying one
>>> of multiple analyzed files in a given commit, but the degenerate case
>>> of a single large file being assembled from a multitude of smaller
>>> files in the past is not likely to occur often enough to warrant
>>> special treatment.
>>> ---
>>
>> Sign-off?
>
> Not while this is not fit for merging because of #if 0 etc and missing
> functionality. This is just for review.
That is not what Signing off a patch means, though ;-)
>> Actually, I'd like to take my previous suggestion to add this as
>> blame2 (to replace blame in the future) back. That was based on my
>> fear/hope to see an implementation based on a completely different
>> approach, but the basic premise of having one blame_entry record per
>> the lines of the final image in the scoreboard, using diff between
>> parents to the child to find common lines for passing the blame up,
>> etc. have not changed at all and the change is all about organizing
>> the pieces of data in a *much* *better* way to avoid needlessly
>> finding what we already have computed.
>
> Yes. Basically, the call graph outside of blame.c itself should be
> pretty much the same.
> ...
> Ok. I'm also certain to have a few "space between function name and
> arguments" left and will grep for those before submitting the next
> version.
>
> Next version will at least include -M option, possibly leaving -C for
> later.
OK. As we do not want to break the build in the middle of the
series, but we still want to keep the individual steps reviewable
incrementally, I would think the best structure would be have the
series that consists of multiple steps "the basic infrastructure is
there, but no rename following, and neither -M or -C works", "now
renames are followed", "now -M works", etc., with tests that are not
yet working (in the beginning, most of "git blame" test may no
longer work until the series is finished) marked with
s/test_expect_success/test_expect_failure/
and turn expect_failure into expect_success as the series advances.
That way, we may get a better picture of what each step achieves. I
dunno.
next prev parent reply other threads:[~2014-01-27 20:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-25 12:44 How to substructure rewrites? David Kastrup
2014-01-25 18:23 ` [PATCH 0/3] "Teaser" patch for rewriting blame for efficiency David Kastrup
2014-01-25 18:23 ` [PATCH 1/3] builtin/blame.c: struct blame_entry does not need a prev link David Kastrup
2014-01-25 18:23 ` [PATCH 2/3] Eliminate same_suspect function in builtin/blame.c David Kastrup
2014-01-25 18:23 ` [PATCH 3/3] builtin/blame.c: large-scale rewrite David Kastrup
2014-01-27 16:54 ` Junio C Hamano
2014-01-27 19:45 ` David Kastrup
2014-01-27 20:51 ` Junio C Hamano [this message]
2014-01-27 21:21 ` David Kastrup
2014-01-27 15:58 ` How to substructure rewrites? Junio C Hamano
2014-01-27 16:27 ` David Kastrup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqtxcpno6i.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=dak@gnu$(echo .)org \
--cc=git@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox