public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail•com>
To: Alex Vandiver <alexmv@dropbox•com>, Jeff King <peff@peff•net>
Cc: Ben Peart <benpeart@microsoft•com>,
	git@vger•kernel.org, gitster@pobox•com, chriscool@tuxfamily•org,
	t.gummerer@gmail•com, l.s.r@web•de, jsorianopastor@gmail•com,
	Johannes Schindelin <johannes.schindelin@gmx•de>
Subject: Re: [PATCH v2] read_index_from(): Skip verification of the cache entry order to speed index loading
Date: Tue, 31 Oct 2017 09:01:45 -0400	[thread overview]
Message-ID: <f671ea09-d4aa-64aa-8225-c1fbf2eac175@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1710301727160.10801@alexmv-linux>



On 10/30/2017 8:33 PM, Alex Vandiver wrote:
> On Mon, 30 Oct 2017, Jeff King wrote:
>> On Mon, Oct 30, 2017 at 08:48:48AM -0400, Ben Peart wrote:
>>
>>> Any updates or thoughts on this one?  While the patch has become quite
>>> trivial, it does results in a savings of 5%-15% in index load time.
>>
>> I like the general direction of avoiding the check during each read.
> 
> Same -- the savings here are well worth it, IMHO.
> 
>>> I thought the compromise of having this test only run when DEBUG is defined
>>> should limit it to developer builds (hopefully everyone developing on git is
>>> running DEBUG builds :)).  Since the test is trying to detect buggy code
>>> when writing the index, I thought that was the right time to test/catch any
>>> issues.
>>
>> I certainly don't build with DEBUG. It traditionally hasn't done
>> anything useful. But I'm also not convinced that this is a likely way to
>> find bugs in the first place, so I'm OK missing out on it.
> 
> I also don't compile with DEBUG -- there's no documentation that
> mentions it, and I don't think I'd considered going poking for what
> was `#ifdef`d.  I think it'd be reasonable to provide some
> configure-time setting that adds `CFLAGS="-ggdb3 -O0 -DDEBUG"` or
> similar, but that seems possibly moot for this particular change (see
> below).
> 
>> But what we probably _do_ need is to make sure that "git fsck" would
>> detect such an out-of-order index. So that developers and users alike
>> can diagnose suspected problems.
> 
> Agree -- that seems like a better home for this logic.

That is how version 1 of this patch worked but the feedback to that 
patch was to remove it "not only during the normal operation but also in 
fsck."

> 
>>> I am working on other, more substantial savings for index load times
>>> (stay tuned) but this seemed like a small simple way to help speed
>>> things up.
> 
> I'm interested to hear more about what direction you're looking in here.
>   - Alex
> 

I'm working on parallelizing the index load process across multiple 
threads/cpu cores.  Specifically the loop that calls create_from_disk() 
and set_index_entry().  The serial nature of the index formats makes 
that difficult but I believe I've come up with a way to make it work 
across all existing index formats.

  reply	other threads:[~2017-10-31 13:01 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-18 14:27 [PATCH v1] read_index_from(): Skip verification of the cache entry order to speed index loading Ben Peart
2017-10-19  5:22 ` Junio C Hamano
2017-10-19 15:12   ` Ben Peart
2017-10-19 16:05     ` Jeff King
2017-10-20  1:14       ` Junio C Hamano
2017-10-19 22:14     ` Stefan Beller
2017-10-20 12:47       ` Johannes Schindelin
2017-10-20 18:53         ` Stefan Beller
2017-10-21  1:55           ` Junio C Hamano
2017-10-24 14:45 ` [PATCH v2] " Ben Peart
2017-10-30 12:48   ` Ben Peart
2017-10-30 18:03     ` Jeff King
2017-10-31  0:33       ` Alex Vandiver
2017-10-31 13:01         ` Ben Peart [this message]
2017-10-31 17:10           ` Jeff King
2017-11-01  6:09             ` Junio C Hamano
2017-10-31  1:49     ` Junio C Hamano
2017-10-31 12:51       ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f671ea09-d4aa-64aa-8225-c1fbf2eac175@gmail.com \
    --to=peartben@gmail$(echo .)com \
    --cc=alexmv@dropbox$(echo .)com \
    --cc=benpeart@microsoft$(echo .)com \
    --cc=chriscool@tuxfamily$(echo .)org \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=johannes.schindelin@gmx$(echo .)de \
    --cc=jsorianopastor@gmail$(echo .)com \
    --cc=l.s.r@web$(echo .)de \
    --cc=peff@peff$(echo .)net \
    --cc=t.gummerer@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox