public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Linus Torvalds <torvalds@linux-foundation•org>
Cc: git@vger•kernel.org
Subject: Re: [PATCH WIP] sha1-lookup: make selection of 'middle' less aggressive
Date: Sun, 30 Dec 2007 13:49:11 -0800	[thread overview]
Message-ID: <7vodc77t0o.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <alpine.LFD.1.00.0712301150120.32517@woody.linux-foundation.org> (Linus Torvalds's message of "Sun, 30 Dec 2007 11:58:02 -0800 (PST)")

Linus Torvalds <torvalds@linux-foundation•org> writes:

> On Sun, 30 Dec 2007, Junio C Hamano wrote:
>> 
>> With this patch, we actually see slight improvements in
>> execution time as well.  In the same partial kde repository
>> (3.0GB pack, 95MB idx; the numbers are from the same machine as
>> before, best of 5 runs):
>
> Ok, I tried this a year ago, and never got any real improvement.

Yes, I remember that one.

> and I decided it wasn't worth it. Yours looks much better, and seems to 
> get a real performance improvement, so go for it, but I doubt that the 
> actual object lookup is really ever the main issue. I've never seen it 
> stand out in the real profiles, although if it is able to cut down on IO 
> (and your minor fault numbers are promising!), it might be more important 
> than I'd otherwise think.

The cost of the key comparison done in each round is
insignificant compared to the actual cost of accessing the
object data through zlib.  The only potential performance
benefit that could come from this patch to reduce the average
number of rounds in the search is I/O reduction.

The only case I can think of that this may matter in real life
is accessing only small number of objects in a history with a
huge pack.  Once you dig down the history deep enough to check
enough number of objects inside a single process, you would need
to touch every page of the mapped idx and the minor-fault gain
rapidly diminishes.

Accessing only small number of objects in a huge history most
often happens when building near the tip of the history
(e.g. commit, rebase, merge), but these operations tend to deal
with very young objects, often unpacked.  We check pack first
and then loose objects, so the search for young loose objects
will benefit from the patch because the negative look-up to
notice that they do not live in any pack also becomes cheaper,
but I do not think it is such a big deal.

  reply	other threads:[~2007-12-30 21:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-30 10:22 [PATCH WIP] sha1-lookup: more memory efficient search in sorted list of SHA-1 Junio C Hamano
2007-12-30 11:38 ` [PATCH WIP] sha1-lookup: make selection of 'middle' less aggressive Junio C Hamano
2007-12-30 19:06   ` Marco Costalba
2007-12-30 19:12     ` Marco Costalba
2007-12-31 22:40     ` Shawn O. Pearce
2007-12-30 19:58   ` Linus Torvalds
2007-12-30 21:49     ` Junio C Hamano [this message]
2007-12-30 22:04       ` Marco Costalba
2007-12-31 20:37         ` Linus Torvalds
2007-12-31 23:47           ` Marco Costalba
2008-01-01  6:36           ` Jeff King
2008-01-01  8:40             ` Marco Costalba
2008-01-01  9:01               ` Marco Costalba
2008-01-01 14:51             ` Pierre Habouzit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vodc77t0o.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=torvalds@linux-foundation$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox