public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Karsten Blees <karsten.blees@gmail•com>
To: Stefan Zager <szager@chromium•org>
Cc: "Nguyễn Thái Ngọc" <pclouds@gmail•com>,
	"Git Mailing List" <git@vger•kernel.org>
Subject: Re: [PATCH] Enable index-pack threading in msysgit.
Date: Fri, 21 Mar 2014 21:01:45 +0100	[thread overview]
Message-ID: <532C9AA9.1010102@gmail.com> (raw)
In-Reply-To: <CAHOQ7J-sUt3HGYNE7n=X3ZmV3Q-n+n9hMDAtzLbH3YU8iAqoqA@mail.gmail.com>

Am 20.03.2014 22:56, schrieb Stefan Zager:
> On Thu, Mar 20, 2014 at 2:35 PM, Karsten Blees <karsten.blees@gmail•com> wrote:
>> Am 20.03.2014 17:08, schrieb Stefan Zager:
>>
>>> Going forward, there is still a lot of performance that gets left on
>>> the table when you rule out threaded file access.  There are not so
>>> many calls to read, mmap, and pread in the code; it should be possible
>>> to rationalize them and make them thread-safe -- at least, thread-safe
>>> for posix-compliant systems and msysgit, which covers the great
>>> majority of git users, I would hope.
>>>
>>
>> IMO a "mostly" XSI compliant pread (or even the git_pread() emulation) is still better than forbidding the use of read() entirely. Switching from read to pread everywhere requires that all callers have to keep track of the file position, which means a _lot_ of code changes (read/xread/strbuf_read is used in ~70 places throughout git). And how do you plan to deal with platforms that don't have a thread-safe pread (HP, Cygwin)?
>>
>> Considering all that, Duy's solution of opening separate file descriptors per thread seems to be the best pattern for future multi-threaded work.
> 
> Does that mean you would endorse the (N threads) * (M pack files)
> approach to threading checkout and status?  That seems kind of
> crazy-town to me.  Not to mention that pack windows are not shared, so
> this approach to multi-threading can have the side-effect of blowing
> out memory consumption.  We have already had to dial back settings for
> pack.threads and core.deltaBaseCacheLimit, because threaded index-pack
> was causing OOM errors on 32-bit platforms.
> 

Opening more file descriptors doesn't significantly increase the memory footprint, so it shouldn't matter whether the threads read data via shared or private descriptors.

git-status with core.preloadindex is already multithreaded (at least the first part), and AFAIK doesn't read pack files at all.

I'm still not convinced that multi-threaded git-checkout is a good idea. According to my tests this is actually slower than sequential checkout. You'd have to be very careful to only multi-thread the parts that don't do any IO, such as unpacking / undeltifying.

  parent reply	other threads:[~2014-03-21 20:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-19  0:46 [PATCH] Enable index-pack threading in msysgit szager
2014-03-19  7:30 ` Duy Nguyen
2014-03-19  7:50   ` Stefan Zager
2014-03-19 10:28     ` Duy Nguyen
2014-03-19 16:57       ` Stefan Zager
2014-03-19 19:15         ` Stefan Zager
2014-03-19 20:57 ` Junio C Hamano
2014-03-20 13:54 ` Karsten Blees
2014-03-20 16:08   ` Stefan Zager
2014-03-20 21:35     ` Karsten Blees
2014-03-20 21:56       ` Stefan Zager
2014-03-21  1:33         ` Duy Nguyen
2014-03-21 20:01         ` Karsten Blees [this message]
2014-03-21  1:51     ` Duy Nguyen
2014-03-21  5:21       ` Duy Nguyen
2014-03-21  5:35         ` Stefan Zager
2014-03-21 18:55           ` Karsten Blees
2014-03-25 13:41       ` [PATCH] index-pack: work around thread-unsafe pread() Nguyễn Thái Ngọc Duy
2014-03-26  8:35 ` [PATCH] Enable index-pack threading in msysgit Johannes Sixt
  -- strict thread matches above, loose matches on Subject: below --
2014-03-19 21:35 Stefan Zager
2014-03-19 22:23 ` Junio C Hamano
2014-03-20  1:25 ` Duy Nguyen
2014-03-21 18:40   ` Karsten Blees

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=532C9AA9.1010102@gmail.com \
    --to=karsten.blees@gmail$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=pclouds@gmail$(echo .)com \
    --cc=szager@chromium$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox