From: Karsten Blees <karsten.blees@gmail•com>
To: Stefan Zager <szager@chromium•org>
Cc: "Nguyễn Thái Ngọc" <pclouds@gmail•com>,
"Git Mailing List" <git@vger•kernel.org>
Subject: Re: [PATCH] Enable index-pack threading in msysgit.
Date: Fri, 21 Mar 2014 21:01:45 +0100 [thread overview]
Message-ID: <532C9AA9.1010102@gmail.com> (raw)
In-Reply-To: <CAHOQ7J-sUt3HGYNE7n=X3ZmV3Q-n+n9hMDAtzLbH3YU8iAqoqA@mail.gmail.com>
Am 20.03.2014 22:56, schrieb Stefan Zager:
> On Thu, Mar 20, 2014 at 2:35 PM, Karsten Blees <karsten.blees@gmail•com> wrote:
>> Am 20.03.2014 17:08, schrieb Stefan Zager:
>>
>>> Going forward, there is still a lot of performance that gets left on
>>> the table when you rule out threaded file access. There are not so
>>> many calls to read, mmap, and pread in the code; it should be possible
>>> to rationalize them and make them thread-safe -- at least, thread-safe
>>> for posix-compliant systems and msysgit, which covers the great
>>> majority of git users, I would hope.
>>>
>>
>> IMO a "mostly" XSI compliant pread (or even the git_pread() emulation) is still better than forbidding the use of read() entirely. Switching from read to pread everywhere requires that all callers have to keep track of the file position, which means a _lot_ of code changes (read/xread/strbuf_read is used in ~70 places throughout git). And how do you plan to deal with platforms that don't have a thread-safe pread (HP, Cygwin)?
>>
>> Considering all that, Duy's solution of opening separate file descriptors per thread seems to be the best pattern for future multi-threaded work.
>
> Does that mean you would endorse the (N threads) * (M pack files)
> approach to threading checkout and status? That seems kind of
> crazy-town to me. Not to mention that pack windows are not shared, so
> this approach to multi-threading can have the side-effect of blowing
> out memory consumption. We have already had to dial back settings for
> pack.threads and core.deltaBaseCacheLimit, because threaded index-pack
> was causing OOM errors on 32-bit platforms.
>
Opening more file descriptors doesn't significantly increase the memory footprint, so it shouldn't matter whether the threads read data via shared or private descriptors.
git-status with core.preloadindex is already multithreaded (at least the first part), and AFAIK doesn't read pack files at all.
I'm still not convinced that multi-threaded git-checkout is a good idea. According to my tests this is actually slower than sequential checkout. You'd have to be very careful to only multi-thread the parts that don't do any IO, such as unpacking / undeltifying.
next prev parent reply other threads:[~2014-03-21 20:01 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-19 0:46 [PATCH] Enable index-pack threading in msysgit szager
2014-03-19 7:30 ` Duy Nguyen
2014-03-19 7:50 ` Stefan Zager
2014-03-19 10:28 ` Duy Nguyen
2014-03-19 16:57 ` Stefan Zager
2014-03-19 19:15 ` Stefan Zager
2014-03-19 20:57 ` Junio C Hamano
2014-03-20 13:54 ` Karsten Blees
2014-03-20 16:08 ` Stefan Zager
2014-03-20 21:35 ` Karsten Blees
2014-03-20 21:56 ` Stefan Zager
2014-03-21 1:33 ` Duy Nguyen
2014-03-21 20:01 ` Karsten Blees [this message]
2014-03-21 1:51 ` Duy Nguyen
2014-03-21 5:21 ` Duy Nguyen
2014-03-21 5:35 ` Stefan Zager
2014-03-21 18:55 ` Karsten Blees
2014-03-25 13:41 ` [PATCH] index-pack: work around thread-unsafe pread() Nguyễn Thái Ngọc Duy
2014-03-26 8:35 ` [PATCH] Enable index-pack threading in msysgit Johannes Sixt
-- strict thread matches above, loose matches on Subject: below --
2014-03-19 21:35 Stefan Zager
2014-03-19 22:23 ` Junio C Hamano
2014-03-20 1:25 ` Duy Nguyen
2014-03-21 18:40 ` Karsten Blees
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=532C9AA9.1010102@gmail.com \
--to=karsten.blees@gmail$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=pclouds@gmail$(echo .)com \
--cc=szager@chromium$(echo .)org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox