From: Pete Wyckoff <pw@padd•com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail•com>
Cc: git@vger•kernel.org, Junio C Hamano <gitster@pobox•com>,
Eric Herman <eric@freesa•org>,
Sverre Rabbelier <srabbelier@gmail•com>,
Fernando Vezzosi <buccia@repnz•net>
Subject: Re: [PATCH] grep: detect number of CPUs for thread spawning
Date: Sun, 6 Nov 2011 09:50:50 -0500 [thread overview]
Message-ID: <20111106145050.GA4219@arf.padd.com> (raw)
In-Reply-To: <1320502568-14085-1-git-send-email-avarab@gmail.com>
avarab@gmail•com wrote on Sat, 05 Nov 2011 14:16 +0000:
> From: Eric Herman <eric@freesa•org>
>
> Change the number of threads that we spawn from a hardcoded value of
> "8" to what online_cpus() returns.
>
> Back in v1.7.0-rc1~19^2 when threaded grep was introduced the number
> of threads was hardcoded at compile time to 8, but this value was only
> used if online_cpus() returned greater than 1.
>
> However just using 8 threads regardless of the actual number of CPUs
> is inefficient if we have more than 8 CPUs, and potentially wasteful
> if we have fewer than 8 CPUs.
I agree with the need to exploit >8 CPUs, but I lose a lot of
performance when limiting the threads to the number of physical
CPUs.
Tests without your patch on master, just changing "#define
THREADS" from 8 to 2. On a 2-core Intel Core2 Duo.
Producing lots of output:
8 threads:
$ time ~/u/src/git/bin-wrappers/git grep f > /dev/null
0m14.02s user 0m3.64s sys 0m11.93s elapsed 148.07 %CPU
$ time ~/u/src/git/bin-wrappers/git grep f > /dev/null
0m13.86s user 0m3.70s sys 0m11.82s elapsed 148.57 %CPU
2 threads:
$ time ~/u/src/git/bin-wrappers/git grep f > /dev/null
0m15.14s user 0m3.52s sys 0m24.22s elapsed 77.05 %CPU
$ time ~/u/src/git/bin-wrappers/git grep f > /dev/null
0m14.85s user 0m3.79s sys 0m24.20s elapsed 77.05 %CPU
Producing no output:
8 threads:
$ time ~/u/src/git/bin-wrappers/git grep unfindable-string
0m1.14s user 0m3.68s sys 0m5.17s elapsed 93.22 %CPU
$ time ~/u/src/git/bin-wrappers/git grep unfindable-string
0m1.28s user 0m3.56s sys 0m5.15s elapsed 94.22 %CPU
2 threads:
$ time ~/u/src/git/bin-wrappers/git grep unfindable-string
0m1.36s user 0m3.64s sys 0m16.82s elapsed 29.75 %CPU
$ time ~/u/src/git/bin-wrappers/git grep unfindable-string
0m1.38s user 0m3.66s sys 0m16.81s elapsed 30.04 %CPU
My workdir is on NFS, where even though the repository is fully
cached, the open()s must go to the server. Using more threads
than CPUs makes it more likely that some thread isn't blocked.
You could add a #threads knob, but then we'd have to get
everybody on NFS to set that properly. Or take a look at
preload_index() to see how it guesses at how many threads it
needs.
-- Pete
next prev parent reply other threads:[~2011-11-06 14:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-05 14:16 [PATCH] grep: detect number of CPUs for thread spawning Ævar Arnfjörð Bjarmason
2011-11-06 14:50 ` Pete Wyckoff [this message]
2011-11-06 18:00 ` Eric Herman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111106145050.GA4219@arf.padd.com \
--to=pw@padd$(echo .)com \
--cc=avarab@gmail$(echo .)com \
--cc=buccia@repnz$(echo .)net \
--cc=eric@freesa$(echo .)org \
--cc=git@vger$(echo .)kernel.org \
--cc=gitster@pobox$(echo .)com \
--cc=srabbelier@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox