public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Kjetil Barvik <barvik@broadpark•no>
To: Johannes Schindelin <Johannes.Schindelin@gmx•de>,
	Johannes Sixt <j.sixt@viscovery•net>
Cc: Junio C Hamano <gitster@pobox•com>, git@vger•kernel.org
Subject: Re: [PATCH/RFC v3 7/9] write_entry(): use fstat() instead of lstat() when file is open
Date: Fri, 06 Feb 2009 12:06:57 +0100	[thread overview]
Message-ID: <868wojq0xa.fsf@broadpark.no> (raw)
In-Reply-To: <alpine.DEB.1.00.0902051202440.7491@intel-tinevez-2-302>

* On Thu, 5 Feb 2009, Johannes Sixt wrote:
|
|> Kjetil Barvik schrieb:
|> >   And, yes, since each lstat() call cost approximately 44 microseconds
|> >   compared to 12-16 for each lstat() on my Linux box, there was a little
|>                                ^^^^^^^ fstat()?
|> >   performance gain from this patch.
|> 
|> This does look like a good gain. But do you have hard numbers that back
|> the claim?

  __________________
  Test description

  I have used the following 8 git binaries:

$ for g in ./git_t*; do printf "$g => $($g --version)\n"; done
./git_t0 => git version 1.6.1.80.g7eb5
./git_t1 => git version 1.6.1.85.gbda6       <= added lstat_cache()
./git_t2 => git version 1.6.1.2.306.gc0f6f
./git_t3 => git version 1.6.1.2.307.g55385
./git_t4 => git version 1.6.1.2.308.g052a
./git_t5 => git version 1.6.1.2.310.g40dd2   <= added schedule_dir_for_removal()
./git_t6 => git version 1.6.1.2.313.g9deee
./git_t7 => git version 1.6.1.2.314.g0ad9    <= added this patch (7/9)

  Except for git_t7 all commits should be in origin/pu, so people should
  be able to do git show/diff/log on the last hex chars on the version-
  strings to see the differences.

  CFLAGS="-mtune=core2 -O2 -fomit-frame-pointer -fno-stack-protector -g0 -s"

  Each git_t* binary is run with args "checkout -q my-v.2.6.2[57]" for a
  total of 100 times (50/50 to my-v2.6.25/my-v2.6.27).  Before each run
  the test script sleeps 20 seconds to allow the disk to finish and
  being idle.  Time is collected by /usr/bin/time -o output-file prog.

  While the test script was run, the laptop with an Intel Core2 Duo 2
  GHz processor, was run without X and was otherwise idle.  The test
  script took 9 hours and 45 minutes to complete.

  __________________
  Test numbers

$ for ((t=0; $t<=7; t++)); do echo "git_t$t => $(parse_usr_bin_time_lines.pl git_t$t*)"; done
git_t0 =>  5.646 user  8.322 sys  25.579 real  54.6% CPU  faults:  0 major 39587 minor
git_t1 =>  5.631 user  6.826 sys  23.941 real  52.1% CPU  faults:  0 major 39901 minor
git_t2 =>  5.622 user  6.854 sys  24.036 real  52.1% CPU  faults:  0 major 39298 minor
git_t3 =>  5.636 user  6.867 sys  24.088 real  52.0% CPU  faults:  0 major 39786 minor
git_t4 =>  5.640 user  6.818 sys  24.006 real  52.0% CPU  faults:  0 major 39629 minor
git_t5 =>  5.642 user  6.552 sys  23.805 real  51.4% CPU  faults:  0 major 39707 minor
git_t6 =>  5.629 user  6.518 sys  22.981 real  53.0% CPU  faults:  0 major 40029 minor
git_t7 =>  5.629 user  6.051 sys  23.013 real  50.9% CPU  faults:  0 major 39451 minor

|> If you can squeeze out a 10% improvement on Linux with this patch, we
|> should take it, and if it turns out that it doesn't work on Windows, we
|> could trivially throw in an #ifdef MINGW (or even #ifdef WIN32 if Cygwin
|> is affected, too) that skips the fstat() optimization.

  For the system used time, the improvement was (- 6.518 6.051) = 0.467
  seconds, or (/ (* (- 6.518 6.051) 100.0) 6.518) = 7.2%, so not 10%.

  Funny to see that in:
     http://article.gmane.org/gmane.comp.version-control.git/107281

  I guessed the improvement to be (quote):
     "(* 14403 (- 44 12)) = 460 896 microseconds system time"

  So I missed only by a little over 6ms. :-)

* Johannes Schindelin <Johannes.Schindelin@gmx•de> writes:
| Of course, what we _really_ would do is to provide a flag, say, 
| FSTAT_UNRELIABLE and test for _that_ (after defining it in the Makefile 
| for the appropriate platforms).

  Or, maybe

     #define FSTAT_RELIABLE 1

  for Linux only?  Then we can change the if-test inside this patch to
  the following:

-  if (state->refresh_cache && !to_tempfile && !state->base_dir_len) {
+  if (state->refresh_cache && !to_tempfile && !state->base_dir_len && 
+      FSTAT_RELIABLE) {

  Then we do not have to have #if-defines inside the source code, only
  in one header file.

  But, question: is this patch worth the extra lines added to the source
  code?

  -- kjetil

  PS!  Junio, I think this patch series should be inside pu some days
       more, since things obviously needs more refinement and tests.

  parent reply	other threads:[~2009-02-06 11:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-04 12:52 [PATCH/RFC v3 0/9] git checkout: more cleanups, optimisation, less lstat() calls Kjetil Barvik
2009-02-04 12:52 ` [PATCH/RFC v3 1/9] lstat_cache(): small cleanup and optimisation Kjetil Barvik
2009-02-04 12:52 ` [PATCH/RFC v3 2/9] lstat_cache(): generalise longest_match_lstat_cache() Kjetil Barvik
2009-02-04 12:52 ` [PATCH/RFC v3 3/9] lstat_cache(): swap func(length, string) into func(string, length) Kjetil Barvik
2009-02-04 12:52 ` [PATCH/RFC v3 4/9] unlink_entry(): introduce schedule_dir_for_removal() Kjetil Barvik
2009-02-04 20:54   ` Junio C Hamano
2009-02-04 20:55   ` Junio C Hamano
2009-02-04 21:32     ` Kjetil Barvik
2009-02-04 12:52 ` [PATCH/RFC v3 5/9] create_directories(): remove some memcpy() and strchr() calls Kjetil Barvik
2009-02-04 12:52 ` [PATCH/RFC v3 6/9] write_entry(): cleanup of some duplicated code Kjetil Barvik
2009-02-04 12:53 ` [PATCH/RFC v3 7/9] write_entry(): use fstat() instead of lstat() when file is open Kjetil Barvik
2009-02-04 14:01   ` Johannes Sixt
2009-02-04 18:41     ` Junio C Hamano
2009-02-04 19:53       ` Kjetil Barvik
2009-02-04 20:30         ` Junio C Hamano
2009-02-05  8:14         ` Johannes Sixt
2009-02-05 11:03           ` Johannes Schindelin
2009-02-05 17:45             ` Junio C Hamano
2009-02-06 11:06             ` Kjetil Barvik [this message]
2009-02-06 11:26               ` Johannes Schindelin
2009-02-14 17:50           ` Kjetil Barvik
2009-02-04 12:53 ` [PATCH/RFC v3 8/9] show_patch_diff(): remove a call to fstat() Kjetil Barvik
2009-02-04 12:53 ` [PATCH/RFC v3 9/9] lstat_cache(): print a warning if doing ping-pong between cache types Kjetil Barvik
  -- strict thread matches above, loose matches on Subject: below --
2009-02-05 10:46 [PATCH/RFC v3 7/9] write_entry(): use fstat() instead of lstat() when file is open Kjetil Barvik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=868wojq0xa.fsf@broadpark.no \
    --to=barvik@broadpark$(echo .)no \
    --cc=Johannes.Schindelin@gmx$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=j.sixt@viscovery$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox