public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse•com>
To: Linus Torvalds <torvalds@osdl•org>
Cc: git@vger•kernel.org
Subject: Re: [PATCH] write-tree performance problems
Date: Tue, 19 Apr 2005 20:49:21 -0400	[thread overview]
Message-ID: <200504192049.21947.mason@suse.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0504191420060.19286@ppc970.osdl.org>

On Tuesday 19 April 2005 17:23, Linus Torvalds wrote:
> On Tue, 19 Apr 2005, Chris Mason wrote:
> > Regardless, putting it into the index somehow should be fastest, I'll see
> > what I can do.
>
> Start by putting it in at "read-tree" time, and adding the code to
> invalidate all parent directory indexes when somebody changes a file in
> the index (ie "update-cache" for anything but a "--refresh").
>
> That would be needed anyway, since those two are the ones that already
> change the index file.
>
> Once you're sure that you can correctly invalidate the entries (so that
> you could never use a stale tree entry by mistake), the second stage would
> be to update it at "write-tree" time.

This was much easier then I expected, and it seems to be working here.  It 
does slow down the write-tree slightly because we have to write out the index 
file, but I can get around that with the index file on tmpfs change.

The original write-tree needs .54 seconds to run

write-tree with the index speedup gets that down to .024s (same as my first 
patch) when nothing has changed.  When it has to rewrite the index file 
because something changed, it's .167s.

I'll finish off the patch once you ok the basics below.  My current code works 
like this:

1) read-tree will insert index entries for directories.  There is no index 
entry for the root.

2) update-cache removes index entries for all parents of the file you're 
updating.  So, if you update-cache fs/ext3/inode.c, I remove the index of fs 
and fs/ext3

3) If write-tree finds a directory in the index, it uses the sha1 in the cache 
entry and skips all files/dirs under that directory.

4) If write-tree detects a subdir with no directory in the index, it calls 
write_tree the same way it used to.  It then inserts a new cache object with 
the calculated sha1.

5) right before exiting, write-tree updates the index if it made any changes.

The downside to this setup is that I've got to change other index users to 
deal with directory entries that are there sometimes and missing other times.  
The nice part is that I don't have to "invalidate" the directory entry, if it 
is present, it is valid.

-chris

  reply	other threads:[~2005-04-20  0:46 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-19 16:50 [PATCH] write-tree performance problems Chris Mason
2005-04-19 17:36 ` Linus Torvalds
2005-04-19 18:11   ` Chris Mason
2005-04-19 19:03     ` Linus Torvalds
2005-04-19 21:08       ` Chris Mason
2005-04-19 21:23         ` Linus Torvalds
2005-04-20  0:49           ` Chris Mason [this message]
2005-04-20  1:09             ` Linus Torvalds
2005-04-20  6:43             ` Linus Torvalds
2005-04-20  7:38               ` H. Peter Anvin
2005-04-20  9:08                 ` WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems) Linus Torvalds
2005-04-20 10:04                   ` Ingo Molnar
2005-04-20 12:11                   ` Jon Seymour
2005-04-20 13:24                     ` Martin Uecker
2005-04-20 13:35                       ` Morten Welinder
2005-04-20 13:41                       ` Jon Seymour
2005-04-20 14:30                       ` C. Scott Ananian
2005-04-20 15:19                         ` Martin Uecker
2005-04-20 15:28                           ` C. Scott Ananian
2005-04-20 15:57                             ` Martin Uecker
2005-04-20 16:33                               ` Martin Uecker
2005-04-20 13:30                   ` Blob chunking code. [First look.] C. Scott Ananian
2005-04-20 17:31                     ` Blob chunking code. [Second look] C. Scott Ananian
2005-04-20 14:13                   ` WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems) David Woodhouse
2005-04-20 14:59                     ` Linus Torvalds
2005-04-20 22:29                       ` David Woodhouse
     [not found]                   ` <2cfc4032050420050655265d3a@mail.gmail.com>
2005-04-20 14:29                     ` Linus Torvalds
2005-04-20 14:35                       ` C. Scott Ananian
2005-04-20 15:22               ` [PATCH] write-tree performance problems Chris Mason
2005-04-20 15:30                 ` C. Scott Ananian
2005-04-20 15:46                   ` Linus Torvalds
2005-04-20 15:52                     ` C. Scott Ananian
2005-04-20 16:21                       ` Linus Torvalds
2005-04-20 15:40                 ` Linus Torvalds
2005-04-20 16:10                   ` David Willmore
2005-04-20 16:33                   ` Linus Torvalds
2005-04-20 16:41                     ` Linus Torvalds
2005-04-20 16:37                   ` Chris Mason
2005-04-20 17:06                     ` Linus Torvalds
2005-04-20 17:23                       ` Chris Mason
2005-04-20 17:52                         ` Linus Torvalds
2005-04-20 19:04                           ` Chris Mason
2005-04-20 19:19                             ` Linus Torvalds
2005-04-20 19:47                               ` Linus Torvalds
2005-04-20 18:07                       ` David S. Miller
2005-04-19 22:09       ` David Lang
2005-04-19 22:21         ` Linus Torvalds
2005-04-19 23:00           ` David Lang
2005-04-19 23:09             ` Linus Torvalds
2005-04-19 23:42               ` David Lang
2005-04-19 23:59                 ` Linus Torvalds
2005-04-19 21:52                   ` Christopher Li
2005-04-19 18:51   ` Olivier Galibert
2005-04-19 22:47   ` C. Scott Ananian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200504192049.21947.mason@suse.com \
    --to=mason@suse$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=torvalds@osdl$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox