public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Andreas Ericsson <ae@op5•se>
To: Nicolas Pitre <nico@cam•org>
Cc: "Shawn O. Pearce" <spearce@spearce•org>,
	Geert Bosch <bosch@adacore•com>, Andi Kleen <andi@firstfloor•org>,
	Ken Pratt <ken@kenpratt•net>,
	git@vger•kernel.org
Subject: Re: pack operation is thrashing my server
Date: Thu, 14 Aug 2008 08:33:59 +0200	[thread overview]
Message-ID: <48A3D1D7.5030805@op5.se> (raw)
In-Reply-To: <alpine.LFD.1.10.0808131228270.4352@xanadu.home>

Nicolas Pitre wrote:
> On Wed, 13 Aug 2008, Shawn O. Pearce wrote:
> 
>> Nicolas Pitre <nico@cam•org> wrote:
>>> Well, we are talking about 50MB which is not that bad.
>> I think we're closer to 100MB here due to the extra overheads
>> I just alluded to above, and which weren't in your 104 byte
>> per object figure.
> 
> Sure.  That should still be workable on a machine with 256MB of RAM.
> 
>>> However there is a point where we should be realistic and just admit 
>>> that you need a sufficiently big machine if you have huge repositories 
>>> to deal with.  Git should be fine serving pull requests with relatively 
>>> little memory usage, but anything else such as the initial repack simply 
>>> require enough RAM to be effective.
>> Yea.  But it would also be nice to be able to just concat packs
>> together.  Especially if the repository in question is an open source
>> one and everything published is already known to be in the wild,
>> as say it is also available over dumb HTTP.  Yea, I know people
>> like the 'security feature' of the packer not including objects
>> which aren't reachable.
> 
> It is not only that, even if it is a point I consider important.  If you 
> end up with 10 packs, it is likely that a base object in each of those 
> packs could simply be a delta against a single common base object, and 
> therefore the amount of data to transfer might be up to 10 times higher 
> than necessary.
> 

[cut]

>> This is also true for many internal corporate repositories.
>> Users probably have full read access to the object database anyway,
>> and maybe even have direct write access to it.  Doing the object
>> enumeration there is pointless as a security measure.
> 
> It is good for network bandwidth efficiency as I mentioned.
> 

As a corporate git user, I can say that I'm very rarely worried
about how much data gets sent over our in-office gigabit network.
My primary concern wrt server side git is cpu- and IO-heavy
operations, as we run the entire machine in a vmware guest os
which just plain sucks at such things.

With that in mind, a config variable in /etc/gitconfig would
work wonderfully for that situation, as our central watering
hole only ever serves locally.

>> I'm too busy to write a pack concat implementation proposal, so
>> I'll just shutup now.  But it wouldn't be hard if someone wanted
>> to improve at least the initial clone serving case.
> 
> A much better solution would consist of finding just _why_ object 
> enumeration is so slow.  This is indeed my biggest grip with git 
> performance at the moment.
> 
> |nico@xanadu:linux-2.6> time git rev-list --objects --all > /dev/null
> |
> |real    0m21.742s
> |user    0m21.379s
> |sys     0m0.360s
> 
> That's way too long for 1030198 objects (roughly 48k objects/sec).  And 
> it gets even worse with the gcc repository:
> 
> |nico@xanadu:gcc> time git rev-list --objects --all > /dev/null
> |
> |real    1m51.591s
> |user    1m50.757s
> |sys     0m0.810s
> 
> That's for 1267993 objects, or about 11400 objects/sec.
> 
> Clearly something is not scaling here.
> 

What are the different packing options for the two repositories?
A longer deltachain and larger packwindow would increase the
enumeration time, wouldn't it?

-- 
Andreas Ericsson                   andreas.ericsson@op5•se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

  parent reply	other threads:[~2008-08-14  6:35 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-10 19:47 pack operation is thrashing my server Ken Pratt
2008-08-10 23:06 ` Martin Langhoff
2008-08-10 23:12   ` Ken Pratt
2008-08-10 23:30     ` Martin Langhoff
2008-08-10 23:34       ` Ken Pratt
2008-08-11  3:04 ` Shawn O. Pearce
2008-08-11  7:43   ` Ken Pratt
2008-08-11 15:01     ` Shawn O. Pearce
2008-08-11 15:40       ` Avery Pennarun
2008-08-11 15:59         ` Shawn O. Pearce
2008-08-11 19:13       ` Ken Pratt
2008-08-11 19:10     ` Andi Kleen
2008-08-11 19:15       ` Ken Pratt
2008-08-13  2:38         ` Nicolas Pitre
2008-08-13  2:50           ` Andi Kleen
2008-08-13  2:57             ` Shawn O. Pearce
2008-08-11 19:22       ` Shawn O. Pearce
2008-08-11 19:29         ` Ken Pratt
2008-08-11 19:34           ` Shawn O. Pearce
2008-08-11 20:10             ` Andi Kleen
2008-08-13  3:12       ` Geert Bosch
2008-08-13  3:15         ` Shawn O. Pearce
2008-08-13  3:58           ` Geert Bosch
2008-08-13 14:37             ` Nicolas Pitre
2008-08-13 14:56               ` Jakub Narebski
2008-08-13 15:04                 ` Shawn O. Pearce
2008-08-13 15:26                   ` David Tweed
2008-08-13 23:54                     ` Martin Langhoff
2008-08-14  9:04                       ` David Tweed
2008-08-13 16:10                   ` Johan Herland
2008-08-13 17:38                     ` Ken Pratt
2008-08-13 17:57                       ` Nicolas Pitre
2008-08-13 14:35         ` Nicolas Pitre
2008-08-13 14:59           ` Shawn O. Pearce
2008-08-13 15:43             ` Nicolas Pitre
2008-08-13 15:50               ` Shawn O. Pearce
2008-08-13 17:04                 ` Nicolas Pitre
2008-08-13 17:19                   ` Shawn O. Pearce
2008-08-14  6:33                   ` Andreas Ericsson [this message]
2008-08-14 10:04                     ` Thomas Rast
2008-08-14 10:15                       ` Andreas Ericsson
2008-08-14 22:33                         ` Shawn O. Pearce
2008-08-15  1:46                           ` Nicolas Pitre
2008-08-14 14:01                     ` Nicolas Pitre
2008-08-14 17:21                   ` Linus Torvalds
2008-08-14 17:58                     ` Linus Torvalds
2008-08-14 19:04                       ` Nicolas Pitre
2008-08-14 19:44                         ` Linus Torvalds
2008-08-14 21:30                           ` Andi Kleen
2008-08-15 16:15                             ` Linus Torvalds
2008-08-14 21:50                           ` Nicolas Pitre
2008-08-14 23:14                             ` Linus Torvalds
2008-08-14 23:39                               ` Björn Steinbrink
2008-08-15  0:06                                 ` Linus Torvalds
2008-08-15  0:25                                   ` Linus Torvalds
2008-08-16 12:47                                   ` Björn Steinbrink
2008-08-16  0:34                               ` Linus Torvalds
2008-09-07  1:03                                 ` Junio C Hamano
2008-09-07  1:46                                   ` Linus Torvalds
2008-09-07  2:33                                     ` Junio C Hamano
2008-09-07 17:11                                       ` Nicolas Pitre
2008-09-07 17:41                                         ` Junio C Hamano
2008-09-07  2:50                                     ` Jon Smirl
2008-09-07  3:07                                       ` Linus Torvalds
2008-09-07  3:43                                         ` Jon Smirl
2008-09-07  4:50                                           ` Linus Torvalds
2008-09-07 13:58                                             ` Jon Smirl
2008-09-07 17:08                                               ` Nicolas Pitre
2008-09-07 20:33                                                 ` Jon Smirl
2008-09-08 14:17                                                   ` Nicolas Pitre
2008-09-08 15:12                                                     ` Jon Smirl
2008-09-08 16:01                                                       ` Jon Smirl
2008-09-07  8:18                                         ` Andreas Ericsson
2008-09-07  7:45                                     ` Mike Hommey
2008-08-14 18:38                     ` Nicolas Pitre
2008-08-14 18:55                       ` Linus Torvalds
2008-08-13 16:01           ` Geert Bosch
2008-08-13 17:13             ` Dana How
2008-08-13 17:26             ` Nicolas Pitre
2008-08-13 12:43 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48A3D1D7.5030805@op5.se \
    --to=ae@op5$(echo .)se \
    --cc=andi@firstfloor$(echo .)org \
    --cc=bosch@adacore$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=ken@kenpratt$(echo .)net \
    --cc=nico@cam$(echo .)org \
    --cc=spearce@spearce$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox