public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce•org>
To: Linus Torvalds <torvalds@osdl•org>
Cc: git@vger•kernel.org
Subject: Re: 1.3.0 creating bigger packs than 1.2.3
Date: Thu, 20 Apr 2006 12:43:51 -0400	[thread overview]
Message-ID: <20060420164351.GB31738@spearce.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0604200857460.3701@g5.osdl.org>

Linus Torvalds <torvalds@osdl•org> wrote:
> 
> 
> On Thu, 20 Apr 2006, Shawn Pearce wrote:
> > 
> > So with 1.3.0.g56c1 "git repack -a -d -f" did worse:
> > 
> >   Total 46391, written 46391 (delta 6649), reused 39742 (delta 0)
> >   129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
> > 
> > I just tried -f on v1.2.3 and it did slightly better then before:
> > 
> >   Total 46391, written 46391 (delta 6847), reused 38012 (delta 0)
> >    59M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

Oddly enough repacking the v1.2.3 pack using 1.3.0.g56c1 created an
even smaller pack ("git-repack -a -d"):

  Total 46391, written 46391 (delta 8253), reused 44985 (delta 6847)
   49M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

and repacking again with "git-repack -a -d" chopped another 1M:

  Total 46391, written 46391 (delta 8258), reused 46386 (delta 8253)
   48M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pac
  
but then adding -f definately gives us the 2x explosion again:

  Total 46391, written 46391 (delta 6649), reused 37894 (delta 0)
  129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

> Interesting. The bigger packs do generate fewer deltas, but they don't 
> seem to be _that_ much fewer. And the deltas themselves certainly 
> shouldn't be bigger.
> 
> It almost sounds like there's a problem with choosing what to delta 
> against, not necessarily a delta algorithm problem. Although that sounds a 
> bit strange, because I wouldn't have thought we actually changed the 
> packing algorithm noticeably since 1.2.3.
> 
> Hmm. Doing "gitk v1.2.3.. -- pack-objects.c" shows that I was wrong. Junio 
> did the "hash basename and direname a bit differently" thing, which would 
> appear to change the "find objects to delta against" a lot. That could be 
> it. 
> 
> You could try to revert that change:
> 
> 	git revert eeef7135fed9b8784627c4c96e125241c06c65e1
> 
> which needs a trivial manual fixup (remove the conflict entirely: 
> everything between the "<<<<" and ">>>>>" lines should go), and see if 
> that's it.

Whoa.  I did that revert and fixup on top of 'next'.  The pack
from "git-repack -a -d -f" is now even larger due to even less
delta reuse:

  Total 46391, written 46391 (delta 5148), reused 39565 (delta 0)
  171M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

> You can also try to see if
> 
> 	git repack -a -d -f --window=50
> 
> makes for a better pack (at the cost of a much slower repack). It makes 
> git try more objects to delta against, and can thus hide a bad sort order.

With --window=50 on 'next' (without the revert'):

  Total 46391, written 46391 (delta 6666), reused 39723 (delta 0)
  129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack

For added measure I tried --window=100 and 500 with pretty much
the same result (slightly higher delta but still a 129M pack).

-- 
Shawn.

  reply	other threads:[~2006-04-20 16:44 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-20 13:36 1.3.0 creating bigger packs than 1.2.3 Shawn Pearce
2006-04-20 14:47 ` Linus Torvalds
2006-04-20 15:03   ` Shawn Pearce
2006-04-20 16:07     ` Linus Torvalds
2006-04-20 16:43       ` Shawn Pearce [this message]
2006-04-20 17:03         ` Linus Torvalds
2006-04-20 17:24           ` Junio C Hamano
2006-04-20 17:31           ` Shawn Pearce
2006-04-20 17:54             ` Nicolas Pitre
2006-04-20 21:31             ` Junio C Hamano
2006-04-20 21:53               ` Shawn Pearce
2006-04-20 21:56               ` Jakub Narebski
2006-04-20 17:41           ` Nicolas Pitre
2006-04-20 17:55           ` Shawn Pearce
2006-04-20 18:24             ` Nicolas Pitre
2006-04-20 18:49               ` Junio C Hamano
2006-04-20 21:02                 ` Nicolas Pitre
2006-04-20 21:40                   ` Junio C Hamano
2006-04-20 22:02                     ` Shawn Pearce
2006-04-20 22:35                       ` Junio C Hamano
2006-04-21  1:01                         ` Shawn Pearce
2006-04-20 22:59                       ` Linus Torvalds
2006-04-21  0:52                     ` Nicolas Pitre
2006-04-21  1:20                     ` Shawn Pearce
2006-04-21  2:28                       ` Nicolas Pitre
2006-04-21  2:40                         ` Shawn Pearce
2006-04-21  3:07                           ` Nicolas Pitre
2006-04-21  2:32                       ` Shawn Pearce
2006-04-20 23:02                   ` Junio C Hamano
2006-04-20 16:09 ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060420164351.GB31738@spearce.org \
    --to=spearce@spearce$(echo .)org \
    --cc=git@vger$(echo .)kernel.org \
    --cc=torvalds@osdl$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox