public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Ian Kumlien <pomac@vapor•com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail•com>
Cc: git@vger•kernel.org
Subject: Re: [PATCH 2/2] index-pack: reduce memory usage when the pack has large blobs
Date: Fri, 24 Feb 2012 16:37:53 +0100	[thread overview]
Message-ID: <20120224153753.GG9526@pomac.netswarm.net> (raw)
In-Reply-To: <1330086201-13916-2-git-send-email-pclouds@gmail.com>

On Fri, Feb 24, 2012 at 07:23:21PM +0700, Nguyễn Thái Ngọc Duy wrote:
> This command unpacks every non-delta objects in order to:
> 
> 1. calculate sha-1
> 2. do byte-to-byte sha-1 collision test if we happen to have objects
>    with the same sha-1
> 3. validate object content in strict mode
> 
> All this requires the entire object to stay in memory, a bad news for
> giant blobs. This patch lowers memory consumption by not saving the
> object in memory whenever possible, calculating SHA-1 while unpacking
> the object.
> 
> This patch assumes that the collision test is rarely needed. The
> collision test will be done later in second pass if necessary, which
> puts the entire object back to memory again (We could even do the
> collision test without putting the entire object back in memory, by
> comparing as we unpack it).
> 
> In strict mode, it always keeps non-blob objects in memory for
> validation (blobs do not need data validation). "--strict --verify"
> also keeps blobs in memory.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail•com>

Finally, reapplied the patches and so on:
remote: Counting objects: 1425, done.
remote: Compressing objects: 100% (617/617), done.
remote: Total 1425 (delta 790), reused 1425 (delta 790)
Receiving objects: 100% (1425/1425), 56.06 MiB | 3.97 MiB/s, done.
Resolving deltas: 100% (790/790), done.

real	1m57.742s
user	0m29.950s
sys	0m6.308s

*YAY*

I wonder how the hell i could have missed several parts of the patch =(

But there seems to be some issue in gerrit 2.1.8, will have to check
against a newer gerrit to verify if it's still a problem.

FYI - it seems to hang doing nothing.

As for your patches:
Tested-by: Ian Kumlien <pomac@vapor•com>

;)

  parent reply	other threads:[~2012-02-24 15:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-24 12:23 [PATCH 1/2] Skip SHA-1 collision test on "index-pack --verify" Nguyễn Thái Ngọc Duy
2012-02-24 12:23 ` [PATCH 2/2] index-pack: reduce memory usage when the pack has large blobs Nguyễn Thái Ngọc Duy
2012-02-24 14:30   ` Ian Kumlien
2012-02-24 14:40   ` Ian Kumlien
2012-02-24 15:37   ` Ian Kumlien [this message]
2012-02-24 16:16   ` Ian Kumlien
2012-02-25  1:49     ` Nguyen Thai Ngoc Duy
2012-02-25 13:17       ` Ian Kumlien
2012-02-25 22:45       ` Ian Kumlien
2012-02-26  4:10         ` Nguyen Thai Ngoc Duy
2012-02-26 13:28           ` Ian Kumlien

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120224153753.GG9526@pomac.netswarm.net \
    --to=pomac@vapor$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=pclouds@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox