From: Junio C Hamano <gitster@pobox•com>
To: Jeff King <peff@peff•net>
Cc: David Turner <novalis@novalis•org>,
Duy Nguyen <pclouds@gmail•com>,
Git Mailing List <git@vger•kernel.org>
Subject: Re: "disabling bitmap writing, as some objects are not being packed"?
Date: Wed, 08 Feb 2017 16:18:25 -0800 [thread overview]
Message-ID: <xmqqbmuctdwu.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170208230057.hking37uuynf4cgd@sigill.intra.peff.net> (Jeff King's message of "Wed, 8 Feb 2017 18:00:57 -0500")
Jeff King <peff@peff•net> writes:
> In my experience, auto-gc has never been a low-maintenance operation on
> the server side (and I do think it was primarily designed with clients
> in mind).
I do not think auto-gc was ever tweaked to help server usage, in its
history since it was invented strictly to help end-users (mostly new
ones).
> At GitHub we disable it entirely, and do our own gc based on a throttled
> job queue ...
> I wish regular Git were more turn-key in that respect. Maybe it is for
> smaller sites, but we certainly didn't find it so. And I don't know that
> it's feasible to really share the solution. It's entangled with our
> database (to store last-pushed and last-maintenance values for repos)
> and our job scheduler.
Thanks for sharing the insights from the trenches ;-)
> Yeah, I'm certainly open to improving Git's defaults. If it's not clear
> from the above, I mostly just gave up for a site the size of GitHub. :)
>
>> Idea 1: when gc --auto would issue this message, instead it could create
>> a file named gc.too-much-garbage (instead of gc.log), with this message.
>> If that file exists, and it is less than one day (?) old, then we don't
>> attempt to do a full gc; instead we just run git repack -A -d. (If it's
>> more than one day old, we just delete it and continue anyway).
>
> I kind of wonder if this should apply to _any_ error. I.e., just check
> the mtime of gc.log and forcibly remove it when it's older than a day.
> You never want to get into a state that will fail to resolve itself
> eventually. That might still happen (e.g., corrupt repo), but at the
> very least it won't be because Git is too dumb to try again.
;-)
>> Idea 2 : Like idea 1, but instead of repacking, just smash the existing
>> packs together into one big pack. In other words, don't consider
>> dangling objects, or recompute deltas. Twitter has a tool called "git
>> combine-pack" that does this:
>> https://github.com/dturner-tw/git/blob/dturner/journal/builtin/combine-pack.c
>
> We wrote something similar at GitHub, too, but we never ended up using
> it in production. We found that with a sane scheduler, it's not too big
> a deal to just do maintenance once in a while.
Thanks again for this. I've also been wondering about how effective
a "concatenate packs without paying reachability penalty" would be.
> I'm still not sure if it's worth making the fatal/non-fatal distinction.
> Doing so is perhaps safer, but it does mean that somebody has to decide
> which errors are important enough to block a retry totally, and which
> are not. In theory, it would be safe to always _try_ and then the gc
> process can decide when something is broken and abort. And all you've
> wasted is some processing power each day.
Yup, and somebody or something need to monitor so that repeated
failures can be dealt with.
next prev parent reply other threads:[~2017-02-09 0:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-16 21:05 "disabling bitmap writing, as some objects are not being packed"? David Turner
2016-12-16 21:27 ` Jeff King
2016-12-16 21:28 ` Junio C Hamano
2016-12-16 21:32 ` Jeff King
2016-12-16 21:40 ` David Turner
2016-12-16 21:49 ` Jeff King
2016-12-16 23:59 ` [PATCH] pack-objects: don't warn about bitmaps on incremental pack David Turner
2016-12-17 4:04 ` Jeff King
2016-12-19 16:03 ` David Turner
2016-12-17 7:50 ` "disabling bitmap writing, as some objects are not being packed"? Duy Nguyen
2017-02-08 1:03 ` David Turner
2017-02-08 6:45 ` Duy Nguyen
2017-02-08 8:24 ` David Turner
2017-02-08 8:37 ` Duy Nguyen
2017-02-08 17:44 ` Junio C Hamano
2017-02-08 19:05 ` David Turner
2017-02-08 19:08 ` Jeff King
2017-02-08 22:14 ` David Turner
2017-02-08 23:00 ` Jeff King
2017-02-09 0:18 ` Junio C Hamano [this message]
2017-02-09 1:12 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqbmuctdwu.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=novalis@novalis$(echo .)org \
--cc=pclouds@gmail$(echo .)com \
--cc=peff@peff$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox