public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: thomas <trast@student•ethz.ch>
To: Junio C Hamano <gitster@pobox•com>
Cc: "Stefan Zager" <szager@google•com>,
	git@vger•kernel.org, "Jeff King" <peff@peff•net>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail•com>,
	"Nicolas Pitre" <nico@fluxnic•net>
Subject: Re: [PATCH] sha1_file: remove recursion in packed_object_info
Date: Mon, 25 Mar 2013 10:27:16 +0100	[thread overview]
Message-ID: <87620faky3.fsf@linux-k42r.v.cablecom.net> (raw)
In-Reply-To: <7vtxo6f27l.fsf@alter.siamese.dyndns.org> (Junio C. Hamano's message of "Wed, 20 Mar 2013 09:47:10 -0700")

Junio C Hamano <gitster@pobox•com> writes:

> Thomas Rast <trast@student•ethz.ch> writes:
>
>> So here's a nonrecursive version.  Dijkstra is probably turning over
>> in his grave as we speak.
>>
>> I *think* I actually got it right.
>
> You seem to have lost the "if we cannot get delta base, this object
> is BAD" check where you measure the size of a deltified object,
> which would correspond to this check:
>
>> -static int packed_delta_info(struct packed_git *p,
>> -			     struct pack_window **w_curs,
>> -			     off_t curpos,
>> -			     enum object_type type,
>> -			     off_t obj_offset,
>> -			     unsigned long *sizep)
>> -{
>> -	off_t base_offset;
>> -
>> -	base_offset = get_delta_base(p, w_curs, &curpos, type, obj_offset);
>> -	if (!base_offset)
>> -		return OBJ_BAD;

True, I'll think about this.

> The following comment is also lost but...
>
>> -	/* We choose to only get the type of the base object and
>> -	 * ignore potentially corrupt pack file that expects the delta
>> -	 * based on a base with a wrong size.  This saves tons of
>> -	 * inflate() calls.
>> -	 */
>> -	if (sizep) {
>> -		*sizep = get_size_from_delta(p, w_curs, curpos);
>> -		if (*sizep == 0)
>> -			type = OBJ_BAD;
>
> ... is this check correct?  There is an equivalent check at the
> beginning of the new packed_object_info() to error out a deltified
> result.  Why is an object whose size is 0 bad?

Cc'ing Nicolas, but I think there are several reasons:

If it's a delta, then according to docs[1] it starts with the SHA1 of
the base object, plus the deflated data.  So it is at least 20 bytes.

If it's not a delta, then it must start with '<type> <size>\0', which
even after compression cannot possibly be 0 bytes.

Either way, get_size_from_delta() also uses 0 as the error return.

> The stack/recursion is used _only_ for error recovery, no?  If we do
> not care about retrying with a different copy of an object we find
> in the delta chain, we can just update obj_offset with base_offset
> and keep digging.  It almost makes me wonder if a logical follow-up
> to this patch may be to do so, and rewrite the error recovery
> codepath to just mark the bad copy and jump back to the very top,
> retrying everything from scratch.

I totally agree.  I'll try this again -- my last attempt just didn't
work out...



[1]  Documentation/technical/pack-format.txt

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  reply	other threads:[~2013-03-25  9:28 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-15 22:42 regression in multi-threaded git-pack-index Stefan Zager
2013-03-16 11:41 ` Jeff King
2013-03-16 12:38   ` Duy Nguyen
2013-03-19  8:17   ` Thomas Rast
2013-03-19  9:30     ` Jeff King
2013-03-19  9:59       ` Jeff King
2013-03-19 10:08         ` Jeff King
2013-03-19 10:24           ` Jeff King
2013-03-19 10:29             ` Thomas Rast
2013-03-19 10:33               ` Jeff King
2013-03-19 10:45                 ` Thomas Rast
2013-03-19 10:47                   ` Jeff King
2013-03-19 10:58             ` [PATCH] index-pack: always zero-initialize object_entry list Jeff King
2013-03-19 15:33               ` Thomas Rast
2013-03-19 15:43                 ` Jeff King
2013-03-19 15:52                   ` Jeff King
2013-03-19 16:17                     ` [PATCH v2] " Jeff King
2013-03-19 16:27                       ` Thomas Rast
2013-03-19 17:13                         ` Junio C Hamano
2013-03-20 19:12                       ` Eric Sunshine
2013-03-20 19:13                         ` Jeff King
2013-03-20 19:14                           ` Eric Sunshine
2013-03-19 12:35     ` regression in multi-threaded git-pack-index Duy Nguyen
2013-03-19 13:01       ` [PATCH] index-pack: protect deepest_delta in multithread code Nguyễn Thái Ngọc Duy
2013-03-19 13:25         ` Jeff King
2013-03-19 13:50         ` Thomas Rast
2013-03-19 14:07           ` Duy Nguyen
2013-03-19 14:16             ` [PATCH v2] index-pack: guard nr_resolved_deltas reads by lock Thomas Rast
2013-03-19 15:53               ` Junio C Hamano
2013-03-19 15:41 ` regression in multi-threaded git-pack-index Thomas Rast
2013-03-19 15:45   ` Thomas Rast
2013-03-19 16:11     ` Thomas Rast
2013-03-19 17:58       ` Junio C Hamano
2013-03-19 22:08         ` [PATCH] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-20 16:47           ` Junio C Hamano
2013-03-25  9:27             ` thomas [this message]
2013-03-25 18:07               ` [PATCH v2 0/3] Recursion-free unpack_entry and packed_object_info Thomas Rast
2013-03-25 18:07                 ` [PATCH v2 1/3] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-25 18:07                 ` [PATCH v2 2/3] Refactor parts of in_delta_base_cache/cache_or_unpack_entry Thomas Rast
2013-03-25 23:15                   ` Junio C Hamano
2013-03-26 11:09                     ` thomas
2013-03-25 18:07                 ` [PATCH v2 3/3] sha1_file: remove recursion in unpack_entry Thomas Rast
2013-03-25 23:19                   ` Junio C Hamano
2013-03-26  3:37                 ` [PATCH v2 0/3] Recursion-free unpack_entry and packed_object_info Nicolas Pitre
2013-03-25 18:17               ` [PATCH] sha1_file: remove recursion in packed_object_info Junio C Hamano
2013-03-27 20:03               ` [PATCH v3 0/3] Recursion-free unpack_entry and packed_object_info Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 1/3] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 2/3] Refactor parts of in_delta_base_cache/cache_or_unpack_entry Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 3/3] sha1_file: remove recursion in unpack_entry Thomas Rast
2013-03-27 20:29                   ` Junio C Hamano
2013-03-20  1:17       ` regression in multi-threaded git-pack-index Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87620faky3.fsf@linux-k42r.v.cablecom.net \
    --to=trast@student$(echo .)ethz.ch \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=nico@fluxnic$(echo .)net \
    --cc=pclouds@gmail$(echo .)com \
    --cc=peff@peff$(echo .)net \
    --cc=szager@google$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox