public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: git@jeffhostetler•com
Cc: git@vger•kernel.org, peff@peff•net,
	Jeff Hostetler <jeffhost@microsoft•com>
Subject: Re: [PATCH v3] unpack-trees: avoid duplicate ODB lookups during checkout
Date: Thu, 13 Apr 2017 16:11:31 -0700	[thread overview]
Message-ID: <xmqqtw5roqcc.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170413155852.4281-2-git@jeffhostetler.com> (git@jeffhostetler.com's message of "Thu, 13 Apr 2017 15:58:52 +0000")

git@jeffhostetler•com writes:

> +	/*
> +	 * Fetch the tree from the ODB for each peer directory in the
> +	 * n commits.
> +	 *
> +	 * For 2- and 3-way traversals, we try to avoid hitting the
> +	 * ODB twice for the same OID.  This should yield a nice speed
> +	 * up in checkouts and merges when the commits are similar.
> +	 *
> +	 * We don't bother doing the full O(n^2) search for larger n,
> +	 * because wider traversals don't happen that often and we
> +	 * avoid the search setup.
> +	 *
> +	 * When 2 peer OIDs are the same, we just copy the tree
> +	 * descriptor data.  This implicitly borrows the buffer
> +	 * data from the earlier cell.

This "borrowing" made me worried, but it turns out that this is
perfectly OK.

fill_tree_descriptor() uses read_sha1_file() to give a tree_desc its
own copy of the tree object data, so the code that calls into the
tree traversal API is responsible for freeing the buffer returned
from fill_tree_descriptor().  The updated code avoids double freeing
by setting buf[i] to NULL for borrowing [i].

> +	 */
>  	for (i = 0; i < n; i++, dirmask >>= 1) {
> -		const unsigned char *sha1 = NULL;
> -		if (dirmask & 1)
> -			sha1 = names[i].oid->hash;
> -		buf[i] = fill_tree_descriptor(t+i, sha1);
> +		if (i > 0 && are_same_oid(&names[i], &names[i - 1])) {
> +			t[i] = t[i - 1];
> +			buf[i] = NULL;
> +		} else if (i > 1 && are_same_oid(&names[i], &names[i - 2])) {
> +			t[i] = t[i - 2];
> +			buf[i] = NULL;
> +		} else {
> +			const unsigned char *sha1 = NULL;
> +			if (dirmask & 1)
> +				sha1 = names[i].oid->hash;
> +			buf[i] = fill_tree_descriptor(t+i, sha1);
> +		}
>  	}
>  
>  	bottom = switch_cache_bottom(&newinfo);

  reply	other threads:[~2017-04-13 23:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-13 15:58 [PATCH v3] unpack-trees: avoid duplicate ODB lookups during checkout git
2017-04-13 15:58 ` git
2017-04-13 23:11   ` Junio C Hamano [this message]
2017-04-14  0:59     ` Jeff King
2017-04-14 19:32       ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqtw5roqcc.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=git@jeffhostetler$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=jeffhost@microsoft$(echo .)com \
    --cc=peff@peff$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox