public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Jeff King <peff@peff•net>
Cc: Michael Haggerty <mhagger@alum•mit.edu>,
	git discussion list <git@vger•kernel.org>
Subject: Re: [PATCH 2/2] log: do not shorten decoration names too early
Date: Thu, 14 May 2015 10:37:39 -0700	[thread overview]
Message-ID: <xmqq8ucr83h8.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20150514063317.GA22509@peff.net> (Jeff King's message of "Thu, 14 May 2015 02:33:18 -0400")

Jeff King <peff@peff•net> writes:

> While we are on the subject of the name_decoration code, I had
> considered at one point replacing the use of the decorate.[ch] hash
> table with a commit_slab (you can't do it in the general case, because
> decorate.[ch] handles arbitrary objects, but the name_decoration code
> only does commits). It would in theory be faster, though I don't know if
> the time we spend on the hash table is actually measurable (we make a
> lot of queries on it, but it doesn't actually get that big in the first
> place).

Hmmm, but I do not know if commit_slab is a good fit for the usage
pattern.  I expect commit objects to be fairly sparsely decorated
(e.g. a tag or ref pointing at say 1-2% of commits or fewer).
Wouldn't the hashtable used by decorate.[ch] with the max load
factor capped to 66% a better economy?

I notice that there is no API into commit_slab to ask "Does this
commit have data in the slab?"  *slabname##_at() may be the closest
thing, but that would allocate the space and then says "This is the
slot for that commit; go check if there is data there already."

In the context of using commit_slab in log-tree.c for decoration, it
would mean that we assign low slab indices to commits at the tips by
first calling "for_each_ref(add_ref_decoration)" and populate the
slab fairly densely at the beginning.  But when we check if a commit
that we encountered during a traversal is decorated or not, we would
ask *slabname##_at() and that ends up enlarging the slab, even at
that point the only thing we are interested in is if the commit is
decorated and we are not adding a new decoration for it.

For example, we have this in commit.c:

    const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep)
    {
            struct commit_buffer *v = buffer_slab_at(&buffer_slab, commit);
            if (sizep)
                    *sizep = v->size;
            return v->buffer;
    }

But if we do not have the "buffer" data cached for that commit (via
an earlier call to set_commit_buffer()), we don't have to enlarge
the slab, as we are not adding anything to the slab system with this
call.

Perhaps we want a new function *slabname##_peek() with the same
signature as *slabname##_at() that returns NULL when commit->index
is larger than the last existing element in the slab?  Then the
above would become more like:

    const void *get_cached_commit_buffer(const struct commit *commit, unsigned long *sizep)
    {
            struct commit_buffer *v = buffer_slab_peek(&buffer_slab, commit);
            if (!v)
                    return NULL;
            if (sizep)
                    *sizep = v->size;
            return v->buffer;
    }

  reply	other threads:[~2015-05-14 17:37 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-13 13:11 "HEAD -> branch" decoration doesn't work with "--decorate=full" Michael Haggerty
2015-05-13 14:51 ` Junio C Hamano
2015-05-13 15:26   ` Michael J Gruber
2015-05-13 17:11 ` Junio C Hamano
2015-05-13 17:13   ` Junio C Hamano
2015-05-13 19:40     ` [PATCH 2/2] log: do not shorten decoration names too early Junio C Hamano
2015-05-14  6:33       ` Jeff King
2015-05-14 17:37         ` Junio C Hamano [this message]
2015-05-14 17:49           ` Jeff King
2015-05-14 18:01             ` Junio C Hamano
2015-05-14 18:10               ` Jeff King
2015-05-14 21:49           ` Junio C Hamano
2015-05-14 21:54             ` Jeff King
2015-05-14 22:25               ` Junio C Hamano
2015-05-14 22:33                 ` Jeff King
2015-05-22 21:21                   ` Junio C Hamano
2015-05-22 21:38                     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq8ucr83h8.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox$(echo .)com \
    --cc=git@vger$(echo .)kernel.org \
    --cc=mhagger@alum$(echo .)mit.edu \
    --cc=peff@peff$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox