public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Thomas Gummerer <t.gummerer@gmail•com>
To: Andreas Krey <a.krey@gmx•de>
Cc: Git Mailing List <git@vger•kernel.org>,
	Junio C Hamano <gitster@pobox•com>
Subject: Re: [PATCH] refs.c: get_ref_cache: use a bucket hash
Date: Mon, 16 Mar 2015 18:19:10 +0100	[thread overview]
Message-ID: <20150316171909.GA8618@hank> (raw)
In-Reply-To: <20150316142026.GJ7847@inner.h.apk.li>

Hi,

On 03/16, Andreas Krey wrote:
> get_ref_cache used a linear list, which obviously is O(n^2).
> Use a fixed bucket hash which just takes a factor of 100000
> (~ 317^2) out of the n^2 - which is enough.
>
> Signed-off-by: Andreas Krey <a.krey@gmx•de>
> ---
>
> This brings 'git clean -ndx' times down from 17 minutes
> to 11 seconds on one of our workspaces (which accumulated
> a lot of ignored directories). Actuallly using adaptive
> hashing or other structures seems overkill.
>
>  refs.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index e23542b..8198d9e 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -982,6 +982,8 @@ struct packed_ref_cache {
>  	struct stat_validity validity;
>  };
>
> +#define REF_CACHE_HASH 317
> +
>  /*
>   * Future: need to be in "struct repository"
>   * when doing a full libification.
> @@ -996,7 +998,7 @@ static struct ref_cache {
>  	 * is initialized correctly.
>  	 */
>  	char name[1];
> -} ref_cache, *submodule_ref_caches;
> +} ref_cache, *submodule_ref_caches[REF_CACHE_HASH];
>
>  /* Lock used for the main packed-refs file: */
>  static struct lock_file packlock;
> @@ -1065,18 +1067,19 @@ static struct ref_cache *create_ref_cache(const char *submodule)
>   */
>  static struct ref_cache *get_ref_cache(const char *submodule)
>  {
> -	struct ref_cache *refs;
> +	struct ref_cache *refs, **bucketp;
> +	bucketp = submodule_ref_caches + strhash(submodule) % REF_CACHE_HASH;
>

This breaks the test-suite for me, in the cases where submodule is
NULL.  How about something like this on top?

diff --git a/refs.c b/refs.c
index 8198d9e..311faf2 100644
--- a/refs.c
+++ b/refs.c
@@ -1068,7 +1068,9 @@ static struct ref_cache *create_ref_cache(const char *submodule)
 static struct ref_cache *get_ref_cache(const char *submodule)
 {
        struct ref_cache *refs, **bucketp;
-       bucketp = submodule_ref_caches + strhash(submodule) % REF_CACHE_HASH;
+       bucketp = submodule_ref_caches;
+       if (submodule)
+               bucketp += strhash(submodule) % REF_CACHE_HASH;

        if (!submodule || !*submodule)
                return &ref_cache;

>  	if (!submodule || !*submodule)
>  		return &ref_cache;
>
> -	for (refs = submodule_ref_caches; refs; refs = refs->next)
> +	for (refs = *bucketp; refs; refs = refs->next)
>  		if (!strcmp(submodule, refs->name))
>  			return refs;
>
>  	refs = create_ref_cache(submodule);
> -	refs->next = submodule_ref_caches;
> -	submodule_ref_caches = refs;
> +	refs->next = *bucketp;
> +	*bucketp = refs;
>  	return refs;
>  }
>
> --
> 2.3.2.223.g7a9409c
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger•kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-03-16 17:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 14:20 [PATCH] refs.c: get_ref_cache: use a bucket hash Andreas Krey
2015-03-16 17:19 ` Thomas Gummerer [this message]
2015-03-16 17:23 ` Junio C Hamano
2015-03-16 18:40   ` Andreas Krey
2015-03-17  2:40     ` Jeff King
2015-03-17  5:35       ` Junio C Hamano
2015-03-17  5:48         ` Jeff King
2015-11-13 15:29           ` Andreas Krey
2015-11-14  0:01             ` Jeff King
2015-11-14 13:22               ` Andreas Krey
2015-11-14 13:35               ` Andreas Krey
2015-11-16 16:31                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150316171909.GA8618@hank \
    --to=t.gummerer@gmail$(echo .)com \
    --cc=a.krey@gmx$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox