From: Junio C Hamano <gitster@pobox•com>
To: David Turner <dturner@twopensource•com>
Cc: git@vger•kernel.org, mhagger@alum•mit.edu, neleai@seznam•cz,
David Turner <dturner@twitter•com>
Subject: Re: [PATCH v8] refs.c: SSE2 optimizations for check_refname_component
Date: Mon, 16 Jun 2014 17:06:31 -0700 [thread overview]
Message-ID: <xmqqha3ko0vs.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1402946566-14923-1-git-send-email-dturner@twitter.com> (David Turner's message of "Mon, 16 Jun 2014 15:22:46 -0400")
David Turner <dturner@twopensource•com> writes:
> Optimize check_refname_component using SSE2 on x86_64.
>
> git rev-parse HEAD is a good test-case for this, since it does almost
> nothing except parse refs. For one particular repo with about 60k
> refs, almost all packed, the timings are:
>
> Look up table: 29 ms
> SSE2: 23 ms
>
> This cuts about 20% off of the runtime.
>
> The configure.ac changes include code from the GNU C Library written
> by Joseph S. Myers <joseph at codesourcery dot com>.
>
> Ondřej Bílka <neleai@seznam•cz> suggested an SSE2 approach to the
One e-mail address is obfuscated while the other not; intended?
> substring searches, which netted a speed boost over the SSE4.2 code I
> had initially written.
>
> Signed-off-by: David Turner <dturner@twitter•com>
> ---
> diff --git a/git-compat-util.h b/git-compat-util.h
> index f6d3a46..291d46b 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -668,6 +668,16 @@ void git_qsort(void *base, size_t nmemb, size_t size,
> #endif
> #endif
>
> +#if defined(__GNUC__) && defined(__x86_64__)
> +#include <emmintrin.h>
> +/* This is the system memory page size; it's used so that we can read
Style (there are other instances of the same kind).
/*
* This is the ...
> + * outside the bounds of an allocation without segfaulting.
> + */
> +static int check_refname_component_trailer(const char *cp, const char *refname, int flags)
> +{
> + if (cp == refname)
> + return 0; /* Component has zero length. */
> + if (refname[0] == '.') {
> + if (!(flags & REFNAME_DOT_COMPONENT))
> + return -1; /* Component starts with '.'. */
> + /*
> + * Even if leading dots are allowed, don't allow "."
> + * as a component (".." is prevented by a rule above).
> + */
> + if (refname[1] == '\0')
> + return -1; /* Component equals ".". */
> + }
> + if (cp - refname >= 5 && !memcmp(cp - 5, ".lock", 5))
> + return -1; /* Refname ends with ".lock". */
This is merely a moved code that retained the same comment, but it
is more like "the current refname component ends with .lock", I
suspect. In other words, we do not allow "refs/heads/foo.lock/bar".
Am I reading the patch correctly?
> +#if defined(__GNUC__) && defined(__x86_64__)
> +#define SSE_VECTOR_BYTES 16
> +
> +/* Vectorized version of check_refname_format. */
> +int check_refname_format(const char *refname, int flags)
> +{
> + const char *cp = refname;
> +
> + const __m128i dot = _mm_set1_epi8 ('.');
Style (there are other instances of the same kind). No SP between
function/macro name and opening parenthesis.
> + if (refname[0] == '.') {
> + if (refname[1] == '/' || refname[1] == '\0')
> + return -1;
> + if (!(flags & REFNAME_DOT_COMPONENT))
> + return -1;
> + }
> + while(1) {
> + __m128i tmp, tmp1, result;
> + uint64_t mask;
> +
> + if ((uintptr_t) cp % PAGE_SIZE > PAGE_SIZE - SSE_VECTOR_BYTES - 1)
OK, so we make sure we do not overrun by reading too much near the
end of the page, as the next page might be unmapped.
I am showing my ignorance but does cp (i.e. refname) upon entry to
this function need to be aligned in some way?
Thanks.
next prev parent reply other threads:[~2014-06-17 0:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-16 19:22 [PATCH v8] refs.c: SSE2 optimizations for check_refname_component David Turner
2014-06-17 0:06 ` Junio C Hamano [this message]
2014-06-17 0:51 ` David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqha3ko0vs.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=dturner@twitter$(echo .)com \
--cc=dturner@twopensource$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=mhagger@alum$(echo .)mit.edu \
--cc=neleai@seznam$(echo .)cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox