public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Pádraig Brady" <P@draigBrady•com>
To: Bug-coreutils@gnu•org,  Linus Torvalds <torvalds@linux-foundation•org>
Cc: Git Mailing List <git@vger•kernel.org>
Subject: Linus' sha1 is much faster!
Date: Sat, 15 Aug 2009 00:25:36 +0100	[thread overview]
Message-ID: <4A85F270.20703@draigBrady.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]

I've noticed before that coreutils hashing utils
were a little behind in performance, but was prompted
to look at it again when I noticed the recently
updated sha1 implementation in git:
http://git.kernel.org/?p=git/git.git;a=history;f=block-sha1;h=d3121f7;hb=pu

Testing that with the attached program which I wrote
in a couple of mins to try and match sha1sum's system calls
shows that it's around 33% faster, as shown below:

$ gcc $(rpm -q --qf="%{OPTFLAGS}\n" coreutils) linus-sha1.c sha1.c -o linus-sha1

$ time ./linus-sha1 300MB_file
df1e19e245fee4f53087b50ef953ca2c8d1644d7  300MB_file
real    0m2.742s
user    0m2.516s
sys     0m0.206s

$ time ~/git/coreutils/src/sha1sum 300MB_file
df1e19e245fee4f53087b50ef953ca2c8d1644d7  300MB_file

real    0m4.166s
user    0m3.846s
sys     0m0.298s

So, could we use that code in coreutils?
Think of all the dead fish it would save.

I've also attached a trivial block-sha1 patch which doesn't
affect performance, but does suppress a signed unsigned
comparison warning which occurs with -Wextra for example.

cheers,
Pádraig.

[-- Attachment #2: linus-sha1.c --]
[-- Type: text/x-csrc, Size: 681 bytes --]

/* gcc -O2 -Wall linus-sha1.c sha1.c -o linus-sha1 */
#include <stdio.h>
#include <stdlib.h>
#include "sha1.h"

int main(int argc, char** argv)
{
    if (argc != 2) return 1;
    const char* filename = argv[1];
    FILE *fp = fopen (filename, "r");
    if (!fp) return 1;

    #define BS 4096 /* match coreutils */

    blk_SHA_CTX ctx;
    blk_SHA1_Init(&ctx);
    size_t nr;
    char buf[BS];
    while ((nr=fread_unlocked(buf, 1, sizeof(buf), fp)))
        blk_SHA1_Update(&ctx, buf, nr);
    unsigned char hash[20];
    blk_SHA1_Final(hash, &ctx);
    int i;
    for (i=0; i<sizeof(hash); i++)
        printf("%02x",*(hash+i));
    printf("  %s\n", filename);

    return 0;
}

[-- Attachment #3: block-sha1-signed-warning.diff --]
[-- Type: text/x-patch, Size: 998 bytes --]

>From fa75e818836f763357ff9b7bbde3327e1aabbe47 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?P=C3=A1draig=20Brady?= <P@draigBrady•com>
Date: Sat, 15 Aug 2009 00:17:30 +0100
Subject: [PATCH] block-sha1: suppress signed unsigned comparison warning

* block-sha1/sha1.c: Use unsigned ints as the values
will never go negative.
---
 block-sha1/sha1.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index d3121f7..be763d8 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -231,13 +231,13 @@ void blk_SHA1_Init(blk_SHA_CTX *ctx)
 
 void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len)
 {
-	int lenW = ctx->size & 63;
+	unsigned int lenW = ctx->size & 63;
 
 	ctx->size += len;
 
 	/* Read the data into W and process blocks as they get full */
 	if (lenW) {
-		int left = 64 - lenW;
+		unsigned int left = 64 - lenW;
 		if (len < left)
 			left = len;
 		memcpy(lenW + (char *)ctx->W, data, left);
-- 
1.6.2.5


             reply	other threads:[~2009-08-14 23:29 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-14 23:25 Pádraig Brady [this message]
2009-08-15 20:02 ` Linus' sha1 is much faster! Bryan Donlan
2009-08-15 20:12   ` John Tapsell
2009-08-15 20:23     ` Linus Torvalds
2009-08-15 20:54       ` Linus Torvalds
2009-08-17  1:55         ` Nicolas Pitre
2009-08-26 11:39           ` Pádraig Brady
2017-04-20 21:35             ` galt
2017-04-20 21:38             ` galt
2009-08-17  8:22         ` Andreas Ericsson
2009-08-16  0:06     ` Theodore Tso
2009-08-16 19:25 ` Giuseppe Scrivano
2009-08-16 20:10   ` Linus Torvalds
2009-08-16 22:15     ` Giuseppe Scrivano
2009-08-16 22:47       ` Linus Torvalds
2009-08-17  1:53   ` Pádraig Brady
2009-08-17 10:51     ` Giuseppe Scrivano
2009-08-17 15:44       ` Steven Noonan
2009-08-17 16:22         ` Linus Torvalds
2009-08-17 21:43           ` Steven Noonan
2009-08-17 17:32         ` Giuseppe Scrivano
  -- strict thread matches above, loose matches on Subject: below --
2009-08-17  7:23 George Spelvin
2009-08-17 14:20 ` Nicolas Pitre
2009-08-17 17:06 ` Nicolas Pitre
2009-08-17 17:20   ` Paolo Bonzini
2009-08-17 18:54   ` George Spelvin
2009-08-17 19:34     ` Nicolas Pitre
2009-08-17 23:12       ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A85F270.20703@draigBrady.com \
    --to=p@draigbrady$(echo .)com \
    --cc=Bug-coreutils@gnu$(echo .)org \
    --cc=git@vger$(echo .)kernel.org \
    --cc=torvalds@linux-foundation$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox