From: Junio C Hamano <gitster@pobox•com>
To: Nicolas Pitre <nico@fluxnic•net>
Cc: git@vger•kernel.org
Subject: Re: [PATCH 09/23] pack v4: commit object encoding
Date: Tue, 27 Aug 2013 08:39:44 -0700 [thread overview]
Message-ID: <xmqqy57ndtxb.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1377577567-27655-10-git-send-email-nico@fluxnic.net> (Nicolas Pitre's message of "Tue, 27 Aug 2013 00:25:53 -0400")
Nicolas Pitre <nico@fluxnic•net> writes:
> This goes as follows:
>
> - Tree reference: either variable length encoding of the index
> into the SHA1 table or the literal SHA1 prefixed by 0 (see
> add_sha1_ref()).
>
> - Parent count: variable length encoding of the number of parents.
> This is normally going to occupy a single byte but doesn't have to.
>
> - List of parent references: a list of add_sha1_ref() encoded references,
> or nothing if the parent count was zero.
>
> - Author reference: variable length encoding of an index into the author
> string dictionary table which also covers the time zone. To make the
> overall encoding efficient, the author table is already sorted by usage
> frequency so the most used names are first and require the shortest
> index encoding.
>
> - Author time stamp: variable length encoded. Year 2038 ready!
>
> - Committer reference: same as author reference.
>
> - Committer time stamp: same as author time stamp.
>
> The remainder of the canonical commit object content is then zlib
> compressed and appended to the above.
>
> Rationale: The most important commit object data is densely encoded while
> requiring no zlib inflate processing, and all SHA1 references are most
> likely to be direct indices into the pack index file requiring no SHA1
> search into the pack index file.
>
> Signed-off-by: Nicolas Pitre <nico@fluxnic•net>
> ---
> packv4-create.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 119 insertions(+)
>
> diff --git a/packv4-create.c b/packv4-create.c
> index bf33d15..cedbbd9 100644
> --- a/packv4-create.c
> +++ b/packv4-create.c
> @@ -13,6 +13,9 @@
> #include "tree-walk.h"
> #include "pack.h"
>
> +
> +static int pack_compression_level = Z_DEFAULT_COMPRESSION;
> +
> struct data_entry {
> unsigned offset;
> unsigned size;
> @@ -289,6 +292,122 @@ static unsigned char *add_sha1_ref(unsigned char *dst, const unsigned char *sha1
> return dst + 20;
> }
>
> +/*
> + * This converts a canonical commit object buffer into its
> + * tightly packed representation using the already populated
> + * and sorted commit_name_table dictionary. The parsing is
> + * strict so to ensure the canonical version may always be
> + * regenerated and produce the same hash.
> + */
> +void * conv_to_dict_commit(void *buffer, unsigned long *psize)
Drop SP between asterisk and "conv_"?
> +{
> + unsigned long size = *psize;
> + char *in, *tail, *end;
> + unsigned char *out;
> + unsigned char sha1[20];
> + int nb_parents, index, tz_val;
> + unsigned long time;
> + z_stream stream;
> + int status;
> +
> + /*
> + * It is guaranteed that the output is always going to be smaller
> + * than the input. We could even do this conversion in place.
> + */
> + in = buffer;
> + tail = in + size;
> + buffer = xmalloc(size);
> + out = buffer;
> +
> + /* parse the "tree" line */
> + if (in + 46 >= tail || memcmp(in, "tree ", 5) || in[45] != '\n')
> + goto bad_data;
> + if (get_sha1_hex(in + 5, sha1) < 0)
> + goto bad_data;
Is this strict enough to guarantee roundtrip hash identity? Because
get_sha1_hex() accepts hexadecimal represented with uppercase A-F,
you need to reject such a "broken" commit object, no?
Same for parent commit object names below that are parsed with the
same helper.
next prev parent reply other threads:[~2013-08-27 15:40 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-27 4:25 [PATCH 00/23] Preliminary pack v4 support Nicolas Pitre
2013-08-27 4:25 ` [PATCH 01/23] pack v4: initial pack dictionary structure and code Nicolas Pitre
2013-08-27 15:08 ` Junio C Hamano
2013-08-27 16:13 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 02/23] export packed_object_info() Nicolas Pitre
2013-08-27 4:25 ` [PATCH 03/23] pack v4: scan tree objects Nicolas Pitre
2013-08-27 4:25 ` [PATCH 04/23] pack v4: add tree entry mode support to dictionary entries Nicolas Pitre
2013-08-27 4:25 ` [PATCH 05/23] pack v4: add commit object parsing Nicolas Pitre
2013-08-27 15:26 ` Junio C Hamano
2013-08-27 16:47 ` Nicolas Pitre
2013-08-27 17:42 ` Junio C Hamano
2013-08-27 4:25 ` [PATCH 06/23] pack v4: split the object list and dictionary creation Nicolas Pitre
2013-08-27 4:25 ` [PATCH 07/23] pack v4: move to struct pack_idx_entry and get rid of our own struct idx_entry Nicolas Pitre
2013-08-27 4:25 ` [PATCH 08/23] pack v4: basic references encoding Nicolas Pitre
2013-08-27 15:29 ` Junio C Hamano
2013-08-27 15:53 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 09/23] pack v4: commit object encoding Nicolas Pitre
2013-08-27 15:39 ` Junio C Hamano [this message]
2013-08-27 16:50 ` Nicolas Pitre
2013-08-27 19:59 ` Nicolas Pitre
2013-08-27 20:15 ` Junio C Hamano
2013-08-27 21:43 ` Nicolas Pitre
2013-09-02 20:48 ` Duy Nguyen
2013-09-03 6:30 ` Nicolas Pitre
2013-09-03 7:41 ` Duy Nguyen
2013-09-05 3:50 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 10/23] pack v4: tree " Nicolas Pitre
2013-08-27 15:44 ` Junio C Hamano
2013-08-27 16:52 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 11/23] pack v4: dictionary table output Nicolas Pitre
2013-08-27 4:25 ` [PATCH 12/23] pack v4: creation code Nicolas Pitre
2013-08-27 15:48 ` Junio C Hamano
2013-08-27 16:59 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 13/23] pack v4: object headers Nicolas Pitre
2013-08-27 4:25 ` [PATCH 14/23] pack v4: object data copy Nicolas Pitre
2013-08-27 15:53 ` Junio C Hamano
2013-08-27 18:24 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 15/23] pack v4: object writing Nicolas Pitre
2013-08-27 4:26 ` [PATCH 16/23] pack v4: tree object delta encoding Nicolas Pitre
2013-08-27 4:26 ` [PATCH 17/23] pack v4: load delta candidate for encoding tree objects Nicolas Pitre
2013-08-27 4:26 ` [PATCH 18/23] pack v4: honor pack.compression config option Nicolas Pitre
2013-08-27 4:26 ` [PATCH 19/23] pack v4: relax commit parsing a bit Nicolas Pitre
2013-08-27 4:26 ` [PATCH 20/23] pack index v3 Nicolas Pitre
2013-08-27 4:26 ` [PATCH 21/23] pack v4: normalize pack name to properly generate the pack index file name Nicolas Pitre
2013-08-27 4:26 ` [PATCH 22/23] pack v4: add progress display Nicolas Pitre
2013-08-27 4:26 ` [PATCH 23/23] initial pack index v3 support on the read side Nicolas Pitre
2013-08-31 11:45 ` Duy Nguyen
2013-09-03 6:09 ` Nicolas Pitre
2013-09-03 7:34 ` Duy Nguyen
2013-08-27 11:17 ` [PATCH] Document pack v4 format Nguyễn Thái Ngọc Duy
2013-08-27 18:25 ` Junio C Hamano
2013-08-27 18:53 ` Nicolas Pitre
2013-08-31 2:49 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2013-09-03 6:00 ` Nicolas Pitre
2013-09-03 6:46 ` Nicolas Pitre
2013-09-03 11:49 ` Duy Nguyen
2013-09-03 14:54 ` Duy Nguyen
2013-09-05 4:12 ` Nicolas Pitre
2013-09-05 4:19 ` Duy Nguyen
2013-09-05 4:40 ` Nicolas Pitre
2013-09-05 5:04 ` Duy Nguyen
2013-09-05 5:39 ` Nicolas Pitre
2013-09-05 16:52 ` Duy Nguyen
2013-09-05 17:14 ` Nicolas Pitre
2013-09-05 20:26 ` Junio C Hamano
2013-09-05 21:04 ` Nicolas Pitre
2013-09-06 4:18 ` Duy Nguyen
2013-09-06 13:19 ` Nicolas Pitre
2013-09-06 2:14 ` [PATCH v3] " Nguyễn Thái Ngọc Duy
2013-09-06 3:23 ` Nicolas Pitre
2013-09-06 9:48 ` Duy Nguyen
2013-09-06 13:25 ` Nicolas Pitre
2013-09-06 13:44 ` Duy Nguyen
2013-09-06 16:44 ` Nicolas Pitre
2013-09-07 4:57 ` Nicolas Pitre
2013-09-07 4:52 ` Nicolas Pitre
2013-09-07 8:05 ` Duy Nguyen
2013-08-27 15:03 ` [PATCH 00/23] Preliminary pack v4 support Junio C Hamano
2013-08-27 15:59 ` Nicolas Pitre
2013-08-27 16:44 ` Junio C Hamano
2013-08-28 2:30 ` Duy Nguyen
2013-08-28 2:58 ` Nicolas Pitre
2013-08-28 3:06 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqy57ndtxb.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=nico@fluxnic$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox