From: Junio C Hamano <gitster@pobox•com>
To: Nicolas Pitre <nico@fluxnic•net>
Cc: git@vger•kernel.org
Subject: Re: [PATCH 12/23] pack v4: creation code
Date: Tue, 27 Aug 2013 08:48:24 -0700 [thread overview]
Message-ID: <xmqqppszdtiv.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1377577567-27655-13-git-send-email-nico@fluxnic.net> (Nicolas Pitre's message of "Tue, 27 Aug 2013 00:25:56 -0400")
Nicolas Pitre <nico@fluxnic•net> writes:
> Let's actually open the destination pack file and write the header and
> the tables.
>
> The header isn't much different from pack v3, except for the pack version
> number of course.
>
> The first table is the sorted SHA1 table normally found in the pack index
> file. With pack v4 we write this table in the main pack file instead as
> it is index referenced by subsequent objects in the pack. Doing so has
> many advantages:
>
> - The SHA1 references used to be duplicated on disk: once in the pack
> index file, and then at least once or more within commit and tree
> objects referencing them. The only SHA1 which is not being listed more
> than once this way is the one for a branch tip commit object and those
> are normally very few. Now all that SHA1 data is represented only once.
>
This tickles my curiosity. Why isn't this SHA-1 table sorted by
reference count the same way as the tree path and the people name
tables to keep the average length of varint references short?
> - The SHA1 references found in commit and tree objects can be obtained
> on disk directly without having to deflate those objects first.
>
> The SHA1 table size is obtained by multiplying the number of objects by 20.
>
> And then the commit and path dictionary tables are written right after
> the SHA1 table.
> Signed-off-by: Nicolas Pitre <nico@fluxnic•net>
> ---
> packv4-create.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 55 insertions(+), 5 deletions(-)
>
> diff --git a/packv4-create.c b/packv4-create.c
> index 2956fda..5211f9c 100644
> --- a/packv4-create.c
> +++ b/packv4-create.c
> @@ -605,6 +605,48 @@ static unsigned long write_dict_table(struct sha1file *f, struct dict_table *t)
> return hdrlen + datalen;
> }
>
> +static struct sha1file * packv4_open(char *path)
> +{
> + int fd;
> +
> + fd = open(path, O_CREAT|O_EXCL|O_WRONLY, 0600);
> + if (fd < 0)
> + die_errno("unable to create '%s'", path);
> + return sha1fd(fd, path);
> +}
> +
> +static unsigned int packv4_write_header(struct sha1file *f, unsigned nr_objects)
> +{
> + struct pack_header hdr;
> +
> + hdr.hdr_signature = htonl(PACK_SIGNATURE);
> + hdr.hdr_version = htonl(4);
> + hdr.hdr_entries = htonl(nr_objects);
> + sha1write(f, &hdr, sizeof(hdr));
> +
> + return sizeof(hdr);
> +}
> +
> +static unsigned long packv4_write_tables(struct sha1file *f, unsigned nr_objects,
> + struct pack_idx_entry *objs)
> +{
> + unsigned i;
> + unsigned long written = 0;
> +
> + /* The sorted list of object SHA1's is always first */
> + for (i = 0; i < nr_objects; i++)
> + sha1write(f, objs[i].sha1, 20);
> + written = 20 * nr_objects;
> +
> + /* Then the commit dictionary table */
> + written += write_dict_table(f, commit_name_table);
> +
> + /* Followed by the path component dictionary table */
> + written += write_dict_table(f, tree_path_table);
> +
> + return written;
> +}
> +
> static struct packed_git *open_pack(const char *path)
> {
> char arg[PATH_MAX];
> @@ -658,9 +700,10 @@ static struct packed_git *open_pack(const char *path)
> return p;
> }
>
> -static void process_one_pack(char *src_pack)
> +static void process_one_pack(char *src_pack, char *dst_pack)
> {
> struct packed_git *p;
> + struct sha1file *f;
> struct pack_idx_entry *objs, **p_objs;
> unsigned nr_objects;
>
> @@ -673,15 +716,22 @@ static void process_one_pack(char *src_pack)
> p_objs = sort_objs_by_offset(objs, nr_objects);
>
> create_pack_dictionaries(p, p_objs);
> +
> + f = packv4_open(dst_pack);
> + if (!f)
> + die("unable to open destination pack");
> + packv4_write_header(f, nr_objects);
> + packv4_write_tables(f, nr_objects, objs);
> }
>
> int main(int argc, char *argv[])
> {
> - if (argc != 2) {
> - fprintf(stderr, "Usage: %s <packfile>\n", argv[0]);
> + if (argc != 3) {
> + fprintf(stderr, "Usage: %s <src_packfile> <dst_packfile>\n", argv[0]);
> exit(1);
> }
> - process_one_pack(argv[1]);
> - dict_dump();
> + process_one_pack(argv[1], argv[2]);
> + if (0)
> + dict_dump();
> return 0;
> }
next prev parent reply other threads:[~2013-08-27 15:48 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-27 4:25 [PATCH 00/23] Preliminary pack v4 support Nicolas Pitre
2013-08-27 4:25 ` [PATCH 01/23] pack v4: initial pack dictionary structure and code Nicolas Pitre
2013-08-27 15:08 ` Junio C Hamano
2013-08-27 16:13 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 02/23] export packed_object_info() Nicolas Pitre
2013-08-27 4:25 ` [PATCH 03/23] pack v4: scan tree objects Nicolas Pitre
2013-08-27 4:25 ` [PATCH 04/23] pack v4: add tree entry mode support to dictionary entries Nicolas Pitre
2013-08-27 4:25 ` [PATCH 05/23] pack v4: add commit object parsing Nicolas Pitre
2013-08-27 15:26 ` Junio C Hamano
2013-08-27 16:47 ` Nicolas Pitre
2013-08-27 17:42 ` Junio C Hamano
2013-08-27 4:25 ` [PATCH 06/23] pack v4: split the object list and dictionary creation Nicolas Pitre
2013-08-27 4:25 ` [PATCH 07/23] pack v4: move to struct pack_idx_entry and get rid of our own struct idx_entry Nicolas Pitre
2013-08-27 4:25 ` [PATCH 08/23] pack v4: basic references encoding Nicolas Pitre
2013-08-27 15:29 ` Junio C Hamano
2013-08-27 15:53 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 09/23] pack v4: commit object encoding Nicolas Pitre
2013-08-27 15:39 ` Junio C Hamano
2013-08-27 16:50 ` Nicolas Pitre
2013-08-27 19:59 ` Nicolas Pitre
2013-08-27 20:15 ` Junio C Hamano
2013-08-27 21:43 ` Nicolas Pitre
2013-09-02 20:48 ` Duy Nguyen
2013-09-03 6:30 ` Nicolas Pitre
2013-09-03 7:41 ` Duy Nguyen
2013-09-05 3:50 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 10/23] pack v4: tree " Nicolas Pitre
2013-08-27 15:44 ` Junio C Hamano
2013-08-27 16:52 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 11/23] pack v4: dictionary table output Nicolas Pitre
2013-08-27 4:25 ` [PATCH 12/23] pack v4: creation code Nicolas Pitre
2013-08-27 15:48 ` Junio C Hamano [this message]
2013-08-27 16:59 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 13/23] pack v4: object headers Nicolas Pitre
2013-08-27 4:25 ` [PATCH 14/23] pack v4: object data copy Nicolas Pitre
2013-08-27 15:53 ` Junio C Hamano
2013-08-27 18:24 ` Nicolas Pitre
2013-08-27 4:25 ` [PATCH 15/23] pack v4: object writing Nicolas Pitre
2013-08-27 4:26 ` [PATCH 16/23] pack v4: tree object delta encoding Nicolas Pitre
2013-08-27 4:26 ` [PATCH 17/23] pack v4: load delta candidate for encoding tree objects Nicolas Pitre
2013-08-27 4:26 ` [PATCH 18/23] pack v4: honor pack.compression config option Nicolas Pitre
2013-08-27 4:26 ` [PATCH 19/23] pack v4: relax commit parsing a bit Nicolas Pitre
2013-08-27 4:26 ` [PATCH 20/23] pack index v3 Nicolas Pitre
2013-08-27 4:26 ` [PATCH 21/23] pack v4: normalize pack name to properly generate the pack index file name Nicolas Pitre
2013-08-27 4:26 ` [PATCH 22/23] pack v4: add progress display Nicolas Pitre
2013-08-27 4:26 ` [PATCH 23/23] initial pack index v3 support on the read side Nicolas Pitre
2013-08-31 11:45 ` Duy Nguyen
2013-09-03 6:09 ` Nicolas Pitre
2013-09-03 7:34 ` Duy Nguyen
2013-08-27 11:17 ` [PATCH] Document pack v4 format Nguyễn Thái Ngọc Duy
2013-08-27 18:25 ` Junio C Hamano
2013-08-27 18:53 ` Nicolas Pitre
2013-08-31 2:49 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2013-09-03 6:00 ` Nicolas Pitre
2013-09-03 6:46 ` Nicolas Pitre
2013-09-03 11:49 ` Duy Nguyen
2013-09-03 14:54 ` Duy Nguyen
2013-09-05 4:12 ` Nicolas Pitre
2013-09-05 4:19 ` Duy Nguyen
2013-09-05 4:40 ` Nicolas Pitre
2013-09-05 5:04 ` Duy Nguyen
2013-09-05 5:39 ` Nicolas Pitre
2013-09-05 16:52 ` Duy Nguyen
2013-09-05 17:14 ` Nicolas Pitre
2013-09-05 20:26 ` Junio C Hamano
2013-09-05 21:04 ` Nicolas Pitre
2013-09-06 4:18 ` Duy Nguyen
2013-09-06 13:19 ` Nicolas Pitre
2013-09-06 2:14 ` [PATCH v3] " Nguyễn Thái Ngọc Duy
2013-09-06 3:23 ` Nicolas Pitre
2013-09-06 9:48 ` Duy Nguyen
2013-09-06 13:25 ` Nicolas Pitre
2013-09-06 13:44 ` Duy Nguyen
2013-09-06 16:44 ` Nicolas Pitre
2013-09-07 4:57 ` Nicolas Pitre
2013-09-07 4:52 ` Nicolas Pitre
2013-09-07 8:05 ` Duy Nguyen
2013-08-27 15:03 ` [PATCH 00/23] Preliminary pack v4 support Junio C Hamano
2013-08-27 15:59 ` Nicolas Pitre
2013-08-27 16:44 ` Junio C Hamano
2013-08-28 2:30 ` Duy Nguyen
2013-08-28 2:58 ` Nicolas Pitre
2013-08-28 3:06 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqppszdtiv.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=nico@fluxnic$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox