From: Junio C Hamano <gitster@pobox•com>
To: Stefan Beller <sbeller@google•com>
Cc: Jeff King <peff@peff•net>,
"git\@vger.kernel.org" <git@vger•kernel.org>,
Jonathan Nieder <jrnieder@gmail•com>,
Johannes Sixt <j6t@kdbg•org>, Heiko Voigt <hvoigt@hvoigt•net>,
Jens Lehmann <jens.lehmann@web•de>
Subject: Re: [RFC PATCH 2/3] run-commands: add an async queue processor
Date: Mon, 24 Aug 2015 14:22:33 -0700 [thread overview]
Message-ID: <xmqqvbc42xgm.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <CAGZ79kbhRQ0SmSu6xiij6Gxdrtfh02knexz6+CNjihY4xbC83w@mail.gmail.com> (Stefan Beller's message of "Fri, 21 Aug 2015 16:40:09 -0700")
Stefan Beller <sbeller@google•com> writes:
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index 3f10840..159ee36 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -11,6 +11,7 @@
> #include "exec_cmd.h"
> #include "streaming.h"
> #include "thread-utils.h"
> +#include "run-command.h"
>
> static const char index_pack_usage[] =
> "git index-pack [-v] [-o <index-file>] [--keep | --keep=<msg>]
> [--verify] [--strict] (<pack-file> | --stdin [--fix-thin]
> [<pack-file>])";
> @@ -1075,7 +1076,7 @@ static void resolve_base(struct object_entry *obj)
> }
>
> #ifndef NO_PTHREADS
> -static void *threaded_second_pass(void *data)
> +static int threaded_second_pass(struct task_queue *tq, void *data)
> {
> set_thread_data(data);
> for (;;) {
> @@ -1096,7 +1097,7 @@ static void *threaded_second_pass(void *data)
>
> resolve_base(&objects[i]);
> }
> - return NULL;
> + return 0;
> }
> #endif
>
> @@ -1195,18 +1196,18 @@ static void resolve_deltas(void)
> nr_ref_deltas + nr_ofs_deltas);
>
> #ifndef NO_PTHREADS
> - nr_dispatched = 0;
> if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) {
> + nr_dispatched = 0;
> init_thread();
> - for (i = 0; i < nr_threads; i++) {
> - int ret = pthread_create(&thread_data[i].thread, NULL,
> - threaded_second_pass,
> thread_data + i);
> - if (ret)
> - die(_("unable to create thread: %s"),
> - strerror(ret));
> - }
> +
> + tq = create_task_queue(nr_threads);
> +
> for (i = 0; i < nr_threads; i++)
> - pthread_join(thread_data[i].thread, NULL);
> + add_task(tq, threaded_second_pass, thread_data + i);
> +
> + if (finish_task_queue(tq))
> + die("Not all threads have finished");
> +
> cleanup_thread();
> return;
> }
This looks quite straight-forward, but that is not too surprising,
as the "dispatcher" side naturally should have a similar logic to
manage threads by creating and joining them ;-)
> @@ -1075,28 +1067,24 @@ static void resolve_base(struct object_entry *obj)
> }
>
> #ifndef NO_PTHREADS
> -static void *threaded_second_pass(void *data)
> +static int threaded_second_pass(struct task_queue *tq, void *data)
> {
> - set_thread_data(data);
> - for (;;) {
> - int i;
> - counter_lock();
> - display_progress(progress, nr_resolved_deltas);
> - counter_unlock();
> - work_lock();
> - while (nr_dispatched < nr_objects &&
> - is_delta_type(objects[nr_dispatched].type))
> - nr_dispatched++;
> - if (nr_dispatched >= nr_objects) {
> - work_unlock();
> - break;
> - }
> - i = nr_dispatched++;
> - work_unlock();
> + if (!get_thread_data()) {
> + struct thread_local *t = xmalloc(sizeof(*t));
> + t->pack_fd = open(curr_pack, O_RDONLY);
> + if (t->pack_fd == -1)
> + die_errno(_("unable to open %s"), curr_pack);
>
> - resolve_base(&objects[i]);
> + set_thread_data(t);
> + /* TODO: I haven't figured out where to free this memory */
Sorry but it is hard to grok what is going on in the code with
squished indentation.
> Why did I not pick upload-pack?
> ========================
>
> I could not spot easily how to make it a typical queuing problem.
> We start in the threads, and once in a while we're like: "Uhg, this
> thread has more load than the other, let's shove a bit over there"
>
> So what we would need there is splitting the work in smallest chunks
> from the beginning and just load it into the queue via add_task
... or a way for the overload and tasks to communicate with each
other and rebalance. If I am not mistaken, it has a big negative
consequence for pack-objects to split the work to too small a chunk,
as the chunk boundary also becomes boundary of find delta bases.
next prev parent reply other threads:[~2015-08-24 21:22 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-21 1:40 [PATCH 1/3] submodule: implement `module_clone` as a builtin helper Stefan Beller
2015-08-21 1:40 ` [RFC PATCH 2/3] run-commands: add an async queue processor Stefan Beller
2015-08-21 19:05 ` Junio C Hamano
2015-08-21 19:44 ` Jeff King
2015-08-21 19:48 ` Stefan Beller
2015-08-21 19:51 ` Jeff King
2015-08-21 20:12 ` Stefan Beller
2015-08-21 20:41 ` Junio C Hamano
2015-08-21 23:40 ` Stefan Beller
2015-08-24 21:22 ` Junio C Hamano [this message]
2015-08-21 19:45 ` Stefan Beller
2015-08-21 20:47 ` Junio C Hamano
2015-08-21 20:56 ` Stefan Beller
2015-08-21 1:40 ` [WIP/PATCH 3/3] submodule: helper to run foreach in parallel Stefan Beller
2015-08-21 19:23 ` Junio C Hamano
2015-08-21 20:21 ` Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqvbc42xgm.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=hvoigt@hvoigt$(echo .)net \
--cc=j6t@kdbg$(echo .)org \
--cc=jens.lehmann@web$(echo .)de \
--cc=jrnieder@gmail$(echo .)com \
--cc=peff@peff$(echo .)net \
--cc=sbeller@google$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox