From: "René Scharfe" <l.s.r@web•de>
To: Justin Tobler <jltobler@gmail•com>
Cc: Git List <git@vger•kernel.org>, Jeff King <peff@peff•net>
Subject: Re: [PATCH 1/3] commit: convert pop_most_recent_commit() to prio_queue
Date: Wed, 16 Jul 2025 11:39:29 +0200 [thread overview]
Message-ID: <3f24210a-cd96-4c2c-9d55-1f0ebd7bbff5@web.de> (raw)
In-Reply-To: <d2myew7nonfrelrwplpypexvcrktmjdlsccobjvx3dydvhnlar@bin5ol2vj3xs>
On 7/15/25 10:47 PM, Justin Tobler wrote:
> On 25/07/15 04:51PM, René Scharfe wrote:
>> pop_most_recent_commit() calls commit_list_insert_by_date(), which and
>
> Did you mean?
>
> s/which and/which/
Oh, yes.
>> is itself called in a loop, which can lead to quadratic complexity.
>> Replace the commit_list with a prio_queue to ensure logarithmic worst
>> case complexity and convert all three users.
>
> If I understand correctly, `pop_most_recent_commit()` removes the most
> recent commit from a list of commits sorted by date and then inserts
> each of the removed commit's parents into the list while maintaining
> date order. Iterating through `struct commit_list` every time to find
> where to insert each parent parent leads to quadratic complexity. For
> repositories with many merge commits, this could scale poorly.
Right.
>> Add a performance test that exercises one of them using a pathological
>> history that consists of 50% merges and 50% root commits to demonstrate
>> the speedup:
>>
>> Test v2.50.1 HEAD
>> ----------------------------------------------------------------------
>> 1501.2: rev-parse ':/65535' 2.48(2.47+0.00) 0.20(0.19+0.00) -91.9%
>>
>> Alas, sane histories don't benefit from the conversion much, and
>> traversing Git's own history takes a 1% performance hit on my machine:
>
> As "normal" repositories don't benefit here, it might be nice to more
> explicitly mention the the types of repositories that do benefit.
Good idea.
>> diff --git a/commit.h b/commit.h
>> index 70c870dae4..9630c076d6 100644
>> --- a/commit.h
>> +++ b/commit.h
>> @@ -201,10 +201,10 @@ const char *repo_logmsg_reencode(struct repository *r,
>>
>> const char *skip_blank_lines(const char *msg);
>>
>> -/** Removes the first commit from a list sorted by date, and adds all
>> - * of its parents.
>> - **/
>> -struct commit *pop_most_recent_commit(struct commit_list **list,
>> +struct prio_queue;
>> +
>> +/* Removes the first commit from a prio_queue and adds its parents. */
>> +struct commit *pop_most_recent_commit(struct prio_queue *queue,
>> unsigned int mark);
>
> Previously, `pop_most_recent_commit()` would ensure commits inserted in
> the list were done it date order. Now this depends on how the caller has
> configured the `struct prio_queue`. This is fine though as previously
> the caller was required to ensure the list was sorted to begin with
> otherwise it wouldn't work properly.
Indeed.
René
next prev parent reply other threads:[~2025-07-16 9:39 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-15 14:35 [PATCH 0/3] commit: convert pop_most_recent_commit() to prio_queue René Scharfe
2025-07-15 14:51 ` [PATCH 1/3] " René Scharfe
2025-07-15 19:23 ` Junio C Hamano
2025-07-15 20:47 ` Justin Tobler
2025-07-16 9:39 ` René Scharfe [this message]
2025-07-16 5:05 ` Jeff King
2025-07-16 9:39 ` René Scharfe
2025-07-17 8:22 ` René Scharfe
2025-07-19 6:55 ` Jeff King
2025-07-19 6:57 ` Jeff King
2025-07-19 11:15 ` René Scharfe
2025-07-20 0:03 ` Jeff King
2025-07-20 1:22 ` Junio C Hamano
2025-07-16 22:23 ` Junio C Hamano
2025-07-17 8:22 ` René Scharfe
2025-07-15 14:51 ` [PATCH 2/3] prio-queue: add prio_queue_replace() René Scharfe
2025-07-16 5:09 ` Jeff King
2025-07-16 9:38 ` René Scharfe
2025-07-17 9:20 ` René Scharfe
2025-07-19 7:02 ` Jeff King
2025-07-15 14:51 ` [PATCH 3/3] commit: use prio_queue_replace() in pop_most_recent_commit() René Scharfe
2025-07-15 20:43 ` Junio C Hamano
2025-07-16 9:38 ` René Scharfe
2025-07-16 0:07 ` [PATCH 0/3] commit: convert pop_most_recent_commit() to prio_queue Junio C Hamano
2025-07-16 5:15 ` Jeff King
2025-07-16 9:38 ` René Scharfe
2025-07-19 6:45 ` Jeff King
2025-07-16 14:49 ` Junio C Hamano
2025-07-18 9:09 ` [PATCH v2 " René Scharfe
2025-07-18 9:39 ` [PATCH v2 1/3] " René Scharfe
2025-07-21 14:02 ` Lidong Yan
2025-08-03 9:54 ` René Scharfe
2025-08-03 16:48 ` Junio C Hamano
2025-08-04 19:56 ` René Scharfe
2025-07-18 9:39 ` [PATCH v2 3/3] commit: use prio_queue_replace() in pop_most_recent_commit(),MIME-Version: 1.0 René Scharfe
2025-08-03 11:12 ` Johannes Schindelin
2025-08-03 11:33 ` René Scharfe
2025-07-18 9:39 ` [PATCH v2 2/3] prio-queue: add prio_queue_replace() René Scharfe
2025-07-19 7:04 ` [PATCH v2 0/3] commit: convert pop_most_recent_commit() to prio_queue Jeff King
2025-07-22 6:26 ` SZEDER Gábor
2025-07-22 14:27 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3f24210a-cd96-4c2c-9d55-1f0ebd7bbff5@web.de \
--to=l.s.r@web$(echo .)de \
--cc=git@vger$(echo .)kernel.org \
--cc=jltobler@gmail$(echo .)com \
--cc=peff@peff$(echo .)net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox