public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web•de>
To: Lidong Yan <yldhome2d2@gmail•com>
Cc: Git List <git@vger•kernel.org>, Jeff King <peff@peff•net>,
	Junio C Hamano <gitster@pobox•com>,
	Justin Tobler <jltobler@gmail•com>
Subject: Re: [PATCH v2 1/3] commit: convert pop_most_recent_commit() to prio_queue
Date: Sun, 3 Aug 2025 11:54:26 +0200	[thread overview]
Message-ID: <c5d91cec-3fec-422e-86d4-78767d95f208@web.de> (raw)
In-Reply-To: <148541F5-DC9F-4A3A-B1B1-0FED8AA5A101@gmail.com>

On 7/21/25 4:02 PM, Lidong Yan wrote:
> René Scharfe <l.s.r@web•de> writes:
>>
>> +#
>> +# Creates lots of merges to make history traversal costly.  In
>> +# particular it creates 2^($max_level-1)-1 2-way merges on top of
>> +# 2^($max_level-1) root commits.  E.g., the commit history looks like
>> +# this for a $max_level of 3:
>> +#
>> +#     _1_
>> +#    /   \
>> +#   2     3
>> +#  / \   / \
>> +# 4   5 6   7
>> +#
>> +# The numbers are the fast-import marks, which also are the commit
>> +# messages.  1 is the HEAD commit and a merge, 2 and 3 are also merges,
>> +# 4-7 are the root commits.
>> +#
> 
> I feel that the reason there's no significant performance improvement is probably
> because mostly we are using the priority queue to sort O(siblings) nodes.
> For example, in this case, the most time-consuming operation is when the priority
> queue or commit list contains 4 and 5, and we then need to insert 6 and 7.
> 
> Assuming the maximum number of siblings is W and the number of nodes is N,
> the time complexity with a commit list is O(W² × N), while using a priority queue
> gives O(W log W × N). Perhaps in many projects, W isn't particularly large,
> which results in the performance improvement not being very significant.

Kinda.  While traversing the history we take a commit from to the
commit_list or prio_queue and put back its parents.  For single-parent
commits this sequence keeps the number of stored items the same.  Merges
increase that number.

We add and retrieve each commit in the (relevant part of) history.  That
takes O(N) and O(1) for the sorted list, and O(log N) and O(log N) for
the prio_queue, where N is the length of the list.

So the best-case history is a string of single-parent commits, keeping
only a single item on the list/queue throughout.  That requires no
sorting or heaping, making the additions and retrievals O(1).  The
overall complexity is then O(N) for both variants, N being the number
of commits in the history.

Worst-case history might be a single merge of all commits -- a
centipede or myriapod?  With all commits on the sorted list we get a
complexity of O(N²) for the traversal, and O(N log N) with a prio_queue.

René


  reply	other threads:[~2025-08-03  9:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-15 14:35 [PATCH 0/3] commit: convert pop_most_recent_commit() to prio_queue René Scharfe
2025-07-15 14:51 ` [PATCH 1/3] " René Scharfe
2025-07-15 19:23   ` Junio C Hamano
2025-07-15 20:47   ` Justin Tobler
2025-07-16  9:39     ` René Scharfe
2025-07-16  5:05   ` Jeff King
2025-07-16  9:39     ` René Scharfe
2025-07-17  8:22       ` René Scharfe
2025-07-19  6:55       ` Jeff King
2025-07-19  6:57         ` Jeff King
2025-07-19 11:15         ` René Scharfe
2025-07-20  0:03           ` Jeff King
2025-07-20  1:22             ` Junio C Hamano
2025-07-16 22:23   ` Junio C Hamano
2025-07-17  8:22     ` René Scharfe
2025-07-15 14:51 ` [PATCH 2/3] prio-queue: add prio_queue_replace() René Scharfe
2025-07-16  5:09   ` Jeff King
2025-07-16  9:38     ` René Scharfe
2025-07-17  9:20       ` René Scharfe
2025-07-19  7:02         ` Jeff King
2025-07-15 14:51 ` [PATCH 3/3] commit: use prio_queue_replace() in pop_most_recent_commit() René Scharfe
2025-07-15 20:43   ` Junio C Hamano
2025-07-16  9:38     ` René Scharfe
2025-07-16  0:07 ` [PATCH 0/3] commit: convert pop_most_recent_commit() to prio_queue Junio C Hamano
2025-07-16  5:15   ` Jeff King
2025-07-16  9:38     ` René Scharfe
2025-07-19  6:45       ` Jeff King
2025-07-16 14:49     ` Junio C Hamano
2025-07-18  9:09 ` [PATCH v2 " René Scharfe
2025-07-18  9:39   ` [PATCH v2 1/3] " René Scharfe
2025-07-21 14:02     ` Lidong Yan
2025-08-03  9:54       ` René Scharfe [this message]
2025-08-03 16:48         ` Junio C Hamano
2025-08-04 19:56           ` René Scharfe
2025-07-18  9:39   ` [PATCH v2 3/3] commit: use prio_queue_replace() in pop_most_recent_commit(),MIME-Version: 1.0 René Scharfe
2025-08-03 11:12     ` Johannes Schindelin
2025-08-03 11:33       ` René Scharfe
2025-07-18  9:39   ` [PATCH v2 2/3] prio-queue: add prio_queue_replace() René Scharfe
2025-07-19  7:04   ` [PATCH v2 0/3] commit: convert pop_most_recent_commit() to prio_queue Jeff King
2025-07-22  6:26   ` SZEDER Gábor
2025-07-22 14:27     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c5d91cec-3fec-422e-86d4-78767d95f208@web.de \
    --to=l.s.r@web$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=jltobler@gmail$(echo .)com \
    --cc=peff@peff$(echo .)net \
    --cc=yldhome2d2@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox