public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web•de>
To: Git List <git@vger•kernel.org>
Cc: Jeff King <peff@peff•net>, Junio C Hamano <gitster@pobox•com>,
	Justin Tobler <jltobler@gmail•com>
Subject: [PATCH v2 0/3] commit: convert pop_most_recent_commit() to prio_queue
Date: Fri, 18 Jul 2025 11:09:04 +0200	[thread overview]
Message-ID: <8ff40c56-368a-4347-aeae-2aca2cb6a5b2@web.de> (raw)
In-Reply-To: <bc079b3c-a472-4f5d-95ca-390f9de25196@web.de>

Use prio_queue to improve worst-case performance at the cost of slightly
worse best-case performance.  Then add and use prio_queue_replace() to
recover that loss.

Changes since v2:
- Mention that a prio_queue improves performance for merge-heavy
  histories in the commit message.
- Add the new perf script to Meson build file.
- Mention which kind of history we are aiming for and show its shape in
  a comment in the perf script.
- Remove unnecessary quotes and use single quotes for the perf test
  code.
- Rename the variable delete_pending to get_pending to align it with
  the concrete function prio_queue_get() instead of referring to the
  abstract concept of deletion, to improve readability.

  commit: convert pop_most_recent_commit() to prio_queue
  prio-queue: add prio_queue_replace()
  commit: use prio_queue_replace() in pop_most_recent_commit()

 commit.c                          | 14 ++++--
 commit.h                          |  8 ++--
 fetch-pack.c                      | 13 +++---
 object-name.c                     | 10 ++---
 prio-queue.c                      | 45 ++++++++++++++------
 prio-queue.h                      |  8 ++++
 t/meson.build                     |  1 +
 t/perf/p1501-rev-parse-oneline.sh | 71 +++++++++++++++++++++++++++++++
 t/unit-tests/u-prio-queue.c       | 23 ++++++++++
 walker.c                          | 11 +++--
 10 files changed, 170 insertions(+), 34 deletions(-)
 create mode 100755 t/perf/p1501-rev-parse-oneline.sh

Range-diff against v1:
1:  d9c0d13801 ! 1:  5bff885d0f commit: convert pop_most_recent_commit() to prio_queue
    @@ Metadata
      ## Commit message ##
         commit: convert pop_most_recent_commit() to prio_queue
     
    -    pop_most_recent_commit() calls commit_list_insert_by_date(), which and
    -    is itself called in a loop, which can lead to quadratic complexity.
    -    Replace the commit_list with a prio_queue to ensure logarithmic worst
    -    case complexity and convert all three users.
    +    pop_most_recent_commit() calls commit_list_insert_by_date() for parent
    +    commits, which is itself called in a loop.  This can lead to quadratic
    +    complexity if there are many merges.  Replace the commit_list with a
    +    prio_queue to ensure logarithmic worst case complexity and convert all
    +    three users.
     
         Add a performance test that exercises one of them using a pathological
         history that consists of 50% merges and 50% root commits to demonstrate
    @@ object-name.c: static enum get_oid_result get_oid_with_context_1(struct reposito
      
      			free_commit_list(list);
     
    + ## t/meson.build ##
    +@@ t/meson.build: benchmarks = [
    +   'perf/p1450-fsck.sh',
    +   'perf/p1451-fsck-skip-list.sh',
    +   'perf/p1500-graph-walks.sh',
    ++  'perf/p1501-rev-parse-oneline.sh',
    +   'perf/p2000-sparse-operations.sh',
    +   'perf/p3400-rebase.sh',
    +   'perf/p3404-rebase-interactive.sh',
    +
      ## t/perf/p1501-rev-parse-oneline.sh (new) ##
     @@
     +#!/bin/sh
    @@ t/perf/p1501-rev-parse-oneline.sh (new)
     +
     +test_perf_fresh_repo
     +
    ++#
    ++# Creates lots of merges to make history traversal costly.  In
    ++# particular it creates 2^($max_level-1)-1 2-way merges on top of
    ++# 2^($max_level-1) root commits.  E.g., the commit history looks like
    ++# this for a $max_level of 3:
    ++#
    ++#     _1_
    ++#    /   \
    ++#   2     3
    ++#  / \   / \
    ++# 4   5 6   7
    ++#
    ++# The numbers are the fast-import marks, which also are the commit
    ++# messages.  1 is the HEAD commit and a merge, 2 and 3 are also merges,
    ++# 4-7 are the root commits.
    ++#
     +build_history () {
     +	local max_level="$1" &&
     +	local level="${2:-1}" &&
    @@ t/perf/p1501-rev-parse-oneline.sh (new)
     +	sed -n -e "s/^.* //p" -e "q" <commits >needle
     +'
     +
    -+test_perf "rev-parse ':/$(cat needle)'" "
    -+	git rev-parse ':/$(cat needle)' >actual
    -+"
    ++test_perf "rev-parse :/$(cat needle)" '
    ++	git rev-parse :/$(cat needle) >actual
    ++'
     +
     +test_expect_success 'verify result' '
     +	test_cmp expect actual
2:  a4011d850c = 2:  a2e57ca443 prio-queue: add prio_queue_replace()
3:  dc535e8b64 ! 3:  1cbea38524 commit: use prio_queue_replace() in pop_most_recent_commit()
    @@ commit.c: void commit_list_sort_by_date(struct commit_list **list)
      {
     -	struct commit *ret = prio_queue_get(queue);
     +	struct commit *ret = prio_queue_peek(queue);
    -+	int delete_pending = 1;
    ++	int get_pending = 1;
      	struct commit_list *parents = ret->parents;
      
      	while (parents) {
    @@ commit.c: void commit_list_sort_by_date(struct commit_list **list)
      		if (!repo_parse_commit(the_repository, commit) && !(commit->object.flags & mark)) {
      			commit->object.flags |= mark;
     -			prio_queue_put(queue, commit);
    -+			if (delete_pending)
    ++			if (get_pending)
     +				prio_queue_replace(queue, commit);
     +			else
     +				prio_queue_put(queue, commit);
    -+			delete_pending = 0;
    ++			get_pending = 0;
      		}
      		parents = parents->next;
      	}
    -+	if (delete_pending)
    ++	if (get_pending)
     +		prio_queue_get(queue);
      	return ret;
      }
-- 
2.50.1

  parent reply	other threads:[~2025-07-18  9:09 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-15 14:35 [PATCH 0/3] commit: convert pop_most_recent_commit() to prio_queue René Scharfe
2025-07-15 14:51 ` [PATCH 1/3] " René Scharfe
2025-07-15 19:23   ` Junio C Hamano
2025-07-15 20:47   ` Justin Tobler
2025-07-16  9:39     ` René Scharfe
2025-07-16  5:05   ` Jeff King
2025-07-16  9:39     ` René Scharfe
2025-07-17  8:22       ` René Scharfe
2025-07-19  6:55       ` Jeff King
2025-07-19  6:57         ` Jeff King
2025-07-19 11:15         ` René Scharfe
2025-07-20  0:03           ` Jeff King
2025-07-20  1:22             ` Junio C Hamano
2025-07-16 22:23   ` Junio C Hamano
2025-07-17  8:22     ` René Scharfe
2025-07-15 14:51 ` [PATCH 2/3] prio-queue: add prio_queue_replace() René Scharfe
2025-07-16  5:09   ` Jeff King
2025-07-16  9:38     ` René Scharfe
2025-07-17  9:20       ` René Scharfe
2025-07-19  7:02         ` Jeff King
2025-07-15 14:51 ` [PATCH 3/3] commit: use prio_queue_replace() in pop_most_recent_commit() René Scharfe
2025-07-15 20:43   ` Junio C Hamano
2025-07-16  9:38     ` René Scharfe
2025-07-16  0:07 ` [PATCH 0/3] commit: convert pop_most_recent_commit() to prio_queue Junio C Hamano
2025-07-16  5:15   ` Jeff King
2025-07-16  9:38     ` René Scharfe
2025-07-19  6:45       ` Jeff King
2025-07-16 14:49     ` Junio C Hamano
2025-07-18  9:09 ` René Scharfe [this message]
2025-07-18  9:39   ` [PATCH v2 1/3] " René Scharfe
2025-07-21 14:02     ` Lidong Yan
2025-08-03  9:54       ` René Scharfe
2025-08-03 16:48         ` Junio C Hamano
2025-08-04 19:56           ` René Scharfe
2025-07-18  9:39   ` [PATCH v2 3/3] commit: use prio_queue_replace() in pop_most_recent_commit(),MIME-Version: 1.0 René Scharfe
2025-08-03 11:12     ` Johannes Schindelin
2025-08-03 11:33       ` René Scharfe
2025-07-18  9:39   ` [PATCH v2 2/3] prio-queue: add prio_queue_replace() René Scharfe
2025-07-19  7:04   ` [PATCH v2 0/3] commit: convert pop_most_recent_commit() to prio_queue Jeff King
2025-07-22  6:26   ` SZEDER Gábor
2025-07-22 14:27     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ff40c56-368a-4347-aeae-2aca2cb6a5b2@web.de \
    --to=l.s.r@web$(echo .)de \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitster@pobox$(echo .)com \
    --cc=jltobler@gmail$(echo .)com \
    --cc=peff@peff$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox