public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Jeff King <peff@peff•net>
To: Kristofer Karlsson via GitGitGadget <gitgitgadget@gmail•com>
Cc: git@vger•kernel.org, Kristofer Karlsson <krka@spotify•com>
Subject: Re: [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter
Date: Mon, 25 May 2026 02:47:55 -0400	[thread overview]
Message-ID: <20260525064755.GA2737798@coredump.intra.peff.net> (raw)
In-Reply-To: <pull.2124.git.1779644541.gitgitgadget@gmail.com>

On Sun, May 24, 2026 at 05:42:17PM +0000, Kristofer Karlsson via GitGitGadget wrote:

> paint_down_to_common() and ahead_behind() terminate when every commit in
> their priority queue is STALE. The current check, queue_has_nonstale(), does
> an O(n) linear scan of the queue on every iteration, costing O(n*m) total
> where n is the queue size and m is the number of commits processed. This
> series replaces that scan with an O(1) counter.

We faced a similar problem in limit_list() but solved it a bit
differently (mostly because I was worried about keeping the counter up
to date in all cases).

It's described in more detail in b6e8a3b540 (limit_list: avoid quadratic
behavior from still_interesting, 2015-04-17), but the general idea is to
just cache the interesting element we found, and invalidate the cache
when it gets removed from the queue or gets marked UNINTERESTING.

The equivalent code for the STALE flag here is something like this:

diff --git a/commit-reach.c b/commit-reach.c
index d3a9b3ed6f..d1621be89f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -39,12 +39,25 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
 	return 0;
 }
 
-static int queue_has_nonstale(struct prio_queue *queue)
+static int queue_has_nonstale(struct prio_queue *queue,
+			      struct commit **nonstale_cache)
 {
+	if (*nonstale_cache) {
+		struct commit *commit = *nonstale_cache;
+		if (!(commit->object.flags & STALE))
+			return 1;
+	}
+
+	/*
+	 * This might also benefit from looking back-to-front, since
+	 * earlier commits are more likely to get popped sooner.
+	 */
 	for (size_t i = 0; i < queue->nr; i++) {
 		struct commit *commit = queue->array[i].data;
-		if (!(commit->object.flags & STALE))
+		if (!(commit->object.flags & STALE)) {
+			*nonstale_cache = commit;
 			return 1;
+		}
 	}
 	return 0;
 }
@@ -61,6 +74,7 @@ static int paint_down_to_common(struct repository *r,
 	int i;
 	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
 	struct commit_list **tail = result;
+	struct commit *nonstale_cache = NULL;
 
 	if (!min_generation && !corrected_commit_dates_enabled(r))
 		queue.compare = compare_commits_by_commit_date;
@@ -77,12 +91,15 @@ static int paint_down_to_common(struct repository *r,
 		prio_queue_put(&queue, twos[i]);
 	}
 
-	while (queue_has_nonstale(&queue)) {
+	while (queue_has_nonstale(&queue, &nonstale_cache)) {
 		struct commit *commit = prio_queue_get(&queue);
 		struct commit_list *parents;
 		int flags;
 		timestamp_t generation = commit_graph_generation(commit);
 
+		if (nonstale_cache == commit)
+			nonstale_cache = NULL;
+
 		if (min_generation && generation > last_gen)
 			BUG("bad generation skip %"PRItime" > %"PRItime" at %s",
 			    generation, last_gen,
@@ -1053,6 +1070,7 @@ void ahead_behind(struct repository *r,
 {
 	struct prio_queue queue = { .compare = compare_commits_by_gen_then_commit_date };
 	size_t width = DIV_ROUND_UP(commits_nr, BITS_IN_EWORD);
+	struct commit *nonstale_cache = NULL;
 
 	if (!commits_nr || !counts_nr)
 		return;
@@ -1074,11 +1092,14 @@ void ahead_behind(struct repository *r,
 		insert_no_dup(&queue, c);
 	}
 
-	while (queue_has_nonstale(&queue)) {
+	while (queue_has_nonstale(&queue, &nonstale_cache)) {
 		struct commit *c = prio_queue_get(&queue);
 		struct commit_list *p;
 		struct bitmap *bitmap_c = get_bit_array(c, width);
 
+		if (c == nonstale_cache)
+			nonstale_cache = NULL;
+
 		for (size_t i = 0; i < counts_nr; i++) {
 			int reach_from_tip = !!bitmap_get(bitmap_c, counts[i].tip_index);
 			int reach_from_base = !!bitmap_get(bitmap_c, counts[i].base_index);


I don't have a repo handy which reproduces the problem, so I can't see
if it improves things. But if it's easy to do, can you report on the
timing change with your monorepo?

I do think what I've shown here is a bit hacky (just like the
limit_list() one), as we are relying on heuristics about the order in
which items are taken from the queue. So even if it performs well, we
may still prefer the counter version for being truly O(1). But having
timing numbers would be useful for comparing the two approaches.

-Peff

  parent reply	other threads:[~2026-05-25  6:48 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-24 17:42 [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson via GitGitGadget
2026-05-24 17:42 ` [PATCH 1/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-24 23:40   ` Junio C Hamano
2026-05-25  1:43     ` Derrick Stolee
2026-05-25  6:50       ` Kristofer Karlsson
2026-05-25  7:17         ` Junio C Hamano
2026-05-25  7:53           ` Kristofer Karlsson
2026-05-25 10:02             ` Jeff King
2026-05-25  7:01   ` Jeff King
2026-05-25  7:15     ` Jeff King
2026-05-24 17:42 ` [PATCH 2/3] commit-reach: optimize queue scan " Kristofer Karlsson via GitGitGadget
2026-05-25  1:59   ` Derrick Stolee
2026-05-25  8:54     ` Kristofer Karlsson
2026-05-24 17:42 ` [PATCH 3/3] commit-reach: optimize queue scan in ahead_behind Kristofer Karlsson via GitGitGadget
2026-05-25  7:11   ` Jeff King
2026-05-25  6:47 ` Jeff King [this message]
2026-05-25  7:59   ` [PATCH 0/3] commit-reach: replace queue_has_nonstale with a counter Kristofer Karlsson
2026-05-25  8:38     ` Junio C Hamano
2026-05-25  9:55     ` Jeff King
2026-05-25 10:47       ` Kristofer Karlsson
2026-05-25 14:28 ` [PATCH v2 0/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget
2026-05-25 14:28   ` [PATCH v2 1/3] object.h: fix stale entries in object flag allocation table Kristofer Karlsson via GitGitGadget
2026-05-25 14:28   ` [PATCH v2 2/3] commit-reach: deduplicate queue entries in paint_down_to_common Kristofer Karlsson via GitGitGadget
2026-05-25 22:50     ` Junio C Hamano
2026-05-26  6:57       ` Kristofer Karlsson
2026-05-25 14:28   ` [PATCH v2 3/3] commit-reach: replace queue_has_nonstale() scan with O(1) tracking Kristofer Karlsson via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260525064755.GA2737798@coredump.intra.peff.net \
    --to=peff@peff$(echo .)net \
    --cc=git@vger$(echo .)kernel.org \
    --cc=gitgitgadget@gmail$(echo .)com \
    --cc=krka@spotify$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox