public inbox for git@vger.kernel.org 
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox•com>
To: Derrick Stolee <stolee@gmail•com>
Cc: Lidong Yan <yldhome2d2@gmail•com>,
	 502024330056@smail•nju.edu.cn, git@vger•kernel.org
Subject: Re: [PATCH v4 4/4] bloom: optimize multiple pathspec items in revision traversal
Date: Mon, 07 Jul 2025 08:14:56 -0700	[thread overview]
Message-ID: <xmqqikk458y7.fsf@gitster.g> (raw)
In-Reply-To: <ea144a72-0975-4ac9-b2e4-ae0f7fcb6837@gmail.com> (Derrick Stolee's message of "Mon, 7 Jul 2025 07:43:19 -0400")

Derrick Stolee <stolee@gmail•com> writes:

> On 7/4/2025 7:14 AM, Lidong Yan wrote:
>> To enable optimize multiple pathspec items in revision traversal,
>> return 0 if all pathspec item is literal in forbid_bloom_filters().
>> Add code to initialize and check each pathspec item's bloom_keyvec.
>> 
>> Add new function release_revisions_bloom_keyvecs() to free all bloom
>> keyvec owned by rev_info.
>> 
>> Add new test cases in t/t4216-log-bloom.sh to ensure
>>   - consistent results between the optimization for multiple pathspec
>>     items using bloom filter and the case without bloom filter
>>     optimization.
>>   - does not use bloom filter if any pathspec item is not literal.
>
> This would be a great time to add some performance statistics when
> using this feature with multiple pathspecs on some standard repos (git
> and the Linux kernel repo are two good examples).
>
> We don't have a great performance script for this, since each test
> repo will have different paths to use for comparisons, but you can
> use 'hyperfine' to assemble your own comparisons before and after
> this change and report them here (and in your cover letter). 

Excellent suggestion.  Very much appreciated.

> The size of this diff is unfortunate. I wonder if there could first
> be an extraction of this logic to operate on a single pathspec and
> bloom_keyvec in a way that would be an obvious code move, then this
> patch could call that method in a loop now that we have an array of
> bloom_keyvecs.
>
> I think it would make a cleaner patch and a cleaner final result.

Hmph.  Didn't think of that extra "preliminary clean-up" step
myself, but I think you have a point.  That would likely make
the resulting series easier to follow.

Thanks.

  parent reply	other threads:[~2025-07-07 15:14 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25 12:55 [PATCH 0/2] bloom: use bloom filter given multiple pathspec Lidong Yan
2025-06-25 12:55 ` [PATCH 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-06-25 17:43   ` Junio C Hamano
2025-06-26  3:44     ` Lidong Yan
2025-06-25 12:55 ` [PATCH 2/2] bloom: enable multiple pathspec bloom keys Lidong Yan
2025-06-27 13:50   ` Junio C Hamano
2025-06-27 14:24     ` Lidong Yan
2025-06-27 18:09       ` Junio C Hamano
2025-07-01  5:52     ` Lidong Yan
2025-07-01 15:19       ` Junio C Hamano
2025-07-02  7:14         ` Lidong Yan
2025-07-02 15:48           ` Junio C Hamano
2025-07-03  1:52             ` Lidong Yan
2025-07-04 12:09             ` Lidong Yan
2025-07-01  8:50     ` SZEDER Gábor
2025-07-01 11:40       ` Lidong Yan
2025-07-01 15:43       ` Junio C Hamano
2025-06-27 20:39   ` Junio C Hamano
2025-06-28  2:54     ` Lidong Yan
2025-06-25 17:32 ` [PATCH 0/2] bloom: use bloom filter given multiple pathspec Junio C Hamano
2025-06-26  3:34   ` Lidong Yan
2025-06-26 14:15     ` Junio C Hamano
2025-06-27  6:21 ` [PATCH v2 0/2] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Lidong Yan
2025-06-28  4:21   ` [PATCH v3 " Lidong Yan
2025-07-04 11:14     ` [PATCH v4 0/4] " Lidong Yan
2025-07-04 11:14       ` [PATCH v4 1/4] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-04 11:14       ` [PATCH v4 2/4] bloom: rename function operates on bloom_key Lidong Yan
2025-07-04 11:14       ` [PATCH v4 3/4] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-07 11:35         ` Derrick Stolee
2025-07-07 14:14           ` Lidong Yan
2025-07-04 11:14       ` [PATCH v4 4/4] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-07-07 11:43         ` Derrick Stolee
2025-07-07 14:18           ` Lidong Yan
2025-07-07 15:14           ` Junio C Hamano [this message]
2025-07-10  8:48       ` [PATCH v5 0/4] bloom: enable bloom filter optimization for multiple pathspec elements " Lidong Yan
2025-07-10  8:48         ` [PATCH v5 1/4] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-10  8:48         ` [PATCH v5 2/4] bloom: rename function operates on bloom_key Lidong Yan
2025-07-10  8:48         ` [PATCH v5 3/4] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-10 16:17           ` Junio C Hamano
2025-07-11 12:46             ` Lidong Yan
2025-07-11 15:06               ` Junio C Hamano
2025-07-10  8:48         ` [PATCH v5 4/4] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-07-10 13:51           ` [PATCH v5.1 3.5/4] revision: make helper for pathspec to bloom key Derrick Stolee
2025-07-10 15:42             ` Lidong Yan
2025-07-10 13:55           ` [PATCH v5.1 4/4] bloom: optimize multiple pathspec items in revision Derrick Stolee
2025-07-10 15:49             ` Lidong Yan
2025-07-10 13:49         ` [PATCH v5 0/4] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Derrick Stolee
2025-07-12  9:35         ` [PATCH v6 0/5] " Lidong Yan
2025-07-12  9:35           ` [PATCH v6 1/5] bloom: add test helper to return murmur3 hash Lidong Yan
2025-07-12  9:35           ` [PATCH v6 2/5] bloom: rename function operates on bloom_key Lidong Yan
2025-07-12  9:35           ` [PATCH v6 3/5] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-12  9:35           ` [PATCH v6 4/5] revision: make helper for pathspec to bloom keyvec Lidong Yan
2025-07-12  9:35           ` [PATCH v6 5/5] To enable optimize multiple pathspec items in revision traversal, return 0 if all pathspec item is literal in forbid_bloom_filters(). Add for loops to initialize and check each pathspec item's bloom_keyvec when optimization is possible Lidong Yan
2025-07-12  9:47             ` Lidong Yan
2025-07-12  9:51               ` [PATCH v6 5/5] bloom: optimize multiple pathspec items in revision Lidong Yan
2025-07-14 16:51                 ` Derrick Stolee
2025-07-14 17:01                   ` Junio C Hamano
2025-07-15  1:37                     ` Lidong Yan
2025-07-15  2:56                       ` [RESEND][PATCH " Lidong Yan
2025-07-14 16:53           ` [PATCH v6 0/5] bloom: enable bloom filter optimization for multiple pathspec elements in revision traversal Derrick Stolee
2025-07-14 17:02             ` Junio C Hamano
2025-07-15  1:34             ` Lidong Yan
2025-07-15  2:48               ` Derrick Stolee
2025-07-15 15:09                 ` Junio C Hamano
2025-06-28  4:21   ` [PATCH v3 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-07-02 15:08     ` Patrick Steinhardt
2025-07-02 15:49       ` Lidong Yan
2025-07-02 18:28       ` Junio C Hamano
2025-07-03  1:41         ` Lidong Yan
2025-06-28  4:21   ` [PATCH v3 2/2] bloom: optimize multiple pathspec items in revision traversal Lidong Yan
2025-06-27  6:21 ` [PATCH v2 1/2] bloom: replace struct bloom_key * with struct bloom_keyvec Lidong Yan
2025-06-27  6:21 ` [PATCH v2 2/2] bloom: optimize multiple pathspec items in revision traversal Lidong Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqikk458y7.fsf@gitster.g \
    --to=gitster@pobox$(echo .)com \
    --cc=502024330056@smail$(echo .)nju.edu.cn \
    --cc=git@vger$(echo .)kernel.org \
    --cc=stolee@gmail$(echo .)com \
    --cc=yldhome2d2@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox