From: Junio C Hamano <gitster@pobox•com>
To: Stefan Beller <sbeller@google•com>
Cc: Duy Nguyen <pclouds@gmail•com>,
Git Mailing List <git@vger•kernel.org>,
Brandon Williams <bmwill@google•com>
Subject: Re: [PATCH 32/36] pathspec: allow querying for attributes
Date: Wed, 09 Nov 2016 14:25:13 -0800 [thread overview]
Message-ID: <xmqqy40s8f5i.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <CAGZ79kZ=9QeZLKKrH27U2iE9x3WxgVe4RvCZpbdzZriMArV6Sg@mail.gmail.com> (Stefan Beller's message of "Wed, 9 Nov 2016 10:08:45 -0800")
Stefan Beller <sbeller@google•com> writes:
> On Wed, Nov 9, 2016 at 1:45 AM, Duy Nguyen <pclouds@gmail•com> wrote:
>> On Fri, Oct 28, 2016 at 1:29 AM, Junio C Hamano <gitster@pobox•com> wrote:
>>> ...
>>> The strategy this round takes to make it unnecessary to punt
>>> preloading (i.e. dropping "pathspec: disable preload-index when
>>> attribute pathspec magic is in use" patch the old series had) is to
>>> make the attribute subsystem thread-safe. But that thread-safety in
>>> the initial round is based on a single Big Attribute Lock, so it may
>>> turn out that the end result performs better for this codepath if we
>>> did not to make any call into the attribute subsystem.
>>
>> It does sound good and I want to say "yes please do this", but is it
>> making pathspec api a bit more complex (to express "assume all
>> attr-related criteria match")? I guess we can have an api to simply
>> filter out attr-related magic (basically set attr_match_nr back to
>> zero) then pass a safe (but more relaxing) pathspec to the threaded
>> code. That would not add big maintenance burden.
>
> So with the current implementation, we already have the shortcut as:
>
> if (item->attr_match_nr && !match_attrs(name, namelen, item))
>
> i.e. if attr_match_nr is zero, we do not even look at the mutexes and such,
> so I am not sure what you intend to say in this email?
I am not sure what relevance the "we call into attribute subsystem
only when there is any need to check attributes" obvious short-cut
has to what is being discussed.
The issue is specific to what preloading is about. It is merely an
attempt to run cheap checks that could be easily multi-threaded with
multiple threads early in the program that we _know_ we would need
to eventually refresh the index before doing some interesting work.
A full refresh_index() will be done eventually, and because it needs
to trigger thread-unsafe part of the API, it needs to be done in the
main thread. Doing the preload allows us to mark index entries that
do not have to be scanned again in the upcoming refresh_index()
call. It is OK for preload-index.c::preload_thread() to skip and
not mark some index entries (iow, its sole purpose is to leave a
note in each index entry "this is fresh, you do not need to look at
it again", and it can choose to skip an entry, which essentially
means "this I didn't check, so you, refresh_index(), need to check
yourself").
preload_thread() for example skips index entries that needs to
trigger "racy Git avoidance" logic that is heavyweight (it has to go
to the filesystem and the object store), and it is a sensible thing
to skip because they are rather rare.
The message by Duy you are responding to was his response to me who
wondered if the attribute based pathspec match also falls into the
same category. Just like racy Git code was deemed too heavyweight
to be called from preloading codepath and CE_MATCH_RACY_IS_DIRTY bit
was added as a way to ask ie_match_stat() API to avoid it (and hence
we are skipping, the caller is also telling "if you suspect a racy,
without checking for real, just answer 'I cannot say it is
clean/fresh'"), if we invent a new flag and pass it through
match_pathspec() down to match_pathspec_item() and have that if()
statement you quoted also skip match_attrs() for a pathspec element
with attribute based narrowing (as we are skipping, the flag may
also have say "instead of checking the attributes, just pretend that
the path did not satisfy the attribute narrowing"), would it benefit
the overall performance?
To answer Duy's "would it make sense to force the caller to create a
new pathspec from an existing one by filtering out pathspec elements
with attr-based narrowing?" question, I do not think it does.
next prev parent reply other threads:[~2016-11-09 22:25 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-22 23:31 [PATCHv2 00/36] Revamping the attr subsystem! Stefan Beller
2016-10-22 23:31 ` [PATCH 01/36] commit.c: use strchrnul() to scan for one line Stefan Beller
2016-10-22 23:31 ` [PATCH 02/36] attr.c: " Stefan Beller
2016-10-22 23:31 ` [PATCH 03/36] attr.c: update a stale comment on "struct match_attr" Stefan Beller
2016-10-22 23:31 ` [PATCH 04/36] attr.c: explain the lack of attr-name syntax check in parse_attr() Stefan Beller
2016-10-22 23:31 ` [PATCH 05/36] attr.c: complete a sentence in a comment Stefan Beller
2016-10-22 23:31 ` [PATCH 06/36] attr.c: mark where #if DEBUG ends more clearly Stefan Beller
2016-10-22 23:31 ` [PATCH 07/36] attr.c: simplify macroexpand_one() Stefan Beller
2016-10-22 23:31 ` [PATCH 08/36] attr.c: tighten constness around "git_attr" structure Stefan Beller
2016-10-22 23:31 ` [PATCH 09/36] attr.c: plug small leak in parse_attr_line() Stefan Beller
2016-10-22 23:31 ` [PATCH 10/36] attr: rename function and struct related to checking attributes Stefan Beller
2016-10-22 23:32 ` [PATCH 11/36] attr: (re)introduce git_check_attr() and struct git_attr_check Stefan Beller
2016-10-22 23:32 ` [PATCH 12/36] attr: convert git_all_attrs() to use "struct git_attr_check" Stefan Beller
2016-10-22 23:32 ` [PATCH 13/36] attr: convert git_check_attrs() callers to use the new API Stefan Beller
2016-10-22 23:32 ` [PATCH 14/36] attr: retire git_check_attrs() API Stefan Beller
2016-10-22 23:32 ` [PATCH 15/36] attr: add counted string version of git_check_attr() Stefan Beller
2016-10-22 23:32 ` [PATCH 16/36] attr: add counted string version of git_attr() Stefan Beller
2016-10-22 23:32 ` [PATCH 17/36] attr: expose validity check for attribute names Stefan Beller
2016-10-23 15:07 ` Ramsay Jones
2016-10-24 21:07 ` Stefan Beller
2016-10-27 20:57 ` Stefan Beller
2016-10-26 21:20 ` [PATCH] attr: expose error reporting function for invalid " Stefan Beller
2016-10-22 23:32 ` [PATCH 18/36] attr: support quoting pathname patterns in C style Stefan Beller
2016-10-22 23:32 ` [PATCH 19/36] attr.c: add push_stack() helper Stefan Beller
2016-10-22 23:32 ` [PATCH 20/36] attr.c: pass struct git_attr_check down the callchain Stefan Beller
2016-10-22 23:32 ` [PATCH 21/36] attr.c: rename a local variable check Stefan Beller
2016-10-22 23:32 ` [PATCH 22/36] attr.c: correct ugly hack for git_all_attrs() Stefan Beller
2016-10-22 23:32 ` [PATCH 23/36] attr.c: introduce empty_attr_check_elems() Stefan Beller
2016-10-22 23:32 ` [PATCH 24/36] attr.c: always pass check[] to collect_some_attrs() Stefan Beller
2016-10-22 23:32 ` [PATCH 25/36] attr.c: outline the future plans by heavily commenting Stefan Beller
2016-10-22 23:32 ` [PATCH 26/36] attr: make git_check_attr_counted static Stefan Beller
2016-10-22 23:32 ` [PATCH 27/36] attr: convert to new threadsafe API Stefan Beller
2016-10-24 18:55 ` Junio C Hamano
2016-10-24 19:18 ` Stefan Beller
2016-10-26 14:06 ` Duy Nguyen
2016-10-26 8:52 ` Johannes Schindelin
2016-10-26 9:35 ` Simon Ruderich
2016-10-26 12:15 ` Jeff King
2016-10-26 19:51 ` Stefan Beller
2016-10-26 20:20 ` Jeff King
2016-10-26 20:25 ` Johannes Sixt
2016-10-26 20:26 ` Jeff King
2016-10-26 20:40 ` Johannes Sixt
2016-10-26 20:46 ` Stefan Beller
2016-10-26 22:41 ` [PATCHv2 1/2] " Stefan Beller
2016-10-26 23:14 ` Junio C Hamano
2016-10-27 0:08 ` Stefan Beller
2016-10-27 0:16 ` Junio C Hamano
2016-10-27 0:22 ` Stefan Beller
2016-10-27 0:50 ` Junio C Hamano
2016-10-27 0:49 ` Junio C Hamano
2016-10-27 2:19 ` Stefan Beller
2016-10-27 4:13 ` Junio C Hamano
2016-10-27 5:44 ` Junio C Hamano
2016-10-27 22:15 ` [PATCH] " Stefan Beller
2016-10-28 8:55 ` Johannes Schindelin
2016-10-28 17:20 ` Junio C Hamano
2016-10-28 17:30 ` Junio C Hamano
2016-10-28 18:16 ` Johannes Sixt
2016-10-26 20:43 ` [PATCH 27/36] " Stefan Beller
2016-10-26 13:52 ` Duy Nguyen
2016-10-22 23:32 ` [PATCH 28/36] attr: keep attr stack for each check Stefan Beller
2016-10-23 15:10 ` Ramsay Jones
2016-10-26 23:10 ` Stefan Beller
2016-10-24 19:07 ` Junio C Hamano
2016-10-24 19:32 ` Stefan Beller
2016-10-24 20:29 ` Junio C Hamano
2016-10-22 23:32 ` [PATCH 29/36] Documentation: fix a typo Stefan Beller
2016-10-22 23:32 ` [PATCH 30/36] pathspec: move long magic parsing out of prefix_pathspec Stefan Beller
2016-10-22 23:32 ` [PATCH 31/36] pathspec: move prefix check out of the inner loop Stefan Beller
2016-10-22 23:32 ` [PATCH 32/36] pathspec: allow querying for attributes Stefan Beller
2016-10-26 13:33 ` Duy Nguyen
2016-10-27 21:32 ` Stefan Beller
2016-10-27 18:29 ` Junio C Hamano
2016-11-09 9:45 ` Duy Nguyen
2016-11-09 18:08 ` Stefan Beller
2016-11-09 22:25 ` Junio C Hamano [this message]
2016-10-22 23:32 ` [PATCH 33/36] pathspec: allow escaped query values Stefan Beller
2016-10-22 23:32 ` [PATCH 34/36] submodule update: add `--init-default-path` switch Stefan Beller
2016-10-22 23:32 ` [PATCH 35/36] clone: add --init-submodule=<pathspec> switch Stefan Beller
2016-10-22 23:32 ` [PATCH 36/36] completion: clone can initialize specific submodules Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqy40s8f5i.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox$(echo .)com \
--cc=bmwill@google$(echo .)com \
--cc=git@vger$(echo .)kernel.org \
--cc=pclouds@gmail$(echo .)com \
--cc=sbeller@google$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox