public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Vlad Buslov <vladbu@mellanox•com>
To: Cong Wang <xiyou.wangcong@gmail•com>, Jiri Pirko <jiri@resnulli•us>
Cc: Jiri Pirko <jiri@resnulli•us>,
	Linux Kernel Network Developers <netdev@vger•kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu•com>,
	David Miller <davem@davemloft•net>,
	Alexei Starovoitov <ast@kernel•org>,
	Daniel Borkmann <daniel@iogearbox•net>
Subject: Re: [PATCH net-next v2 01/17] net: sched: refactor mini_qdisc_pair_swap() to use workqueue
Date: Mon, 14 Jan 2019 19:00:48 +0000	[thread overview]
Message-ID: <vbfef9fvzc3.fsf@mellanox.com> (raw)
In-Reply-To: <vbf1s6c5mlz.fsf@mellanox.com>


On Thu 20 Dec 2018 at 13:55, Vlad Buslov <vladbu@mellanox•com> wrote:
> On Wed 19 Dec 2018 at 04:27, Cong Wang <xiyou.wangcong@gmail•com> wrote:
>> On Mon, Dec 17, 2018 at 2:30 AM Jiri Pirko <jiri@resnulli•us> wrote:
>>>
>>> Sun, Dec 16, 2018 at 07:52:18PM CET, xiyou.wangcong@gmail•com wrote:
>>> >On Sun, Dec 16, 2018 at 8:32 AM Vlad Buslov <vladbu@mellanox•com> wrote:
>>> >>
>>> >> On Thu 13 Dec 2018 at 23:32, Cong Wang <xiyou.wangcong@gmail•com> wrote:
>>> >> > On Tue, Dec 11, 2018 at 2:19 AM Vlad Buslov <vladbu@mellanox•com> wrote:
>>> >> >>
>>> >> >> As a part of the effort to remove dependency on rtnl lock, cls API is being
>>> >> >> converted to use fine-grained locking mechanisms instead of global rtnl
>>> >> >> lock. However, chain_head_change callback for ingress Qdisc is a sleeping
>>> >> >> function and cannot be executed while holding a spinlock.
>>> >> >
>>> >> >
>>> >> > Why does it have to be a spinlock not a mutex?
>>> >> >
>>> >> > I've read your cover letter and this changelog, I don't find any
>>> >> > answer.
>>> >>
>>> >> My initial implementation used mutex. However, it was changed to
>>> >> spinlock by Jiri's request during internal review.
>>> >>
>>> >
>>> >So what's the answer to my question? :)
>>>
>>> Yeah, my concern agains mutexes was that it would be needed to have one
>>> per every block and per every chain. I find it quite heavy and I believe
>>> it is better to use spinlock in those cases. This patch is a side effect
>>> of that. Do you think it would be better to have mutexes instead of
>>> spinlocks?
>>
>> My only concern with spinlock is we have to go async as we
>> can't block. This is almost always error-prone especially
>> when dealing with tcf block. I had to give up with spinlock
>> for idrinfo->lock, please take a look at:
>>
>> commit 95278ddaa15cfa23e4a06ee9ed7b6ee0197c500b
>> Author: Cong Wang <xiyou.wangcong@gmail•com>
>> Date:   Tue Oct 2 12:50:19 2018 -0700
>>
>>     net_sched: convert idrinfo->lock from spinlock to a mutex
>>
>>
>> There are indeed some cases in kernel we do take multiple
>> mutex'es, for example,
>>
>> /*
>>  * bd_mutex locking:
>>  *
>>  *  mutex_lock(part->bd_mutex)
>>  *    mutex_lock_nested(whole->bd_mutex, 1)
>>  */
>>
>> So, how heavy are they comparing with spinlocks?
>>
>> Thanks.
>
> Hi Cong,
>
> I did quick-and-dirty mutex-based implementation of my cls API patchset
> (removed async miniqp refactoring, changed block and chain lock types to
> mutex lock) and performed some benchmarks to determine exact mutex
> impact on cls API performance (both rule and chains API).
>
> 1. Parallel flower insertion rate (create 5 million flower filters on
> same chain/prio with 10 tc instances) - same performance:
>
> SPINLOCK
> $ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    0m12.183s
> user    0m35.013s
> sys     1m24.947s
>
> MUTEX
> $ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    0m12.246s
> user    0m35.417s
> sys     1m25.330s
>
> 2. Parallel flower insertion rate (create 100k flower filters, each on
> new instance of chain/tp, 10 tc instances) - mutex implementation is
> ~17% slower:
>
> SPINLOCK:
> $ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    2m19.285s
> user    0m1.863s
> sys     5m41.157s
>
> MUTEX:
> multichain$ time ls add_sw* | xargs -n 1 -P 100 sudo tc -force -b
>
> real    2m43.495s
> user    0m1.664s
> sys     8m32.483s
>
> 3. Chains insertion rate with cls chain API (create 100k empty chains, 1
> tc instance) - ~3% difference:
>
> SPINLOCK:
> $ time sudo tc -b add_chains
>
> real    0m46.822s
> user    0m0.239s
> sys     0m46.437s
>
> MUTEX:
> $ time sudo tc -b add_chains
>
> real    0m48.600s
> user    0m0.230s
> sys     0m48.227s
>
>
> Only case where performance is significantly impacted by mutex is second
> test. This happens because chain lookup is a linear search that is
> performed while holding highly contested block lock. Perf profile for
> mutex version confirms this:
>
> +   91.83%     0.00%  tc               [kernel.vmlinux]            [k] entry_SYSCALL_64_after_hwframe
> +   91.83%     0.00%  tc               [kernel.vmlinux]            [k] do_syscall_64
> +   91.80%     0.00%  tc               libc-2.25.so                [.] __libc_sendmsg
> +   91.78%     0.00%  tc               [kernel.vmlinux]            [k] __sys_sendmsg
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] ___sys_sendmsg
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] sock_sendmsg
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] netlink_sendmsg
> +   91.77%     0.01%  tc               [kernel.vmlinux]            [k] netlink_unicast
> +   91.77%     0.00%  tc               [kernel.vmlinux]            [k] netlink_rcv_skb
> +   91.71%     0.01%  tc               [kernel.vmlinux]            [k] rtnetlink_rcv_msg
> +   91.69%     0.03%  tc               [kernel.vmlinux]            [k] tc_new_tfilter
> +   90.96%    26.45%  tc               [kernel.vmlinux]            [k] __tcf_chain_get
> +   64.30%     0.01%  tc               [kernel.vmlinux]            [k] __mutex_lock.isra.7
> +   39.36%    39.18%  tc               [kernel.vmlinux]            [k] osq_lock
> +   24.92%    24.81%  tc               [kernel.vmlinux]            [k] mutex_spin_on_owner
>
> Based on these results we can conclude that use-cases with significant
> amount of chains on single block will be affected by using mutex in cls
> API.
>
> Regards,
> Vlad

Hi Cong,

Any comments regarding benchmark results? Are you okay with
spinlock-based implementation?

Thanks,
Vlad

  reply	other threads:[~2019-01-14 19:00 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-11 10:11 [PATCH net-next v2 00/17] Refactor classifier API to work with chain/classifiers without rtnl lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 01/17] net: sched: refactor mini_qdisc_pair_swap() to use workqueue Vlad Buslov
2018-12-12  3:23   ` kbuild test robot
2018-12-13 23:32   ` Cong Wang
2018-12-16 16:32     ` Vlad Buslov
2018-12-16 18:52       ` Cong Wang
2018-12-17 10:22         ` Jiri Pirko
2018-12-19  4:27           ` Cong Wang
2018-12-20 13:55             ` Vlad Buslov
2019-01-14 19:00               ` Vlad Buslov [this message]
2019-01-14 23:00                 ` Cong Wang
2018-12-19  8:12   ` kbuild test robot
2018-12-11 10:11 ` [PATCH net-next v2 02/17] net: sched: protect block state with spinlock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 03/17] net: sched: refactor tc_ctl_chain() to use block->lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 04/17] net: sched: protect block->chain0 with block->lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 05/17] net: sched: traverse chains in block with tcf_get_next_chain() Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 06/17] net: sched: protect chain template accesses with block lock Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 07/17] net: sched: lock the chain when accessing filter_chain list Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 08/17] net: sched: introduce reference counting for tcf proto Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 09/17] net: sched: traverse classifiers in chain with tcf_get_next_proto() Vlad Buslov
2018-12-14 10:15   ` kbuild test robot
2018-12-11 10:11 ` [PATCH net-next v2 10/17] net: sched: refactor tp insert/delete for concurrent execution Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 11/17] net: sched: prevent insertion of new classifiers during chain flush Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 12/17] net: sched: track rtnl lock status when validating extensions Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 13/17] net: sched: extend proto ops with 'put' callback Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 14/17] net: sched: extend proto ops to support unlocked classifiers Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 15/17] net: sched: add flags to Qdisc class ops struct Vlad Buslov
2018-12-11 10:11 ` [PATCH net-next v2 16/17] net: sched: refactor tcf_block_find() into standalone functions Vlad Buslov
2018-12-11 10:12 ` [PATCH net-next v2 17/17] net: sched: unlock rules update API Vlad Buslov
2018-12-20  0:15   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vbfef9fvzc3.fsf@mellanox.com \
    --to=vladbu@mellanox$(echo .)com \
    --cc=ast@kernel$(echo .)org \
    --cc=daniel@iogearbox$(echo .)net \
    --cc=davem@davemloft$(echo .)net \
    --cc=jhs@mojatatu$(echo .)com \
    --cc=jiri@resnulli$(echo .)us \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=xiyou.wangcong@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox