public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Vlad Buslov <vladbu@mellanox•com>
To: Cong Wang <xiyou.wangcong@gmail•com>
Cc: Linux Kernel Network Developers <netdev@vger•kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu•com>,
	Jiri Pirko <jiri@resnulli•us>, David Miller <davem@davemloft•net>
Subject: Re: [PATCH net-next 01/12] net: sched: flower: don't check for rtnl on head dereference
Date: Mon, 25 Feb 2019 16:11:20 +0000	[thread overview]
Message-ID: <vbf36obu9y9.fsf@mellanox.com> (raw)
In-Reply-To: <CAM_iQpVHggO6vQDr6RaWdr-g1nZGScO_mkWNVqaDXQSVbdiOCA@mail.gmail.com>


On Fri 22 Feb 2019 at 19:32, Cong Wang <xiyou.wangcong@gmail•com> wrote:
> On Thu, Feb 21, 2019 at 9:45 AM Vlad Buslov <vladbu@mellanox•com> wrote:
>>
>>
>> On Wed 20 Feb 2019 at 22:33, Cong Wang <xiyou.wangcong@gmail•com> wrote:
>> > On Tue, Feb 19, 2019 at 1:46 AM Vlad Buslov <vladbu@mellanox•com> wrote:
>> >>
>> >>
>> >> On Mon 18 Feb 2019 at 19:08, Cong Wang <xiyou.wangcong@gmail•com> wrote:
>> >> > On Wed, Feb 13, 2019 at 11:47 PM Vlad Buslov <vladbu@mellanox•com> wrote:
>> >> >>
>> >> >> Flower classifier only changes root pointer during init and destroy. Cls
>> >> >> API implements reference counting for tcf_proto, so there is no danger of
>> >> >> concurrent access to tp when it is being destroyed, even without protection
>> >> >> provided by rtnl lock.
>> >> >
>> >> > How about atomicity? Refcnt doesn't guarantee atomicity, how do
>> >> > you make sure two concurrent modifications are atomic?
>> >>
>> >> In order to guarantee atomicity I lock shared flower classifier data
>> >> structures with tp->lock in following patches.
>> >
>> > Sure, I meant the atomicity of the _whole_ change, as you know
>> > the TC filters are stored in hierarchical structures: a block, a chain,
>> > a tp struct, some type-specific data structure like a hash table.
>> >
>> > Locking tp only solves a partial of the atomicity here. Are you
>> > going to restart the whole change from top down to the bottom?
>>
>> You mean in cases where there is a conflict? Yes, processing EAGAIN in
>> tc_new_tfilter() involves releasing and re-obtaining all locks and
>> references.
>
> Sure, restart only happens when a conflict is detected, this is
> why I called it worst scenario.
>
>
>>
>> >
>> >
>> >>
>> >> >
>> >> >
>> >> >>
>> >> >> Implement new function fl_head_dereference() to dereference tp->root
>> >> >> without checking for rtnl lock. Use it in all flower function that obtain
>> >> >> head pointer instead of rtnl_dereference().
>> >> >>
>> >> >
>> >> > So what lock protects RCU writers after this patch?
>> >>
>> >> I explained it in comment for fl_head_dereference(), but should have
>> >> copied this information to changelog as well:
>> >> Flower classifier only changes root pointer during init and destroy.
>> >> Cls API implements reference counting for tcf_proto, so there is no
>> >> danger of concurrent access to tp when it is being destroyed, even
>> >> without protection provided by rtnl lock.
>> >
>> > So you are saying an RCU pointer is okay to deference without
>> > any lock eve without RCU read lock, right?
>> >
>> >
>> >>
>> >> In initial version of this change I used tp->lock to protect tp->root
>> >> access and verified it with lockdep, but during internal review Jiri
>> >> noted that this is not needed in current flower implementation.
>> >
>> > Let's see what you have on top of your own branch
>> > unlocked_flower_cong_1:
>> >
>> > 1458 static int fl_change(struct net *net, struct sk_buff *in_skb,
>> > 1459                      struct tcf_proto *tp, unsigned long base,
>> > 1460                      u32 handle, struct nlattr **tca,
>> > 1461                      void **arg, bool ovr, bool rtnl_held,
>> > 1462                      struct netlink_ext_ack *extack)
>> > 1463 {
>> > 1464         struct cls_fl_head *head = fl_head_dereference(tp);
>> >
>> > At the point of line 1464, there is no lock taken, tp->lock is taken
>> > after it, block->lock or chain lock is already unlocked before ->change().
>> >
>> > So, what protects this RCU structure? According to RCU, it must be
>> > either RCU read lock or some writer lock. I see none here. :(
>> >
>> > What am I missing?
>>
>> Initially I had flower implementation that protected tp->root access
>> with tp->lock, re-obtaining lock and head reference each time it was
>> used, sometimes multiple times per function (head reference is usually
>> obtained in beginning of every flower API function and used multiple
>> times afterwards). This complicated the code and reduced parallelism.
>> However, in case of flower classifier tp->root is never reassigned after
>> init, so essentially there is no "CU" part in this "RCU" pointer. This
>> realization allowed me to refactor unlocked flower code to look much
>> nicer and perform better. Am I missing something in flower classifier
>> implementation?
>
> So if it is no longer RCU any more, why do you still use
> rcu_dereference_protected()? That is, why not just deref it as a raw
> pointer?
>
> And, I don't think I can buy your argument here. The RCU infrastructure
> should not be changed even after your patches, the fast path is still
> protocted by RCU read lock, while the slow path now is protected by
> some smaller-scope locks. What makes cls_flower so unique that
> it doesn't even need RCU here? tp->root is not reassigned but it is still
> freed via RCU infra, that is in fl_destroy_sleepable().
>
> Thanks.

My cls API patch set introduced reference counting for tcf_proto
structure. With that change tp->ops->destroy() (which calls fl_destroy()
and fl_destroy_sleepable(), in case of flower classifier) is only called
after last reference to tp is released. All slow path users of tp->ops
must obtain reference to tp, so concurrent call to fl_destroy() is not
possible. Before this change tcf_proto structure didn't have reference
counting support and required users to obtain rtnl mutex before calling
its ops callbacks. This was verified in flower by using rtnl_dereference
to obtain tp->root.

  reply	other threads:[~2019-02-25 16:11 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14  7:47 [PATCH net-next 00/12] Refactor flower classifier to remove dependency on rtnl lock Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 01/12] net: sched: flower: don't check for rtnl on head dereference Vlad Buslov
2019-02-18 19:08   ` Cong Wang
2019-02-19  9:45     ` Vlad Buslov
2019-02-20 22:33       ` Cong Wang
2019-02-21 17:45         ` Vlad Buslov
2019-02-22 19:32           ` Cong Wang
2019-02-25 16:11             ` Vlad Buslov [this message]
2019-02-25 22:39               ` Cong Wang
2019-02-26 14:57                 ` Vlad Buslov
2019-02-28  0:49                   ` Cong Wang
2019-02-28 18:35                     ` Vlad Buslov
2019-03-02  0:51                       ` Cong Wang
2019-02-14  7:47 ` [PATCH net-next 02/12] net: sched: flower: refactor fl_change Vlad Buslov
2019-02-14 20:34   ` Stefano Brivio
2019-02-15 10:38     ` Vlad Buslov
2019-02-15 10:47       ` Stefano Brivio
2019-02-15 16:25         ` Vlad Buslov
2019-02-18 18:20           ` Stefano Brivio
2019-02-14  7:47 ` [PATCH net-next 03/12] net: sched: flower: introduce reference counting for filters Vlad Buslov
2019-02-14 20:34   ` Stefano Brivio
2019-02-15 11:22     ` Vlad Buslov
2019-02-15 12:32       ` Stefano Brivio
2019-02-14  7:47 ` [PATCH net-next 04/12] net: sched: flower: track filter deletion with flag Vlad Buslov
2019-02-14 20:49   ` Stefano Brivio
2019-02-15 15:54     ` Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 05/12] net: sched: flower: add reference counter to flower mask Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 06/12] net: sched: flower: handle concurrent mask insertion Vlad Buslov
2019-02-15 22:46   ` Stefano Brivio
2019-02-14  7:47 ` [PATCH net-next 07/12] net: sched: flower: protect masks list with spinlock Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 08/12] net: sched: flower: handle concurrent filter insertion in fl_change Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 09/12] net: sched: flower: handle concurrent tcf proto deletion Vlad Buslov
2019-02-18 20:47   ` Cong Wang
2019-02-19 14:08     ` Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 10/12] net: sched: flower: protect flower classifier state with spinlock Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 11/12] net: sched: flower: track rtnl lock state Vlad Buslov
2019-02-15 22:46   ` Stefano Brivio
2019-02-18  9:35     ` Vlad Buslov
2019-02-14  7:47 ` [PATCH net-next 12/12] net: sched: flower: set unlocked flag for flower proto ops Vlad Buslov
2019-02-18 19:27   ` Cong Wang
2019-02-19 10:15     ` Vlad Buslov
2019-02-20 22:36       ` Cong Wang
2019-02-18 19:15 ` [PATCH net-next 00/12] Refactor flower classifier to remove dependency on rtnl lock Cong Wang
2019-02-19 10:00   ` Vlad Buslov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vbf36obu9y9.fsf@mellanox.com \
    --to=vladbu@mellanox$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=jhs@mojatatu$(echo .)com \
    --cc=jiri@resnulli$(echo .)us \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=xiyou.wangcong@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox