public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Petr Machata <petrm@mellanox•com>
To: Cong Wang <xiyou.wangcong@gmail•com>
Cc: Linux Kernel Network Developers <netdev@vger•kernel.org>,
	David Miller <davem@davemloft•net>,
	Jakub Kicinski <kuba@kernel•org>,
	Eric Dumazet <eric.dumazet@gmail•com>,
	Jiri Pirko <jiri@mellanox•com>,
	Ido Schimmel <idosch@mellanox•com>
Subject: Re: [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks
Date: Wed, 08 Jul 2020 18:21:12 +0200	[thread overview]
Message-ID: <87y2nugepz.fsf@mellanox.com> (raw)
In-Reply-To: <873662i3rc.fsf@mellanox.com>


Petr Machata <petrm@mellanox•com> writes:

> Cong Wang <xiyou.wangcong@gmail•com> writes:
>
>> On Tue, Jul 7, 2020 at 8:22 AM Petr Machata <petrm@mellanox•com> wrote:
>>>
>>>
>>> Cong Wang <xiyou.wangcong@gmail•com> writes:
>>>
>>> > On Fri, Jun 26, 2020 at 3:46 PM Petr Machata <petrm@mellanox•com> wrote:
>>> >> The function tcf_qevent_handle() should be invoked when qdisc hits the
>>> >> "interesting event" corresponding to a block. This function releases root
>>> >> lock for the duration of executing the attached filters, to allow packets
>>> >> generated through user actions (notably mirred) to be reinserted to the
>>> >> same qdisc tree.
>>> >
>>> > Are you sure releasing the root lock in the middle of an enqueue operation
>>> > is a good idea? I mean, it seems racy with qdisc change or reset path,
>>> > for example, __red_change() could update some RED parameters
>>> > immediately after you release the root lock.
>>>
>>> So I had mulled over this for a while. If packets are enqueued or
>>> dequeued while the lock is released, maybe the packet under
>>> consideration should have been hard_marked instead of prob_marked, or
>>> vice versa. (I can't really go to not marked at all, because the fact
>>> that we ran the qevent is a very observable commitment to put the packet
>>> in the queue with marking.) I figured that is not such a big deal.
>>>
>>> Regarding a configuration change, for a brief period after the change, a
>>> few not-yet-pushed packets could have been enqueued with ECN marking
>>> even as I e.g. disabled ECN. This does not seem like a big deal either,
>>> these are transient effects.
>>
>> Hmm, let's see:
>>
>> 1. red_enqueue() caches a pointer to child qdisc, child = q->qdisc
>> 2. root lock is released by tcf_qevent_handle().
>> 3. __red_change() acquires the root lock and then changes child
>> qdisc with a new one
>> 4. The old child qdisc is put by qdisc_put()
>> 5. tcf_qevent_handle() acquires the root lock again, and still uses
>> the cached but now freed old child qdisc.
>>
>> Isn't this a problem?
>
> I missed this. It is a problem, destroy gets called right away and then
> enqueue would use invalid data.
>
> Also qdisc_graft() calls cops->graft, which locks the tree and swaps the
> qdisc pointes, and then qdisc_put()s the original one. So dropping the
> root lock can lead to destruction of the qdisc whose enqueue is
> currently executed.
>
> So that's no good. The lock has to be held throughout.
>
>> Even if it really is not, why do we make tcf_qevent_handle() callers'
>> life so hard? They have to be very careful with race conditions like this.
>
> Do you have another solution in mind here? I think the deadlock (in both
> classification and qevents) is an issue, but really don't know how to
> avoid it except by dropping the lock.

Actually I guess I could qdisc_refcount_inc() the current qdisc so that
it doesn't go away. Then when enqueing I could access the child
directly, not relying on the now-obsolete cache from the beginning of
the enqueue function. I suppose that a similar approach could be used in
other users of tcf_classify() as well. What do you think?

  reply	other threads:[~2020-07-08 16:21 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-26 22:45 [PATCH net-next v1 0/5] TC: Introduce qevents Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 1/5] net: sched: Pass root lock to Qdisc_ops.enqueue Petr Machata
2020-07-06 19:21   ` Cong Wang
2020-07-07 15:25     ` Petr Machata
2020-07-07 19:41       ` Cong Wang
2020-06-26 22:45 ` [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks Petr Machata
2020-07-06 19:48   ` Cong Wang
2020-07-07 15:22     ` Petr Machata
2020-07-07 19:13       ` Cong Wang
2020-07-08 12:35         ` Petr Machata
2020-07-08 16:21           ` Petr Machata [this message]
2020-07-08 19:09             ` Cong Wang
2020-07-08 19:04           ` Cong Wang
2020-07-08 21:04             ` Petr Machata
2020-07-09  0:13               ` Petr Machata
2020-07-09 19:37                 ` Cong Wang
2020-07-10 14:40                   ` Petr Machata
2020-07-14  2:51                     ` Cong Wang
2020-07-14  9:12                       ` Petr Machata
2020-07-07 19:48   ` Cong Wang
2020-07-08  9:19     ` Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 3/5] net: sched: sch_red: Split init and change callbacks Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 4/5] net: sched: sch_red: Add qevents "early_drop" and "mark" Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 5/5] selftests: forwarding: Add a RED test for SW datapath Petr Machata
2020-06-26 22:45 ` [PATCH iproute2-next v1 1/4] uapi: pkt_sched: Add two new RED attributes Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 2/4] tc: Add helpers to support qevent handling Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 3/4] man: tc: Describe qevents Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 4/4] tc: q_red: Add support for qevents "mark" and "early_drop" Petr Machata
2020-06-26 22:56 ` [PATCH net-next v1 0/5] TC: Introduce qevents Stephen Hemminger
2020-06-29 13:21   ` Petr Machata
2020-06-30  0:15 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y2nugepz.fsf@mellanox.com \
    --to=petrm@mellanox$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=eric.dumazet@gmail$(echo .)com \
    --cc=idosch@mellanox$(echo .)com \
    --cc=jiri@mellanox$(echo .)com \
    --cc=kuba@kernel$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=xiyou.wangcong@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox