From: Jamal Hadi Salim <jhs@mojatatu•com>
To: Alex Gartrell <agartrell@fb•com>,
xiyou.wangcong@gmail•com, davem@davemloft•net
Cc: netdev@vger•kernel.org, eric.dumazet@gmail•com,
kernel-team@fb•com, stable@vger•kernel.org
Subject: Re: [PATCH,v2 net] net: sched: validate that class is found in qdisc_tree_decrease_qlen
Date: Tue, 21 Jul 2015 06:04:41 -0400 [thread overview]
Message-ID: <55AE1939.105@mojatatu.com> (raw)
In-Reply-To: <1437421248-2796139-1-git-send-email-agartrell@fb.com>
On 07/20/15 15:40, Alex Gartrell wrote:
> We have an application that invokes tc to delete the root every time the
> config changes. As a result we stress the cleanup code and were seeing the
> following panic:
>
> crash> bt
> PID: 630839 TASK: ffff8823c990d280 CPU: 14 COMMAND: "tc"
> [... snip ...]
> #8 [ffff8820ceec17a0] page_fault at ffffffff8160a8c2
> [exception RIP: htb_qlen_notify+24]
> RIP: ffffffffa0841718 RSP: ffff8820ceec1858 RFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88241747b400
> RDX: ffff88241747b408 RSI: 0000000000000000 RDI: ffff8811fb27d000
> RBP: ffff8820ceec1868 R8: ffff88120cdeff24 R9: ffff88120cdeff30
> R10: 0000000000000bd4 R11: ffffffffa0840919 R12: ffffffffa0843340
> R13: 0000000000000000 R14: 0000000000000001 R15: ffff8808dae5c2e8
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #9 [...] qdisc_tree_decrease_qlen at ffffffff81565375
> #10 [...] fq_codel_dequeue at ffffffffa084e0a0 [sch_fq_codel]
> #11 [...] fq_codel_reset at ffffffffa084e2f8 [sch_fq_codel]
> #12 [...] qdisc_destroy at ffffffff81560d2d
> #13 [...] htb_destroy_class at ffffffffa08408f8 [sch_htb]
> #14 [...] htb_put at ffffffffa084095c [sch_htb]
> #15 [...] tc_ctl_tclass at ffffffff815645a3
> #16 [...] rtnetlink_rcv_msg at ffffffff81552cb0
> [... snip ...]
>
> To my understanding, the following situation is taking place.
>
> tc_ctl_tclass
> -> htb_delete
> -> class is deleted from clhash
> -> htb_put
> -> qdisc_destroy
> -> fq_codel_reset
=========> this part looks suspicious. Why is reset invoking
a dequeue? Shouldnt a destroy just purge the queue?
> -> fq_codel_dequeue
> -> qdidsc_tree_decrease_qlen
> -> cl = htb_get # returns NULL, removed in htb_delete
> -> htb_qlen_notify(sch, NULL) # BOOM
>
It is worrisome to fix the core code for this. The root cause seems to
be codel. Dont have time but in general, reset would be something like:
struct fq_codel_sched_data *q = qdisc_priv(sch);
qdisc_reset(q)
or something along those lines...
But certainly dequeue semantics dont seem right there..
cheers,
jamal
cheers,
jamal
next prev parent reply other threads:[~2015-07-21 10:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-20 19:40 [PATCH,v2 net] net: sched: validate that class is found in qdisc_tree_decrease_qlen Alex Gartrell
2015-07-21 10:04 ` Jamal Hadi Salim [this message]
2015-07-21 10:52 ` Eric Dumazet
2015-07-21 18:12 ` Cong Wang
2015-07-21 20:57 ` Eric Dumazet
2015-07-22 2:03 ` Cong Wang
2015-07-22 4:41 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55AE1939.105@mojatatu.com \
--to=jhs@mojatatu$(echo .)com \
--cc=agartrell@fb$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=kernel-team@fb$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=stable@vger$(echo .)kernel.org \
--cc=xiyou.wangcong@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox