From: Daniel Borkmann <daniel@iogearbox•net>
To: Roi Dayan <roid@mellanox•com>,
Cong Wang <xiyou.wangcong@gmail•com>,
netdev@vger•kernel.org
Cc: jiri@mellanox•com, John Fastabend <john.fastabend@gmail•com>
Subject: Re: [Patch net-next] net_sched: move the empty tp check from ->destroy() to ->delete()
Date: Thu, 24 Nov 2016 18:18:40 +0100 [thread overview]
Message-ID: <583720F0.7090606@iogearbox.net> (raw)
In-Reply-To: <58370558.9070004@iogearbox.net>
On 11/24/2016 04:20 PM, Daniel Borkmann wrote:
> On 11/24/2016 12:01 PM, Roi Dayan wrote:
>> On 24/11/2016 12:14, Daniel Borkmann wrote:
>>> On 11/24/2016 09:29 AM, Roi Dayan wrote:
>>>> Hi,
>>>>
>>>> I'm testing this patch with KASAN enabled and got into a new kernel crash I didn't hit before.
>>>>
>>>> [ 1860.725065] ==================================================================
>>>> [ 1860.733893] BUG: KASAN: use-after-free in __netif_receive_skb_core+0x1ebe/0x29a0 at addr ffff880a68b04028
>>>> [ 1860.745415] Read of size 8 by task CPU 0/KVM/5334
>>>> [ 1860.751368] CPU: 8 PID: 5334 Comm: CPU 0/KVM Tainted: G O 4.9.0-rc3+ #18
>
> (Btw, your kernel is tainted with o-o-tree module? Anything relevant?)
>
>>>> [ 1860.760547] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 07/01/2015
>>>> [ 1860.768036] Call Trace:
>>>> [ 1860.771307] [<ffffffffa9b6dc42>] dump_stack+0x63/0x81
>>>> [ 1860.777167] [<ffffffffa95fb751>] kasan_object_err+0x21/0x70
>>>> [ 1860.783826] [<ffffffffa95fb9dd>] kasan_report_error+0x1ed/0x4e0
>>>> [ 1860.790640] [<ffffffffa9b9b841>] ? csum_partial+0x11/0x20
>>>> [ 1860.796871] [<ffffffffaa44a6b9>] ? csum_partial_ext+0x9/0x10
>>>> [ 1860.803571] [<ffffffffaa453155>] ? __skb_checksum+0x115/0x8d0
>>>> [ 1860.810370] [<ffffffffa95fbe81>] __asan_report_load8_noabort+0x61/0x70
>>>> [ 1860.818263] [<ffffffffaa49c3fe>] ? __netif_receive_skb_core+0x1ebe/0x29a0
>>>> [ 1860.826215] [<ffffffffaa49c3fe>] __netif_receive_skb_core+0x1ebe/0x29a0
>>>> [ 1860.833991] [<ffffffffaa49a540>] ? netdev_info+0x100/0x100
>>>> [ 1860.840529] [<ffffffffaa671792>] ? udp4_gro_receive+0x802/0x1090
>>>> [ 1860.847783] [<ffffffffa9bb9a08>] ? find_next_bit+0x18/0x20
>>>> [ 1860.854126] [<ffffffffaa49cf04>] __netif_receive_skb+0x24/0x150
>>>> [ 1860.861695] [<ffffffffaa49d0d1>] netif_receive_skb_internal+0xa1/0x1d0
>>>> [ 1860.869366] [<ffffffffaa49d030>] ? __netif_receive_skb+0x150/0x150
>>>> [ 1860.876464] [<ffffffffaa49f7e9>] ? dev_gro_receive+0x969/0x1660
>>>> [ 1860.883924] [<ffffffffaa4a0e1f>] napi_gro_receive+0x1df/0x300
>>>> [ 1860.890744] [<ffffffffc02e885d>] mlx5e_handle_rx_cqe_rep+0x83d/0xd30 [mlx5_core]
>>>>
>>>> checking with gdb
>>>>
>>>> (gdb) l *(__netif_receive_skb_core+0x1ebe)
>>>> 0xffffffff8249c3fe is in __netif_receive_skb_core (net/core/dev.c:3937).
>>>> 3932 *pt_prev = NULL;
>>>> 3933 }
>>>> 3934
>>>> 3935 qdisc_skb_cb(skb)->pkt_len = skb->len;
>>>> 3936 skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
>>>> 3937 qdisc_bstats_cpu_update(cl->q, skb);
>>>> 3938
>>>> 3939 switch (tc_classify(skb, cl, &cl_res, false)) {
>>>> 3940 case TC_ACT_OK:
>>>> 3941 case TC_ACT_RECLASSIFY:
>>>
>>> Can you elaborate some more on your test-case? Adding/dropping ingress qdisc with
>>> some classifier on it in a loop while traffic goes through?
>>
>> I first delete the qdisc ingress from the relevant interface
>> I start traffic on it then I add the qdisc ingress to the relevant interface and start adding tc flower rules to match the traffic.
>
> Ok, strange, qdisc_destroy() calls into ops->destroy(), where ingress
> drops its entire chain via tcf_destroy_chain(), so that will be NULL
> eventually. The tps are freed by call_rcu() as well as qdisc itself
> later on via qdisc_rcu_free(), where it frees per-cpu bstats as well.
> Outstanding readers should either bail out due to if (!cl) or can still
> process the chain until read section ends, but during that time, cl->q
> resp. bstats should be good. Do you happen to know what's at address
> ffff880a68b04028? I was wondering wrt call_rcu() vs call_rcu_bh(), but
> at least on ingress (netif_receive_skb_internal()) we hold rcu_read_lock()
> here. The KASAN report is reliably happening at this location, right?
Tried to reproduce this on my phys machine on top of Cong's patch and no
luck hitting above so far. I have a KASAN compiled kernel with pktgen
hitting ingress and ingress qdisc + flower filter rules added/destroyed
in a loop. Hmm, do you have a kernel config (particular RCU settings)?
next prev parent reply other threads:[~2016-11-24 17:18 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-24 1:58 [Patch net-next] net_sched: move the empty tp check from ->destroy() to ->delete() Cong Wang
2016-11-24 8:29 ` Roi Dayan
2016-11-24 10:14 ` Daniel Borkmann
2016-11-24 11:01 ` Roi Dayan
2016-11-24 15:20 ` Daniel Borkmann
2016-11-24 17:18 ` Daniel Borkmann [this message]
2016-11-26 6:46 ` Cong Wang
2016-11-26 11:09 ` Daniel Borkmann
2016-11-27 0:33 ` Daniel Borkmann
2016-11-27 4:47 ` Roi Dayan
2016-11-27 6:29 ` Roi Dayan
2016-11-28 2:26 ` John Fastabend
2016-11-28 2:51 ` John Fastabend
2016-11-29 6:59 ` Cong Wang
2016-11-28 2:57 ` John Fastabend
2016-11-29 6:57 ` Cong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=583720F0.7090606@iogearbox.net \
--to=daniel@iogearbox$(echo .)net \
--cc=jiri@mellanox$(echo .)com \
--cc=john.fastabend@gmail$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=roid@mellanox$(echo .)com \
--cc=xiyou.wangcong@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox