* [PATCH v2 net] mld: fix panic in mld_newpack()
@ 2021-05-16 14:44 Taehee Yoo
2021-05-17 21:10 ` patchwork-bot+netdevbpf
2021-05-20 17:20 ` Eric Dumazet
0 siblings, 2 replies; 4+ messages in thread
From: Taehee Yoo @ 2021-05-16 14:44 UTC (permalink / raw)
To: davem, kuba, dsahern, yoshfuji, netdev, eric.dumazet,
xiyou.wangcong
Cc: ap420073
mld_newpack() doesn't allow to allocate high order page,
only order-0 allocation is allowed.
If headroom size is too large, a kernel panic could occur in skb_put().
Test commands:
ip netns del A
ip netns del B
ip netns add A
ip netns add B
ip link add veth0 type veth peer name veth1
ip link set veth0 netns A
ip link set veth1 netns B
ip netns exec A ip link set lo up
ip netns exec A ip link set veth0 up
ip netns exec A ip -6 a a 2001:db8:0::1/64 dev veth0
ip netns exec B ip link set lo up
ip netns exec B ip link set veth1 up
ip netns exec B ip -6 a a 2001:db8:0::2/64 dev veth1
for i in {1..99}
do
let A=$i-1
ip netns exec A ip link add ip6gre$i type ip6gre \
local 2001:db8:$A::1 remote 2001:db8:$A::2 encaplimit 100
ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev ip6gre$i
ip netns exec A ip link set ip6gre$i up
ip netns exec B ip link add ip6gre$i type ip6gre \
local 2001:db8:$A::2 remote 2001:db8:$A::1 encaplimit 100
ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev ip6gre$i
ip netns exec B ip link set ip6gre$i up
done
Splat looks like:
kernel BUG at net/core/skbuff.c:110!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.12.0+ #891
Workqueue: ipv6_addrconf addrconf_dad_work
RIP: 0010:skb_panic+0x15d/0x15f
Code: 92 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 00 ae 79 83
41 57 41 56 41 55 48 8b 54 24 a6 26 f9 ff <0f> 0b 48 8b 6c 24 20 89
34 24 e8 4a 4e 92 fe 8b 34 24 48 c7 c1 20
RSP: 0018:ffff88810091f820 EFLAGS: 00010282
RAX: 0000000000000089 RBX: ffff8881086e9000 RCX: 0000000000000000
RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed1020123efb
RBP: ffff888005f6eac0 R08: ffffed1022fc0031 R09: ffffed1022fc0031
R10: ffff888117e00187 R11: ffffed1022fc0030 R12: 0000000000000028
R13: ffff888008284eb0 R14: 0000000000000ed8 R15: 0000000000000ec0
FS: 0000000000000000(0000) GS:ffff888117c00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8b801c5640 CR3: 0000000033c2c006 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600
? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600
skb_put.cold.104+0x22/0x22
ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600
? rcu_read_lock_sched_held+0x91/0xc0
mld_newpack+0x398/0x8f0
? ip6_mc_hdr.isra.26.constprop.46+0x600/0x600
? lock_contended+0xc40/0xc40
add_grhead.isra.33+0x280/0x380
add_grec+0x5ca/0xff0
? mld_sendpack+0xf40/0xf40
? lock_downgrade+0x690/0x690
mld_send_initial_cr.part.34+0xb9/0x180
ipv6_mc_dad_complete+0x15d/0x1b0
addrconf_dad_completed+0x8d2/0xbb0
? lock_downgrade+0x690/0x690
? addrconf_rs_timer+0x660/0x660
? addrconf_dad_work+0x73c/0x10e0
addrconf_dad_work+0x73c/0x10e0
Allowing high order page allocation could fix this problem.
Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations")
Signed-off-by: Taehee Yoo <ap420073@gmail•com>
---
v1 -> v2:
- Wait for mld-sleepable patchset to be merged.
net/ipv6/mcast.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 0d59efb6b49e..d36ef9d25e73 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1745,10 +1745,7 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
IPV6_TLV_PADN, 0 };
/* we assume size > sizeof(ra) here */
- /* limit our allocations to order-0 page */
- size = min_t(int, size, SKB_MAX_ORDER(0, 0));
skb = sock_alloc_send_skb(sk, size, 1, &err);
-
if (!skb)
return NULL;
--
2.17.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH v2 net] mld: fix panic in mld_newpack() 2021-05-16 14:44 [PATCH v2 net] mld: fix panic in mld_newpack() Taehee Yoo @ 2021-05-17 21:10 ` patchwork-bot+netdevbpf 2021-05-20 17:20 ` Eric Dumazet 1 sibling, 0 replies; 4+ messages in thread From: patchwork-bot+netdevbpf @ 2021-05-17 21:10 UTC (permalink / raw) To: Taehee Yoo Cc: davem, kuba, dsahern, yoshfuji, netdev, eric.dumazet, xiyou.wangcong Hello: This patch was applied to netdev/net.git (refs/heads/master): On Sun, 16 May 2021 14:44:42 +0000 you wrote: > mld_newpack() doesn't allow to allocate high order page, > only order-0 allocation is allowed. > If headroom size is too large, a kernel panic could occur in skb_put(). > > Test commands: > ip netns del A > ip netns del B > ip netns add A > ip netns add B > ip link add veth0 type veth peer name veth1 > ip link set veth0 netns A > ip link set veth1 netns B > > [...] Here is the summary with links: - [v2,net] mld: fix panic in mld_newpack() https://git.kernel.org/netdev/net/c/020ef930b826 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 net] mld: fix panic in mld_newpack() 2021-05-16 14:44 [PATCH v2 net] mld: fix panic in mld_newpack() Taehee Yoo 2021-05-17 21:10 ` patchwork-bot+netdevbpf @ 2021-05-20 17:20 ` Eric Dumazet 2021-05-21 16:11 ` Taehee Yoo 1 sibling, 1 reply; 4+ messages in thread From: Eric Dumazet @ 2021-05-20 17:20 UTC (permalink / raw) To: Taehee Yoo, davem, kuba, dsahern, yoshfuji, netdev, eric.dumazet, xiyou.wangcong On 5/16/21 4:44 PM, Taehee Yoo wrote: > mld_newpack() doesn't allow to allocate high order page, > only order-0 allocation is allowed. > If headroom size is too large, a kernel panic could occur in skb_put(). > > Test commands: > ip netns del A > ip netns del B > ip netns add A > ip netns add B > ip link add veth0 type veth peer name veth1 > ip link set veth0 netns A > ip link set veth1 netns B > > ip netns exec A ip link set lo up > ip netns exec A ip link set veth0 up > ip netns exec A ip -6 a a 2001:db8:0::1/64 dev veth0 > ip netns exec B ip link set lo up > ip netns exec B ip link set veth1 up > ip netns exec B ip -6 a a 2001:db8:0::2/64 dev veth1 > for i in {1..99} > do > let A=$i-1 > ip netns exec A ip link add ip6gre$i type ip6gre \ > local 2001:db8:$A::1 remote 2001:db8:$A::2 encaplimit 100 > ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev ip6gre$i > ip netns exec A ip link set ip6gre$i up > > ip netns exec B ip link add ip6gre$i type ip6gre \ > local 2001:db8:$A::2 remote 2001:db8:$A::1 encaplimit 100 > ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev ip6gre$i > ip netns exec B ip link set ip6gre$i up > done > > Splat looks like: > kernel BUG at net/core/skbuff.c:110! > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.12.0+ #891 > Workqueue: ipv6_addrconf addrconf_dad_work > RIP: 0010:skb_panic+0x15d/0x15f > Code: 92 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 00 ae 79 83 > 41 57 41 56 41 55 48 8b 54 24 a6 26 f9 ff <0f> 0b 48 8b 6c 24 20 89 > 34 24 e8 4a 4e 92 fe 8b 34 24 48 c7 c1 20 > RSP: 0018:ffff88810091f820 EFLAGS: 00010282 > RAX: 0000000000000089 RBX: ffff8881086e9000 RCX: 0000000000000000 > RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed1020123efb > RBP: ffff888005f6eac0 R08: ffffed1022fc0031 R09: ffffed1022fc0031 > R10: ffff888117e00187 R11: ffffed1022fc0030 R12: 0000000000000028 > R13: ffff888008284eb0 R14: 0000000000000ed8 R15: 0000000000000ec0 > FS: 0000000000000000(0000) GS:ffff888117c00000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f8b801c5640 CR3: 0000000033c2c006 CR4: 00000000003706f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 > ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 > skb_put.cold.104+0x22/0x22 > ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 > ? rcu_read_lock_sched_held+0x91/0xc0 > mld_newpack+0x398/0x8f0 > ? ip6_mc_hdr.isra.26.constprop.46+0x600/0x600 > ? lock_contended+0xc40/0xc40 > add_grhead.isra.33+0x280/0x380 > add_grec+0x5ca/0xff0 > ? mld_sendpack+0xf40/0xf40 > ? lock_downgrade+0x690/0x690 > mld_send_initial_cr.part.34+0xb9/0x180 > ipv6_mc_dad_complete+0x15d/0x1b0 > addrconf_dad_completed+0x8d2/0xbb0 > ? lock_downgrade+0x690/0x690 > ? addrconf_rs_timer+0x660/0x660 > ? addrconf_dad_work+0x73c/0x10e0 > addrconf_dad_work+0x73c/0x10e0 > > Allowing high order page allocation could fix this problem. > > Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations") > Signed-off-by: Taehee Yoo <ap420073@gmail•com> > --- > > v1 -> v2: > - Wait for mld-sleepable patchset to be merged. > > net/ipv6/mcast.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c > index 0d59efb6b49e..d36ef9d25e73 100644 > --- a/net/ipv6/mcast.c > +++ b/net/ipv6/mcast.c > @@ -1745,10 +1745,7 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu) > IPV6_TLV_PADN, 0 }; > > /* we assume size > sizeof(ra) here */ > - /* limit our allocations to order-0 page */ > - size = min_t(int, size, SKB_MAX_ORDER(0, 0)); > skb = sock_alloc_send_skb(sk, size, 1, &err); > - > if (!skb) > return NULL; > > Sorry for being late to the party. This is forcing high-order allocations for devices with big mtu, even for non pathological cases. (lo has MTU 65535, so we attempt order-5 allocations :/ ) I think this could be smarter [1], addressing both the common case and syzbot-like abuses. XMIT_RECURSION_LIMIT being 8, I doubt the repro makes any sense in real world. Maybe it is time to limit netdev chains to 8 as well. Also, veth MTU being 1500, I fail to understand how your script was crashing your host. [1] diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index d36ef9d25e73cb14eed45701acafcaf78e08451e..420bf2038e810173fc6f86622378b5151463f204 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -1738,13 +1738,16 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu) const struct in6_addr *saddr; int hlen = LL_RESERVED_SPACE(dev); int tlen = dev->needed_tailroom; - unsigned int size = mtu + hlen + tlen; + unsigned int size; int err; u8 ra[8] = { IPPROTO_ICMPV6, 0, IPV6_TLV_ROUTERALERT, 2, 0, 0, IPV6_TLV_PADN, 0 }; - /* we assume size > sizeof(ra) here */ + /* We assume size > sizeof(ra) here. + * Also try to not allocate high-order pages for big MTU. + */ + size = min_t(int, mtu, PAGE_SIZE/2) + hlen + tlen; skb = sock_alloc_send_skb(sk, size, 1, &err); if (!skb) return NULL; ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 net] mld: fix panic in mld_newpack() 2021-05-20 17:20 ` Eric Dumazet @ 2021-05-21 16:11 ` Taehee Yoo 0 siblings, 0 replies; 4+ messages in thread From: Taehee Yoo @ 2021-05-21 16:11 UTC (permalink / raw) To: Eric Dumazet, davem, kuba, dsahern, yoshfuji, netdev, xiyou.wangcong On 5/21/21 2:20 AM, Eric Dumazet wrote: Hi Eric, Thank you for your review! > > > On 5/16/21 4:44 PM, Taehee Yoo wrote: >> mld_newpack() doesn't allow to allocate high order page, >> only order-0 allocation is allowed. >> If headroom size is too large, a kernel panic could occur in skb_put(). >> >> Test commands: >> ip netns del A >> ip netns del B >> ip netns add A >> ip netns add B >> ip link add veth0 type veth peer name veth1 >> ip link set veth0 netns A >> ip link set veth1 netns B >> >> ip netns exec A ip link set lo up >> ip netns exec A ip link set veth0 up >> ip netns exec A ip -6 a a 2001:db8:0::1/64 dev veth0 >> ip netns exec B ip link set lo up >> ip netns exec B ip link set veth1 up >> ip netns exec B ip -6 a a 2001:db8:0::2/64 dev veth1 >> for i in {1..99} >> do >> let A=$i-1 >> ip netns exec A ip link add ip6gre$i type ip6gre \ >> local 2001:db8:$A::1 remote 2001:db8:$A::2 encaplimit 100 >> ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev ip6gre$i >> ip netns exec A ip link set ip6gre$i up >> >> ip netns exec B ip link add ip6gre$i type ip6gre \ >> local 2001:db8:$A::2 remote 2001:db8:$A::1 encaplimit 100 >> ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev ip6gre$i >> ip netns exec B ip link set ip6gre$i up >> done >> >> Splat looks like: >> kernel BUG at net/core/skbuff.c:110! >> invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI >> CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.12.0+ #891 >> Workqueue: ipv6_addrconf addrconf_dad_work >> RIP: 0010:skb_panic+0x15d/0x15f >> Code: 92 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 00 ae 79 83 >> 41 57 41 56 41 55 48 8b 54 24 a6 26 f9 ff <0f> 0b 48 8b 6c 24 20 89 >> 34 24 e8 4a 4e 92 fe 8b 34 24 48 c7 c1 20 >> RSP: 0018:ffff88810091f820 EFLAGS: 00010282 >> RAX: 0000000000000089 RBX: ffff8881086e9000 RCX: 0000000000000000 >> RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed1020123efb >> RBP: ffff888005f6eac0 R08: ffffed1022fc0031 R09: ffffed1022fc0031 >> R10: ffff888117e00187 R11: ffffed1022fc0030 R12: 0000000000000028 >> R13: ffff888008284eb0 R14: 0000000000000ed8 R15: 0000000000000ec0 >> FS: 0000000000000000(0000) GS:ffff888117c00000(0000) >> knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007f8b801c5640 CR3: 0000000033c2c006 CR4: 00000000003706f0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Call Trace: >> ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 >> ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 >> skb_put.cold.104+0x22/0x22 >> ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600 >> ? rcu_read_lock_sched_held+0x91/0xc0 >> mld_newpack+0x398/0x8f0 >> ? ip6_mc_hdr.isra.26.constprop.46+0x600/0x600 >> ? lock_contended+0xc40/0xc40 >> add_grhead.isra.33+0x280/0x380 >> add_grec+0x5ca/0xff0 >> ? mld_sendpack+0xf40/0xf40 >> ? lock_downgrade+0x690/0x690 >> mld_send_initial_cr.part.34+0xb9/0x180 >> ipv6_mc_dad_complete+0x15d/0x1b0 >> addrconf_dad_completed+0x8d2/0xbb0 >> ? lock_downgrade+0x690/0x690 >> ? addrconf_rs_timer+0x660/0x660 >> ? addrconf_dad_work+0x73c/0x10e0 >> addrconf_dad_work+0x73c/0x10e0 >> >> Allowing high order page allocation could fix this problem. >> >> Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations") >> Signed-off-by: Taehee Yoo <ap420073@gmail•com> >> --- >> >> v1 -> v2: >> - Wait for mld-sleepable patchset to be merged. >> >> net/ipv6/mcast.c | 3 --- >> 1 file changed, 3 deletions(-) >> >> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c >> index 0d59efb6b49e..d36ef9d25e73 100644 >> --- a/net/ipv6/mcast.c >> +++ b/net/ipv6/mcast.c >> @@ -1745,10 +1745,7 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu) >> IPV6_TLV_PADN, 0 }; >> >> /* we assume size > sizeof(ra) here */ >> - /* limit our allocations to order-0 page */ >> - size = min_t(int, size, SKB_MAX_ORDER(0, 0)); >> skb = sock_alloc_send_skb(sk, size, 1, &err); >> - >> if (!skb) >> return NULL; >> >> > > Sorry for being late to the party. > > This is forcing high-order allocations for devices with big mtu, > even for non pathological cases. > > (lo has MTU 65535, so we attempt order-5 allocations :/ ) > > I think this could be smarter [1], addressing both the common case > and syzbot-like abuses. > > XMIT_RECURSION_LIMIT being 8, I doubt the repro makes any sense in real world. > Maybe it is time to limit netdev chains to 8 as well. > > Also, veth MTU being 1500, I fail to understand how your script > was crashing your host. The root problem is that dev->need_headroom can be abnormally big. Because when the tunneling interface begins to send a packet, it recalculates dev->needed_headroom. int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, struct flowi6 *fl6, int encap_limit, __u32 *pmtu, __u8 proto) { ... max_headroom = LL_RESERVED_SPACE(dst->dev) + sizeof(struct ipv6hdr) + dst->header_len + t->hlen; if (max_headroom > dev->needed_headroom) dev->needed_headroom = max_headroom; Every time an interface sends a packet, this function is called and it increases dev->needed_headroom. The first time, ip6gre1 will recalculate its own needed_headroom value. Then, when ip6gre2 is used and if a dst interface is ip6gre1, it increases ip6gre->needed_headroom. This logic will be repeated until it uses ip6gre100. Actually, most packets will be drop because of XMIT_RECURSION_LIMIT. But updating dev->needed_headroom will be updated regardless of the success of sending packets. So, the needed_headroom value can be too big. I think your suggestion can fix this problem more smartly because It can avoid high-order allocation by mtu and it allows high-order if hlen or tlen is too big. So, I think it deserves to be used. And If we deny setting ->needed_headrom to a too big value at the ip6_tnl_xmit(), it's safer I think. How do you think about it? > > [1] > diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c > index d36ef9d25e73cb14eed45701acafcaf78e08451e..420bf2038e810173fc6f86622378b5151463f204 100644 > --- a/net/ipv6/mcast.c > +++ b/net/ipv6/mcast.c > @@ -1738,13 +1738,16 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu) > const struct in6_addr *saddr; > int hlen = LL_RESERVED_SPACE(dev); > int tlen = dev->needed_tailroom; > - unsigned int size = mtu + hlen + tlen; > + unsigned int size; > int err; > u8 ra[8] = { IPPROTO_ICMPV6, 0, > IPV6_TLV_ROUTERALERT, 2, 0, 0, > IPV6_TLV_PADN, 0 }; > > - /* we assume size > sizeof(ra) here */ > + /* We assume size > sizeof(ra) here. > + * Also try to not allocate high-order pages for big MTU. > + */ > + size = min_t(int, mtu, PAGE_SIZE/2) + hlen + tlen; > skb = sock_alloc_send_skb(sk, size, 1, &err); > if (!skb) > return NULL; ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-05-21 16:12 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-05-16 14:44 [PATCH v2 net] mld: fix panic in mld_newpack() Taehee Yoo 2021-05-17 21:10 ` patchwork-bot+netdevbpf 2021-05-20 17:20 ` Eric Dumazet 2021-05-21 16:11 ` Taehee Yoo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox