From: Taehee Yoo <ap420073@gmail•com>
To: Eric Dumazet <eric.dumazet@gmail•com>,
davem@davemloft•net, kuba@kernel•org, dsahern@kernel•org,
yoshfuji@linux-ipv6•org, netdev@vger•kernel.org,
xiyou.wangcong@gmail•com
Subject: Re: [PATCH v2 net] mld: fix panic in mld_newpack()
Date: Sat, 22 May 2021 01:11:57 +0900 [thread overview]
Message-ID: <905ec725-3d38-e110-e1ef-1300da80c2de@gmail.com> (raw)
In-Reply-To: <00922799-f302-b17b-2f2c-032c4a562315@gmail.com>
On 5/21/21 2:20 AM, Eric Dumazet wrote:
Hi Eric,
Thank you for your review!
>
>
> On 5/16/21 4:44 PM, Taehee Yoo wrote:
>> mld_newpack() doesn't allow to allocate high order page,
>> only order-0 allocation is allowed.
>> If headroom size is too large, a kernel panic could occur in skb_put().
>>
>> Test commands:
>> ip netns del A
>> ip netns del B
>> ip netns add A
>> ip netns add B
>> ip link add veth0 type veth peer name veth1
>> ip link set veth0 netns A
>> ip link set veth1 netns B
>>
>> ip netns exec A ip link set lo up
>> ip netns exec A ip link set veth0 up
>> ip netns exec A ip -6 a a 2001:db8:0::1/64 dev veth0
>> ip netns exec B ip link set lo up
>> ip netns exec B ip link set veth1 up
>> ip netns exec B ip -6 a a 2001:db8:0::2/64 dev veth1
>> for i in {1..99}
>> do
>> let A=$i-1
>> ip netns exec A ip link add ip6gre$i type ip6gre \
>> local 2001:db8:$A::1 remote 2001:db8:$A::2 encaplimit 100
>> ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev ip6gre$i
>> ip netns exec A ip link set ip6gre$i up
>>
>> ip netns exec B ip link add ip6gre$i type ip6gre \
>> local 2001:db8:$A::2 remote 2001:db8:$A::1 encaplimit 100
>> ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev ip6gre$i
>> ip netns exec B ip link set ip6gre$i up
>> done
>>
>> Splat looks like:
>> kernel BUG at net/core/skbuff.c:110!
>> invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
>> CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.12.0+ #891
>> Workqueue: ipv6_addrconf addrconf_dad_work
>> RIP: 0010:skb_panic+0x15d/0x15f
>> Code: 92 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 00 ae 79 83
>> 41 57 41 56 41 55 48 8b 54 24 a6 26 f9 ff <0f> 0b 48 8b 6c 24 20 89
>> 34 24 e8 4a 4e 92 fe 8b 34 24 48 c7 c1 20
>> RSP: 0018:ffff88810091f820 EFLAGS: 00010282
>> RAX: 0000000000000089 RBX: ffff8881086e9000 RCX: 0000000000000000
>> RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed1020123efb
>> RBP: ffff888005f6eac0 R08: ffffed1022fc0031 R09: ffffed1022fc0031
>> R10: ffff888117e00187 R11: ffffed1022fc0030 R12: 0000000000000028
>> R13: ffff888008284eb0 R14: 0000000000000ed8 R15: 0000000000000ec0
>> FS: 0000000000000000(0000) GS:ffff888117c00000(0000)
>> knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f8b801c5640 CR3: 0000000033c2c006 CR4: 00000000003706f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>> ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600
>> ? ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600
>> skb_put.cold.104+0x22/0x22
>> ip6_mc_hdr.isra.26.constprop.46+0x12a/0x600
>> ? rcu_read_lock_sched_held+0x91/0xc0
>> mld_newpack+0x398/0x8f0
>> ? ip6_mc_hdr.isra.26.constprop.46+0x600/0x600
>> ? lock_contended+0xc40/0xc40
>> add_grhead.isra.33+0x280/0x380
>> add_grec+0x5ca/0xff0
>> ? mld_sendpack+0xf40/0xf40
>> ? lock_downgrade+0x690/0x690
>> mld_send_initial_cr.part.34+0xb9/0x180
>> ipv6_mc_dad_complete+0x15d/0x1b0
>> addrconf_dad_completed+0x8d2/0xbb0
>> ? lock_downgrade+0x690/0x690
>> ? addrconf_rs_timer+0x660/0x660
>> ? addrconf_dad_work+0x73c/0x10e0
>> addrconf_dad_work+0x73c/0x10e0
>>
>> Allowing high order page allocation could fix this problem.
>>
>> Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations")
>> Signed-off-by: Taehee Yoo <ap420073@gmail•com>
>> ---
>>
>> v1 -> v2:
>> - Wait for mld-sleepable patchset to be merged.
>>
>> net/ipv6/mcast.c | 3 ---
>> 1 file changed, 3 deletions(-)
>>
>> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
>> index 0d59efb6b49e..d36ef9d25e73 100644
>> --- a/net/ipv6/mcast.c
>> +++ b/net/ipv6/mcast.c
>> @@ -1745,10 +1745,7 @@ static struct sk_buff *mld_newpack(struct
inet6_dev *idev, unsigned int mtu)
>> IPV6_TLV_PADN, 0 };
>>
>> /* we assume size > sizeof(ra) here */
>> - /* limit our allocations to order-0 page */
>> - size = min_t(int, size, SKB_MAX_ORDER(0, 0));
>> skb = sock_alloc_send_skb(sk, size, 1, &err);
>> -
>> if (!skb)
>> return NULL;
>>
>>
>
> Sorry for being late to the party.
>
> This is forcing high-order allocations for devices with big mtu,
> even for non pathological cases.
>
> (lo has MTU 65535, so we attempt order-5 allocations :/ )
>
> I think this could be smarter [1], addressing both the common case
> and syzbot-like abuses.
>
> XMIT_RECURSION_LIMIT being 8, I doubt the repro makes any sense in
real world.
> Maybe it is time to limit netdev chains to 8 as well.
>
> Also, veth MTU being 1500, I fail to understand how your script
> was crashing your host.
The root problem is that dev->need_headroom can be abnormally big.
Because when the tunneling interface begins to send a packet, it
recalculates dev->needed_headroom.
int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
struct flowi6 *fl6, int encap_limit, __u32 *pmtu,
__u8 proto)
{
...
max_headroom = LL_RESERVED_SPACE(dst->dev) + sizeof(struct ipv6hdr)
+ dst->header_len + t->hlen;
if (max_headroom > dev->needed_headroom)
dev->needed_headroom = max_headroom;
Every time an interface sends a packet, this function is called and it
increases dev->needed_headroom.
The first time, ip6gre1 will recalculate its own needed_headroom value.
Then, when ip6gre2 is used and if a dst interface is ip6gre1, it
increases ip6gre->needed_headroom.
This logic will be repeated until it uses ip6gre100.
Actually, most packets will be drop because of XMIT_RECURSION_LIMIT.
But updating dev->needed_headroom will be updated regardless of the
success of sending packets.
So, the needed_headroom value can be too big.
I think your suggestion can fix this problem more smartly because It can
avoid high-order allocation by mtu and it allows high-order if hlen or
tlen is too big.
So, I think it deserves to be used.
And If we deny setting ->needed_headrom to a too big value at the
ip6_tnl_xmit(), it's safer I think.
How do you think about it?
>
> [1]
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index
d36ef9d25e73cb14eed45701acafcaf78e08451e..420bf2038e810173fc6f86622378b5151463f204
100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1738,13 +1738,16 @@ static struct sk_buff *mld_newpack(struct
inet6_dev *idev, unsigned int mtu)
> const struct in6_addr *saddr;
> int hlen = LL_RESERVED_SPACE(dev);
> int tlen = dev->needed_tailroom;
> - unsigned int size = mtu + hlen + tlen;
> + unsigned int size;
> int err;
> u8 ra[8] = { IPPROTO_ICMPV6, 0,
> IPV6_TLV_ROUTERALERT, 2, 0, 0,
> IPV6_TLV_PADN, 0 };
>
> - /* we assume size > sizeof(ra) here */
> + /* We assume size > sizeof(ra) here.
> + * Also try to not allocate high-order pages for big MTU.
> + */
> + size = min_t(int, mtu, PAGE_SIZE/2) + hlen + tlen;
> skb = sock_alloc_send_skb(sk, size, 1, &err);
> if (!skb)
> return NULL;
prev parent reply other threads:[~2021-05-21 16:12 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-16 14:44 [PATCH v2 net] mld: fix panic in mld_newpack() Taehee Yoo
2021-05-17 21:10 ` patchwork-bot+netdevbpf
2021-05-20 17:20 ` Eric Dumazet
2021-05-21 16:11 ` Taehee Yoo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=905ec725-3d38-e110-e1ef-1300da80c2de@gmail.com \
--to=ap420073@gmail$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=dsahern@kernel$(echo .)org \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=kuba@kernel$(echo .)org \
--cc=netdev@vger$(echo .)kernel.org \
--cc=xiyou.wangcong@gmail$(echo .)com \
--cc=yoshfuji@linux-ipv6$(echo .)org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox