From: Arjan van de Ven <arjan@linux•intel.com>
To: netdev@vger•kernel.org
Cc: syzbot+d8f76778263ab65c2b21@syzkaller•appspotmail.com,
davem@davemloft•net, dsahern@kernel•org, edumazet@google•com,
horms@kernel•org, kuba@kernel•org, linux-kernel@vger•kernel.org,
pabeni@redhat•com, syzkaller-bugs@googlegroups•com
Subject: Re: [syzbot] [net?] general protection fault in kernel_sock_shutdown (4)
Date: Fri, 24 Apr 2026 09:47:17 -0700 [thread overview]
Message-ID: <20260424164733.356003-1-arjan@linux.intel.com> (raw)
In-Reply-To: <69ea344f.a00a0220.17a17.0040.GAE@google.com>
This report was analysed with the help of an automated kernel crash
analysis assistant. The analysis below is tentative and should be
reviewed by a human before any action is taken.
Decoded Backtrace
-----------------
1. kernel_sock_shutdown -- crash site (net/socket.c:3785)
3783 int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how)
3784 {
3785 return READ_ONCE(sock->ops)->shutdown(sock, how);
/* CRASH: sock->ops is NULL (R12 = 0x0); KASAN traps
null-ptr-deref at offset 0x68 = offsetof(proto_ops, shutdown) */
3786 }
Register context at crash:
RBX = 0xffff888058587240 (struct socket *sock)
R12 = 0x0000000000000000 (sock->ops, loaded from RBX+0x20 -- NULL)
RDI = 0x0000000000000068 (= NULL + 0x68, address of shutdown fn ptr)
RBP = 0x0000000000000002 (how = SHUT_RDWR)
2. udp_tunnel_sock_release (net/ipv4/udp_tunnel_core.c:196-202)
196 void udp_tunnel_sock_release(struct socket *sock)
197 {
198 rcu_assign_sk_user_data(sock->sk, NULL);
199 synchronize_rcu();
200 kernel_sock_shutdown(sock, SHUT_RDWR); /* <- calls crash site */
201 sock_release(sock);
202 }
3. rxe_release_udp_tunnel inlined (drivers/infiniband/sw/rxe/rxe_net.c:290-293)
290 static void rxe_release_udp_tunnel(struct socket *sk)
291 {
292 if (sk)
293 udp_tunnel_sock_release(sk);
294 }
4. rxe_sock_put (drivers/infiniband/sw/rxe/rxe_net.c:632-643)
632 static void rxe_sock_put(struct sock *sk,
633 void (*set_sk)(struct net *, struct sock *),
634 struct net *net)
635 {
636 if (refcount_read(&sk->sk_refcnt) > SK_REF_FOR_TUNNEL) {
637 __sock_put(sk);
638 } else {
639 rxe_release_udp_tunnel(sk->sk_socket); /* <- release BEFORE clear */
640 sk = NULL;
641 set_sk(net, sk); /* <- clear AFTER (too late) */
642 }
643 }
Caller: rxe_net_del (rxe_net.c:644-666), triggered via:
nldev_dellink -> rxe_dellink -> rxe_net_del -> rxe_sock_put
Tentative Analysis
------------------
sock->ops is set to NULL by sock_release() (net/socket.c:726) after
calling ops->release(sock). The crash in kernel_sock_shutdown() means
the socket was already passed to sock_release() before this call.
Two independent code paths can release the same UDP tunnel socket stored
in the per-network-namespace rxe_ns_sock structure:
Path 1 -- namespace teardown (rxe_ns.c, rxe_ns_exit()):
rcu_assign_pointer(ns_sk->rxe_sk4, NULL); /* clears pointer FIRST */
udp_tunnel_sock_release(sk->sk_socket); /* then releases */
Path 2 -- RDMA link delete (rxe_net.c, rxe_net_del() -> rxe_sock_put()):
sk = rxe_ns_pernet_sk4(net); /* reads pointer (no ownership) */
rxe_release_udp_tunnel(sk->sk_socket); /* releases FIRST */
set_sk(net, NULL); /* clears AFTER */
The following TOCTOU (time-of-check time-of-use) race is possible when
namespace teardown and RDMA link deletion occur concurrently:
Thread A (rxe_net_del):
rxe_ns_pernet_sk4() -> sk = X (non-NULL)
Thread B (rxe_ns_exit):
rcu_assign_pointer(sk4, NULL)
udp_tunnel_sock_release(X->sk_socket)
sock_release(X->sk_socket)
X->sk_socket->ops = NULL <- clears ops
Thread A (rxe_net_del) continues:
rxe_sock_put(sk=X, ...)
rxe_release_udp_tunnel(X->sk_socket)
kernel_sock_shutdown(X->sk_socket, SHUT_RDWR)
READ_ONCE(sock->ops)->shutdown(...)
<- CRASH: sock->ops == NULL
The bug was introduced by two commits in March 2026 that added
per-network-namespace support to the Soft RoCE (RXE) driver:
13f2a53c2a71e RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
f1327abd6abed RDMA/rxe: Support RDMA link creation and destruction per
net namespace
Neither commit provides synchronisation between the two teardown paths.
Potential Solution
------------------
Replace rxe_ns_pernet_sk4() calls in rxe_net_del() (and rxe_notify())
with an atomic exchange that simultaneously reads and clears the pernet
pointer, so only one of the two teardown paths can ever obtain a
non-NULL socket pointer:
struct sock *rxe_ns_pernet_take_sk4(struct net *net)
{
struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
return unrcu_pointer(xchg(&ns_sk->rxe_sk4, RCU_INITIALIZER(NULL)));
}
Whichever path (rxe_ns_exit or rxe_net_del) wins the xchg gets the
socket and releases it; the loser gets NULL and skips the release.
More information
----------------
Oops-Analysis: https://lore.kernel.org/r/69ea344f.a00a0220.17a17.0040.GAE@google.com
Assisted-by: linux-kernel-oops-x86 skill (Claude Sonnet 4.6)
next prev parent reply other threads:[~2026-04-24 16:46 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 15:01 [syzbot] [net?] general protection fault in kernel_sock_shutdown (4) syzbot
2026-04-23 17:41 ` Jakub Kicinski
2026-04-24 16:47 ` Arjan van de Ven [this message]
2026-04-24 18:08 ` Arjan van de Ven
2026-05-06 13:48 ` [syzbot] [rdma] " syzbot
2026-05-06 14:28 ` Zhu Yanjun
2026-05-06 15:19 ` Kuniyuki Iwashima
2026-05-07 3:52 ` syzbot
2026-05-07 12:50 ` [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink() Edward Adam Davis
2026-05-07 13:25 ` Zhu Yanjun
2026-05-07 13:40 ` Edward Adam Davis
2026-05-07 14:11 ` Zhu Yanjun
2026-05-13 18:17 ` Leon Romanovsky
2026-05-13 23:46 ` Jason Gunthorpe
2026-05-14 7:31 ` Edward Adam Davis
2026-05-14 11:50 ` Jason Gunthorpe
2026-05-14 13:58 ` David Ahern
2026-05-14 14:14 ` Jason Gunthorpe
2026-05-14 14:26 ` David Ahern
2026-05-14 15:46 ` Zhu Yanjun
2026-05-16 12:40 ` Edward Adam Davis
2026-05-16 14:00 ` [PATCH RDMA v2] RDMA/rxe: add mutual exclusion in rxe_net_del() Edward Adam Davis
2026-05-16 14:31 ` Zhu Yanjun
2026-05-16 23:40 ` Yanjun.Zhu
2026-05-17 1:56 ` Edward Adam Davis
2026-05-17 2:15 ` Kuniyuki Iwashima
2026-05-17 3:27 ` Zhu Yanjun
2026-05-17 4:31 ` Zhu Yanjun
2026-05-14 5:15 ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) Zhu Yanjun
2026-05-16 5:44 ` Zhu Yanjun
2026-05-16 7:02 ` syzbot
2026-05-16 18:40 ` Zhu Yanjun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260424164733.356003-1-arjan@linux.intel.com \
--to=arjan@linux$(echo .)intel.com \
--cc=davem@davemloft$(echo .)net \
--cc=dsahern@kernel$(echo .)org \
--cc=edumazet@google$(echo .)com \
--cc=horms@kernel$(echo .)org \
--cc=kuba@kernel$(echo .)org \
--cc=linux-kernel@vger$(echo .)kernel.org \
--cc=netdev@vger$(echo .)kernel.org \
--cc=pabeni@redhat$(echo .)com \
--cc=syzbot+d8f76778263ab65c2b21@syzkaller$(echo .)appspotmail.com \
--cc=syzkaller-bugs@googlegroups$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox