public inbox for quic@lists.linux.dev 
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Xin Long <lucien.xin@gmail.com>,
	network dev <netdev@vger.kernel.org>,
	quic@lists.linux.dev
Cc: davem@davemloft.net, kuba@kernel.org,
	Eric Dumazet <edumazet@google.com>,
	Simon Horman <horms@kernel.org>,
	Stefan Metzmacher <metze@samba.org>,
	Moritz Buhl <mbuhl@openbsd.org>,
	Tyler Fanelli <tfanelli@redhat.com>,
	Pengtao He <hepengtao@xiaomi.com>,
	Thomas Dreibholz <dreibh@simula.no>,
	linux-cifs@vger.kernel.org, Steve French <smfrench@gmail.com>,
	Namjae Jeon <linkinjeon@kernel.org>,
	Paulo Alcantara <pc@manguebit.com>, Tom Talpey <tom@talpey.com>,
	kernel-tls-handshake@lists.linux.dev,
	Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	Steve Dickson <steved@redhat.com>, Hannes Reinecke <hare@suse.de>,
	Alexander Aring <aahringo@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Matthieu Baerts <matttbe@kernel.org>,
	John Ericson <mail@johnericson.me>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	"D . Wythe" <alibuda@linux.alibaba.com>,
	Jason Baron <jbaron@akamai.com>,
	illiliti <illiliti@protonmail.com>,
	Sabrina Dubroca <sd@queasysnail.net>,
	Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	Daniel Stenberg <daniel@haxx.se>,
	Andy Gospodarek <andrew.gospodarek@broadcom.com>
Subject: Re: [PATCH net-next v8 08/15] quic: add path management
Date: Thu, 29 Jan 2026 17:20:31 +0100	[thread overview]
Message-ID: <a7b64f16-5ca9-4344-b7e8-c0d4508e43cc@redhat.com> (raw)
In-Reply-To: <2367f77787fa0a29913f48c91087397dcc82c35f.1769439073.git.lucien.xin@gmail.com>

On 1/26/26 3:51 PM, Xin Long wrote:
> This patch introduces 'quic_path_group' for managing paths, represented
> by 'struct quic_path'. A connection may use two paths simultaneously
> for connection migration.
> 
> Each path is associated with a UDP tunnel socket (sk), and a single
> UDP tunnel socket can be related to multiple paths from different sockets.
> These UDP tunnel sockets are wrapped in 'quic_udp_sock' structures and
> stored in a hash table.
> 
> It includes mechanisms to bind and unbind paths, detect alternative paths
> for migration, and swap paths to support seamless transition between
> networks.
> 
> - quic_path_bind(): Bind a path to a port and associate it with a UDP sk.
> 
> - quic_path_unbind(): Unbind a path from a port and disassociate it from a
>   UDP sk.
> 
> - quic_path_swap(): Swap two paths to facilitate connection migration.
> 
> - quic_path_detect_alt(): Determine if a packet is using an alternative
>   path, used for connection migration.
> 
>  It also integrates basic support for Packetization Layer Path MTU
> Discovery (PLPMTUD), using PING frames and ICMP feedback to adjust path
> MTU and handle probe confirmation or resets during routing changes.
> 
> - quic_path_pl_recv(): state transition and pmtu update after the probe
>   packet is acked.
> 
> - quic_path_pl_toobig(): state transition and pmtu update after
>   receiving a toobig or needfrag icmp packet.
> 
> - quic_path_pl_send(): state transition and pmtu update after sending a
>   probe packet.
> 
> - quic_path_pl_reset(): restart the probing when path routing changes.
> 
> - quic_path_pl_confirm(): check if probe packet gets acked.
> 
> Signed-off-by: Tyler Fanelli <tfanelli@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
> v3:
>   - Fix annotation in quic_udp_sock_lookup() (noted by Paolo).
>   - Use inet_sk_get_local_port_range() instead of
>     inet_get_local_port_range() (suggested by Paolo).
>   - Adjust global UDP tunnel socket hashtable operations for the new
>     hashtable type.
>   - Delete quic_workqueue; use system_wq for UDP tunnel socket destroy.
> v4:
>   - Cache UDP tunnel socket pointer and its source address in struct
>     quic_path for RCU-protected lookup/access.
>   - Return -EAGAIN instead of -EINVAL in quic_path_bind() when UDP
>     socket is being released in workqueue.
>   - Move udp_tunnel_sock_release() out of the mutex_lock to avoid a
>     warning of lockdep in quic_udp_sock_put_work().
>   - Introduce quic_wq for UDP socket release work, so all pending works
>     can be flushed before destroying the hashtable in quic_exit().
> v5:
>   - Rename quic_path_free() to quic_path_unbind() (suggested by Paolo).
>   - Remove the 'serv' member from struct quic_path_group, since
>     quic_is_serv() defined in a previous patch now uses
>     sk->sk_max_ack_backlog for server-side detection.
>   - Use quic_ktime_get_us() to set skb_cb->time, as RTT is measured
>     in microseconds and jiffies_to_usecs() is not accurate enough.
> v6:
>   - Do not reset transport_header for QUIC in quic_udp_rcv(), allowing
>     removal of udph_offset and enabling access to the UDP header via
>     udp_hdr(); Pull skb->data in quic_udp_rcv() to allow access to the
>     QUIC header via skb->data.
> v7:
>   - Pass udp sk to quic_path_rcv() and move the call to skb_linearize()
>     and skb_set_owner_sk_safe() to .quic_path_rcv().
>   - Delete the call to skb_linearize() and skb_set_owner_sk_safe() from
>     quic_udp_err(), as it should not change skb in .encap_err_lookup()
>     (noted by AI review).
> v8:
>   - Remove indirect quic_path_rcv and late call quic_packet_rcv()
>     directly via extern (noted by Paolo).
>   - Add a comment in quic_udp_rcv() clarifying it must return 0.
>   - Add a comment in quic_udp_sock_put() clarifying the UDP socket
>     may be freed in atomic RX context during connection migration.
>   - Reorder some quic_path_group members to reduce struct size.
> ---
>  net/quic/Makefile   |   2 +-
>  net/quic/path.c     | 520 ++++++++++++++++++++++++++++++++++++++++++++
>  net/quic/path.h     | 170 +++++++++++++++
>  net/quic/protocol.c |  11 +
>  net/quic/socket.c   |   3 +
>  net/quic/socket.h   |   7 +
>  6 files changed, 712 insertions(+), 1 deletion(-)
>  create mode 100644 net/quic/path.c
>  create mode 100644 net/quic/path.h
> 
> diff --git a/net/quic/Makefile b/net/quic/Makefile
> index eee7501588d3..1565fb5cef9d 100644
> --- a/net/quic/Makefile
> +++ b/net/quic/Makefile
> @@ -5,4 +5,4 @@
>  
>  obj-$(CONFIG_IP_QUIC) += quic.o
>  
> -quic-y := common.o family.o protocol.o socket.o stream.o connid.o
> +quic-y := common.o family.o protocol.o socket.o stream.o connid.o path.o
> diff --git a/net/quic/path.c b/net/quic/path.c
> new file mode 100644
> index 000000000000..9556607a009e
> --- /dev/null
> +++ b/net/quic/path.c
> @@ -0,0 +1,520 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/* QUIC kernel implementation
> + * (C) Copyright Red Hat Corp. 2023
> + *
> + * This file is part of the QUIC kernel implementation
> + *
> + * Initialization/cleanup for QUIC protocol support.
> + *
> + * Written or modified by:
> + *    Xin Long <lucien.xin@gmail.com>
> + */
> +
> +#include <net/udp_tunnel.h>
> +#include <linux/quic.h>
> +
> +#include "common.h"
> +#include "family.h"
> +#include "path.h"
> +
> +static int quic_udp_rcv(struct sock *sk, struct sk_buff *skb)
> +{
> +	memset(skb->cb, 0, sizeof(skb->cb));
> +	QUIC_SKB_CB(skb)->seqno = -1;
> +	QUIC_SKB_CB(skb)->time = quic_ktime_get_us();
> +
> +	skb_pull(skb, sizeof(struct udphdr));
> +	skb_dst_force(skb);
> +	kfree_skb(skb);
> +	return 0; /* .encap_rcv must return 0 if skb was either consumed or dropped. */
> +}
> +
> +static int quic_udp_err(struct sock *sk, struct sk_buff *skb)
> +{
> +	return 0;
> +}
> +
> +static void quic_udp_sock_put_work(struct work_struct *work)
> +{
> +	struct quic_udp_sock *us = container_of(work, struct quic_udp_sock, work);
> +	struct quic_uhash_head *head;
> +	struct sock *sk = us->sk;
> +
> +	/* Hold the sock to safely access it in quic_udp_sock_lookup() even after
> +	 * udp_tunnel_sock_release(). The release must occur before __hlist_del()
> +	 * so a new UDP tunnel socket can be created for the same address and port
> +	 * if quic_udp_sock_lookup() fails to find one.
> +	 *
> +	 * Note: udp_tunnel_sock_release() cannot be called under the mutex due to
> +	 * some lockdep warnings.
> +	 */
> +	sock_hold(sk);
> +	udp_tunnel_sock_release(sk->sk_socket);
> +
> +	head = quic_udp_sock_head(sock_net(sk), ntohs(us->addr.v4.sin_port));
> +	mutex_lock(&head->lock);
> +	__hlist_del(&us->node);
> +	mutex_unlock(&head->lock);
> +
> +	sock_put(sk);
> +	kfree(us);
> +}
> +
> +static struct quic_udp_sock *quic_udp_sock_create(struct sock *sk, union quic_addr *a)
> +{
> +	struct udp_tunnel_sock_cfg tuncfg = {};
> +	struct udp_port_cfg udp_conf = {};
> +	struct net *net = sock_net(sk);
> +	struct quic_uhash_head *head;
> +	struct quic_udp_sock *us;
> +	struct socket *sock;
> +
> +	us = kzalloc(sizeof(*us), GFP_KERNEL);
> +	if (!us)
> +		return NULL;
> +
> +	quic_udp_conf_init(sk, &udp_conf, a);
> +	if (udp_sock_create(net, &udp_conf, &sock)) {
> +		pr_debug("%s: failed to create udp sock\n", __func__);
> +		kfree(us);
> +		return NULL;
> +	}
> +
> +	tuncfg.encap_type = 1;
> +	tuncfg.encap_rcv = quic_udp_rcv;
> +	tuncfg.encap_err_lookup = quic_udp_err;
> +	setup_udp_tunnel_sock(net, sock, &tuncfg);

Possibly you need to adjust UDP_MAX_TUNNEL_TYPES in udp_offload.c. You
could check running a kernel with QUIC enabled and geneve, vxlan, FOU
and xfrm disabled.

> +
> +	refcount_set(&us->refcnt, 1);
> +	us->sk = sock->sk;
> +	memcpy(&us->addr, a, sizeof(*a));
> +	us->bind_ifindex = sk->sk_bound_dev_if;
> +
> +	head = quic_udp_sock_head(net, ntohs(a->v4.sin_port));
> +	hlist_add_head(&us->node, &head->head);
> +	INIT_WORK(&us->work, quic_udp_sock_put_work);

Is unclear to me if quick udp socket lookup be done locklessy with
future series?

/P


  reply	other threads:[~2026-01-29 16:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26 14:50 [PATCH net-next v8 00/15] net: introduce QUIC infrastructure and core subcomponents Xin Long
2026-01-26 14:50 ` [PATCH net-next v8 01/15] net: define IPPROTO_QUIC and SOL_QUIC constants Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 02/15] net: build socket infrastructure for QUIC protocol Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 03/15] quic: provide common utilities and data structures Xin Long
2026-01-29 16:26   ` Paolo Abeni
2026-01-29 19:40     ` Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 04/15] quic: provide family ops for address and protocol Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 05/15] quic: provide quic.h header files for kernel and userspace Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 06/15] quic: add stream management Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 07/15] quic: add connection id management Xin Long
2026-01-29 12:33   ` Paolo Abeni
2026-01-26 14:51 ` [PATCH net-next v8 08/15] quic: add path management Xin Long
2026-01-29 16:20   ` Paolo Abeni [this message]
2026-01-29 20:46     ` Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 09/15] quic: add congestion control Xin Long
2026-01-28 16:15   ` [net-next,v8,09/15] " Simon Horman
2026-01-29 19:44     ` Xin Long
2026-02-02 14:40       ` Simon Horman
2026-01-26 14:51 ` [PATCH net-next v8 10/15] quic: add packet number space Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 11/15] quic: add crypto key derivation and installation Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 12/15] quic: add crypto packet encryption and decryption Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 13/15] quic: add timer management Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 14/15] quic: add packet builder base Xin Long
2026-01-29 16:40   ` Paolo Abeni
2026-01-29 19:39     ` Xin Long
2026-01-26 14:51 ` [PATCH net-next v8 15/15] quic: add packet parser base Xin Long
2026-01-29 16:53   ` Paolo Abeni
2026-01-29 19:37     ` Xin Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a7b64f16-5ca9-4344-b7e8-c0d4508e43cc@redhat.com \
    --to=pabeni@redhat.com \
    --cc=aahringo@redhat.com \
    --cc=alibuda@linux.alibaba.com \
    --cc=andrew.gospodarek@broadcom.com \
    --cc=chuck.lever@oracle.com \
    --cc=daniel@haxx.se \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=dreibh@simula.no \
    --cc=edumazet@google.com \
    --cc=hare@suse.de \
    --cc=hepengtao@xiaomi.com \
    --cc=horms@kernel.org \
    --cc=illiliti@protonmail.com \
    --cc=jbaron@akamai.com \
    --cc=jlayton@kernel.org \
    --cc=kernel-tls-handshake@lists.linux.dev \
    --cc=kuba@kernel.org \
    --cc=linkinjeon@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=lucien.xin@gmail.com \
    --cc=mail@johnericson.me \
    --cc=marcelo.leitner@gmail.com \
    --cc=matttbe@kernel.org \
    --cc=mbuhl@openbsd.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    --cc=pc@manguebit.com \
    --cc=quic@lists.linux.dev \
    --cc=sd@queasysnail.net \
    --cc=smfrench@gmail.com \
    --cc=steved@redhat.com \
    --cc=tfanelli@redhat.com \
    --cc=tom@talpey.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox