Re: [PATCH net-next v4 04/15] quic: provide family ops for address and protocol

public inbox for quic@lists.linux.dev 
 help / color / mirror / Atom feed

From: Xin Long <lucien.xin@gmail.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: network dev <netdev@vger.kernel.org>,
	quic@lists.linux.dev, davem@davemloft.net,  kuba@kernel.org,
	Eric Dumazet <edumazet@google.com>,
	Simon Horman <horms@kernel.org>,
	 Stefan Metzmacher <metze@samba.org>,
	Moritz Buhl <mbuhl@openbsd.org>,
	Tyler Fanelli <tfanelli@redhat.com>,
	 Pengtao He <hepengtao@xiaomi.com>,
	Thomas Dreibholz <dreibh@simula.no>,
	linux-cifs@vger.kernel.org,  Steve French <smfrench@gmail.com>,
	Namjae Jeon <linkinjeon@kernel.org>,
	 Paulo Alcantara <pc@manguebit.com>, Tom Talpey <tom@talpey.com>,
	kernel-tls-handshake@lists.linux.dev,
	 Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	 Steve Dickson <steved@redhat.com>,
	Hannes Reinecke <hare@suse.de>,
	Alexander Aring <aahringo@redhat.com>,
	 David Howells <dhowells@redhat.com>,
	Matthieu Baerts <matttbe@kernel.org>,
	 John Ericson <mail@johnericson.me>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	 "D . Wythe" <alibuda@linux.alibaba.com>,
	Jason Baron <jbaron@akamai.com>,
	 illiliti <illiliti@protonmail.com>,
	Sabrina Dubroca <sd@queasysnail.net>,
	 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	Daniel Stenberg <daniel@haxx.se>,
	 Andy Gospodarek <andrew.gospodarek@broadcom.com>
Subject: Re: [PATCH net-next v4 04/15] quic: provide family ops for address and protocol
Date: Wed, 5 Nov 2025 20:01:54 -0500	[thread overview]
Message-ID: <CADvbK_cG-yAAqUjGMVcmewP1Cc-7HqRLWsn2j_yWu_hmxqP5Eg@mail.gmail.com> (raw)
In-Reply-To: <f557c3eb-9177-4e4f-b46e-e83bf938e2b0@redhat.com>

On Tue, Nov 4, 2025 at 5:27 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 10/29/25 3:35 PM, Xin Long wrote:
> > +static int quic_v4_flow_route(struct sock *sk, union quic_addr *da, union quic_addr *sa,
> > +                           struct flowi *fl)
> > +{
> > +     struct flowi4 *fl4;
> > +     struct rtable *rt;
> > +
> > +     if (__sk_dst_check(sk, 0))
> > +             return 1;
> > +
> > +     memset(fl, 0x00, sizeof(*fl));
> > +     fl4 = &fl->u.ip4;
> > +     fl4->saddr = sa->v4.sin_addr.s_addr;
> > +     fl4->fl4_sport = sa->v4.sin_port;
> > +     fl4->daddr = da->v4.sin_addr.s_addr;
> > +     fl4->fl4_dport = da->v4.sin_port;
> > +     fl4->flowi4_proto = IPPROTO_UDP;
> > +     fl4->flowi4_oif = sk->sk_bound_dev_if;
> > +
> > +     fl4->flowi4_scope = ip_sock_rt_scope(sk);
> > +     fl4->flowi4_dscp = inet_sk_dscp(inet_sk(sk));
> > +
> > +     rt = ip_route_output_key(sock_net(sk), fl4);
> > +     if (IS_ERR(rt))
> > +             return PTR_ERR(rt);
> > +
> > +     if (!sa->v4.sin_family) {
>
> The above check is strange. Any special reason to not use
> quic_v4_is_any_addr()?
>
quic_v4_is_any_addr() looks better, will try to replace it.

> > +             sa->v4.sin_family = AF_INET;
> > +             sa->v4.sin_addr.s_addr = fl4->saddr;
> > +     }
> > +     sk_setup_caps(sk, &rt->dst);
> > +     return 0;
> > +}
> > +
> > +static int quic_v6_flow_route(struct sock *sk, union quic_addr *da, union quic_addr *sa,
> > +                           struct flowi *fl)
> > +{
> > +     struct ipv6_pinfo *np = inet6_sk(sk);
> > +     struct ip6_flowlabel *flowlabel;
> > +     struct dst_entry *dst;
> > +     struct flowi6 *fl6;
> > +
> > +     if (__sk_dst_check(sk, np->dst_cookie))
> > +             return 1;
> > +
> > +     memset(fl, 0x00, sizeof(*fl));
> > +     fl6 = &fl->u.ip6;
> > +     fl6->saddr = sa->v6.sin6_addr;
> > +     fl6->fl6_sport = sa->v6.sin6_port;
> > +     fl6->daddr = da->v6.sin6_addr;
> > +     fl6->fl6_dport = da->v6.sin6_port;
> > +     fl6->flowi6_proto = IPPROTO_UDP;
> > +     fl6->flowi6_oif = sk->sk_bound_dev_if;
> > +
> > +     if (inet6_test_bit(SNDFLOW, sk)) {
> > +             fl6->flowlabel = (da->v6.sin6_flowinfo & IPV6_FLOWINFO_MASK);
> > +             if (fl6->flowlabel & IPV6_FLOWLABEL_MASK) {
> > +                     flowlabel = fl6_sock_lookup(sk, fl6->flowlabel);
> > +                     if (IS_ERR(flowlabel))
> > +                             return -EINVAL;
> > +                     fl6_sock_release(flowlabel);
> > +             }
> > +     }
> > +
> > +     dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, NULL);
> > +     if (IS_ERR(dst))
> > +             return PTR_ERR(dst);
> > +
> > +     if (!sa->v6.sin6_family) {
>
> (similar question here)
>
right.

> [...]
> > +static int quic_v4_get_mtu_info(struct sk_buff *skb, u32 *info)
> > +{
> > +     struct icmphdr *hdr;
> > +
> > +     hdr = (struct icmphdr *)(skb_network_header(skb) - sizeof(struct icmphdr));
>
> Noting the above relies on headers being already pulled in the linear
> part. Later patch will do skb_linarize(), but that looks overkill and
> should hit performance badly. Instead you should use pskb_may_pull() &&
> friends.
This path (ICMP error path) doesn't need to parse frames and bundled
packets, so yes we can use pskb_may_pull().

However, in the normal QUIC packet receive path:

- for short header packet path, the packet format is:

Before decryption:

  UDP hdr | QUIC hdr | conn_id | encrypted text

After decryption:

  UDP hdr | QUIC hdr | conn_id | frame1 hdr | frame1 data | frame2 hdr
   | frame2 data ...

When parsing the frames, it's hard to do it without linearizing the
skb, also fields in these frame headers are always variable-length
integers, making the parsing more difficult if it's not a linearized
buffer.

- for long header (handshake) packet path, more complex, packets can
  be bundled like:

  UDP hdr | QUIC hdr1 | encrypted text | QUIC hdr2 | encrypted text |
  ...

>
> > +     if (hdr->type == ICMP_DEST_UNREACH && hdr->code == ICMP_FRAG_NEEDED) {
> > +             *info = ntohs(hdr->un.frag.mtu);
> > +             return 0;
> > +     }
> > +
> > +     /* Defer other types' processing to UDP error handler. */
> > +     return 1;
> > +}
> > +
> > +static int quic_v6_get_mtu_info(struct sk_buff *skb, u32 *info)
> > +{
> > +     struct icmp6hdr *hdr;
> > +
> > +     hdr = (struct icmp6hdr *)(skb_network_header(skb) - sizeof(struct icmp6hdr));
> > +     if (hdr->icmp6_type == ICMPV6_PKT_TOOBIG) {
> > +             *info = ntohl(hdr->icmp6_mtu);
> > +             return 0;
> > +     }
> > +
> > +     /* Defer other types' processing to UDP error handler. */
> > +     return 1;
> > +}
> > +
> > +static u8 quic_v4_get_msg_ecn(struct sk_buff *skb)
> > +{
> > +     return (ip_hdr(skb)->tos & INET_ECN_MASK);
> > +}
> > +
> > +static u8 quic_v6_get_msg_ecn(struct sk_buff *skb)
> > +{
> > +     return (ipv6_get_dsfield(ipv6_hdr(skb)) & INET_ECN_MASK);
> > +}
> > +
> > +static int quic_v4_get_user_addr(struct sock *sk, union quic_addr *a, struct sockaddr *addr,
> > +                              int addr_len)
> > +{
> > +     u32 len = sizeof(struct sockaddr_in);
> > +
> > +     if (addr_len < len || addr->sa_family != AF_INET)
> > +             return 1;
> > +     if (ipv4_is_multicast(quic_addr(addr)->v4.sin_addr.s_addr))
> > +             return 1;
> > +     memcpy(a, addr, len);
> > +     return 0;
> > +}
>
> It looks like the above function is not used in this series?!? (well
> it's called by quic_get_user_addr() which in turn is unsed.
>
> Perhaps drop from here and add later as needed?
Sure, I will drop:

quic_seq_dump_addr()
quic_get_msg_ecn()
quic_get_user_addr()
quic_get_pref_addr()
quic_set_pref_addr()
quic_set_sk_addr()
quic_set_sk_ecn()

>
> Also the name sounds possibly misleading, I read it as it should copy
> data to user-space and return value could possibly be an errnum.
>
Maybe quic_get_addr_from_user()? and I will return -EINVAL instead
of 1 in the err path.

> > +static void quic_v4_get_pref_addr(struct sock *sk, union quic_addr *addr, u8 **pp, u32 *plen)
> > +{
> > +     u8 *p = *pp;
> > +
> > +     memcpy(&addr->v4.sin_addr, p, QUIC_ADDR4_LEN);
> > +     p += QUIC_ADDR4_LEN;
> > +     memcpy(&addr->v4.sin_port, p, QUIC_PORT_LEN);
> > +     p += QUIC_PORT_LEN;
> > +     addr->v4.sin_family = AF_INET;
> > +     /* Skip over IPv6 address and port, not used for AF_INET sockets. */
> > +     p += QUIC_ADDR6_LEN;
> > +     p += QUIC_PORT_LEN;
> > +
> > +     if (!addr->v4.sin_port || quic_v4_is_any_addr(addr) ||
> > +         ipv4_is_multicast(addr->v4.sin_addr.s_addr))
> > +             memset(addr, 0, sizeof(*addr));
> > +     *plen -= (p - *pp);
> > +     *pp = p;
> > +}
>
> Similarly unused?
>
> > +static bool quic_v4_cmp_sk_addr(struct sock *sk, union quic_addr *a, union quic_addr *addr)
> > +{
> > +     if (a->v4.sin_port != addr->v4.sin_port)
> > +             return false;
> > +     if (a->v4.sin_family != addr->v4.sin_family)
> > +             return false;
> > +     if (a->v4.sin_addr.s_addr == htonl(INADDR_ANY) ||
> > +         addr->v4.sin_addr.s_addr == htonl(INADDR_ANY))
> > +             return true;
> > +     return a->v4.sin_addr.s_addr == addr->v4.sin_addr.s_addr;
> > +}
> > +
> > +static bool quic_v6_cmp_sk_addr(struct sock *sk, union quic_addr *a, union quic_addr *addr)
> > +{
> > +     if (a->v4.sin_port != addr->v4.sin_port)
> > +             return false;
> > +
> > +     if (a->sa.sa_family == AF_INET && addr->sa.sa_family == AF_INET) {
> > +             if (a->v4.sin_addr.s_addr == htonl(INADDR_ANY) ||
> > +                 addr->v4.sin_addr.s_addr == htonl(INADDR_ANY))
> > +                     return true;
> > +             return a->v4.sin_addr.s_addr == addr->v4.sin_addr.s_addr;
> > +     }
> > +
> > +     if (a->sa.sa_family != addr->sa.sa_family) {
> > +             if (ipv6_only_sock(sk))
> > +                     return false;
> > +             if (a->sa.sa_family == AF_INET6 && ipv6_addr_any(&a->v6.sin6_addr))
> > +                     return true;
> > +             if (a->sa.sa_family == AF_INET && addr->sa.sa_family == AF_INET6 &&
>
> Below this code assumes that sa_family is either AF_INET or AF_INET6. If
> such assumtion hold, you should use here, too. and drop the
> 'addr->sa.sa_family == AF_INET6' condition.
I agree.

>
> > +                 ipv6_addr_v4mapped(&addr->v6.sin6_addr) &&
> > +                 addr->v6.sin6_addr.s6_addr32[3] == a->v4.sin_addr.s_addr)
> > +                     return true;
> > +             if (addr->sa.sa_family == AF_INET && a->sa.sa_family == AF_INET6 &&
> > +                 ipv6_addr_v4mapped(&a->v6.sin6_addr) &&
> > +                 a->v6.sin6_addr.s6_addr32[3] == addr->v4.sin_addr.s_addr)
> > +                     return true;
>
> Nothing this branch does not handle the 'ipv6_addr_any(&addr->v6.sin6_addr)'
>
Will add a helper:

static bool quic_v4_match_v6_addr(union quic_addr *a4, union quic_addr *a6)
{
        if (ipv6_addr_any(&a6->v6.sin6_addr))
                return true;
        if (ipv6_addr_v4mapped(&a6->v6.sin6_addr) &&
            a6->v6.sin6_addr.s6_addr32[3] == a4->v4.sin_addr.s_addr)
                return true;
        return false;
}

and change this branch to:

                if (a->sa.sa_family == AF_INET)
                        return quic_v4_match_v6_addr(a, addr);
                return quic_v4_match_v6_addr(addr, a);

Thanks.

next prev parent reply	other threads:[~2025-11-06  1:02 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29 14:35 [PATCH net-next v4 00/15] net: introduce QUIC infrastructure and core subcomponents Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 01/15] net: define IPPROTO_QUIC and SOL_QUIC constants Xin Long
2025-11-04  9:20   ` Paolo Abeni
2025-10-29 14:35 ` [PATCH net-next v4 02/15] net: build socket infrastructure for QUIC protocol Xin Long
2025-10-29 16:22   ` Stefan Metzmacher
2025-10-29 19:57     ` Xin Long
2025-10-30 11:29       ` Stefan Metzmacher
2025-10-30 14:13         ` Xin Long
2025-10-30 14:17           ` Stefan Metzmacher
2025-10-30 14:28             ` Xin Long
2025-11-04  9:38   ` Paolo Abeni
2025-11-05 22:20     ` Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 03/15] quic: provide common utilities and data structures Xin Long
2025-11-04  9:55   ` Paolo Abeni
2025-10-29 14:35 ` [PATCH net-next v4 04/15] quic: provide family ops for address and protocol Xin Long
2025-11-04 10:27   ` Paolo Abeni
2025-11-06  1:01     ` Xin Long [this message]
2025-10-29 14:35 ` [PATCH net-next v4 05/15] quic: provide quic.h header files for kernel and userspace Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 06/15] quic: add stream management Xin Long
2025-11-04 11:05   ` Paolo Abeni
2025-11-06  1:27     ` Xin Long
2025-11-06  8:51       ` Paolo Abeni
2025-11-06 16:22         ` Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 07/15] quic: add connection id management Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 08/15] quic: add path management Xin Long
2025-11-04 11:50   ` Paolo Abeni
2025-11-06  1:28     ` Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 09/15] quic: add congestion control Xin Long
2025-11-04 12:02   ` Paolo Abeni
2025-11-06 20:24     ` Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 10/15] quic: add packet number space Xin Long
2025-11-04 12:17   ` Paolo Abeni
2025-11-06 16:40     ` Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 11/15] quic: add crypto key derivation and installation Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 12/15] quic: add crypto packet encryption and decryption Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 13/15] quic: add timer management Xin Long
2025-11-04 12:33   ` Paolo Abeni
2025-11-06 16:49     ` Xin Long
2025-11-13 21:23       ` Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 14/15] quic: add frame encoder and decoder base Xin Long
2025-11-04 12:47   ` Paolo Abeni
2025-11-06 17:22     ` Xin Long
2025-11-13 21:26       ` Xin Long
2025-10-29 14:35 ` [PATCH net-next v4 15/15] quic: add packet builder and parser base Xin Long
2025-11-04 14:44   ` Paolo Abeni
2025-11-06 19:24     ` Xin Long
2025-11-04  2:41 ` [PATCH net-next v4 00/15] net: introduce QUIC infrastructure and core subcomponents Jakub Kicinski
2025-11-05 22:19   ` Xin Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADvbK_cG-yAAqUjGMVcmewP1Cc-7HqRLWsn2j_yWu_hmxqP5Eg@mail.gmail.com \
    --to=lucien.xin@gmail.com \
    --cc=aahringo@redhat.com \
    --cc=alibuda@linux.alibaba.com \
    --cc=andrew.gospodarek@broadcom.com \
    --cc=chuck.lever@oracle.com \
    --cc=daniel@haxx.se \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=dreibh@simula.no \
    --cc=edumazet@google.com \
    --cc=hare@suse.de \
    --cc=hepengtao@xiaomi.com \
    --cc=horms@kernel.org \
    --cc=illiliti@protonmail.com \
    --cc=jbaron@akamai.com \
    --cc=jlayton@kernel.org \
    --cc=kernel-tls-handshake@lists.linux.dev \
    --cc=kuba@kernel.org \
    --cc=linkinjeon@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=mail@johnericson.me \
    --cc=marcelo.leitner@gmail.com \
    --cc=matttbe@kernel.org \
    --cc=mbuhl@openbsd.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pc@manguebit.com \
    --cc=quic@lists.linux.dev \
    --cc=sd@queasysnail.net \
    --cc=smfrench@gmail.com \
    --cc=steved@redhat.com \
    --cc=tfanelli@redhat.com \
    --cc=tom@talpey.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox