public inbox for quic@lists.linux.dev 
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2 00/15] net: introduce QUIC infrastructure and core subcomponents
       [not found]     ` <CADvbK_e9sNbvHSCNuvetOCFY5OQPG99tmZLW=odcRzcN9xK8rQ@mail.gmail.com>
@ 2025-08-27  4:45       ` John Ericson
  2025-08-27  4:52         ` Separate sockets for separate connections John Ericson
       [not found]         ` <9d4a3c5f-8b4b-4057-b550-e9158cbbc8bf@app.fastmail.com>
  0 siblings, 2 replies; 4+ messages in thread
From: John Ericson @ 2025-08-27  4:45 UTC (permalink / raw)
  To: quic; +Cc: Xin Long, mbuhl, draft-lxin-quic-socket-apis

On Tue, Aug 26, 2025, at 5:48 PM, Xin Long wrote:
> Hi, John,
> 
> Feel free to create a thread on quic@lists.linux.dev for this.
> 
> Thanks.

Kicking of the new linux QUIC dev mailing list with this, as requested.

(The last email in netdev is
https://lore.kernel.org/netdev/CADvbK_e9sNbvHSCNuvetOCFY5OQPG99tmZLW=odcRzcN9xK8rQ@mail.gmail.com/,
for reference.)

> On Sun, Aug 24, 2025 at 1:57 PM Xin Long <lucien.xin@gmail.com> wrote:
> >
> > On Sat, Aug 23, 2025 at 11:21 AM John Ericson <mail@johnericson.me> wrote:
> > >
> > > (Note: This is an interface more than implementation question ---
> > > apologies in advanced if this is not the right place to ask. I
> > > originally sent this message to [0] about the IETF internet draft
> > > [1], but then I realized that is just an alias for the draft
> > > authors, and not a public mailing list, so I figured this would be
> > > better in order to have something in the public record.)
> > >
> > > ---
> > >
> > > I was surprised to see that (if I understand correctly) in the
> > > current design, all communication over one connection must happen
> > > with the same socket, and instead stream ids are the sole
> > > mechanism to distinguish between different streams (e.g. for
> > > sending and receiving).
> > >
> > > This does work, but it is bad for application programming which
> > > wants to take advantage of separate streams while being
> > > transport-agnostic. For example, it would be very nice to run an
> > > arbitrary program with stdout and stderr hooked up to separate
> > > QUIC streams. This can be elegantly accomplished if there is an
> > > option to create a fresh socket / file descriptor which is just
> > > associated with a single stream. Then "regular" send/rescv, or
> > > even read/write, can be used with multiple streams.
> > >
> > > I see that the SCTP socket interface has sctp_peeloff [2] for this
> > > purpose. Could something similar be included in this
> > > specification?

> > Hi, John,
> >
> > That is a bit different. In SCTP, sctp_peeloff() detaches an
> > association/connection from a one-to-many socket and returns it as a
> > new socket. It does not peel off a stream. Stream send/receive
> > operations in SCTP are actually quite similar to how QUIC handles
> > streams in the proposed QUIC socket API.

OK fair enough. sctp_peeloff() was the closest prior art I could find,
but I don't know much about SCTP. Rest assured, I did have the QUIC
semantics in mind. E.g. closing one of these QUIC per-stream peeled off
sockets should close just the stream in question, not the entire
connection.

> > For QUIC, supporting 'stream peeloff' might mean creating a new
> > socket type that carries a stream ID and maps its sendmsg/recvmsg to
> > the 'parent' QUIC socket.

Yes, exactly.

> > But there are details to sort out, like whether the 'parent-child
> > relationship' should be maintained.

What do you mean by this? I assume the answer is that it should be
maintained? e.g. if the connection is closed, then any child per-stream
sockets are also invalidated and must be closed.

> > We also need to consider whether this is worth implementing in the
> > kernel, or if a  similar API could be provided in libquic.

So this is sort of the crux of my argument. If it is in userland, then
any application that wants to act per-stream needs to know about QUIC.
But if it is in kernel, just a a tiny bit of QUIC-aware glue code is to
plug together QUIC-agnostic software, by passing stream sockets to that
software. (You could do it by passing pipes and a little userland
man-in-the-middle using *quic_sendmsg and quic_recvmsg*, of course, but
those extra context switches and copies are rather lousy.)

For what it's worth, I would go further in fact and say that this
"stream peeloff" system call should not just be supported by QUIC, too.
It is very nice today how many code can be agnostic to TCP vs unix
domain sockets, for example. I would ideally want the same thing to be
true with QUIC too, via an "extended unix domain socket" that would
replicate the QUIC state machine(s) just as regular unix domain sockets
replicate the TCP state machine.

I bring up such an "extended unix domain socket" not to indulge in scope
creep, but just to point out that a good litmus test for a new socket
interface is that multiple domains could meaningfully support it, and
that litmus test is met in this case.

Cheers,

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Separate sockets for separate connections
  2025-08-27  4:45       ` [PATCH net-next v2 00/15] net: introduce QUIC infrastructure and core subcomponents John Ericson
@ 2025-08-27  4:52         ` John Ericson
       [not found]         ` <9d4a3c5f-8b4b-4057-b550-e9158cbbc8bf@app.fastmail.com>
  1 sibling, 0 replies; 4+ messages in thread
From: John Ericson @ 2025-08-27  4:52 UTC (permalink / raw)
  To: quic; +Cc: Xin Long, mbuhl, draft-lxin-quic-socket-apis

Aye, I did everything I meant to do email-wise but change the subject line to
something more appropriate. Let me just reply to myself right away doing that
before there are more messages.

John

On Wed, Aug 27, 2025, at 12:45 AM, John Ericson wrote:
> On Tue, Aug 26, 2025, at 5:48 PM, Xin Long wrote:
> > Hi, John,
> > 
> > Feel free to create a thread on quic@lists.linux.dev for this.
> > 
> > Thanks.
> 
> Kicking of the new linux QUIC dev mailing list with this, as requested.
> 
> (The last email in netdev is
> https://lore.kernel.org/netdev/CADvbK_e9sNbvHSCNuvetOCFY5OQPG99tmZLW=odcRzcN9xK8rQ@mail.gmail.com/,
> for reference.)
> 
> > On Sun, Aug 24, 2025 at 1:57 PM Xin Long <lucien.xin@gmail.com> wrote:
> > >
> > > On Sat, Aug 23, 2025 at 11:21 AM John Ericson <mail@johnericson.me> wrote:
> > > >
> > > > (Note: This is an interface more than implementation question ---
> > > > apologies in advanced if this is not the right place to ask. I
> > > > originally sent this message to [0] about the IETF internet draft
> > > > [1], but then I realized that is just an alias for the draft
> > > > authors, and not a public mailing list, so I figured this would be
> > > > better in order to have something in the public record.)
> > > >
> > > > ---
> > > >
> > > > I was surprised to see that (if I understand correctly) in the
> > > > current design, all communication over one connection must happen
> > > > with the same socket, and instead stream ids are the sole
> > > > mechanism to distinguish between different streams (e.g. for
> > > > sending and receiving).
> > > >
> > > > This does work, but it is bad for application programming which
> > > > wants to take advantage of separate streams while being
> > > > transport-agnostic. For example, it would be very nice to run an
> > > > arbitrary program with stdout and stderr hooked up to separate
> > > > QUIC streams. This can be elegantly accomplished if there is an
> > > > option to create a fresh socket / file descriptor which is just
> > > > associated with a single stream. Then "regular" send/rescv, or
> > > > even read/write, can be used with multiple streams.
> > > >
> > > > I see that the SCTP socket interface has sctp_peeloff [2] for this
> > > > purpose. Could something similar be included in this
> > > > specification?
> 
> > > Hi, John,
> > >
> > > That is a bit different. In SCTP, sctp_peeloff() detaches an
> > > association/connection from a one-to-many socket and returns it as a
> > > new socket. It does not peel off a stream. Stream send/receive
> > > operations in SCTP are actually quite similar to how QUIC handles
> > > streams in the proposed QUIC socket API.
> 
> OK fair enough. sctp_peeloff() was the closest prior art I could find,
> but I don't know much about SCTP. Rest assured, I did have the QUIC
> semantics in mind. E.g. closing one of these QUIC per-stream peeled off
> sockets should close just the stream in question, not the entire
> connection.
> 
> > > For QUIC, supporting 'stream peeloff' might mean creating a new
> > > socket type that carries a stream ID and maps its sendmsg/recvmsg to
> > > the 'parent' QUIC socket.
> 
> Yes, exactly.
> 
> > > But there are details to sort out, like whether the 'parent-child
> > > relationship' should be maintained.
> 
> What do you mean by this? I assume the answer is that it should be
> maintained? e.g. if the connection is closed, then any child per-stream
> sockets are also invalidated and must be closed.
> 
> > > We also need to consider whether this is worth implementing in the
> > > kernel, or if a  similar API could be provided in libquic.
> 
> So this is sort of the crux of my argument. If it is in userland, then
> any application that wants to act per-stream needs to know about QUIC.
> But if it is in kernel, just a a tiny bit of QUIC-aware glue code is to
> plug together QUIC-agnostic software, by passing stream sockets to that
> software. (You could do it by passing pipes and a little userland
> man-in-the-middle using *quic_sendmsg and quic_recvmsg*, of course, but
> those extra context switches and copies are rather lousy.)
> 
> For what it's worth, I would go further in fact and say that this
> "stream peeloff" system call should not just be supported by QUIC, too.
> It is very nice today how many code can be agnostic to TCP vs unix
> domain sockets, for example. I would ideally want the same thing to be
> true with QUIC too, via an "extended unix domain socket" that would
> replicate the QUIC state machine(s) just as regular unix domain sockets
> replicate the TCP state machine.
> 
> I bring up such an "extended unix domain socket" not to indulge in scope
> creep, but just to point out that a good litmus test for a new socket
> interface is that multiple domains could meaningfully support it, and
> that litmus test is met in this case.
> 
> Cheers,
> 
> John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Separate sockets for separate connections
       [not found]         ` <9d4a3c5f-8b4b-4057-b550-e9158cbbc8bf@app.fastmail.com>
@ 2025-08-30  0:42           ` Xin Long
  2025-09-10 20:50             ` John Ericson
  0 siblings, 1 reply; 4+ messages in thread
From: Xin Long @ 2025-08-30  0:42 UTC (permalink / raw)
  To: John Ericson; +Cc: quic, mbuhl, Stefan Metzmacher, draft-lxin-quic-socket-apis

On Wed, Aug 27, 2025 at 12:49 AM John Ericson <mail@johnericson.me> wrote:
>
> Aye, I did everything I meant to do email-wise but change the subject line to
> something more appropriate. Let me just reply to myself right away doing that
> before there are more messages.
>
Thanks for opening this thread with a new subject!

>
> On Wed, Aug 27, 2025, at 12:45 AM, John Ericson wrote:
>
> On Tue, Aug 26, 2025, at 5:48 PM, Xin Long wrote:
> > Hi, John,
> >
> > Feel free to create a thread on quic@lists.linux.dev for this.
> >
> > Thanks.
>
> Kicking of the new linux QUIC dev mailing list with this, as requested.
>
> (The last email in netdev is
> https://lore.kernel.org/netdev/CADvbK_e9sNbvHSCNuvetOCFY5OQPG99tmZLW=odcRzcN9xK8rQ@mail.gmail.com/,
> for reference.)
>
> > On Sun, Aug 24, 2025 at 1:57 PM Xin Long <lucien.xin@gmail.com> wrote:
> > >
> > > On Sat, Aug 23, 2025 at 11:21 AM John Ericson <mail@johnericson.me> wrote:
> > > >
> > > > (Note: This is an interface more than implementation question ---
> > > > apologies in advanced if this is not the right place to ask. I
> > > > originally sent this message to [0] about the IETF internet draft
> > > > [1], but then I realized that is just an alias for the draft
> > > > authors, and not a public mailing list, so I figured this would be
> > > > better in order to have something in the public record.)
> > > >
> > > > ---
> > > >
> > > > I was surprised to see that (if I understand correctly) in the
> > > > current design, all communication over one connection must happen
> > > > with the same socket, and instead stream ids are the sole
> > > > mechanism to distinguish between different streams (e.g. for
> > > > sending and receiving).
> > > >
> > > > This does work, but it is bad for application programming which
> > > > wants to take advantage of separate streams while being
> > > > transport-agnostic. For example, it would be very nice to run an
> > > > arbitrary program with stdout and stderr hooked up to separate
> > > > QUIC streams. This can be elegantly accomplished if there is an
> > > > option to create a fresh socket / file descriptor which is just
> > > > associated with a single stream. Then "regular" send/rescv, or
> > > > even read/write, can be used with multiple streams.
> > > >
> > > > I see that the SCTP socket interface has sctp_peeloff [2] for this
> > > > purpose. Could something similar be included in this
> > > > specification?
>
> > > Hi, John,
> > >
> > > That is a bit different. In SCTP, sctp_peeloff() detaches an
> > > association/connection from a one-to-many socket and returns it as a
> > > new socket. It does not peel off a stream. Stream send/receive
> > > operations in SCTP are actually quite similar to how QUIC handles
> > > streams in the proposed QUIC socket API.
>
> OK fair enough. sctp_peeloff() was the closest prior art I could find,
> but I don't know much about SCTP. Rest assured, I did have the QUIC
> semantics in mind. E.g. closing one of these QUIC per-stream peeled off
> sockets should close just the stream in question, not the entire
> connection.
>
I wrote some code to explore this:
https://github.com/lxin/quic/pull/53

- A stream can be peeled off from a parent/connection socket using
  getsockopt(QUIC_SOCKOPT_STREAM_PEELOFF) with a stream_id, similar to
  SCTP's connection peeloff.

- For stream sockets, in addition to send(), recv(), and close(), support
  for poll() and shutdown() is also implemented. Note that close() and
  shutdown() send a FIN on the sending side, issue a STOP_SENDING on the
  receiving side, or both for bidirectional streams, as applicable.

There's also a sample test:
https://github.com/lxin/quic/blob/stream-peeloff/tests/peeloff_test.c

- Sender: Opens a stream with getsockopt(QUIC_SOCKOPT_STREAM_OPEN), peels
  it off with getsockopt(QUIC_SOCKOPT_STREAM_PEELOFF), and sends data via
  the new file descriptor.

- Receiver: Detects stream creation through a QUIC_EVENT_STREAM_UPDATE
  event, peels off the stream with getsockopt(QUIC_SOCKOPT_STREAM_PEELOFF),
  and receives data via the new file descriptor.

> > > For QUIC, supporting 'stream peeloff' might mean creating a new
> > > socket type that carries a stream ID and maps its sendmsg/recvmsg to
> > > the 'parent' QUIC socket.
>
> Yes, exactly.
>
> > > But there are details to sort out, like whether the 'parent-child
> > > relationship' should be maintained.
>
> What do you mean by this? I assume the answer is that it should be
> maintained? e.g. if the connection is closed, then any child per-stream
> sockets are also invalidated and must be closed.
>
I aimed to make a peeled-off stream socket fully independent to keep the
design simple. However, since the connection socket may close at any time,
the stream socket must hold a reference to it. To keep the relationship
strictly one-way, the connection socket remains unaware of any peeled-off
stream sockets.

> > > We also need to consider whether this is worth implementing in the
> > > kernel, or if a  similar API could be provided in libquic.
>
> So this is sort of the crux of my argument. If it is in userland, then
> any application that wants to act per-stream needs to know about QUIC.
> But if it is in kernel, just a a tiny bit of QUIC-aware glue code is to
> plug together QUIC-agnostic software, by passing stream sockets to that
> software. (You could do it by passing pipes and a little userland
> man-in-the-middle using *quic_sendmsg and quic_recvmsg*, of course, but
> those extra context switches and copies are rather lousy.)
>
Right, providing it in libquic won't work for kernel consumers.

> For what it's worth, I would go further in fact and say that this
> "stream peeloff" system call should not just be supported by QUIC, too.
> It is very nice today how many code can be agnostic to TCP vs unix
> domain sockets, for example. I would ideally want the same thing to be
> true with QUIC too, via an "extended unix domain socket" that would
> replicate the QUIC state machine(s) just as regular unix domain sockets
> replicate the TCP state machine.
>
> I bring up such an "extended unix domain socket" not to indulge in scope
> creep, but just to point out that a good litmus test for a new socket
> interface is that multiple domains could meaningfully support it, and
> that litmus test is met in this case.
>
That’s a good point. Stream peel-off is an idea I find quite interesting. I
originally based the QUIC stream API on SCTP’s model, but it now seems that
most real-world use cases are closer to applications that traditionally
relied on TCP rather than SCTP.

Please check out the stream socket interfaces in the PR above and comment
on anything that you think could be improved.

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Separate sockets for separate connections
  2025-08-30  0:42           ` Xin Long
@ 2025-09-10 20:50             ` John Ericson
  0 siblings, 0 replies; 4+ messages in thread
From: John Ericson @ 2025-09-10 20:50 UTC (permalink / raw)
  To: Xin Long; +Cc: quic, mbuhl, Stefan Metzmacher, draft-lxin-quic-socket-apis

(Sorry for not responding to this sooner, I was attending NixCon 2025)

On Fri, Aug 29, 2025, at 8:42 PM, Xin Long wrote:
> 
> I wrote some code to explore this:
> https://github.com/lxin/quic/pull/53
> 
> - A stream can be peeled off from a parent/connection socket using
>   getsockopt(QUIC_SOCKOPT_STREAM_PEELOFF) with a stream_id, similar to
>   SCTP's connection peeloff.
> 
> - For stream sockets, in addition to send(), recv(), and close(), support
>   for poll() and shutdown() is also implemented. Note that close() and
>   shutdown() send a FIN on the sending side, issue a STOP_SENDING on the
>   receiving side, or both for bidirectional streams, as applicable.
> 
> There's also a sample test:
> https://github.com/lxin/quic/blob/stream-peeloff/tests/peeloff_test.c
> 
> - Sender: Opens a stream with getsockopt(QUIC_SOCKOPT_STREAM_OPEN), peels
>   it off with getsockopt(QUIC_SOCKOPT_STREAM_PEELOFF), and sends data via
>   the new file descriptor.
> 
> - Receiver: Detects stream creation through a QUIC_EVENT_STREAM_UPDATE
>   event, peels off the stream with getsockopt(QUIC_SOCKOPT_STREAM_PEELOFF),
>   and receives data via the new file descriptor.

Fantastic! Thank you for investigating this; I really appreciate it.

> Please check out the stream socket interfaces in the PR above and comment
> on anything that you think could be improved.

Gladly! I looked at the linked PR and left some comment now, but they are only "mild questions". The basic design here looks like exactly like what I was hoping for. Hooray!

> I aimed to make a peeled-off stream socket fully independent to keep the
> design simple. However, since the connection socket may close at any time,
> the stream socket must hold a reference to it. To keep the relationship
> strictly one-way, the connection socket remains unaware of any peeled-off
> stream sockets.

That makes sense to me. Maybe someday someone will want "please close connection when the last stream socket is closed", but that can come later. (If I understand what `sock_hold` is doing with ref counts correctly, I'd hope that would be possible without any bidirectional references.)

> Right, providing it in libquic won't work for kernel consumers.

Just to be clear, I was thinking of one user-land process sending the stream socket to another. But yes, kernel consumer would face the exact same issues as other userspace processes without this.

> > For what it's worth, I would go further in fact and say that this
> > "stream peeloff" system call should not just be supported by QUIC, too.
> > It is very nice today how many code can be agnostic to TCP vs unix
> > domain sockets, for example. I would ideally want the same thing to be
> > true with QUIC too, via an "extended unix domain socket" that would
> > replicate the QUIC state machine(s) just as regular unix domain sockets
> > replicate the TCP state machine.
> >
> > I bring up such an "extended unix domain socket" not to indulge in scope
> > creep, but just to point out that a good litmus test for a new socket
> > interface is that multiple domains could meaningfully support it, and
> > that litmus test is met in this case.
> >
> That’s a good point.

Glad you like it! I had been thinking abstractly at that point. But putting on my hat as a Nix developer, yes 

> I originally based the QUIC stream API on SCTP’s model, but it now seems that
> most real-world use cases are closer to applications that traditionally
> relied on TCP rather than SCTP.

I don't really know enough about who is using SCTP to know for sure, but yes I think that it is right :).

Cheers,

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-09-10 20:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <cover.1755525878.git.lucien.xin@gmail.com>
     [not found] ` <cb74facd-aa28-4c9d-b05f-84be3a135b20@app.fastmail.com>
     [not found]   ` <CADvbK_f4v916nbx4t0fnkCj44S-buTytj_Paurd3j3Ro2tLDsQ@mail.gmail.com>
     [not found]     ` <CADvbK_e9sNbvHSCNuvetOCFY5OQPG99tmZLW=odcRzcN9xK8rQ@mail.gmail.com>
2025-08-27  4:45       ` [PATCH net-next v2 00/15] net: introduce QUIC infrastructure and core subcomponents John Ericson
2025-08-27  4:52         ` Separate sockets for separate connections John Ericson
     [not found]         ` <9d4a3c5f-8b4b-4057-b550-e9158cbbc8bf@app.fastmail.com>
2025-08-30  0:42           ` Xin Long
2025-09-10 20:50             ` John Ericson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox