From: Jesper Dangaard Brouer <brouer@redhat•com>
To: Magnus Karlsson <magnus.karlsson@gmail•com>
Cc: "Alexei Starovoitov" <alexei.starovoitov@gmail•com>,
"Daniel Borkmann" <daniel@iogearbox•net>,
"Björn Töpel" <bjorn.topel@gmail•com>,
"Karlsson, Magnus" <magnus.karlsson@intel•com>,
"Alexander Duyck" <alexander.h.duyck@intel•com>,
"Alexander Duyck" <alexander.duyck@gmail•com>,
"John Fastabend" <john.fastabend@gmail•com>,
"Alexei Starovoitov" <ast@fb•com>,
"Willem de Bruijn" <willemdebruijn.kernel@gmail•com>,
"Michael S. Tsirkin" <mst@redhat•com>,
"Network Development" <netdev@vger•kernel.org>,
"Björn Töpel" <bjorn.topel@intel•com>,
michael.lundkvist@ericsson•com, "Brandeburg,
Jesse" <jesse.brandeburg@intel•com>,
"Singhai, Anjali" <anjali.singhai@intel•com>,
"Zhang, Qi Z" <qi.z.zhang@intel•com>,
brouer@redhat•com
Subject: Re: [PATCH bpf-next v3 00/15] Introducing AF_XDP support
Date: Mon, 7 May 2018 15:09:40 +0200 [thread overview]
Message-ID: <20180507150940.2578d6e3@redhat.com> (raw)
In-Reply-To: <CAJ8uoz2L+3dw5OOHJJ0CyV-y5w1ooNAjKHzHAcAiKhTjAq-vzQ@mail.gmail.com>
On Mon, 7 May 2018 11:13:58 +0200
Magnus Karlsson <magnus.karlsson@gmail•com> wrote:
> On Sat, May 5, 2018 at 2:34 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail•com> wrote:
> > On Fri, May 04, 2018 at 01:22:17PM +0200, Magnus Karlsson wrote:
> >> On Fri, May 4, 2018 at 1:38 AM, Alexei Starovoitov
> >> <alexei.starovoitov@gmail•com> wrote:
> >> > On Fri, May 04, 2018 at 12:49:09AM +0200, Daniel Borkmann wrote:
> >> >> On 05/02/2018 01:01 PM, Björn Töpel wrote:
> >> >> > From: Björn Töpel <bjorn.topel@intel•com>
> >> >> >
> >> >> > This patch set introduces a new address family called AF_XDP that is
> >> >> > optimized for high performance packet processing and, in upcoming
> >> >> > patch sets, zero-copy semantics. In this patch set, we have removed
> >> >> > all zero-copy related code in order to make it smaller, simpler and
> >> >> > hopefully more review friendly. This patch set only supports copy-mode
> >> >> > for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
> >> >> > for RX using the XDP_DRV path. Zero-copy support requires XDP and
> >> >> > driver changes that Jesper Dangaard Brouer is working on. Some of his
> >> >> > work has already been accepted. We will publish our zero-copy support
> >> >> > for RX and TX on top of his patch sets at a later point in time.
> >> >>
> >> >> +1, would be great to see it land this cycle. Saw few minor nits here
> >> >> and there but nothing to hold it up, for the series:
> >> >>
> >> >> Acked-by: Daniel Borkmann <daniel@iogearbox•net>
> >> >>
> >> >> Thanks everyone!
> >> >
> >> > Great stuff!
> >> >
> >> > Applied to bpf-next, with one condition.
> >> > Upcoming zero-copy patches for both RX and TX need to be posted
> >> > and reviewed within this release window.
> >> > If netdev community as a whole won't be able to agree on the zero-copy
> >> > bits we'd need to revert this feature before the next merge window.
> >>
> >> Thanks everyone for reviewing this. Highly appreciated.
> >>
> >> Just so we understand the purpose correctly:
> >>
> >> 1: Do you want to see the ZC patches in order to verify that the user
> >> space API holds? If so, we can produce an additional RFC patch set
> >> using a big chunk of code that we had in RFC V1. We are not proud of
> >> this code since it is clunky, but it hopefully proves the point with
> >> the uapi being the same.
> >>
> >> 2: And/Or are you worried about us all (the netdev community) not
> >> agreeing on a way to implement ZC internally in the drivers and the
> >> XDP infrastructure? This is not going to be possible to finish during
> >> this cycle since we do not like the implementation we had in RFC V1.
> >> Too intrusive and now we also have nicer abstractions from Jesper that
> >> we can use and extend to provide a (hopefully) much cleaner and less
> >> intrusive solution.
> >
> > short answer: both.
> >
> > Cleanliness and performance of the ZC code is not as important as
> > getting API right. The main concern that during ZC review process
> > we will find out that existing API has issues, so we have to
> > do this exercise before the merge window.
> > And RFC won't fly. Send the patches for real. They have to go
> > through the proper code review. The hackers of netdev community
> > can accept a partial, or a bit unclean, or slightly inefficient
> > implementation, since it can be and will be improved later,
> > but API we cannot change once it goes into official release.
> >
> > Here is the example of API concern:
> > this patch set added shared umem concept. It sounds good in theory,
> > but will it perform well with ZC ? Earlier RFCs didn't have that
> > feature. If it won't perform well than it shouldn't be in the tree.
> > The key reason to let AF_XDP into the tree is its performance promise.
> > If it doesn't perform we should rip it out and redesign.
>
> That is a fair point. We will try to produce patch sets for zero-copy
> RX and TX using the latest interfaces within this merge window. Just
> note that we will focus on this for the next week(s) instead of the
> review items that you and Daniel Borkmann submitted. If we get those
> patch sets out in time and we agree that they are a possible way
> forward, then we produce patches with your fixes. It was mainly small
> items, so should be quick.
I would like to see that you create a new xdp_mem_type for this new
zero-copy type. This will allow other XDP redirect methods/types (e.g.
devmap and cpumap) to react appropriately when receiving a zero-copy
frame.
For devmap, I'm hoping we can allow/support using the ndo_xdp_xmit call
without (first) copying (into a newly allocated page). By arguing that
if an xsk-userspace app modify a frame it's not allowed to, then it is
simply a bug in the program. (Note, this would also allow using
ndo_xdp_xmit call for TX from xsk-userspace).
For cpumap, it is hard to avoid a copy, but I'm hoping we could delay
the copy (and alloc of mem dest area) until on the remote CPU. This is
already the principle of cpumap; of moving the allocation of the SKB to
the remote CPU.
For ZC to interact with XDP redirect-core and return API, the zero-copy
memory type/allocator, need to provide an area for the xdp_frame data
to be stored in (as we cannot allow using top-of-frame like
non-zero-copy variants), and extend xdp_frame with an ZC umem-id.
I imagine we can avoid any dynamic allocations, as we upfront (at bind
and XDP_UMEM_REG time) know the number of frames. (e.g. pre-alloc in
xdp_umem_reg() call, and have xdp_umem_get_xdp_frame lookup func).
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2018-05-07 13:09 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-02 11:01 [PATCH bpf-next v3 00/15] Introducing AF_XDP support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 01/15] net: initial AF_XDP skeleton Björn Töpel
2018-05-23 22:50 ` Stephen Hemminger
2018-05-24 6:38 ` Björn Töpel
2018-05-24 17:57 ` Alexei Starovoitov
2018-05-02 11:01 ` [PATCH bpf-next v3 02/15] xsk: add user memory registration support sockopt Björn Töpel
2018-05-04 12:34 ` Daniel Borkmann
2018-05-02 11:01 ` [PATCH bpf-next v3 03/15] xsk: add umem fill queue support and mmap Björn Töpel
2018-05-04 12:49 ` Daniel Borkmann
2018-05-02 11:01 ` [PATCH bpf-next v3 04/15] xsk: add Rx queue setup and mmap support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 05/15] xsk: add support for bind for Rx Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 06/15] xsk: add Rx receive functions and poll support Björn Töpel
2018-05-04 12:59 ` Daniel Borkmann
2018-05-22 7:42 ` Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 07/15] bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP Björn Töpel
2018-10-08 15:31 ` Eric Dumazet
2018-10-08 16:05 ` Björn Töpel
2018-10-08 16:52 ` Björn Töpel
2018-10-08 16:55 ` Eric Dumazet
2018-10-08 17:04 ` Björn Töpel
2018-10-08 17:40 ` [PATCH bpf] xsk: do not call synchronize_net() under RCU read lock Björn Töpel
2018-10-09 0:30 ` Song Liu
2018-10-11 8:22 ` Daniel Borkmann
2018-05-02 11:01 ` [PATCH bpf-next v3 08/15] xsk: wire up XDP_DRV side of AF_XDP Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 09/15] xsk: wire up XDP_SKB " Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 10/15] xsk: add umem completion queue support and mmap Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 11/15] xsk: add Tx queue setup and mmap support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 12/15] dev: packet: make packet_direct_xmit a common function Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 13/15] xsk: support for Tx Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 14/15] xsk: statistics support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 15/15] samples/bpf: sample application and documentation for AF_XDP sockets Björn Töpel
2018-05-02 20:59 ` Jesper Dangaard Brouer
2018-05-03 13:55 ` [PATCH bpf-next v3 00/15] Introducing AF_XDP support Willem de Bruijn
2018-05-03 15:07 ` David Miller
2018-05-03 22:49 ` Daniel Borkmann
2018-05-03 23:38 ` Alexei Starovoitov
2018-05-04 11:22 ` Magnus Karlsson
2018-05-05 0:34 ` Alexei Starovoitov
2018-05-07 9:13 ` Magnus Karlsson
2018-05-07 13:09 ` Jesper Dangaard Brouer [this message]
2018-05-07 19:47 ` Björn Töpel
2018-05-17 6:46 ` Björn Töpel
2018-05-18 3:38 ` Alexei Starovoitov
2018-05-18 13:43 ` Daniel Borkmann
2018-05-18 15:18 ` Björn Töpel
2018-05-18 16:17 ` Daniel Borkmann
2018-05-18 16:32 ` Björn Töpel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180507150940.2578d6e3@redhat.com \
--to=brouer@redhat$(echo .)com \
--cc=alexander.duyck@gmail$(echo .)com \
--cc=alexander.h.duyck@intel$(echo .)com \
--cc=alexei.starovoitov@gmail$(echo .)com \
--cc=anjali.singhai@intel$(echo .)com \
--cc=ast@fb$(echo .)com \
--cc=bjorn.topel@gmail$(echo .)com \
--cc=bjorn.topel@intel$(echo .)com \
--cc=daniel@iogearbox$(echo .)net \
--cc=jesse.brandeburg@intel$(echo .)com \
--cc=john.fastabend@gmail$(echo .)com \
--cc=magnus.karlsson@gmail$(echo .)com \
--cc=magnus.karlsson@intel$(echo .)com \
--cc=michael.lundkvist@ericsson$(echo .)com \
--cc=mst@redhat$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=qi.z.zhang@intel$(echo .)com \
--cc=willemdebruijn.kernel@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox