From: Koichiro Den <den@klaipeden•com>
To: "Michael S. Tsirkin" <mst@redhat•com>, Jason Wang <jasowang@redhat•com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail•com>,
virtualization@lists•linux-foundation.org,
Network Development <netdev@vger•kernel.org>
Subject: Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi
Date: Wed, 23 Aug 2017 23:28:24 +0900 [thread overview]
Message-ID: <1503498504.8694.26.camel@klaipeden.com> (raw)
In-Reply-To: <20170822204015-mutt-send-email-mst@kernel.org>
On Tue, 2017-08-22 at 20:55 +0300, Michael S. Tsirkin wrote:
> On Tue, Aug 22, 2017 at 10:50:41AM +0800, Jason Wang wrote:
> > > Perhaps the descriptor pool should also be
> > > revised to allow out of order completions. Then there is no need to
> > > copy zerocopy packets whenever they may experience delay.
> >
> > Yes, but as replied in the referenced thread, windows driver may treat out
> > of order completion as a bug.
>
> That would be a windows driver bug then, but I don't think it makes this
> assumption. What the referenced thread
> (https://patchwork.kernel.org/patch/3787671/) is saying is that host
> must use any buffers made available on a tx vq within a reasonable
> timeframe otherwise windows guests panic.
>
> Ideally we would detect that a packet is actually experiencing delay and
> trigger the copy at that point e.g. by calling skb_linearize. But it
> isn't easy to track these packets though and even harder to do a data
> copy without races.
>
> Which reminds me that skb_linearize in net core seems to be
> fundamentally racy - I suspect that if skb is cloned, and someone is
> trying to use the shared frags while another thread calls skb_linearize,
> we get some use after free bugs which likely mostly go undetected
> because the corrupted packets mostly go on wire and get dropped
> by checksum code.
>
Please let me make sure if I understand it correctly:
* always do copy with skb_orphan_frags_rx as Willem mentioned in the earlier
post, before the xmit_skb as opposed to my original patch, is safe but too
costly so cannot be adopted.
* as a generic solution, if we were to somehow overcome the safety issue, track
the delay and do copy if some threshold is reached could be an answer, but it's
hard for now.
* so things like the current vhost-net implementation of deciding whether or not
to do zerocopy beforehand referring the zerocopy tx error ratio is a point of
practical compromise.
Thanks.
next prev parent reply other threads:[~2017-08-23 14:28 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-19 6:38 [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi Koichiro Den
2017-08-20 20:49 ` Willem de Bruijn
2017-08-21 12:40 ` Koichiro Den
2017-08-22 12:11 ` Willem de Bruijn
2017-08-22 14:04 ` Koichiro Den
2017-08-22 17:19 ` Willem de Bruijn
2017-08-23 14:26 ` Koichiro Den
2017-08-21 12:33 ` Jason Wang
2017-08-21 12:58 ` Koichiro Den
2017-08-21 15:41 ` Willem de Bruijn
2017-08-22 2:50 ` Jason Wang
2017-08-22 3:10 ` Willem de Bruijn
2017-08-22 11:47 ` Jason Wang
2017-08-22 13:42 ` Koichiro Den
2017-08-22 17:16 ` Willem de Bruijn
2017-08-23 14:24 ` Koichiro Den
2017-08-22 17:55 ` Michael S. Tsirkin
2017-08-22 18:01 ` David Miller
2017-08-22 18:28 ` Eric Dumazet
2017-08-22 18:39 ` Michael S. Tsirkin
2017-08-23 14:28 ` Koichiro Den [this message]
2017-08-23 14:47 ` Koichiro Den
2017-08-23 15:20 ` Willem de Bruijn
2017-08-23 22:57 ` Michael S. Tsirkin
2017-08-24 3:28 ` Willem de Bruijn
2017-08-24 4:34 ` Michael S. Tsirkin
2017-08-24 13:50 ` Michael S. Tsirkin
2017-08-24 20:20 ` Willem de Bruijn
2017-08-24 20:50 ` Michael S. Tsirkin
2017-08-25 22:44 ` Willem de Bruijn
2017-08-25 23:32 ` Michael S. Tsirkin
2017-08-26 1:03 ` Willem de Bruijn
2017-08-29 19:35 ` Willem de Bruijn
2017-08-29 19:42 ` Michael S. Tsirkin
2017-08-29 19:53 ` Willem de Bruijn
2017-08-29 20:40 ` Michael S. Tsirkin
2017-08-29 22:55 ` Willem de Bruijn
2017-08-30 1:45 ` Jason Wang
2017-08-30 3:11 ` Willem de Bruijn
2017-09-01 3:08 ` Jason Wang
2017-08-31 14:30 ` Willem de Bruijn
2017-09-01 3:25 ` Jason Wang
2017-09-01 16:15 ` Willem de Bruijn
2017-09-01 16:17 ` Willem de Bruijn
2017-09-04 3:03 ` Jason Wang
2017-09-05 14:09 ` Willem de Bruijn
2017-09-06 3:27 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1503498504.8694.26.camel@klaipeden.com \
--to=den@klaipeden$(echo .)com \
--cc=jasowang@redhat$(echo .)com \
--cc=mst@redhat$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=virtualization@lists$(echo .)linux-foundation.org \
--cc=willemdebruijn.kernel@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox