From: Vlad Yasevich <vyasevic@redhat•com>
To: Jason Wang <jasowang@redhat•com>,
Vladislav Yasevich <vyasevich@gmail•com>,
netdev@vger•kernel.org
Cc: virtio-dev@lists•oasis-open.org, maxime.coquelin@redhat•com,
virtualization@lists•linux-foundation.org, mst@redhat•com
Subject: Re: [PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions
Date: Fri, 21 Apr 2017 09:08:21 -0400 [thread overview]
Message-ID: <022c8af3-e7e7-5d35-5152-cb12e90359ef@redhat.com> (raw)
In-Reply-To: <2b70ab98-c5cc-25c7-dd42-4bb570b6aec6@redhat.com>
On 04/21/2017 12:05 AM, Jason Wang wrote:
>
>
> On 2017年04月20日 23:34, Vlad Yasevich wrote:
>> On 04/17/2017 11:01 PM, Jason Wang wrote:
>>>
>>> On 2017年04月16日 00:38, Vladislav Yasevich wrote:
>>>> Curreclty virtion net header is fixed size and adding things to it is rather
>>>> difficult to do. This series attempt to add the infrastructure as well as some
>>>> extensions that try to resolve some deficiencies we currently have.
>>>>
>>>> First, vnet header only has space for 16 flags. This may not be enough
>>>> in the future. The extensions will provide space for 32 possbile extension
>>>> flags and 32 possible extensions. These flags will be carried in the
>>>> first pseudo extension header, the presense of which will be determined by
>>>> the flag in the virtio net header.
>>>>
>>>> The extensions themselves will immidiately follow the extension header itself.
>>>> They will be added to the packet in the same order as they appear in the
>>>> extension flags. No padding is placed between the extensions and any
>>>> extensions negotiated, but not used need by a given packet will convert to
>>>> trailing padding.
>>> Do we need a explicit padding (e.g an extension) which could be controlled by each side?
>> I don't think so. The size of the vnet header is set based on the extensions negotiated.
>> The one part I am not crazy about is that in the case of packet not using any extensions,
>> the data is still placed after the entire vnet header, which essentially adds a lot
>> of padding. However, that's really no different then if we simply grew the vnet header.
>>
>> The other thing I've tried before is putting extensions into their own sg buffer, but that
>> made it slower.h
>
> Yes.
>
>>
>>>> For example:
>>>> | vnet mrg hdr | ext hdr | ext 1 | ext 2 | ext 5 | .. pad .. | packet data |
>>> Just some rough thoughts:
>>>
>>> - Is this better to use TLV instead of bitmap here? One advantage of TLV is that the
>>> length is not limited by the length of bitmap.
>> but the disadvantage is that we add at least 4 bytes per extension of just TL data. That
>> makes this thing even longer.
>
> Yes, and it looks like the length is still limited by e.g the length of T.
Not only that, but it is also limited by the skb->cb as a whole. So adding putting
extensions into a TLV style means we have less extensions for now, until we get rid of
skb->cb usage.
>
>>
>>> - For 1.1, do we really want something like vnet header? AFAIK, it was not used by modern
>>> NICs, is this better to pack all meta-data into descriptor itself? This may need a some
>>> changes in tun/macvtap, but looks more PCIE friendly.
>> That would really be ideal and I've looked at this. There are small issues of exposing
>> the 'net metadata' of the descriptor to taps so they can be filled in. The alternative
>> is to use a different control structure for tap->qemu|vhost channel (that can be
>> implementation specific) and have qemu|vhost populate the 'net metadata' of the descriptor.
>
> Yes, this needs some thought. For vhost, things looks a little bit easier, we can probably
> use msg_control.
>
We can use msg_control in qemu as well, can't we? It really is a question of who is doing
the work and the number of copies.
I can take a closer look of how it would look if we extend the descriptor with type
specific data. I don't know if other users of virtio would benefit from it?
-vlad
> Thanks
>
>> Thanks
>> -vlad
>>
>>> Thanks
>>>
>>>> Extensions proposed in this series are:
>>>> - IPv6 fragment id extension
>>>> * Currently, the guest generated fragment id is discarded and the host
>>>> generates an IPv6 fragment id if the packet has to be fragmented. The
>>>> code attempts to add time based perturbation to id generation to make
>>>> it harder to guess the next fragment id to be used. However, doing this
>>>> on the host may result is less perturbation (due to differnet timing)
>>>> and might make id guessing easier. Ideally, the ids generated by the
>>>> guest should be used. One could also argue that we a "violating" the
>>>> IPv6 protocol in the if the _strict_ interpretation of the spec.
>>>>
>>>> - VLAN header acceleration
>>>> * Currently virtio doesn't not do vlan header acceleration and instead
>>>> uses software tagging. One of the first things that the host will do is
>>>> strip the vlan header out. When passing the packet the a guest the
>>>> vlan header is re-inserted in to the packet. We can skip all that work
>>>> if we can pass the vlan data in accelearted format. Then the host will
>>>> not do any extra work. However, so far, this yeilded a very small
>>>> perf bump (only ~1%). I am still looking into this.
>>>>
>>>> - UDP tunnel offload
>>>> * Similar to vlan acceleration, with this extension we can pass additional
>>>> data to host for support GSO with udp tunnel and possible other
>>>> encapsulations. This yeilds a significant perfromance improvement
>>>> (still testing remote checksum code).
>>>>
>>>> An addition extension that is unfinished (due to still testing for any
>>>> side-effects) is checksum passthrough to support drivers that set
>>>> CHECKSUM_COMPLETE. This would eliminate the need for guests to compute
>>>> the software checksum.
>>>>
>>>> This series only takes care of virtio net. I have addition patches for the
>>>> host side (vhost and tap/macvtap as well as qemu), but wanted to get feedback
>>>> on the general approach first.
>>>>
>>>> Vladislav Yasevich (6):
>>>> virtio-net: Remove the use the padded vnet_header structure
>>>> virtio-net: make header length handling uniform
>>>> virtio_net: Add basic skeleton for handling vnet header extensions.
>>>> virtio-net: Add support for IPv6 fragment id vnet header extension.
>>>> virtio-net: Add support for vlan acceleration vnet header extension.
>>>> virtio-net: Add support for UDP tunnel offload and extension.
>>>>
>>>> drivers/net/virtio_net.c | 132 +++++++++++++++++++++++++++++++++-------
>>>> include/linux/skbuff.h | 5 ++
>>>> include/linux/virtio_net.h | 91 ++++++++++++++++++++++++++-
>>>> include/uapi/linux/virtio_net.h | 38 ++++++++++++
>>>> 4 files changed, 242 insertions(+), 24 deletions(-)
>>>>
>
_______________________________________________
Virtualization mailing list
Virtualization@lists•linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2017-04-21 13:08 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-15 16:38 [PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions Vladislav Yasevich
2017-04-15 16:38 ` [PATCH RFC (resend) net-next 1/6] virtio-net: Remove the use the padded vnet_header structure Vladislav Yasevich
2017-04-15 16:38 ` [PATCH RFC (resend) net-next 2/6] virtio-net: make header length handling uniform Vladislav Yasevich
2017-04-15 16:38 ` [PATCH RFC (resend) net-next 3/6] virtio_net: Add basic skeleton for handling vnet header extensions Vladislav Yasevich
2017-04-18 2:52 ` Jason Wang
2017-04-15 16:38 ` [PATCH RFC (resend) net-next 4/6] virtio-net: Add support for IPv6 fragment id vnet header extension Vladislav Yasevich
2017-04-15 16:38 ` [PATCH RFC (resend) net-next 5/6] virtio-net: Add support for vlan acceleration " Vladislav Yasevich
2017-04-16 0:28 ` Michael S. Tsirkin
2017-04-18 2:54 ` Jason Wang
2017-04-15 16:38 ` [PATCH RFC (resend) net-next 6/6] virtio: Add support for UDP tunnel offload and extension Vladislav Yasevich
2017-04-18 3:01 ` [PATCH RFC (resend) net-next 0/6] virtio-net: Add support for virtio-net header extensions Jason Wang
2017-04-20 15:34 ` Vlad Yasevich
2017-04-21 4:05 ` Jason Wang
2017-04-21 13:08 ` Vlad Yasevich [this message]
2017-04-24 3:22 ` Jason Wang
2017-04-24 17:04 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=022c8af3-e7e7-5d35-5152-cb12e90359ef@redhat.com \
--to=vyasevic@redhat$(echo .)com \
--cc=jasowang@redhat$(echo .)com \
--cc=maxime.coquelin@redhat$(echo .)com \
--cc=mst@redhat$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=virtio-dev@lists$(echo .)oasis-open.org \
--cc=virtualization@lists$(echo .)linux-foundation.org \
--cc=vyasevich@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox