public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat•com>
To: Siwei Liu <loseweigh@gmail•com>
Cc: Sridhar Samudrala <sridhar.samudrala@intel•com>,
	Stephen Hemminger <stephen@networkplumber•org>,
	David Miller <davem@davemloft•net>,
	Netdev <netdev@vger•kernel.org>, Jiri Pirko <jiri@resnulli•us>,
	virtio-dev@lists•oasis-open.org, "Brandeburg,
	Jesse" <jesse.brandeburg@intel•com>,
	Alexander Duyck <alexander.h.duyck@intel•com>,
	Jakub Kicinski <kubakici@wp•pl>
Subject: Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available
Date: Sun, 4 Mar 2018 06:04:56 +0200	[thread overview]
Message-ID: <20180304060014-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CADGSJ22VUgJzi6B=Bh4M6Bado1CQEEJvRR1VJ=oC47G2SJ0DEA@mail.gmail.com>

On Fri, Mar 02, 2018 at 03:56:31PM -0800, Siwei Liu wrote:
> On Fri, Mar 2, 2018 at 1:36 PM, Michael S. Tsirkin <mst@redhat•com> wrote:
> > On Fri, Mar 02, 2018 at 01:11:56PM -0800, Siwei Liu wrote:
> >> On Thu, Mar 1, 2018 at 12:08 PM, Sridhar Samudrala
> >> <sridhar.samudrala@intel•com> wrote:
> >> > This patch enables virtio_net to switch over to a VF datapath when a VF
> >> > netdev is present with the same MAC address. It allows live migration
> >> > of a VM with a direct attached VF without the need to setup a bond/team
> >> > between a VF and virtio net device in the guest.
> >> >
> >> > The hypervisor needs to enable only one datapath at any time so that
> >> > packets don't get looped back to the VM over the other datapath. When a VF
> >> > is plugged, the virtio datapath link state can be marked as down. The
> >> > hypervisor needs to unplug the VF device from the guest on the source host
> >> > and reset the MAC filter of the VF to initiate failover of datapath to
> >> > virtio before starting the migration. After the migration is completed,
> >> > the destination hypervisor sets the MAC filter on the VF and plugs it back
> >> > to the guest to switch over to VF datapath.
> >> >
> >> > When BACKUP feature is enabled, an additional netdev(bypass netdev) is
> >> > created that acts as a master device and tracks the state of the 2 lower
> >> > netdevs. The original virtio_net netdev is marked as 'backup' netdev and a
> >> > passthru device with the same MAC is registered as 'active' netdev.
> >> >
> >> > This patch is based on the discussion initiated by Jesse on this thread.
> >> > https://marc.info/?l=linux-virtualization&m=151189725224231&w=2
> >> >
> >> > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel•com>
> >> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel•com>
> >> > Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel•com>
> >> > ---
> >> >  drivers/net/virtio_net.c | 683 ++++++++++++++++++++++++++++++++++++++++++++++-
> >> >  1 file changed, 682 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >> > index bcd13fe906ca..f2860d86c952 100644
> >> > --- a/drivers/net/virtio_net.c
> >> > +++ b/drivers/net/virtio_net.c
> >> > @@ -30,6 +30,8 @@
> >> >  #include <linux/cpu.h>
> >> >  #include <linux/average.h>
> >> >  #include <linux/filter.h>
> >> > +#include <linux/netdevice.h>
> >> > +#include <linux/pci.h>
> >> >  #include <net/route.h>
> >> >  #include <net/xdp.h>
> >> >
> >> > @@ -206,6 +208,9 @@ struct virtnet_info {
> >> >         u32 speed;
> >> >
> >> >         unsigned long guest_offloads;
> >> > +
> >> > +       /* upper netdev created when BACKUP feature enabled */
> >> > +       struct net_device *bypass_netdev;
> >> >  };
> >> >
> >> >  struct padded_vnet_hdr {
> >> > @@ -2236,6 +2241,22 @@ static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp)
> >> >         }
> >> >  }
> >> >
> >> > +static int virtnet_get_phys_port_name(struct net_device *dev, char *buf,
> >> > +                                     size_t len)
> >> > +{
> >> > +       struct virtnet_info *vi = netdev_priv(dev);
> >> > +       int ret;
> >> > +
> >> > +       if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_BACKUP))
> >> > +               return -EOPNOTSUPP;
> >> > +
> >> > +       ret = snprintf(buf, len, "_bkup");
> >> > +       if (ret >= len)
> >> > +               return -EOPNOTSUPP;
> >> > +
> >> > +       return 0;
> >> > +}
> >> > +
> >>
> >> What if the systemd/udevd is not new enough to enforce the
> >> n<phys_port_name> naming? Would virtio_bypass get a different name
> >> than the original virtio_net?
> >
> > You mean people using ethX names? Any hardware config change breaks
> > these, I don't think that can be helped.
> 
> I don't like the way to rely on .ndo_get_phys_port_name - it's fragile
> and it does not completely solve the problem it tries to address.
> Imagine what can end up with if getting an old udevd, or users already
> have exsiting explicit udev rules around phys_port_name. It does not
> give you the an ack in saying "yes, I know you're the bypass and
> you're the backup, please continue and I will give you both correct
> names", or an unacknowlegment saying "no, I don't know what these
> extra interfaces are, please go back and leave the VF device alone".
> We need new udev API for both feature negotiation and naming, or may
> even completely hide the lower interfaces.

Go ahead and try to make this happen, but I won't hold my
breath.

> >
> >> Should we detect this earlier and fall
> >> back to legacy mode without creating the bypass netdev and ensalving
> >> the VF?
> >
> > I don't think we can do this with existing kernel/userspace APIs.
> 
> That's why I ever said to make udev aware of this new type of combined
> device instead of doing hacks here and there around.
> 
> Regards,
> -Siwei

We can add new interfaces on top but the main purpose here is to
make old userspace do new tricks.

> >
> > --
> > MST

  reply	other threads:[~2018-03-04  4:04 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-01 20:08 [PATCH v4 0/2] Enable virtio_net to act as a backup for a passthru device Sridhar Samudrala
2018-03-01 20:08 ` [PATCH v4 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit Sridhar Samudrala
2018-03-01 20:08 ` [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when available Sridhar Samudrala
2018-03-02  8:36   ` Jiri Pirko
2018-03-02 15:26     ` Alexander Duyck
2018-03-02 16:20       ` Jiri Pirko
2018-03-02 16:37         ` Samudrala, Sridhar
2018-03-02 17:06           ` Alexander Duyck
2018-03-02 19:42         ` Michael S. Tsirkin
2018-03-02 20:49           ` Siwei Liu
2018-03-03 11:31           ` Jiri Pirko
2018-03-03 18:04             ` Alexander Duyck
2018-03-03 21:25               ` Jiri Pirko
2018-03-04  0:26                 ` Alexander Duyck
2018-03-04  7:13                   ` Jiri Pirko
2018-03-04 18:24                     ` Alexander Duyck
2018-03-04 18:50                       ` Jiri Pirko
2018-03-04 21:54                         ` Samudrala, Sridhar
2018-03-04 21:58                         ` Alexander Duyck
2018-03-05  9:21                           ` Jiri Pirko
2018-03-05 16:11                             ` Stephen Hemminger
2018-03-05 22:30                               ` Jiri Pirko
2018-03-05 22:47                                 ` Alexander Duyck
2018-03-06  3:15                                   ` Stephen Hemminger
2018-03-06 19:08                                     ` Alexander Duyck
2018-03-06 22:59                                       ` Jiri Pirko
2018-03-06 23:27                                         ` Alexander Duyck
2018-03-07  2:38                                           ` Michael S. Tsirkin
2018-03-07 17:50                                             ` Alexander Duyck
2018-03-07 18:06                                               ` Stephen Hemminger
2018-03-07 18:55                                                 ` Alexander Duyck
2018-03-07 20:11                                                 ` Michael S. Tsirkin
2018-03-12 18:47                                                   ` Samudrala, Sridhar
2018-03-02 19:41       ` Michael S. Tsirkin
2018-03-02 19:52         ` Samudrala, Sridhar
2018-03-02 20:10           ` Michael S. Tsirkin
2018-03-02 20:44             ` Siwei Liu
2018-03-02 20:56               ` Samudrala, Sridhar
2018-03-02 21:33                 ` Michael S. Tsirkin
2018-03-02 21:31               ` Michael S. Tsirkin
2018-03-02 22:26                 ` Siwei Liu
2018-03-04  4:00                   ` Michael S. Tsirkin
2018-03-02 21:11   ` Siwei Liu
2018-03-02 21:36     ` Michael S. Tsirkin
2018-03-02 23:56       ` Siwei Liu
2018-03-04  4:04         ` Michael S. Tsirkin [this message]
2018-03-12 21:53           ` Siwei Liu
2018-03-02 23:12     ` Samudrala, Sridhar
2018-03-03  0:09       ` Siwei Liu
2018-03-12 20:12   ` Jiri Pirko
2018-03-12 20:58     ` Samudrala, Sridhar
2018-03-12 21:08       ` Jiri Pirko
2018-03-14  0:36         ` Samudrala, Sridhar
2018-03-14  0:54           ` Stephen Hemminger
2018-03-14 15:45           ` Jiri Pirko
2018-03-12 22:44   ` Siwei Liu
2018-03-14  0:28     ` Samudrala, Sridhar
2018-03-14  0:44       ` Michael S. Tsirkin
2018-03-14  4:50       ` Siwei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180304060014-mutt-send-email-mst@kernel.org \
    --to=mst@redhat$(echo .)com \
    --cc=alexander.h.duyck@intel$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=jesse.brandeburg@intel$(echo .)com \
    --cc=jiri@resnulli$(echo .)us \
    --cc=kubakici@wp$(echo .)pl \
    --cc=loseweigh@gmail$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=sridhar.samudrala@intel$(echo .)com \
    --cc=stephen@networkplumber$(echo .)org \
    --cc=virtio-dev@lists$(echo .)oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox