public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Or Gerlitz <ogerlitz@mellanox•com>
To: "Michael S. Tsirkin" <mst@redhat•com>
Cc: "Eric W. Biederman" <ebiederm@xmission•com>,
	<davem@davemloft•net>, <roland@kernel•org>,
	<netdev@vger•kernel.org>, Ali Ayoub <ali@mellanox•com>,
	<sean.hefty@intel•com>, Erez Shitrit <erezsh@mellanox•co.il>,
	Doug Ledford <dledford@redhat•com>,
	Shlomo Pongratz <shlomop@mellanox•com>
Subject: Re: [PATCH V2 09/12] net/eipoib: Add main driver functionality
Date: Thu, 23 Aug 2012 09:45:17 +0300	[thread overview]
Message-ID: <5035D17D.8090703@mellanox.com> (raw)
In-Reply-To: <20120820185625.GA5234@redhat.com>

On 20/08/2012 21:57, Michael S. Tsirkin wrote:
> On Sun, Aug 12, 2012 at 11:54:57PM +0300, Michael S. Tsirkin wrote:
>
>> If you want this, then you really want a limited form of IPoIB bridging.
>
> And to clarify that statement, here is how I would make such IPoIB "bridging" work:
>
> Guest side:
>
> - Implement virtio-ipoib. This would be a device like virtio-net,
>    but instead of ethernet packets, it would pass packets
>    that consist of:
> 	IPoIB destination address
> 	IP packet
>
> - this is passed to/from host without modifications, possibly with addition
>    of header such as virtio net header
>
> - flags such as broadcast can also be added to header
>
> - like virtio net get capabilities from host and expose
>    as netdev capabilities
>
> Host side:
> - create macvtap -passthrough like device that can sit on top of an
>    ipoib interface
> - expose this device QPN and GID to guest as hardware address
> - as we get packet forward it on UD QPN or CM as appropriate
>    depending on size,checksum and admin preference
> - expose capabilities such as TSO
> - can expose capability such as max MTU to guest too
>
> Above means hardware address changes with migration.
> So we need to notify guest when this happens.
>
> This can be addressed from host by notifying all neighbours.
>
> Alternatively guest can notify all neighbours.
>
> Notification can be done by broadcast.
> This second option seems preferable.
>
> this ipoib-vtap can support two modes
> 	- bridge like mode:
> 	  guest to guest and guest to host packets
> 	  can be detected by macvtap and passed
> 	  to/from guest directly like macvlan bridge mode
>
> 	- vepa like mode
> 	  guest to guest and guest to host packets
> 	  are sent out and looped back by IB switch
> 	  like macvlan vepa mode
>
> As compared to the custom protocol I sent, it has -
> Advantages: interoperates cleanly with ipoib
> Disadvantages: no support for legacy (ethernet-only) guest
>

Hi Michael,

As you mentioned, the approach doesn't address legacy guests, who either 
don't have the virtio
driver, or don't have a virtio driver patched to support virtio-ipoib  
--  which doesn't go inline
with a strong requirement I got.

Other than this giant obstacle, I liked the suggestion and it seems 
valid and viable -- BTW IB HW has
loopback capability, so the VM/VM packets wouldn't actually go to the IB 
switch, but remain within the
HCA.

Or.

  reply	other threads:[~2012-08-23  6:45 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01 17:09 [PATCH V2 00/12] Add Ethernet IPoIB driver Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 01/12] IB/ipoib: Add rtnl_link_ops support Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 02/12] IB/ipoib: Add support for clones / multiple childs on the same partition Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 03/12] include/linux: Add private flags for IPoIB interfaces Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 04/12] IB/ipoib: Add support for acting as VIF Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 05/12] net: Add ndo_set_vif_param operation to serve eIPoIB VIFs Or Gerlitz
2012-08-02  0:17   ` Ben Hutchings
2012-08-02  8:25     ` Erez Shitrit
2012-08-01 17:09 ` [PATCH V2 06/12] net/core: Add rtnetlink support to vif parameters Or Gerlitz
2012-08-02  0:20   ` Ben Hutchings
2012-08-02 15:29     ` Erez Shitrit
2012-08-01 17:09 ` [PATCH V2 07/12] net/eipoib: Add private header file Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 08/12] net/eipoib: Add ethtool file support Or Gerlitz
2012-08-02  0:22   ` Ben Hutchings
2012-08-02  8:35     ` Erez Shitrit
2012-08-02 15:42       ` Ben Hutchings
2012-08-01 17:09 ` [PATCH V2 09/12] net/eipoib: Add main driver functionality Or Gerlitz
2012-08-02 17:15   ` Eric W. Biederman
2012-08-03 20:31     ` Ali Ayoub
2012-08-03 21:33       ` David Miller
2012-08-03 22:39         ` Ali Ayoub
2012-08-03 23:36           ` David Miller
2012-08-04 21:23             ` Or Gerlitz
2012-08-04 21:44               ` Or Gerlitz
2012-08-04 23:19                 ` Eric W. Biederman
2012-08-07  0:14             ` Ali Ayoub
2012-08-07  0:44               ` Eric W. Biederman
2012-08-07  1:21                 ` Re[2]: " Naoto MATSUMOTO
2012-08-15  9:10                   ` Re[3]: " Naoto MATSUMOTO
2012-08-07  3:33                 ` Eric W. Biederman
2012-08-08  6:04                   ` Or Gerlitz
2012-08-08  8:36                     ` Eric W. Biederman
2012-08-09  4:06                       ` Or Gerlitz
2012-08-12 14:05                         ` Michael S. Tsirkin
2012-08-07  3:37                 ` Joseph Glanville
2012-08-08  7:32                 ` Or Gerlitz
2012-08-08  9:17                   ` Eric W. Biederman
2012-08-09  4:34                     ` Or Gerlitz
2012-08-12 10:36                       ` Michael S. Tsirkin
2012-08-04  0:02           ` Ali Ayoub
2012-08-04  0:05             ` David Miller
2012-08-04  1:34             ` Eric W. Biederman
2012-08-04 21:33               ` Or Gerlitz
2012-08-05 18:50     ` Michael S. Tsirkin
2012-08-08  5:23       ` Or Gerlitz
2012-08-12 10:22         ` Michael S. Tsirkin
2012-08-12 13:09           ` Or Gerlitz
2012-08-12 13:41             ` Michael S. Tsirkin
2012-08-12 13:15           ` Or Gerlitz
2012-08-12 13:55             ` Michael S. Tsirkin
2012-08-12 14:13               ` Or Gerlitz
2012-08-12 20:54                 ` Michael S. Tsirkin
2012-08-14  8:44                   ` Or Gerlitz
2012-08-20 18:57                   ` Michael S. Tsirkin
2012-08-23  6:45                     ` Or Gerlitz [this message]
2012-08-14  7:41               ` Or Gerlitz
2012-08-12 10:54         ` Michael S. Tsirkin
2012-08-12 13:19           ` Or Gerlitz
2012-08-12 15:40         ` Eric W. Biederman
2012-08-13  8:33           ` Or Gerlitz
2012-08-13 16:08             ` Eric W. Biederman
2012-09-03 20:53       ` Or Gerlitz
2012-09-03 21:22         ` Michael S. Tsirkin
2012-09-04 18:50           ` Or Gerlitz
2012-09-04 19:31             ` Eric W. Biederman
2012-09-04 19:47               ` Or Gerlitz
2012-09-04 21:21             ` Michael S. Tsirkin
2012-09-04 18:57           ` Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 10/12] net/eipoib: Add sysfs support Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 11/12] net/eipoib: Add Makefile, Kconfig and MAINTAINERS entries Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 12/12] IB/ipoib: Add support for transmission of skbs w.o dst/neighbour Or Gerlitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5035D17D.8090703@mellanox.com \
    --to=ogerlitz@mellanox$(echo .)com \
    --cc=ali@mellanox$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=dledford@redhat$(echo .)com \
    --cc=ebiederm@xmission$(echo .)com \
    --cc=erezsh@mellanox$(echo .)co.il \
    --cc=mst@redhat$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=roland@kernel$(echo .)org \
    --cc=sean.hefty@intel$(echo .)com \
    --cc=shlomop@mellanox$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox