From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: [PATCHv8 0/3] vhost: a kernel-level virtio server Date: Wed, 4 Nov 2009 17:52:34 +0200 Message-ID: <20091104155234.GA32673@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii To: netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org, akpm@linux-founda Return-path: Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-Id: netdev.vger.kernel.org Ok, I think I've addressed all comments so far here. Rusty, I'd like this to go into linux-next, through your tree, and hopefully 2.6.33. What do you think? --- This implements vhost: a kernel-level backend for virtio, The main motivation for this work is to reduce virtualization overhead for virtio by removing system calls on data path, without guest changes. For virtio-net, this removes up to 4 system calls per packet: vm exit for kick, reentry for kick, iothread wakeup for packet, interrupt injection for packet. This driver is pretty minimal, but it's fully functional (including migration support interfaces), and already shows performance (especially latency) improvement over userspace. Some more detailed description attached to the patch itself. The patches apply to both 2.6.32-rc6 and kvm.git. I'd like them to go into linux-next if possible. Please comment. Changelog from v7: - Add note on RCU usage, mirroring this in vhost/vhost.h - Fix locking typo noted by Eric Dumazet - Fix warnings on 32 bit Changelog from v6: - review comments by Daniel Walker addressed - checkpatch cleanup - fix build on 32 bit - maintainers entry corrected Changelog from v5: - tun support - backends with virtio net header support (enables GSO, checksum etc) - 32 bit compat fixed - support indirect buffers, tx exit mitigation, tx interrupt mitigation - support write logging (allows migration without virtio ring code in userspace) Changelog from v4: - disable rx notification when have rx buffers - addressed all comments from Rusty's review - copy bugfixes from lguest commits: ebf9a5a99c1a464afe0b4dfa64416fc8b273bc5c e606490c440900e50ccf73a54f6fc6150ff40815 Changelog from v3: - checkpatch fixes Changelog from v2: - Comments on RCU usage - Compat ioctl support - Make variable static - Copied more idiomatic english from Rusty Changes from v1: - Move use_mm/unuse_mm from fs/aio.c to mm instead of copying. - Reorder code to avoid need for forward declarations - Kill a couple of debugging printks Michael S. Tsirkin (3): tun: export underlying socket mm: export use_mm/unuse_mm to modules vhost_net: a kernel-level virtio server MAINTAINERS | 9 + arch/x86/kvm/Kconfig | 1 + drivers/Makefile | 1 + drivers/net/tun.c | 101 ++++- drivers/vhost/Kconfig | 11 + drivers/vhost/Makefile | 2 + drivers/vhost/net.c | 633 +++++++++++++++++++++++++++++ drivers/vhost/vhost.c | 970 ++++++++++++++++++++++++++++++++++++++++++++ drivers/vhost/vhost.h | 158 +++++++ include/linux/Kbuild | 1 + include/linux/if_tun.h | 14 + include/linux/miscdevice.h | 1 + include/linux/vhost.h | 126 ++++++ mm/mmu_context.c | 3 + 14 files changed, 2012 insertions(+), 19 deletions(-) create mode 100644 drivers/vhost/Kconfig create mode 100644 drivers/vhost/Makefile create mode 100644 drivers/vhost/net.c create mode 100644 drivers/vhost/vhost.c create mode 100644 drivers/vhost/vhost.h create mode 100644 include/linux/vhost.h -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gregory Haskins Subject: Re: [PATCHv8 0/3] vhost: a kernel-level virtio server Date: Wed, 04 Nov 2009 11:02:15 -0500 Message-ID: <4AF1A587.8000509@gmail.com> References: <20091104155234.GA32673@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig9CBB576151305611F6CF6CC2" Cc: netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org, akpm@linux-foundation.org, hpa@zytor.com, Rusty Russell , s.hetze@linux-ag.com, Daniel Walker , Eric Dumazet To: "Michael S. Tsirkin" Return-path: In-Reply-To: <20091104155234.GA32673@redhat.com> Sender: owner-linux-mm@kvack.org List-Id: netdev.vger.kernel.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig9CBB576151305611F6CF6CC2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Michael S. Tsirkin wrote: > Ok, I think I've addressed all comments so far here. > Rusty, I'd like this to go into linux-next, through your tree, and > hopefully 2.6.33. What do you think? I think the benchmark data is a prerequisite for merge consideration, IMO= =2E Do you have anything for us to look at? I think comparison that show the following are of interest: throughput (e.g. netperf::TCP_STREAM): guest->host, guest->host->guest, guest->host->remote, host->remote, remote->host->guest latency (e.g. netperf::UDP_RR): same conditions as throughput cpu-utilization others? Ideally, this should be at least between upstream virtio and vhost. Bonus points if you include venet as well. Kind regards, -Greg --------------enig9CBB576151305611F6CF6CC2 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.11 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkrxpYcACgkQP5K2CMvXmqHHKQCfX0c8WoojuJe1A+eFFzu9twpU PEkAn2bUMmeK9n8AfgWItG+bCqzjqtDQ =6kXy -----END PGP SIGNATURE----- --------------enig9CBB576151305611F6CF6CC2-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCHv8 0/3] vhost: a kernel-level virtio server Date: Wed, 4 Nov 2009 18:23:39 +0200 Message-ID: <20091104162339.GA311@redhat.com> References: <20091104155234.GA32673@redhat.com> <4AF1A587.8000509@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org, akpm@linux-foundation.org, hpa@zytor.com, Rusty Russell , s.hetze@linux-ag.com, Daniel Walker , Eric Dumazet To: Gregory Haskins Return-path: Content-Disposition: inline In-Reply-To: <4AF1A587.8000509@gmail.com> Sender: owner-linux-mm@kvack.org List-Id: netdev.vger.kernel.org On Wed, Nov 04, 2009 at 11:02:15AM -0500, Gregory Haskins wrote: > Michael S. Tsirkin wrote: > > Ok, I think I've addressed all comments so far here. > > Rusty, I'd like this to go into linux-next, through your tree, and > > hopefully 2.6.33. What do you think? > > I think the benchmark data is a prerequisite for merge consideration, IMO. Shirley Ma was kind enough to send me some measurement results showing how kernel level acceleration helps speed up you can find them here: http://www.linux-kvm.org/page/VhostNet Generally, I think that merging should happen *before* agressive benchmarking/performance tuning: otherwise there is very substancial risk that what is an optimization in one setup hurts performance in another one. When code is upstream, people can bisect to debug regressions. Another good reason is that I can stop spending time rebasing and start profiling. > Do you have anything for us to look at? For guest to host, compared to latest qemu with userspace virtio backend, latency drops by a factor of 6, bandwidth doubles, cpu utilization drops slightly :) > I think comparison that show the following are of interest: > > throughput (e.g. netperf::TCP_STREAM): guest->host, guest->host->guest, > guest->host->remote, host->remote, remote->host->guest > > latency (e.g. netperf::UDP_RR): same conditions as throughput > > cpu-utilization > > others? > > Ideally, this should be at least between upstream virtio and vhost. > Bonus points if you include venet as well. And vmxnet3 :) > Kind regards, > -Greg > -- MST -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gregory Haskins Subject: Re: [PATCHv8 0/3] vhost: a kernel-level virtio server Date: Wed, 04 Nov 2009 14:15:42 -0500 Message-ID: <4AF1D2DE.10705@gmail.com> References: <20091104155234.GA32673@redhat.com> <4AF1A587.8000509@gmail.com> <20091104162339.GA311@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigE41C0DE2FBFBF3E83786E82A" Cc: netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org, akpm@linux-foundation.org, hpa@zytor.com, Rusty Russell , s.hetze@linux-ag.com, Daniel Walker , Eric Dumazet To: "Michael S. Tsirkin" Return-path: In-Reply-To: <20091104162339.GA311@redhat.com> Sender: owner-linux-mm@kvack.org List-Id: netdev.vger.kernel.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigE41C0DE2FBFBF3E83786E82A Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Michael S. Tsirkin wrote: > On Wed, Nov 04, 2009 at 11:02:15AM -0500, Gregory Haskins wrote: >> Michael S. Tsirkin wrote: >>> Ok, I think I've addressed all comments so far here. >>> Rusty, I'd like this to go into linux-next, through your tree, and >>> hopefully 2.6.33. What do you think? >> I think the benchmark data is a prerequisite for merge consideration, = IMO. >=20 > Shirley Ma was kind enough to send me some measurement results showing > how kernel level acceleration helps speed up you can find them here: > http://www.linux-kvm.org/page/VhostNet Thanks for the pointers. I will roll your latest v8 code into our test matrix. What kernel/qemu trees do they apply to? -Greg --------------enigE41C0DE2FBFBF3E83786E82A Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.11 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkrx0t4ACgkQP5K2CMvXmqFRbgCeLpbxRPIlSPlPG7HtOurzxEQp lWUAnjOvhcZm8LOVA2vlgoLaN/wgPdzQ =JO1s -----END PGP SIGNATURE----- --------------enigE41C0DE2FBFBF3E83786E82A-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCHv8 0/3] vhost: a kernel-level virtio server Date: Wed, 4 Nov 2009 21:19:26 +0200 Message-ID: <20091104191926.GC772@redhat.com> References: <20091104155234.GA32673@redhat.com> <4AF1A587.8000509@gmail.com> <20091104162339.GA311@redhat.com> <4AF1D2DE.10705@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org, akpm@linux-foundation.org, hpa@zytor.com, Rusty Russell , s.hetze@linux-ag.com, Daniel Walker , Eric Dumazet To: Gregory Haskins Return-path: Content-Disposition: inline In-Reply-To: <4AF1D2DE.10705@gmail.com> Sender: owner-linux-mm@kvack.org List-Id: netdev.vger.kernel.org On Wed, Nov 04, 2009 at 02:15:42PM -0500, Gregory Haskins wrote: > Michael S. Tsirkin wrote: > > On Wed, Nov 04, 2009 at 11:02:15AM -0500, Gregory Haskins wrote: > >> Michael S. Tsirkin wrote: > >>> Ok, I think I've addressed all comments so far here. > >>> Rusty, I'd like this to go into linux-next, through your tree, and > >>> hopefully 2.6.33. What do you think? > >> I think the benchmark data is a prerequisite for merge consideration, IMO. > > > > Shirley Ma was kind enough to send me some measurement results showing > > how kernel level acceleration helps speed up you can find them here: > > http://www.linux-kvm.org/page/VhostNet > > Thanks for the pointers. I will roll your latest v8 code into our test > matrix. What kernel/qemu trees do they apply to? > > -Greg > kernel 2.6.32-rc6, qemu-kvm 47e465f031fc43c53ea8f08fa55cc3482c6435c8. You can also use my development git trees if you like. kernel: git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost userspace: git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git vhost Please note I rebase especially userspace tree now and when. -- MST -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org