From: Rusty Russell <rusty@rustcorp•com.au>
To: "Michael S. Tsirkin" <mst@redhat•com>
Cc: Krishna Kumar2 <krkumar2@in•ibm.com>,
David Miller <davem@davemloft•net>,
kvm@vger•kernel.org, Shirley Ma <mashirle@us•ibm.com>,
netdev@vger•kernel.org, steved@us•ibm.com
Subject: Re: Network performance with small packets
Date: Wed, 9 Feb 2011 11:07:20 +1030 [thread overview]
Message-ID: <201102091107.20270.rusty@rustcorp.com.au> (raw)
In-Reply-To: <20110202044222.GC3818@redhat.com>
On Wed, 2 Feb 2011 03:12:22 pm Michael S. Tsirkin wrote:
> On Wed, Feb 02, 2011 at 10:09:18AM +0530, Krishna Kumar2 wrote:
> > > "Michael S. Tsirkin" <mst@redhat•com> 02/02/2011 03:11 AM
> > >
> > > On Tue, Feb 01, 2011 at 01:28:45PM -0800, Shirley Ma wrote:
> > > > On Tue, 2011-02-01 at 23:21 +0200, Michael S. Tsirkin wrote:
> > > > > Confused. We compare capacity to skb frags, no?
> > > > > That's sg I think ...
> > > >
> > > > Current guest kernel use indirect buffers, num_free returns how many
> > > > available descriptors not skb frags. So it's wrong here.
> > > >
> > > > Shirley
> > >
> > > I see. Good point. In other words when we complete the buffer
> > > it was indirect, but when we add a new one we
> > > can not allocate indirect so we consume.
> > > And then we start the queue and add will fail.
> > > I guess we need some kind of API to figure out
> > > whether the buf we complete was indirect?
I've finally read this thread... I think we need to get more serious
with our stats gathering to diagnose these kind of performance issues.
This is a start; it should tell us what is actually happening to the
virtio ring(s) without significant performance impact...
Subject: virtio: CONFIG_VIRTIO_STATS
For performance problems we'd like to know exactly what the ring looks
like. This patch adds stats indexed by how-full-ring-is; we could extend
it to also record them by how-used-ring-is if we need.
Signed-off-by: Rusty Russell <rusty@rustcorp•com.au>
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -7,6 +7,14 @@ config VIRTIO_RING
tristate
depends on VIRTIO
+config VIRTIO_STATS
+ bool "Virtio debugging stats (EXPERIMENTAL)"
+ depends on VIRTIO_RING
+ select DEBUG_FS
+ ---help---
+ Virtio stats collected by how full the ring is at any time,
+ presented under debugfs/virtio/<name>-<vq>/<num-used>/
+
config VIRTIO_PCI
tristate "PCI driver for virtio devices (EXPERIMENTAL)"
depends on PCI && EXPERIMENTAL
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -21,6 +21,7 @@
#include <linux/virtio_config.h>
#include <linux/device.h>
#include <linux/slab.h>
+#include <linux/debugfs.h>
/* virtio guest is communicating with a virtual "device" that actually runs on
* a host processor. Memory barriers are used to control SMP effects. */
@@ -95,6 +96,11 @@ struct vring_virtqueue
/* How to notify other side. FIXME: commonalize hcalls! */
void (*notify)(struct virtqueue *vq);
+#ifdef CONFIG_VIRTIO_STATS
+ struct vring_stat *stats;
+ struct dentry *statdir;
+#endif
+
#ifdef DEBUG
/* They're supposed to lock for us. */
unsigned int in_use;
@@ -106,6 +112,87 @@ struct vring_virtqueue
#define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
+#ifdef CONFIG_VIRTIO_STATS
+/* We have an array of these, indexed by how full the ring is. */
+struct vring_stat {
+ /* How many interrupts? */
+ size_t interrupt_nowork, interrupt_work;
+ /* How many non-notify kicks, how many notify kicks, how many add notify? */
+ size_t kick_no_notify, kick_notify, add_notify;
+ /* How many adds? */
+ size_t add_direct, add_indirect, add_fail;
+ /* How many gets? */
+ size_t get;
+ /* How many disable callbacks? */
+ size_t disable_cb;
+ /* How many enables? */
+ size_t enable_cb_retry, enable_cb_success;
+};
+
+static struct dentry *virtio_stats;
+
+static void create_stat_files(struct vring_virtqueue *vq)
+{
+ char name[80];
+ unsigned int i;
+
+ /* Racy in theory, but we don't care. */
+ if (!virtio_stats)
+ virtio_stats = debugfs_create_dir("virtio-stats", NULL);
+
+ sprintf(name, "%s-%s", dev_name(&vq->vq.vdev->dev), vq->vq.name);
+ vq->statdir = debugfs_create_dir(name, virtio_stats);
+
+ for (i = 0; i < vq->vring.num; i++) {
+ struct dentry *dir;
+
+ sprintf(name, "%i", i);
+ dir = debugfs_create_dir(name, vq->statdir);
+ debugfs_create_size_t("interrupt_nowork", 0400, dir,
+ &vq->stats[i].interrupt_nowork);
+ debugfs_create_size_t("interrupt_work", 0400, dir,
+ &vq->stats[i].interrupt_work);
+ debugfs_create_size_t("kick_no_notify", 0400, dir,
+ &vq->stats[i].kick_no_notify);
+ debugfs_create_size_t("kick_notify", 0400, dir,
+ &vq->stats[i].kick_notify);
+ debugfs_create_size_t("add_notify", 0400, dir,
+ &vq->stats[i].add_notify);
+ debugfs_create_size_t("add_direct", 0400, dir,
+ &vq->stats[i].add_direct);
+ debugfs_create_size_t("add_indirect", 0400, dir,
+ &vq->stats[i].add_indirect);
+ debugfs_create_size_t("add_fail", 0400, dir,
+ &vq->stats[i].add_fail);
+ debugfs_create_size_t("get", 0400, dir,
+ &vq->stats[i].get);
+ debugfs_create_size_t("disable_cb", 0400, dir,
+ &vq->stats[i].disable_cb);
+ debugfs_create_size_t("enable_cb_retry", 0400, dir,
+ &vq->stats[i].enable_cb_retry);
+ debugfs_create_size_t("enable_cb_success", 0400, dir,
+ &vq->stats[i].enable_cb_success);
+ }
+}
+
+static void delete_stat_files(struct vring_virtqueue *vq)
+{
+ debugfs_remove_recursive(vq->statdir);
+}
+
+#define add_stat(vq, name) \
+ do { \
+ struct vring_virtqueue *_vq = (vq); \
+ _vq->stats[_vq->num_free - _vq->vring.num].name++; \
+ } while (0)
+
+#else
+#define add_stat(vq, name)
+static void delete_stat_files(struct vring_virtqueue *vq)
+{
+}
+#endif
+
/* Set up an indirect table of descriptors and add it to the queue. */
static int vring_add_indirect(struct vring_virtqueue *vq,
struct scatterlist sg[],
@@ -121,6 +208,8 @@ static int vring_add_indirect(struct vri
if (!desc)
return -ENOMEM;
+ add_stat(vq, add_indirect);
+
/* Transfer entries from the sg list into the indirect page */
for (i = 0; i < out; i++) {
desc[i].flags = VRING_DESC_F_NEXT;
@@ -183,17 +272,22 @@ int virtqueue_add_buf_gfp(struct virtque
BUG_ON(out + in == 0);
if (vq->num_free < out + in) {
+ add_stat(vq, add_fail);
pr_debug("Can't add buf len %i - avail = %i\n",
out + in, vq->num_free);
/* FIXME: for historical reasons, we force a notify here if
* there are outgoing parts to the buffer. Presumably the
* host should service the ring ASAP. */
- if (out)
+ if (out) {
+ add_stat(vq, add_notify);
vq->notify(&vq->vq);
+ }
END_USE(vq);
return -ENOSPC;
}
+ add_stat(vq, add_direct);
+
/* We're about to use some buffers from the free list. */
vq->num_free -= out + in;
@@ -248,9 +342,12 @@ void virtqueue_kick(struct virtqueue *_v
/* Need to update avail index before checking if we should notify */
virtio_mb();
- if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
+ if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY)) {
+ add_stat(vq, kick_notify);
/* Prod other side to tell it about changes. */
vq->notify(&vq->vq);
+ } else
+ add_stat(vq, kick_no_notify);
END_USE(vq);
}
@@ -294,6 +391,8 @@ void *virtqueue_get_buf(struct virtqueue
START_USE(vq);
+ add_stat(vq, get);
+
if (unlikely(vq->broken)) {
END_USE(vq);
return NULL;
@@ -333,6 +432,7 @@ void virtqueue_disable_cb(struct virtque
{
struct vring_virtqueue *vq = to_vvq(_vq);
+ add_stat(vq, disable_cb);
vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
}
EXPORT_SYMBOL_GPL(virtqueue_disable_cb);
@@ -348,10 +448,12 @@ bool virtqueue_enable_cb(struct virtqueu
vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
virtio_mb();
if (unlikely(more_used(vq))) {
+ add_stat(vq, enable_cb_retry);
END_USE(vq);
return false;
}
+ add_stat(vq, enable_cb_success);
END_USE(vq);
return true;
}
@@ -387,10 +489,12 @@ irqreturn_t vring_interrupt(int irq, voi
struct vring_virtqueue *vq = to_vvq(_vq);
if (!more_used(vq)) {
+ add_stat(vq, interrupt_nowork);
pr_debug("virtqueue interrupt with no work for %p\n", vq);
return IRQ_NONE;
}
+ add_stat(vq, interrupt_work);
if (unlikely(vq->broken))
return IRQ_HANDLED;
@@ -451,6 +555,15 @@ struct virtqueue *vring_new_virtqueue(un
}
vq->data[i] = NULL;
+#ifdef CONFIG_VIRTIO_STATS
+ vq->stats = kzalloc(sizeof(*vq->stats) * num, GFP_KERNEL);
+ if (!vq->stats) {
+ kfree(vq);
+ return NULL;
+ }
+ create_stat_files(vq);
+#endif
+
return &vq->vq;
}
EXPORT_SYMBOL_GPL(vring_new_virtqueue);
@@ -458,6 +571,7 @@ EXPORT_SYMBOL_GPL(vring_new_virtqueue);
void vring_del_virtqueue(struct virtqueue *vq)
{
list_del(&vq->list);
+ delete_stat_files(to_vvq(vq));
kfree(to_vvq(vq));
}
EXPORT_SYMBOL_GPL(vring_del_virtqueue);
next prev parent reply other threads:[~2011-02-09 0:37 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <OFD293DCD2.7F0260F0-ON86257823.0061DC39-86257823.00743BB3@us.ibm.com>
[not found] ` <20110126151700.GA14113@redhat.com>
[not found] ` <1296153874.1640.27.camel@localhost.localdomain>
[not found] ` <20110127190031.GC5228@redhat.com>
[not found] ` <1296155340.1640.34.camel@localhost.localdomain>
[not found] ` <20110127193131.GD5228@redhat.com>
[not found] ` <1296157547.1640.45.camel@localhost.localdomain>
2011-01-27 20:05 ` Network performance with small packets Michael S. Tsirkin
2011-01-27 20:15 ` Shirley Ma
2011-01-28 18:29 ` Steve Dobbelstein
2011-01-28 22:51 ` Steve Dobbelstein
2011-02-01 15:52 ` [PATCHv2 dontapply] vhost-net tx tuning Michael S. Tsirkin
2011-02-01 23:07 ` Sridhar Samudrala
2011-02-01 23:27 ` Shirley Ma
2011-02-02 4:36 ` Michael S. Tsirkin
2011-01-27 21:02 ` Network performance with small packets David Miller
2011-01-27 21:30 ` Shirley Ma
2011-01-28 12:16 ` Michael S. Tsirkin
2011-02-01 0:24 ` Steve Dobbelstein
2011-02-01 1:30 ` Sridhar Samudrala
2011-02-01 5:56 ` Michael S. Tsirkin
2011-02-01 21:09 ` Shirley Ma
2011-02-01 21:24 ` Michael S. Tsirkin
2011-02-01 21:32 ` Shirley Ma
2011-02-01 21:42 ` Michael S. Tsirkin
2011-02-01 21:53 ` Shirley Ma
2011-02-01 21:56 ` Michael S. Tsirkin
2011-02-01 22:59 ` Shirley Ma
2011-02-02 4:40 ` Michael S. Tsirkin
2011-02-02 6:05 ` Shirley Ma
2011-02-02 6:19 ` Shirley Ma
2011-02-02 6:29 ` Michael S. Tsirkin
2011-02-02 7:14 ` Shirley Ma
2011-02-02 7:33 ` Shirley Ma
2011-02-02 10:49 ` Michael S. Tsirkin
2011-02-02 15:42 ` Shirley Ma
2011-02-02 15:48 ` Michael S. Tsirkin
2011-02-02 17:12 ` Shirley Ma
2011-02-02 18:20 ` Michael S. Tsirkin
2011-02-02 18:26 ` Shirley Ma
2011-02-02 10:48 ` Michael S. Tsirkin
2011-02-02 6:34 ` Krishna Kumar2
2011-02-02 7:03 ` Shirley Ma
2011-02-02 7:37 ` Krishna Kumar2
2011-02-02 10:48 ` Michael S. Tsirkin
2011-02-02 15:39 ` Shirley Ma
2011-02-02 15:47 ` Michael S. Tsirkin
2011-02-02 17:10 ` Shirley Ma
2011-02-02 17:32 ` Michael S. Tsirkin
2011-02-02 18:11 ` Shirley Ma
2011-02-02 18:27 ` Michael S. Tsirkin
2011-02-02 19:29 ` Shirley Ma
2011-02-02 20:17 ` Michael S. Tsirkin
2011-02-02 21:03 ` Shirley Ma
2011-02-02 21:20 ` Michael S. Tsirkin
2011-02-02 21:41 ` Shirley Ma
2011-02-03 5:59 ` Michael S. Tsirkin
2011-02-03 6:09 ` Shirley Ma
2011-02-03 6:16 ` Michael S. Tsirkin
2011-02-03 5:05 ` Shirley Ma
2011-02-03 6:13 ` Michael S. Tsirkin
2011-02-03 15:58 ` Shirley Ma
2011-02-03 16:20 ` Michael S. Tsirkin
2011-02-03 17:18 ` Shirley Ma
2011-02-01 5:54 ` Michael S. Tsirkin
2011-02-01 17:23 ` Michael S. Tsirkin
[not found] ` <1296590943.26937.797.camel@localhost.localdomain>
[not found] ` <20110201201715.GA30050@redhat.com>
2011-02-01 20:25 ` Shirley Ma
2011-02-01 21:21 ` Michael S. Tsirkin
2011-02-01 21:28 ` Shirley Ma
2011-02-01 21:41 ` Michael S. Tsirkin
2011-02-02 4:39 ` Krishna Kumar2
2011-02-02 4:42 ` Michael S. Tsirkin
2011-02-09 0:37 ` Rusty Russell [this message]
2011-02-09 0:53 ` Michael S. Tsirkin
2011-02-09 1:39 ` Rusty Russell
2011-02-09 1:55 ` Michael S. Tsirkin
2011-02-09 7:43 ` Stefan Hajnoczi
2011-03-08 21:57 ` Shirley Ma
2011-03-09 2:21 ` Andrew Theurer
2011-03-09 15:42 ` Shirley Ma
2011-03-10 1:49 ` Rusty Russell
2011-04-12 20:01 ` Michael S. Tsirkin
2011-04-14 11:28 ` Rusty Russell
2011-04-14 12:40 ` Michael S. Tsirkin
2011-04-14 16:03 ` Michael S. Tsirkin
2011-04-19 0:33 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201102091107.20270.rusty@rustcorp.com.au \
--to=rusty@rustcorp$(echo .)com.au \
--cc=davem@davemloft$(echo .)net \
--cc=krkumar2@in$(echo .)ibm.com \
--cc=kvm@vger$(echo .)kernel.org \
--cc=mashirle@us$(echo .)ibm.com \
--cc=mst@redhat$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=steved@us$(echo .)ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox