public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash•net>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: David Miller <davem@davemloft•net>, netdev@vger•kernel.org
Subject: Re: net_sched 00/07: classful multiqueue dummy scheduler
Date: Mon, 07 Sep 2009 16:23:27 +0200	[thread overview]
Message-ID: <4AA5175F.6030600@trash.net> (raw)
In-Reply-To: <4AA50ACF.9010400@trash.net>

[-- Attachment #1: Type: text/plain, Size: 2247 bytes --]

Patrick McHardy wrote:
> Eric Dumazet wrote:
>> Had very litle time to test this, but got problems very fast, if rate estimator configured.
> 
> I didn't test that, but I'll look into it.
> 
>> qdisc mq 1: root
>>  Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
>>  rate 177925Kbit 49pps backlog 0b 0p requeues 0
>> qdisc pfifo 8001: parent 1:1 limit 1000p
>>  Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
>>  rate 25400bit 21pps backlog 0b 0p requeues 0
>>
>> <<<crash>>>
> 
> Did you capture the crash?
> 
>> (On another term I had a "ping -i 0.1 192.168.20.120" that gave :
>>
>> 2009/08/07 14:53:42.498 64 bytes from 192.168.20.120: icmp_seq=1982 ttl=64 time=0.126 ms
>> 2009/08/07 14:53:42.598 64 bytes from 192.168.20.120: icmp_seq=1983 ttl=64 time=0.118 ms
>> 2009/08/07 14:53:42.698 64 bytes from 192.168.20.120: icmp_seq=1984 ttl=64 time=0.114 ms
>> 2009/08/07 14:53:42.798 64 bytes from 192.168.20.120: icmp_seq=1985 ttl=64 time=0.123 ms
>> 2009/08/07 14:53:42.898 64 bytes from 192.168.20.120: icmp_seq=1986 ttl=64 time=0.126 ms
>> 2009/08/07 14:53:42.998 64 bytes from 192.168.20.120: icmp_seq=1987 ttl=64 time=0.119 ms
>> 2009/08/07 14:53:43.098 64 bytes from 192.168.20.120: icmp_seq=1988 ttl=64 time=0.122 ms
>> 2009/08/07 14:53:43.198 64 bytes from 192.168.20.120: icmp_seq=1989 ttl=64 time=0.119 ms
>> 2009/08/07 14:53:43.298 64 bytes from 192.168.20.120: icmp_seq=1990 ttl=64 time=0.117 ms
>> 2009/08/07 14:53:43.398 64 bytes from 192.168.20.120: icmp_seq=1991 ttl=64 time=0.117 ms
>> ping: sendmsg: No buffer space available
> 
> Was this also with rate estimators? No buffer space available
> indicates that some class/qdisc isn't dequeued or the packets
> are leaking, so the output of tc -s -d qdisc show ... might be
> helpful.

I figured out the bug, which is likely responsible for both
problems. When grafting a mq class and creating a rate estimator,
the new qdisc is not attached to the device queue yet and also
doesn't have TC_H_ROOT as parent, so qdisc_create() selects
qdisc_root_sleeping_lock() for the estimator, which belongs to
the qdisc that is getting replaced.

This is a patch I used for testing, but I'll come up with
something more elegant (I hope) as a final fix :)



[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1497 bytes --]

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 2a78d54..428eb34 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -732,7 +732,8 @@ static struct lock_class_key qdisc_rx_lock;
  */
 
 static struct Qdisc *
-qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
+qdisc_create(struct net_device *dev, struct Qdisc *p,
+	     struct netdev_queue *dev_queue,
 	     u32 parent, u32 handle, struct nlattr **tca, int *errp)
 {
 	int err;
@@ -810,8 +811,9 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
 		if (tca[TCA_RATE]) {
 			spinlock_t *root_lock;
 
-			if ((sch->parent != TC_H_ROOT) &&
-			    !(sch->flags & TCQ_F_INGRESS))
+			if (((sch->parent != TC_H_ROOT) &&
+			     !(sch->flags & TCQ_F_INGRESS)) &&
+			    (!p || !p->ops->attach))
 				root_lock = qdisc_root_sleeping_lock(sch);
 			else
 				root_lock = qdisc_lock(sch);
@@ -1097,7 +1099,7 @@ create_n_graft:
 	if (!(n->nlmsg_flags&NLM_F_CREATE))
 		return -ENOENT;
 	if (clid == TC_H_INGRESS)
-		q = qdisc_create(dev, &dev->rx_queue,
+		q = qdisc_create(dev, p, &dev->rx_queue,
 				 tcm->tcm_parent, tcm->tcm_parent,
 				 tca, &err);
 	else {
@@ -1106,7 +1108,7 @@ create_n_graft:
 		if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue)
 			ntx = p->ops->cl_ops->select_queue(p, tcm);
 
-		q = qdisc_create(dev, netdev_get_tx_queue(dev, ntx),
+		q = qdisc_create(dev, p, netdev_get_tx_queue(dev, ntx),
 				 tcm->tcm_parent, tcm->tcm_handle,
 				 tca, &err);
 	}

  reply	other threads:[~2009-09-07 14:23 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-04 16:41 net_sched 00/07: classful multiqueue dummy scheduler Patrick McHardy
2009-09-04 16:41 ` net_sched 01/07: fix class grafting errno codes Patrick McHardy
2009-09-04 16:41 ` net_sched 02/07: make cls_ops->tcf_chain() optional Patrick McHardy
2009-09-05  8:13   ` Jarek Poplawski
2009-09-05 11:57     ` Jarek Poplawski
2009-09-05 12:32       ` Jarek Poplawski
2009-09-05 17:03         ` Patrick McHardy
2009-09-06  9:06           ` David Miller
2009-09-04 16:41 ` net_sched 03/07: make cls_ops->change and cls_ops->delete optional Patrick McHardy
2009-09-04 16:41 ` net_sched 04/07: remove some unnecessary checks in classful schedulers Patrick McHardy
2009-09-04 16:41 ` net_sched 05/07: reintroduce dev->qdisc for use by sch_api Patrick McHardy
2009-09-06 18:57   ` Jarek Poplawski
2009-09-07 13:16     ` Patrick McHardy
2009-09-07 16:49       ` Jarek Poplawski
2009-09-04 16:41 ` net_sched 06/07: move dev_graft_qdisc() to sch_generic.c Patrick McHardy
2009-09-04 16:41 ` net_sched 07/07: add classful multiqueue dummy scheduler Patrick McHardy
2009-09-06 20:04   ` Jarek Poplawski
2009-09-07 13:27     ` Patrick McHardy
2009-09-07 18:22       ` Jarek Poplawski
2009-09-07 19:24       ` Jarek Poplawski
2009-09-07 19:49         ` Eric Dumazet
2009-09-09 16:02           ` Patrick McHardy
2009-09-09 19:52             ` Jarek Poplawski
2009-09-10 11:28               ` Patrick McHardy
2009-09-11 21:38                 ` Jarek Poplawski
2009-09-11 22:10                   ` David Miller
2009-09-11 22:21                     ` Jarek Poplawski
2009-09-11 22:27                       ` David Miller
2009-09-09 16:01         ` Patrick McHardy
2009-09-04 16:42 ` net_sched 00/07: " Patrick McHardy
2009-09-07  8:50   ` David Miller
2009-09-07  9:46     ` Jarek Poplawski
2009-09-07 13:00     ` Eric Dumazet
2009-09-07 13:29       ` Patrick McHardy
2009-09-07 14:23         ` Patrick McHardy [this message]
2009-09-07 17:21           ` Eric Dumazet
2009-09-07 17:28             ` Patrick McHardy
2009-09-07 17:30               ` Eric Dumazet
2009-09-07 17:33                 ` Patrick McHardy
2009-09-07 17:38                   ` Eric Dumazet
2009-09-07 17:46                     ` Patrick McHardy
2009-09-08  9:31           ` David Miller
2009-09-08 15:53             ` Patrick McHardy
2009-09-05  7:27 ` David Miller
2009-09-05 17:02   ` Patrick McHardy
2009-09-06  9:01     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AA5175F.6030600@trash.net \
    --to=kaber@trash$(echo .)net \
    --cc=davem@davemloft$(echo .)net \
    --cc=eric.dumazet@gmail$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox