From: Eric Dumazet <dada1@cosmosbay•com>
To: David Miller <davem@davemloft•net>
Cc: jarkao2@gmail•com, vexwek@gmail•com, netdev@vger•kernel.org,
kaber@trash•net, devik@cdi•cz
Subject: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
Date: Tue, 19 May 2009 01:59:55 +0200 [thread overview]
Message-ID: <4A11F67B.3050805@cosmosbay.com> (raw)
In-Reply-To: <20090518.145233.212710505.davem@davemloft.net>
David Miller a écrit :
> From: Jarek Poplawski <jarkao2@gmail•com>
> Date: Mon, 18 May 2009 19:23:49 +0200
>
>> On Mon, May 18, 2009 at 06:40:56PM +0200, Eric Dumazet wrote:
>>> With a typical estimator "1sec 8sec", ewma_log value is 3
>>>
>>> At gigabit speeds, we are very close to overflow yes, since
>>> we only have 27 bits available, so 134217728 bytes per second
>>> or 1073741824 bits per second.
>>>
>>> So formula :
>>> e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
>>> is going to overflow.
>>>
>>> One way to avoid the overflow would be to use a smaller estimator, like "500ms 4sec"
>>>
>>> Or use a 64bits rate & avbps, this is needed fo 10Gb speeds I suppose...
>> Yes, I considered this too, but because of an overhead I decided to
>> fix as designed (according to the comment) for now. But probably you
>> are right, and we should go further, so I'm OK with your patch.
>
> I like this patch too, Eric can you submit this formally with
> proper signoffs etc.?
>
Sure, here it is. We might need a similar patch to get a correct pps value
too, since we currently are limited to ~ 2^21 packets per second.
[PATCH] pkt_sched: gen_estimator: use 64 bit intermediate counters for bps
gen_estimator can overflow bps (bytes per second) with Gb links, while
it was designed with a u32 API, with a theorical limit of 34360Mbit (2^32 bytes)
Using 64 bit intermediate avbps/brate counters can allow us to reach this
theorical limit.
Signed-off-by: Eric Dumazet <dada1@cosmosbay•com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail•com>
---
diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index 9cc9f95..ea28659 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -66,9 +66,9 @@
NOTES.
- * The stored value for avbps is scaled by 2^5, so that maximal
- rate is ~1Gbit, avpps is scaled by 2^10.
-
+ * avbps is scaled by 2^5, avpps is scaled by 2^10.
+ * both values are reported as 32 bit unsigned values. bps can
+ overflow for fast links : max speed being 34360Mbit/sec
* Minimal interval is HZ/4=250msec (it is the greatest common divisor
for HZ=100 and HZ=1024 8)), maximal interval
is (HZ*2^EST_MAX_INTERVAL)/4 = 8sec. Shorter intervals
@@ -86,9 +86,9 @@ struct gen_estimator
spinlock_t *stats_lock;
int ewma_log;
u64 last_bytes;
+ u64 avbps;
u32 last_packets;
u32 avpps;
- u32 avbps;
struct rcu_head e_rcu;
struct rb_node node;
};
@@ -115,6 +115,7 @@ static void est_timer(unsigned long arg)
rcu_read_lock();
list_for_each_entry_rcu(e, &elist[idx].list, list) {
u64 nbytes;
+ u64 brate;
u32 npackets;
u32 rate;
@@ -125,9 +126,9 @@ static void est_timer(unsigned long arg)
nbytes = e->bstats->bytes;
npackets = e->bstats->packets;
- rate = (nbytes - e->last_bytes)<<(7 - idx);
+ brate = (nbytes - e->last_bytes)<<(7 - idx);
e->last_bytes = nbytes;
- e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
+ e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
e->rate_est->bps = (e->avbps+0xF)>>5;
rate = (npackets - e->last_packets)<<(12 - idx);
next prev parent reply other threads:[~2009-05-19 0:00 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <298f5c050905150745p13dc226eia1ff50ffa8c4b300@mail.gmail.com>
2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
2009-05-15 18:12 ` Stephen Hemminger
2009-05-18 10:01 ` Antonio Almeida
2009-05-18 10:45 ` Jarek Poplawski
2009-05-18 12:27 ` Antonio Almeida
2009-05-18 12:32 ` Jarek Poplawski
2009-05-18 16:13 ` Stephen Hemminger
2009-05-18 18:03 ` Antonio Almeida
2009-05-18 22:02 ` Stephen Hemminger
2009-05-19 11:48 ` Antonio Almeida
2009-05-19 13:08 ` Antonio Almeida
2009-05-16 8:31 ` Jarek Poplawski
2009-05-18 10:39 ` Antonio Almeida
2009-05-18 11:14 ` Jarek Poplawski
2009-05-18 12:05 ` Antonio Almeida
2009-05-16 14:14 ` Jarek Poplawski
2009-05-18 14:36 ` Antonio Almeida
2009-05-18 23:14 ` Vladimir Ivashchenko
2009-05-18 23:27 ` Vladimir Ivashchenko
2009-05-19 11:03 ` Jarek Poplawski
2009-05-19 14:04 ` Vladimir Ivashchenko
2009-05-19 20:10 ` Jarek Poplawski
2009-05-20 22:07 ` Vladimir Ivashchenko
2009-05-20 22:46 ` Eric Dumazet
2009-05-21 7:20 ` Jarek Poplawski
2009-05-21 7:44 ` Vladimir Ivashchenko
2009-05-21 8:28 ` Jarek Poplawski
2009-05-21 9:07 ` Eric Dumazet
2009-05-21 9:22 ` Jarek Poplawski
2009-05-23 10:37 ` HTB accuracy for high speed (and bonding) Vladimir Ivashchenko
2009-05-23 14:34 ` Jarek Poplawski
2009-05-23 15:06 ` Vladimir Ivashchenko
2009-05-23 15:35 ` Jarek Poplawski
2009-05-23 15:53 ` Vladimir Ivashchenko
2009-05-23 16:02 ` Jarek Poplawski
2009-05-18 16:40 ` HTB accuracy for high speed Eric Dumazet
2009-05-18 17:23 ` Jarek Poplawski
2009-05-18 21:52 ` David Miller
2009-05-18 23:59 ` Eric Dumazet [this message]
2009-05-19 2:27 ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps David Miller
2009-05-19 7:02 ` Jarek Poplawski
2009-05-19 7:31 ` Eric Dumazet
2009-05-19 7:42 ` Jarek Poplawski
2009-05-19 7:57 ` Jarek Poplawski
2009-05-19 18:03 ` Eric Dumazet
2009-05-19 19:09 ` [PATCH] pkt_sched: gen_estimator: Fix signed integers right-shifts Jarek Poplawski
2009-05-26 5:47 ` David Miller
2009-05-19 8:18 ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps David Miller
2009-05-17 20:15 ` HTB accuracy for high speed Jarek Poplawski
2009-05-18 6:56 ` [PATCH iproute2] " Jarek Poplawski
2009-05-18 16:54 ` Antonio Almeida
2009-05-18 17:16 ` Antonio Almeida
2009-05-21 8:51 ` Jarek Poplawski
2009-05-22 17:42 ` Antonio Almeida
2009-05-23 7:32 ` Jarek Poplawski
2009-05-28 18:13 ` Antonio Almeida
2009-05-28 21:12 ` Jarek Poplawski
2009-05-29 17:02 ` Antonio Almeida
2009-05-29 17:28 ` Stephen Hemminger
2009-05-29 19:58 ` Jarek Poplawski
2009-05-29 19:46 ` Jarek Poplawski
2009-05-29 20:49 ` Stephen Hemminger
2009-05-29 20:59 ` Jarek Poplawski
2009-05-30 20:07 ` Jarek Poplawski
2009-06-02 10:12 ` Antonio Almeida
2009-06-02 11:45 ` Antonio Almeida
2009-06-02 12:36 ` Jarek Poplawski
2009-06-02 12:45 ` Patrick McHardy
2009-06-02 13:08 ` Jarek Poplawski
2009-06-02 13:20 ` Patrick McHardy
2009-06-02 21:37 ` Jarek Poplawski
2009-06-02 21:50 ` Jarek Poplawski
2009-06-03 7:06 ` Patrick McHardy
2009-06-03 7:40 ` Jarek Poplawski
2009-06-03 7:53 ` Patrick McHardy
2009-06-03 8:01 ` Jarek Poplawski
2009-06-03 8:29 ` Patrick McHardy
2009-06-03 8:45 ` Jarek Poplawski
2009-06-03 9:54 ` Jarek Poplawski
2009-06-03 10:01 ` Patrick McHardy
2009-06-03 10:05 ` Patrick McHardy
2009-06-03 10:06 ` Patrick McHardy
2009-06-03 10:27 ` Jarek Poplawski
2009-06-04 13:50 ` Antonio Almeida
[not found] ` <20090604193013.GA2755@ami.dom.local>
[not found] ` <4A282216.20203@trash.net>
[not found] ` <20090604194203.GB2755@ami.dom.local>
2009-06-09 5:25 ` Badalian Vyacheslav
2009-06-09 5:49 ` Jarek Poplawski
2009-06-04 4:53 ` David Miller
2009-06-04 7:50 ` Jarek Poplawski
2009-05-18 17:53 ` Jarek Poplawski
2009-05-18 18:23 ` Antonio Almeida
2009-05-18 18:32 ` Jarek Poplawski
2009-05-18 18:56 ` Antonio Almeida
2009-05-18 19:05 ` Jarek Poplawski
2009-05-19 10:55 ` Antonio Almeida
2009-05-19 11:04 ` Denys Fedoryschenko
2009-05-19 11:18 ` Jarek Poplawski
2009-05-19 11:21 ` Denys Fedoryschenko
2009-05-19 11:28 ` Jarek Poplawski
2009-05-19 14:31 ` Antonio Almeida
2009-05-19 11:09 ` Jarek Poplawski
2009-05-19 13:18 ` Jesper Dangaard Brouer
2009-05-19 19:35 ` Jarek Poplawski
2009-05-18 7:01 ` [PATCH iproute2 v2] " Jarek Poplawski
2009-05-17 20:29 ` Vladimir Ivashchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A11F67B.3050805@cosmosbay.com \
--to=dada1@cosmosbay$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=devik@cdi$(echo .)cz \
--cc=jarkao2@gmail$(echo .)com \
--cc=kaber@trash$(echo .)net \
--cc=netdev@vger$(echo .)kernel.org \
--cc=vexwek@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox