public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Andrew Gallatin <gallatin@myri•com>
To: Eric Dumazet <dada1@cosmosbay•com>
Cc: Herbert Xu <herbert@gondor•apana.org.au>,
	David Miller <davem@davemloft•net>,
	brice@myri•com, sgruszka@redhat•com, netdev@vger•kernel.org
Subject: Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment
Date: Wed, 29 Apr 2009 10:18:39 -0400	[thread overview]
Message-ID: <49F861BF.7060403@myri.com> (raw)
In-Reply-To: <49F85BF1.1020501@cosmosbay.com>

Eric Dumazet wrote:
 > Andrew Gallatin a écrit :
 >> Andrew Gallatin wrote:
 >>> For variety, I grabbed a different "slow" receiver.  This is another
 >>> 2 CPU machine, but a dual-socket single-core opteron (Tyan S2895)
 >>>
 >>> processor       : 0
 >>> vendor_id       : AuthenticAMD
 >>> cpu family      : 15
 >>> model           : 37
 >>> model name      : AMD Opteron(tm) Processor 252
 >> <...>
 >>> The sender was an identical machine running an ancient RHEL4 kernel
 >>> (2.6.9-42.ELsmp) and our downloadable (backported) driver.
 >>> (http://www.myri.com/ftp/pub/Myri10GE/myri10ge-linux.1.4.4.tgz)
 >>> I disabled LRO, on the sender.
 >>>
 >>> Binding the IRQ to CPU0, and the netserver to CPU1 I see 8.1Gb/s with
 >>> LRO and 8.0Gb/s with GRO.
 >> With the recent patch to fix idle CPU time accounting from LKML applied,
 >> it is again possible to trust netperf's service demand (based on %CPU).
 >> So here is raw netperf output for LRO and GRO, bound as above.
 >>
 >> TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
 >> hail1-m.sw.myri.com (10.0.130.167) port 0 AF_INET : cpu bind
 >> Recv   Send    Send                          Utilization       Service
 >> Demand
 >> Socket Socket  Message  Elapsed              Send     Recv     Send 
    Recv
 >> Size   Size    Size     Time     Throughput  local    remote   local 
remote
 >> bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB
 >> us/KB
 >>
 >> LRO:
 >>  87380  65536  65536    60.00      8279.36   8.10     77.55    0.160 
1.535
 >> GRO:
 >>  87380  65536  65536    60.00      8053.19   7.86     85.47    0.160 
1.739
 >>
 >> The difference is bigger if you disable TCP timestamps (and thus shrink
 >> the packets headers down so they require fewer cachelines):
 >> LRO:
 >>  87380  65536  65536    60.02      7753.55   8.01     74.06    0.169 
1.565
 >> GRO:
 >>  87380  65536  65536    60.02      7535.12   7.27     84.57    0.158 
1.839
 >>
 >>
 >> As you can see, even though the raw bandwidth is very close, the
 >> service demand makes it clear that GRO is more expensive
 >> than LRO.  I just wish I understood why.
 >>
 >
 > What are "vmstat 1" ouputs on both tests ? Any difference on say... 
context switches ?

Not much difference is apparent from vmstat, except for a
lower load and slightly higher IRQ rate from LRO:

LRO:
procs -----------memory---------- ---swap-- -----io---- --system-- 
-----cpu------
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa st
  1  0      0 676960  19280 209812    0    0     0     0 14817   24  0 
73 27  0  0
  1  0      0 677084  19280 209812    0    0     0     0 14834   20  0 
73 27  0  0
  1  0      0 676916  19280 209812    0    0     0     0 14833   16  0 
74 26  0  0


GRO:
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa st
  1  0      0 678244  18008 209784    0    0     0    24 14288   32  0 
84 16  0  0
  1  0      0 678268  18008 209788    0    0     0     0 14403   22  0 
85 15  0  0
  1  0      0 677956  18008 209788    0    0     0     0 14331   20  0 
84 16  0  0




The real difference is visible mainly from mpstat on the CPU handing the
interrupts where you see softirq is much higher:

LRO:
07:15:16     CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal 
   %idle    intr/s
07:15:17       0    0.00    0.00    0.00    0.00    0.00   45.00    0.00 
   55.00  12907.92
07:15:18       0    0.00    0.00    1.00    0.00    2.00   43.00    0.00 
   54.00  12707.92
07:15:19       0    0.00    0.00    1.00    0.00    0.00   46.00    0.00 
   53.00  12825.00


GRO
07:11:59     CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal 
   %idle    intr/s
07:12:00       0    0.00    0.00    0.00    0.00    0.99   66.34    0.00 
   32.67  12242.57
07:12:01       0    0.00    0.00    0.00    0.00    1.01   66.67    0.00 
   32.32  12220.00
07:12:02       0    0.00    0.00    0.99    0.00    0.99   65.35    0.00 
   32.67  12336.00


So it is like "something" GRO is doing in the softirq context is more
expensive than what LRO is doing.

Drew

  reply	other threads:[~2009-04-29 14:19 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-15  8:09 [PATCH] myr10ge: again fix lro_gen_skb() alignment Stanislaw Gruszka
2009-04-15  9:28 ` David Miller
2009-04-15  9:48   ` Brice Goglin
2009-04-15 10:02     ` David Miller
2009-04-15 13:01       ` Andrew Gallatin
2009-04-15 21:04         ` Andrew Gallatin
2009-04-15 23:42           ` David Miller
2009-04-16  8:50             ` Herbert Xu
2009-04-16  9:02               ` David Miller
2009-04-21 19:19               ` Andrew Gallatin
2009-04-22 10:48                 ` Herbert Xu
2009-04-22 15:37                   ` Andrew Gallatin
2009-04-24  5:45                     ` Herbert Xu
2009-04-24 12:45                       ` Andrew Gallatin
2009-04-24 12:51                         ` Herbert Xu
2009-04-24 17:13                         ` Rick Jones
2009-04-24 16:16                       ` Andrew Gallatin
2009-04-24 16:30                         ` Herbert Xu
2009-04-24 16:31                           ` Herbert Xu
2009-04-27  8:05                         ` Herbert Xu
2009-04-27  8:07                           ` Herbert Xu
2009-04-27  9:32                             ` David Miller
2009-04-27 11:01                               ` Herbert Xu
2009-04-27 12:45                             ` David Miller
2009-04-27 12:45                           ` David Miller
2009-04-28  6:12                           ` Herbert Xu
2009-04-28 15:00                             ` Andrew Gallatin
2009-04-28 15:02                               ` David Miller
2009-04-28 15:20                               ` Herbert Xu
2009-04-28 15:44                                 ` Andrew Gallatin
2009-04-28 21:12                                 ` Andrew Gallatin
2009-04-29 13:42                                   ` Andrew Gallatin
2009-04-29 13:53                                     ` Eric Dumazet
2009-04-29 14:18                                       ` Andrew Gallatin [this message]
2009-04-29 15:26                                         ` Eric Dumazet
2009-04-29 17:28                                           ` Andrew Gallatin
2009-04-30  8:10                                             ` Herbert Xu
2009-04-30  8:14                                               ` Herbert Xu
2009-04-30  8:17                                             ` Eric Dumazet
2009-04-30 19:14                                               ` Andrew Gallatin
2009-04-23  8:00                 ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49F861BF.7060403@myri.com \
    --to=gallatin@myri$(echo .)com \
    --cc=brice@myri$(echo .)com \
    --cc=dada1@cosmosbay$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=herbert@gondor$(echo .)apana.org.au \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=sgruszka@redhat$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox