From: Jesper Dangaard Brouer <brouer@redhat•com>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: Rick Jones <rick.jones2@hpe•com>,
netdev@vger•kernel.org, Saeed Mahameed <saeedm@mellanox•com>,
Tariq Toukan <tariqt@mellanox•com>,
brouer@redhat•com
Subject: Re: Netperf UDP issue with connected sockets
Date: Thu, 17 Nov 2016 22:19:41 +0100 [thread overview]
Message-ID: <20161117221941.3b525181@redhat.com> (raw)
In-Reply-To: <1479408683.8455.273.camel@edumazet-glaptop3.roam.corp.google.com>
On Thu, 17 Nov 2016 10:51:23 -0800
Eric Dumazet <eric.dumazet@gmail•com> wrote:
> On Thu, 2016-11-17 at 19:30 +0100, Jesper Dangaard Brouer wrote:
>
> > The point is I can see a socket Send-Q forming, thus we do know the
> > application have something to send. Thus, and possibility for
> > non-opportunistic bulking. Allowing/implementing bulk enqueue from
> > socket layer into qdisc layer, should be fairly simple (and rest of
> > xmit_more is already in place).
>
>
> As I said, you are fooled by TX completions.
>
> Please make sure to increase the sndbuf limits !
>
> echo 2129920 >/proc/sys/net/core/wmem_default
>
> lpaa23:~# sar -n DEV 1 10|grep eth1
> 10:49:25 eth1 7.00 9273283.00 0.61 2187214.90 0.00 0.00 0.00
> 10:49:26 eth1 1.00 9230795.00 0.06 2176787.57 0.00 0.00 1.00
> 10:49:27 eth1 2.00 9247906.00 0.17 2180915.45 0.00 0.00 0.00
> 10:49:28 eth1 3.00 9246542.00 0.23 2180790.38 0.00 0.00 1.00
> 10:49:29 eth1 1.00 9239218.00 0.06 2179044.83 0.00 0.00 0.00
> 10:49:30 eth1 3.00 9248775.00 0.23 2181257.84 0.00 0.00 1.00
> 10:49:31 eth1 4.00 9225471.00 0.65 2175772.75 0.00 0.00 0.00
> 10:49:32 eth1 2.00 9253536.00 0.33 2182666.44 0.00 0.00 1.00
> 10:49:33 eth1 1.00 9265900.00 0.06 2185598.40 0.00 0.00 0.00
> 10:49:34 eth1 1.00 6949031.00 0.06 1638889.63 0.00 0.00 1.00
> Average: eth1 2.50 9018045.70 0.25 2126893.82 0.00 0.00 0.50
>
>
> lpaa23:~# ethtool -S eth1|grep more; sleep 1;ethtool -S eth1|grep more
> xmit_more: 2251366909
> xmit_more: 2256011392
>
> lpaa23:~# echo 2256011392-2251366909 | bc
> 4644483
xmit more not happen that frequently for my setup, it does happen
sometimes. And I do monitor with "ethtool -S".
~/git/network-testing/bin/ethtool_stats.pl --sec 2 --dev mlx5p2
Show adapter(s) (mlx5p2) statistics (ONLY that changed!)
Ethtool(mlx5p2 ) stat: 92900913 ( 92,900,913) <= tx0_bytes /sec
Ethtool(mlx5p2 ) stat: 36073 ( 36,073) <= tx0_nop /sec
Ethtool(mlx5p2 ) stat: 1548349 ( 1,548,349) <= tx0_packets /sec
Ethtool(mlx5p2 ) stat: 1 ( 1) <= tx0_xmit_more /sec
Ethtool(mlx5p2 ) stat: 92884899 ( 92,884,899) <= tx_bytes /sec
Ethtool(mlx5p2 ) stat: 99297696 ( 99,297,696) <= tx_bytes_phy /sec
Ethtool(mlx5p2 ) stat: 1548082 ( 1,548,082) <= tx_csum_partial /sec
Ethtool(mlx5p2 ) stat: 1548082 ( 1,548,082) <= tx_packets /sec
Ethtool(mlx5p2 ) stat: 1551527 ( 1,551,527) <= tx_packets_phy /sec
Ethtool(mlx5p2 ) stat: 99076658 ( 99,076,658) <= tx_prio1_bytes /sec
Ethtool(mlx5p2 ) stat: 1548073 ( 1,548,073) <= tx_prio1_packets /sec
Ethtool(mlx5p2 ) stat: 92936078 ( 92,936,078) <= tx_vport_unicast_bytes /sec
Ethtool(mlx5p2 ) stat: 1548934 ( 1,548,934) <= tx_vport_unicast_packets /sec
Ethtool(mlx5p2 ) stat: 1 ( 1) <= tx_xmit_more /sec
(after several attempts I got:)
$ ethtool -S mlx5p2|grep more; sleep 1;ethtool -S mlx5p2|grep more
tx_xmit_more: 14048
tx0_xmit_more: 14048
tx_xmit_more: 14049
tx0_xmit_more: 14049
This was with:
$ grep -H . /proc/sys/net/core/wmem_default
/proc/sys/net/core/wmem_default:2129920
> PerfTop: 76969 irqs/sec kernel:96.6% exact: 100.0% [4000Hz cycles:pp], (all, 48 CPUs)
> ---------------------------------------------------------------------------------------------
>
> 11.64% [kernel] [k] skb_set_owner_w
> 6.21% [kernel] [k] queued_spin_lock_slowpath
> 4.76% [kernel] [k] _raw_spin_lock
> 4.40% [kernel] [k] __ip_make_skb
> 3.10% [kernel] [k] sock_wfree
> 2.87% [kernel] [k] ipt_do_table
> 2.76% [kernel] [k] fq_dequeue
> 2.71% [kernel] [k] mlx4_en_xmit
> 2.50% [kernel] [k] __dev_queue_xmit
> 2.29% [kernel] [k] __ip_append_data.isra.40
> 2.28% [kernel] [k] udp_sendmsg
> 2.01% [kernel] [k] __alloc_skb
> 1.90% [kernel] [k] napi_consume_skb
> 1.63% [kernel] [k] udp_send_skb
> 1.62% [kernel] [k] skb_release_data
> 1.62% [kernel] [k] entry_SYSCALL_64_fastpath
> 1.56% [kernel] [k] dev_hard_start_xmit
> 1.55% udpsnd [.] __libc_send
> 1.48% [kernel] [k] netif_skb_features
> 1.42% [kernel] [k] __qdisc_run
> 1.35% [kernel] [k] sk_dst_check
> 1.33% [kernel] [k] sock_def_write_space
> 1.30% [kernel] [k] kmem_cache_alloc_node_trace
> 1.29% [kernel] [k] __local_bh_enable_ip
> 1.21% [kernel] [k] copy_user_enhanced_fast_string
> 1.08% [kernel] [k] __kmalloc_reserve.isra.40
> 1.08% [kernel] [k] SYSC_sendto
> 1.07% [kernel] [k] kmem_cache_alloc_node
> 0.95% [kernel] [k] ip_finish_output2
> 0.95% [kernel] [k] ktime_get
> 0.91% [kernel] [k] validate_xmit_skb
> 0.88% [kernel] [k] sock_alloc_send_pskb
> 0.82% [kernel] [k] sock_sendmsg
I'm more interested in why I see fib_table_lookup() and
__ip_route_output_key_hash() when you don't ?!? There must be some
mistake in my setup!
Maybe you can share your udp flood "udpsnd" program source?
Maybe I'm missing some important sysctl /proc/net/sys/ ?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
p.s. I placed my testing software here:
https://github.com/netoptimizer/network-testing/tree/master/src
next prev parent reply other threads:[~2016-11-17 21:19 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-03 14:59 High perf top ip_idents_reserve doing netperf UDP_STREAM Jesper Dangaard Brouer
2014-09-03 15:17 ` Eric Dumazet
2016-11-16 12:16 ` Netperf UDP issue with connected sockets Jesper Dangaard Brouer
2016-11-16 17:46 ` Rick Jones
2016-11-16 22:40 ` Jesper Dangaard Brouer
2016-11-16 22:50 ` Rick Jones
2016-11-17 0:34 ` Eric Dumazet
2016-11-17 8:16 ` Jesper Dangaard Brouer
2016-11-17 13:20 ` Eric Dumazet
2016-11-17 13:42 ` Jesper Dangaard Brouer
2016-11-17 14:17 ` Eric Dumazet
2016-11-17 14:57 ` Jesper Dangaard Brouer
2016-11-17 16:21 ` Eric Dumazet
2016-11-17 18:30 ` Jesper Dangaard Brouer
2016-11-17 18:51 ` Eric Dumazet
2016-11-17 21:19 ` Jesper Dangaard Brouer [this message]
2016-11-17 21:44 ` Eric Dumazet
2016-11-17 23:08 ` Rick Jones
2016-11-18 0:37 ` Julian Anastasov
2016-11-18 0:42 ` Rick Jones
2016-11-18 17:12 ` Jesper Dangaard Brouer
2016-11-21 16:03 ` Jesper Dangaard Brouer
2016-11-21 18:10 ` Eric Dumazet
2016-11-29 6:58 ` [WIP] net+mlx4: auto doorbell Eric Dumazet
2016-11-30 11:38 ` Jesper Dangaard Brouer
2016-11-30 15:56 ` Eric Dumazet
2016-11-30 19:17 ` Jesper Dangaard Brouer
2016-11-30 19:30 ` Eric Dumazet
2016-11-30 22:30 ` Jesper Dangaard Brouer
2016-11-30 22:40 ` Eric Dumazet
2016-12-01 0:27 ` Eric Dumazet
2016-12-01 1:16 ` Tom Herbert
2016-12-01 2:32 ` Eric Dumazet
2016-12-01 2:50 ` Eric Dumazet
2016-12-02 18:16 ` Eric Dumazet
2016-12-01 5:03 ` Tom Herbert
2016-12-01 19:24 ` Willem de Bruijn
2016-11-30 13:50 ` Saeed Mahameed
2016-11-30 15:44 ` Eric Dumazet
2016-11-30 16:27 ` Saeed Mahameed
2016-11-30 17:28 ` Eric Dumazet
2016-12-01 12:05 ` Jesper Dangaard Brouer
2016-12-01 14:24 ` Eric Dumazet
2016-12-01 16:04 ` Jesper Dangaard Brouer
2016-12-01 17:04 ` Eric Dumazet
2016-12-01 19:17 ` Jesper Dangaard Brouer
2016-12-01 20:11 ` Eric Dumazet
2016-12-01 20:20 ` David Miller
2016-12-01 22:10 ` Eric Dumazet
2016-12-02 14:23 ` Eric Dumazet
2016-12-01 21:32 ` Alexander Duyck
2016-12-01 22:04 ` Eric Dumazet
2016-11-17 17:34 ` Netperf UDP issue with connected sockets David Laight
2016-11-17 22:39 ` Alexander Duyck
2016-11-17 17:42 ` Rick Jones
2016-11-28 18:33 ` Rick Jones
2016-11-28 18:40 ` Rick Jones
2016-11-30 10:43 ` Jesper Dangaard Brouer
2016-11-30 17:42 ` Rick Jones
2016-11-30 18:11 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161117221941.3b525181@redhat.com \
--to=brouer@redhat$(echo .)com \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=rick.jones2@hpe$(echo .)com \
--cc=saeedm@mellanox$(echo .)com \
--cc=tariqt@mellanox$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox