From: Eric Dumazet <eric.dumazet@gmail•com>
To: Oleksandr Natalenko <oleksandr@natalenko•name>,
Eric Dumazet <edumazet@google•com>
Cc: "David S . Miller" <davem@davemloft•net>,
netdev <netdev@vger•kernel.org>,
Neal Cardwell <ncardwell@google•com>,
Yuchung Cheng <ycheng@google•com>,
Soheil Hassas Yeganeh <soheil@google•com>
Subject: Re: [PATCH net-next 0/6] tcp: remove non GSO code
Date: Tue, 20 Feb 2018 10:57:42 -0800 [thread overview]
Message-ID: <1519153062.55655.24.camel@gmail.com> (raw)
In-Reply-To: <1519141172.55655.21.camel@gmail.com>
On Tue, 2018-02-20 at 07:39 -0800, Eric Dumazet wrote:
> On Tue, 2018-02-20 at 10:32 +0100, Oleksandr Natalenko wrote:
> > Hi.
> >
> > 19.02.2018 20:56, Eric Dumazet wrote:
> > > Switching TCP to GSO mode, relying on core networking layers
> > > to perform eventual adaptation for dumb devices was overdue.
> > >
> > > 1) Most TCP developments are done with TSO in mind.
> > > 2) Less high-resolution timers needs to be armed for TCP-pacing
> > > 3) GSO can benefit of xmit_more hint
> > > 4) Receiver GRO is more effective (as if TSO was used for real on
> > > sender)
> > > -> less ACK packets and overhead.
> > > 5) Write queues have less overhead (one skb holds about 64KB of
> > > payload)
> > > 6) SACK coalescing just works. (no payload in skb->head)
> > > 7) rtx rb-tree contains less packets, SACK is cheaper.
> > > 8) Removal of legacy code. Less maintenance hassles.
> > >
> > > Note that I have left the sendpage/zerocopy paths, but they probably
> > > can
> > > benefit from the same strategy.
> > >
> > > Thanks to Oleksandr Natalenko for reporting a performance issue for
> > > BBR/fq_codel,
> > > which was the main reason I worked on this patch series.
> >
> > Thanks for dealing with this that fast.
> >
> > Does this mean that the option to optimise internal TCP pacing is still
> > an open question?
>
> It is not an optimization that is needed, but taking into account that
> highres timers can have latencies of ~2 usec or more.
>
> When sending 64KB TSO packets, having extra 2 usec after every ~54 usec
> (at 10Gbit) has no big impact, since TCP computes a slightly inflated
> pacing rate anyway.
>
> But when sending one MSS/packet every usec, this definitely can
> demonstrate a big slowdown.
>
> But the anser is yes, I will take a look at this timer drift.
Actually timer drifts are not horrible (at least on my lab hosts)
But BBR has a pessimistic way to sense the burst size, as it is tied to
TSO/GSO being there.
Following patch helps a lot.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b2bca373f8bee35267df49b5947a6793fed71a12..6818042cd8a9a1778f54637861647091afd9a769 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1730,7 +1730,7 @@ u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now,
*/
segs = max_t(u32, bytes / mss_now, min_tso_segs);
- return min_t(u32, segs, sk->sk_gso_max_segs);
+ return segs;
}
EXPORT_SYMBOL(tcp_tso_autosize);
@@ -1742,9 +1742,10 @@ static u32 tcp_tso_segs(struct sock *sk, unsigned int mss_now)
const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
u32 tso_segs = ca_ops->tso_segs_goal ? ca_ops->tso_segs_goal(sk) : 0;
- return tso_segs ? :
- tcp_tso_autosize(sk, mss_now,
- sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs);
+ if (!tso_segs)
+ tso_segs = tcp_tso_autosize(sk, mss_now,
+ sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs);
+ return min_t(u32, tso_segs, sk->sk_gso_max_segs);
}
/* Returns the portion of skb which can be sent right away */
next prev parent reply other threads:[~2018-02-20 18:57 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-19 19:56 [PATCH net-next 0/6] tcp: remove non GSO code Eric Dumazet
2018-02-19 19:56 ` [PATCH net-next 1/6] tcp: switch to GSO being always on Eric Dumazet
2018-02-20 1:22 ` kbuild test robot
2018-02-19 19:56 ` [PATCH net-next 2/6] tcp: remove sk_can_gso() use Eric Dumazet
2018-02-19 19:56 ` [PATCH net-next 3/6] tcp: remove sk_check_csum_caps() Eric Dumazet
2018-02-19 19:56 ` [PATCH net-next 4/6] tcp: tcp_sendmsg() only deals with CHECKSUM_PARTIAL Eric Dumazet
2018-02-19 19:56 ` [PATCH net-next 5/6] tcp: remove dead code from tcp_set_skb_tso_segs() Eric Dumazet
2018-02-19 19:56 ` [PATCH net-next 6/6] tcp: remove dead code after CHECKSUM_PARTIAL adoption Eric Dumazet
2018-02-20 1:45 ` [PATCH net-next 0/6] tcp: remove non GSO code Soheil Hassas Yeganeh
2018-02-20 9:32 ` Oleksandr Natalenko
2018-02-20 15:39 ` Eric Dumazet
2018-02-20 18:57 ` Eric Dumazet [this message]
2018-02-20 19:35 ` Oleksandr Natalenko
2018-02-20 19:39 ` Eric Dumazet
2018-02-20 19:51 ` Oleksandr Natalenko
2018-02-20 19:56 ` Eric Dumazet
2018-02-20 20:06 ` Oleksandr Natalenko
2018-02-20 20:09 ` Eric Dumazet
2018-02-20 20:45 ` Oleksandr Natalenko
2018-02-20 23:21 ` Eric Dumazet
2018-02-21 6:14 ` Oleksandr Natalenko
2018-02-21 14:43 ` [PATCH net] tcp_bbr: better deal with suboptimal GSO Eric Dumazet
2018-02-21 15:01 ` Paolo Abeni
2018-02-21 15:09 ` Eric Dumazet
2018-02-21 15:55 ` Paolo Abeni
2018-02-21 15:14 ` Neal Cardwell
2018-02-21 15:18 ` Soheil Hassas Yeganeh
2018-02-22 19:16 ` David Miller
2018-02-21 19:37 ` [PATCH net-next 0/6] tcp: remove non GSO code David Miller
2018-02-28 20:10 ` Marcelo Ricardo Leitner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1519153062.55655.24.camel@gmail.com \
--to=eric.dumazet@gmail$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=edumazet@google$(echo .)com \
--cc=ncardwell@google$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=oleksandr@natalenko$(echo .)name \
--cc=soheil@google$(echo .)com \
--cc=ycheng@google$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox