public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation•org>
To: Stefan Schmidt <stefan@datenfreihafen•org>
Cc: linux-kernel@vger•kernel.org, netdev@vger•kernel.org,
	stable@vger•kernel.org, Eric Dumazet <edumazet@google•com>,
	Kirill Tkhai <ktkhai@virtuozzo•com>,
	Herbert Xu <herbert@gondor•apana.org.au>,
	Florian Westphal <fw@strlen•de>,
	Jesper Dangaard Brouer <brouer@redhat•com>,
	Alexander Aring <alex.aring@gmail•com>,
	Stefan Schmidt <stefan@osg•samsung.com>,
	"David S. Miller" <davem@davemloft•net>
Subject: Re: [PATCH 4.9 50/71] inet: frags: use rhashtables for reassembly units
Date: Thu, 29 Nov 2018 13:54:22 +0100	[thread overview]
Message-ID: <20181129125422.GO3149@kroah.com> (raw)
In-Reply-To: <62bd748b-20a8-d021-7b3b-32146df8beb8@datenfreihafen.org>

On Fri, Oct 26, 2018 at 03:39:47PM +0200, Stefan Schmidt wrote:
> Hello Greg.
> 
> [Hope I am not to late for this]
> 
> On 16/10/2018 19:09, Greg Kroah-Hartman wrote:
> > 4.9-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Eric Dumazet <edumazet@google•com>
> > 
> > Some applications still rely on IP fragmentation, and to be fair linux
> > reassembly unit is not working under any serious load.
> > 
> > It uses static hash tables of 1024 buckets, and up to 128 items per bucket (!!!)
> > 
> > A work queue is supposed to garbage collect items when host is under memory
> > pressure, and doing a hash rebuild, changing seed used in hash computations.
> > 
> > This work queue blocks softirqs for up to 25 ms when doing a hash rebuild,
> > occurring every 5 seconds if host is under fire.
> > 
> > Then there is the problem of sharing this hash table for all netns.
> > 
> > It is time to switch to rhashtables, and allocate one of them per netns
> > to speedup netns dismantle, since this is a critical metric these days.
> > 
> > Lookup is now using RCU. A followup patch will even remove
> > the refcount hold/release left from prior implementation and save
> > a couple of atomic operations.
> > 
> > Before this patch, 16 cpus (16 RX queue NIC) could not handle more
> > than 1 Mpps frags DDOS.
> > 
> > After the patch, I reach 9 Mpps without any tuning, and can use up to 2GB
> > of storage for the fragments (exact number depends on frags being evicted
> > after timeout)
> > 
> > $ grep FRAG /proc/net/sockstat
> > FRAG: inuse 1966916 memory 2140004608
> > 
> > A followup patch will change the limits for 64bit arches.
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google•com>
> > Cc: Kirill Tkhai <ktkhai@virtuozzo•com>
> > Cc: Herbert Xu <herbert@gondor•apana.org.au>
> > Cc: Florian Westphal <fw@strlen•de>
> > Cc: Jesper Dangaard Brouer <brouer@redhat•com>
> > Cc: Alexander Aring <alex.aring@gmail•com>
> > Cc: Stefan Schmidt <stefan@osg•samsung.com>
> > Signed-off-by: David S. Miller <davem@davemloft•net>
> > (cherry picked from commit 648700f76b03b7e8149d13cc2bdb3355035258a9)
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation•org>
> > ---
> >  Documentation/networking/ip-sysctl.txt  |    7 
> >  include/net/inet_frag.h                 |   81 +++----
> >  include/net/ipv6.h                      |   16 -
> >  net/ieee802154/6lowpan/6lowpan_i.h      |   26 --
> >  net/ieee802154/6lowpan/reassembly.c     |   91 +++-----
> >  net/ipv4/inet_fragment.c                |  349 ++++++--------------------------
> >  net/ipv4/ip_fragment.c                  |  112 ++++------
> >  net/ipv6/netfilter/nf_conntrack_reasm.c |   51 +---
> >  net/ipv6/reassembly.c                   |  110 ++++------
> >  9 files changed, 267 insertions(+), 576 deletions(-)
> > 
> 
> When this patch hit master a while back we had to address a regression
> in the ieee802514 6lowpan layer. It seems this fix is missing in the
> backport series (only looking at your patchset here, no the full tree).
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=f18fa5de5ba7f1d6650951502bb96a6e4715a948
> 
> I would appreciate if you could pull this into this series as well.

Now queued up for 4.14 and 4.9 as well, thanks.

greg k-h

  reply	other threads:[~2018-11-29 12:54 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20181016170539.315587743@linuxfoundation.org>
2018-10-16 17:09 ` [PATCH 4.9 43/71] inet: frags: change inet_frags_init_net() return value Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 44/71] inet: frags: add a pointer to struct netns_frags Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 45/71] inet: frags: refactor ipfrag_init() Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 46/71] inet: frags: refactor ipv6_frag_init() Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 47/71] inet: frags: refactor lowpan_net_frag_init() Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 48/71] ipv6: export ip6 fragments sysctl to unprivileged users Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 49/71] rhashtable: add schedule points Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 50/71] inet: frags: use rhashtables for reassembly units Greg Kroah-Hartman
2018-10-26 13:39   ` Stefan Schmidt
2018-11-29 12:54     ` Greg Kroah-Hartman [this message]
2018-10-16 17:09 ` [PATCH 4.9 51/71] inet: frags: remove some helpers Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 52/71] inet: frags: get rif of inet_frag_evicting() Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 53/71] inet: frags: remove inet_frag_maybe_warn_overflow() Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 54/71] inet: frags: break the 2GB limit for frags storage Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 55/71] inet: frags: do not clone skb in ip_expire() Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 56/71] ipv6: frags: rewrite ip6_expire_frag_queue() Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 57/71] rhashtable: reorganize struct rhashtable layout Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 58/71] inet: frags: reorganize struct netns_frags Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 59/71] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 60/71] inet: frags: fix ip6frag_low_thresh boundary Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 61/71] ip: discard IPv4 datagrams with overlapping segments Greg Kroah-Hartman
2018-10-16 17:09 ` [PATCH 4.9 62/71] net: speed up skb_rbtree_purge() Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 63/71] net: modify skb_rbtree_purge to return the truesize of all purged skbs Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 64/71] ipv6: defrag: drop non-last frags smaller than min mtu Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 65/71] net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 66/71] net: add rb_to_skb() and other rb tree helpers Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 67/71] ip: use rb trees for IP frag queue Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 68/71] ip: add helpers to process in-order fragments faster Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 69/71] ip: process in-order fragments efficiently Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 70/71] ip: frags: fix crash in ip_do_fragment() Greg Kroah-Hartman
2018-10-16 17:10 ` [PATCH 4.9 71/71] ipv4: frags: precedence bug in ip_expire() Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181129125422.GO3149@kroah.com \
    --to=gregkh@linuxfoundation$(echo .)org \
    --cc=alex.aring@gmail$(echo .)com \
    --cc=brouer@redhat$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=edumazet@google$(echo .)com \
    --cc=fw@strlen$(echo .)de \
    --cc=herbert@gondor$(echo .)apana.org.au \
    --cc=ktkhai@virtuozzo$(echo .)com \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=stable@vger$(echo .)kernel.org \
    --cc=stefan@datenfreihafen$(echo .)org \
    --cc=stefan@osg$(echo .)samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox