public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber•org>
To: davem@davemloft•net, gregkh@linuxfoundation•org
Cc: netdev@vger•kernel.org, stable@vger•kernel.org, edumazet@google•com
Subject: [PATCH v3 18/30] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB
Date: Thu, 13 Sep 2018 07:58:50 -0700	[thread overview]
Message-ID: <20180913145902.17531-19-sthemmin@microsoft.com> (raw)
In-Reply-To: <20180913145902.17531-1-sthemmin@microsoft.com>

From: Eric Dumazet <edumazet@google•com>

ip_defrag uses skb->cb[] to store the fragment offset, and unfortunately
this integer is currently in a different cache line than skb->next,
meaning that we use two cache lines per skb when finding the insertion point.

By aliasing skb->ip_defrag_offset and skb->dev, we pack all the fields
in a single cache line and save precious memory bandwidth.

Note that after the fast path added by Changli Gao in commit
d6bebca92c66 ("fragment: add fast path for in-order fragments")
this change wont help the fast path, since we still need
to access prev->len (2nd cache line), but will show great
benefits when slow path is entered, since we perform
a linear scan of a potentially long list.

Also, note that this potential long list is an attack vector,
we might consider also using an rb-tree there eventually.

Signed-off-by: Eric Dumazet <edumazet@google•com>
Signed-off-by: David S. Miller <davem@davemloft•net>
(cherry picked from commit bf66337140c64c27fa37222b7abca7e49d63fb57)
---
 include/linux/skbuff.h |  1 +
 net/ipv4/ip_fragment.c | 35 ++++++++++++++---------------------
 2 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6dd77767fd5b..f4749678b7ee 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -678,6 +678,7 @@ struct sk_buff {
 		 * UDP receive path is one user.
 		 */
 		unsigned long		dev_scratch;
+		int			ip_defrag_offset;
 	};
 	/*
 	 * This is the control buffer. It is free to use for every
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 88fa8ffc5558..5331a0d68374 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -57,14 +57,6 @@
  */
 static const char ip_frag_cache_name[] = "ip4-frags";
 
-struct ipfrag_skb_cb
-{
-	struct inet_skb_parm	h;
-	int			offset;
-};
-
-#define FRAG_CB(skb)	((struct ipfrag_skb_cb *)((skb)->cb))
-
 /* Describe an entry in the "incomplete datagrams" queue. */
 struct ipq {
 	struct inet_frag_queue q;
@@ -353,13 +345,13 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	 * this fragment, right?
 	 */
 	prev = qp->q.fragments_tail;
-	if (!prev || FRAG_CB(prev)->offset < offset) {
+	if (!prev || prev->ip_defrag_offset < offset) {
 		next = NULL;
 		goto found;
 	}
 	prev = NULL;
 	for (next = qp->q.fragments; next != NULL; next = next->next) {
-		if (FRAG_CB(next)->offset >= offset)
+		if (next->ip_defrag_offset >= offset)
 			break;	/* bingo! */
 		prev = next;
 	}
@@ -370,7 +362,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	 * any overlaps are eliminated.
 	 */
 	if (prev) {
-		int i = (FRAG_CB(prev)->offset + prev->len) - offset;
+		int i = (prev->ip_defrag_offset + prev->len) - offset;
 
 		if (i > 0) {
 			offset += i;
@@ -387,8 +379,8 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 
 	err = -ENOMEM;
 
-	while (next && FRAG_CB(next)->offset < end) {
-		int i = end - FRAG_CB(next)->offset; /* overlap is 'i' bytes */
+	while (next && next->ip_defrag_offset < end) {
+		int i = end - next->ip_defrag_offset; /* overlap is 'i' bytes */
 
 		if (i < next->len) {
 			int delta = -next->truesize;
@@ -401,7 +393,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 			delta += next->truesize;
 			if (delta)
 				add_frag_mem_limit(qp->q.net, delta);
-			FRAG_CB(next)->offset += i;
+			next->ip_defrag_offset += i;
 			qp->q.meat -= i;
 			if (next->ip_summed != CHECKSUM_UNNECESSARY)
 				next->ip_summed = CHECKSUM_NONE;
@@ -425,7 +417,13 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 		}
 	}
 
-	FRAG_CB(skb)->offset = offset;
+	/* Note : skb->ip_defrag_offset and skb->dev share the same location */
+	dev = skb->dev;
+	if (dev)
+		qp->iif = dev->ifindex;
+	/* Makes sure compiler wont do silly aliasing games */
+	barrier();
+	skb->ip_defrag_offset = offset;
 
 	/* Insert this fragment in the chain of fragments. */
 	skb->next = next;
@@ -436,11 +434,6 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	else
 		qp->q.fragments = skb;
 
-	dev = skb->dev;
-	if (dev) {
-		qp->iif = dev->ifindex;
-		skb->dev = NULL;
-	}
 	qp->q.stamp = skb->tstamp;
 	qp->q.meat += skb->len;
 	qp->ecn |= ecn;
@@ -516,7 +509,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
 	}
 
 	WARN_ON(!head);
-	WARN_ON(FRAG_CB(head)->offset != 0);
+	WARN_ON(head->ip_defrag_offset != 0);
 
 	/* Allocate a new buffer for the datagram. */
 	ihlen = ip_hdrlen(head);
-- 
2.18.0

  parent reply	other threads:[~2018-09-13 20:09 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-13 14:58 [PATCH v3 00/30] backport of IP fragmentation fixes Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 01/30] inet: frags: change inet_frags_init_net() return value Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 02/30] inet: frags: add a pointer to struct netns_frags Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 03/30] inet: frags: refactor ipfrag_init() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 04/30] inet: frags: Convert timers to use timer_setup() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 05/30] inet: frags: refactor ipv6_frag_init() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 06/30] inet: frags: refactor lowpan_net_frag_init() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 07/30] ipv6: export ip6 fragments sysctl to unprivileged users Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 08/30] rhashtable: add schedule points Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 09/30] inet: frags: use rhashtables for reassembly units Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 10/30] inet: frags: remove some helpers Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 11/30] inet: frags: get rif of inet_frag_evicting() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 12/30] inet: frags: remove inet_frag_maybe_warn_overflow() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 13/30] inet: frags: break the 2GB limit for frags storage Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 14/30] inet: frags: do not clone skb in ip_expire() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 15/30] ipv6: frags: rewrite ip6_expire_frag_queue() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 16/30] rhashtable: reorganize struct rhashtable layout Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 17/30] inet: frags: reorganize struct netns_frags Stephen Hemminger
2018-09-13 14:58 ` Stephen Hemminger [this message]
2018-09-13 14:58 ` [PATCH v3 19/30] inet: frags: fix ip6frag_low_thresh boundary Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 20/30] ip: discard IPv4 datagrams with overlapping segments Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 21/30] net: speed up skb_rbtree_purge() Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 22/30] net: modify skb_rbtree_purge to return the truesize of all purged skbs Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 23/30] ipv6: defrag: drop non-last frags smaller than min mtu Stephen Hemminger
2019-01-10 19:30   ` Tom Herbert
2019-01-10 22:22     ` Florian Westphal
2019-01-11 10:57       ` Eric Dumazet
2019-01-11 12:21         ` Michal Kubecek
2019-01-11 12:27           ` Eric Dumazet
2019-01-11 12:52             ` Michal Kubecek
2019-01-11 13:07               ` Eric Dumazet
     [not found]                 ` <CAOSSMjUODMbBuW=GgwcEt6avKoyYD5A9CzdBtE6NR6dz4pnD6w@mail.gmail.com>
2019-01-11 14:09                   ` Eric Dumazet
2019-01-11 14:21                   ` Michal Kubecek
     [not found]                     ` <CAOSSMjVMVWxzkT5M2LHgf0+GPHdaWHV01a6mBqbGRVXOaQ04PQ@mail.gmail.com>
2019-01-11 17:09                       ` Peter Oskolkov
2019-01-11 18:10                         ` Michal Kubecek
2019-01-12  3:21                           ` Tom Herbert
2018-09-13 14:58 ` [PATCH v3 24/30] net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 25/30] net: add rb_to_skb() and other rb tree helpers Stephen Hemminger
2018-09-13 14:58 ` [PATCH v3 26/30] net: sk_buff rbnode reorg Stephen Hemminger
2018-10-18 16:01   ` Christoph Paasch
2018-09-13 14:58 ` [PATCH v3 27/30] ipv4: frags: precedence bug in ip_expire() Stephen Hemminger
2018-09-13 14:59 ` [PATCH v3 28/30] ip: add helpers to process in-order fragments faster Stephen Hemminger
2018-09-13 14:59 ` [PATCH v3 29/30] ip: process in-order fragments efficiently Stephen Hemminger
2018-09-13 14:59 ` [PATCH v3 30/30] ip: frags: fix crash in ip_do_fragment() Stephen Hemminger
2018-09-17 12:47 ` [PATCH v3 00/30] backport of IP fragmentation fixes Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180913145902.17531-19-sthemmin@microsoft.com \
    --to=stephen@networkplumber$(echo .)org \
    --cc=davem@davemloft$(echo .)net \
    --cc=edumazet@google$(echo .)com \
    --cc=gregkh@linuxfoundation$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=stable@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox