Re: [PATCH net-next 3/3] udp: keep the sk_receive

public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed

* Re: [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing
       [not found] <1494865864.3586.19.camel@redhat.com>
@ 2017-05-15 20:22 ` Paolo Abeni
  0 siblings, 0 replies; 3+ messages in thread
From: Paolo Abeni @ 2017-05-15 20:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David S. Miller

On Mon, 2017-05-15 at 09:11 -0700, Eric Dumazet wrote:
> On Mon, 2017-05-15 at 11:01 +0200, Paolo Abeni wrote:
> > On packet reception, when we are forced to splice the
> > sk_receive_queue, we can keep the related lock held, so
> > that we can avoid re-acquiring it, if fwd memory
> > scheduling is required.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat•com>
> > ---
> >  net/ipv4/udp.c | 36 ++++++++++++++++++++++++++----------
> >  1 file changed, 26 insertions(+), 10 deletions(-)
> > 
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 492c76b..d698973 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1164,7 +1164,8 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
> >  }
> >  
> >  /* fully reclaim rmem/fwd memory allocated for skb */
> > -static void udp_rmem_release(struct sock *sk, int size, int partial)
> > +static void udp_rmem_release(struct sock *sk, int size, int partial,
> > +			     int rx_queue_lock_held)
> 
> Looks good, but please use a bool rx_queue_lock_held ?

oops, I wrongly omitted some most recipients in my initial reply, re-
sending. Eric, I'm sorry for the duplicates.

ok, it sounds cleaner, I will submit a v2.

Thank you for reviewing this!

Cheers,

Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH net-next 0/3] udp: scalability improvements
@ 2017-05-15  9:01 Paolo Abeni
  2017-05-15  9:01 ` [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing Paolo Abeni
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2017-05-15  9:01 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Eric Dumazet

This patch series implement an idea suggested by Eric Dumazet to
reduce the contention of the udp sk_receive_queue lock when the socket is
under flood.

An ancillary queue is added to the udp socket, and the socket always
tries first to read packets from such queue. If it's empty, we splice
the content from sk_receive_queue into the ancillary queue.

The first patch introduces some helpers to keep the udp code small, and the
following two implement the ancillary queue strategy. The code is split
to hopefully help the reviewing process.

The measured overall gain under udp flood is up to the 30% depending on
the numa layout and the number of ingress queue used by the relevant nic.

The performance numbers have been gathered using pktgen as sender, with 64
bytes packets, random src port on a host b2b connected via a 10Gbs link
with the dut.

The receiver used the udp_sink program by Jesper [1] and an h/w l4 rx hash on
the ingress nic, so that the number of ingress nic rx queues hit by the udp
traffic could be controlled via ethtool -L.

The udp_sink program was bound to the first idle cpu, to get more
stable numbers.

On a single numa node receiver:

nic rx queues           vanilla                 patched kernel
1                       1820 kpps               1900 kpps
2                       1950 kpps               2500 kpps
16                      1670 kpps               2120 kpps

When using a single nic rx queue, busy polling was also enabled,
elsewhere, in the above scenario, the bh processing becomes the bottle-neck
and this produces large artifacts in the measured performances (e.g.
improving the udp sink run time, decreases the overall tput, since more
action from the scheduler comes into play).

[1] https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c

No changes since the RFC.

Paolo Abeni (3):
  net/sock: factor out dequeue/peek with offset code
  udp: use a separate rx queue for packet reception
  udp: keep the sk_receive_queue held when splicing

 include/linux/skbuff.h |   7 +++
 include/linux/udp.h    |   3 +
 include/net/sock.h     |   4 +-
 include/net/udp.h      |   9 +--
 include/net/udplite.h  |   2 +-
 net/core/datagram.c    |  90 +++++++++++++++------------
 net/ipv4/udp.c         | 162 +++++++++++++++++++++++++++++++++++++++++++------
 net/ipv6/udp.c         |   3 +-
 8 files changed, 211 insertions(+), 69 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing
  2017-05-15  9:01 [PATCH net-next 0/3] udp: scalability improvements Paolo Abeni
@ 2017-05-15  9:01 ` Paolo Abeni
  2017-05-15 16:11   ` Eric Dumazet
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2017-05-15  9:01 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Eric Dumazet

On packet reception, when we are forced to splice the
sk_receive_queue, we can keep the related lock held, so
that we can avoid re-acquiring it, if fwd memory
scheduling is required.

Signed-off-by: Paolo Abeni <pabeni@redhat•com>
---
 net/ipv4/udp.c | 36 ++++++++++++++++++++++++++----------
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 492c76b..d698973 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1164,7 +1164,8 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
 }
 
 /* fully reclaim rmem/fwd memory allocated for skb */
-static void udp_rmem_release(struct sock *sk, int size, int partial)
+static void udp_rmem_release(struct sock *sk, int size, int partial,
+			     int rx_queue_lock_held)
 {
 	struct udp_sock *up = udp_sk(sk);
 	struct sk_buff_head *sk_queue;
@@ -1181,9 +1182,13 @@ static void udp_rmem_release(struct sock *sk, int size, int partial)
 	}
 	up->forward_deficit = 0;
 
-	/* acquire the sk_receive_queue for fwd allocated memory scheduling */
+	/* acquire the sk_receive_queue for fwd allocated memory scheduling,
+	 * if the called don't held it already
+	 */
 	sk_queue = &sk->sk_receive_queue;
-	spin_lock(&sk_queue->lock);
+	if (!rx_queue_lock_held)
+		spin_lock(&sk_queue->lock);
+
 
 	sk->sk_forward_alloc += size;
 	amt = (sk->sk_forward_alloc - partial) & ~(SK_MEM_QUANTUM - 1);
@@ -1197,7 +1202,8 @@ static void udp_rmem_release(struct sock *sk, int size, int partial)
 	/* this can save us from acquiring the rx queue lock on next receive */
 	skb_queue_splice_tail_init(sk_queue, &up->reader_queue);
 
-	spin_unlock(&sk_queue->lock);
+	if (!rx_queue_lock_held)
+		spin_unlock(&sk_queue->lock);
 }
 
 /* Note: called with reader_queue.lock held.
@@ -1207,10 +1213,16 @@ static void udp_rmem_release(struct sock *sk, int size, int partial)
  */
 void udp_skb_destructor(struct sock *sk, struct sk_buff *skb)
 {
-	udp_rmem_release(sk, skb->dev_scratch, 1);
+	udp_rmem_release(sk, skb->dev_scratch, 1, 0);
 }
 EXPORT_SYMBOL(udp_skb_destructor);
 
+/* as above, but the caller held the rx queue lock, too */
+void udp_skb_dtor_locked(struct sock *sk, struct sk_buff *skb)
+{
+	udp_rmem_release(sk, skb->dev_scratch, 1, 1);
+}
+
 /* Idea of busylocks is to let producers grab an extra spinlock
  * to relieve pressure on the receive_queue spinlock shared by consumer.
  * Under flood, this means that only one producer can be in line
@@ -1325,7 +1337,7 @@ void udp_destruct_sock(struct sock *sk)
 		total += skb->truesize;
 		kfree_skb(skb);
 	}
-	udp_rmem_release(sk, total, 0);
+	udp_rmem_release(sk, total, 0, 1);
 
 	inet_sock_destruct(sk);
 }
@@ -1397,7 +1409,7 @@ static int first_packet_length(struct sock *sk)
 	}
 	res = skb ? skb->len : -1;
 	if (total)
-		udp_rmem_release(sk, total, 1);
+		udp_rmem_release(sk, total, 1, 0);
 	spin_unlock_bh(&rcvq->lock);
 	return res;
 }
@@ -1471,16 +1483,20 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags,
 				goto busy_check;
 			}
 
-			/* refill the reader queue and walk it again */
+			/* refill the reader queue and walk it again
+			 * keep both queues locked to avoid re-acquiring
+			 * the sk_receive_queue lock if fwd memory scheduling
+			 * is needed.
+			 */
 			_off = *off;
 			spin_lock(&sk_queue->lock);
 			skb_queue_splice_tail_init(sk_queue, queue);
-			spin_unlock(&sk_queue->lock);
 
 			skb = __skb_try_recv_from_queue(sk, queue, flags,
-							udp_skb_destructor,
+							udp_skb_dtor_locked,
 							peeked, &_off, err,
 							&last);
+			spin_unlock(&sk_queue->lock);
 			spin_unlock_bh(&queue->lock);
 			if (skb) {
 				*off = _off;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing
  2017-05-15  9:01 ` [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing Paolo Abeni
@ 2017-05-15 16:11   ` Eric Dumazet
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2017-05-15 16:11 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev, David S. Miller, Eric Dumazet

On Mon, 2017-05-15 at 11:01 +0200, Paolo Abeni wrote:
> On packet reception, when we are forced to splice the
> sk_receive_queue, we can keep the related lock held, so
> that we can avoid re-acquiring it, if fwd memory
> scheduling is required.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat•com>
> ---
>  net/ipv4/udp.c | 36 ++++++++++++++++++++++++++----------
>  1 file changed, 26 insertions(+), 10 deletions(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 492c76b..d698973 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1164,7 +1164,8 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
>  }
>  
>  /* fully reclaim rmem/fwd memory allocated for skb */
> -static void udp_rmem_release(struct sock *sk, int size, int partial)
> +static void udp_rmem_release(struct sock *sk, int size, int partial,
> +			     int rx_queue_lock_held)

Looks good, but please use a bool rx_queue_lock_held ?

Thanks !

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-05-15 20:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1494865864.3586.19.camel@redhat.com>
2017-05-15 20:22 ` [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing Paolo Abeni
2017-05-15  9:01 [PATCH net-next 0/3] udp: scalability improvements Paolo Abeni
2017-05-15  9:01 ` [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing Paolo Abeni
2017-05-15 16:11   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox