public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Thomas Petazzoni <thomas.petazzoni@free-electrons•com>
To: Willy Tarreau <w@1wt•eu>
Cc: netdev@vger•kernel.org,
	Gregory CLEMENT <gregory.clement@free-electrons•com>,
	Eric Dumazet <edumazet@google•com>
Subject: Re: [PATCH] net: convert mvneta to build_skb()
Date: Fri, 5 Jul 2013 09:50:38 +0200	[thread overview]
Message-ID: <20130705095038.018de378@skate> (raw)
In-Reply-To: <20130705074330.GB25188@1wt.eu>

Dear Willy Tarreau,

On Fri, 5 Jul 2013 09:43:30 +0200, Willy Tarreau wrote:

> > Thanks Willy. Sorry for asking such a stupid question, but I'm not very
> > familiar with how this mechanism works. Can you explain why a single
> > 'frag_size' field per port is sufficient? My concern is that this
> > frag_size seems to be a per-packet information, but we have potentially
> > multiple packets being received, and multiple RX queues. Is one single
> > 'frag_size' per network interface sufficient?
> 
> I had the exact same question when Eric sent me an experimental patch to
> do this on mv64xxx_eth a few months ago. Then I checked how the frag_size
> is computed. As you can see below, it does not depend on a per-packet size
> but on a per-port size which is in fact the MTU ("pkt_size" is the misleading
> part here) :
> 
>       skb_size = SKB_DATA_ALIGN(pp->pkt_size + MVNETA_MH_SIZE + NET_SKB_PAD) +
>                  SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> 
> So if skb_size depends solely on pp (struct mvneta_port), then it makes
> sense to have frag_size stored at the same place.
> 
> In practice we don't really need to store the frag_size in the struct, we
> just need to know if the data were allocated using netdev_alloc_frag()
> or kmalloc() to know how to free them. So a single bit would be enough,
> and I thought about just doing a +1 on the pointer when we need to free
> using kmalloc(). But that would add unneeded extra work in the fast path
> to fix the pointers. And since we need to pass frag_size to build_skb()
> it was a reasonable solution in my opinion.

Aah, okay makes sense. So now, the question that comes up is why this
skb_size calculation is done in every call of mvneta_rx_refill() ? It
should be computed once, at the same time pkt_size is calculated, and
stored in the mvneta_port structure? Then you just need to test whether
it is smaller or larger than PAGE_SIZE to decide whether to use
netdev_alloc_frag() vs. kmalloc().

I.e, I would turn all the:

	if (pp->frag_size)
		...
	else
		...

into:

	if (pp->skb_size <= PAGE_SIZE)
		...
	else
		...

Of course, you can always hide this test behind a small macro or inline
function, to make it something like:

	if (mvneta_uses_small_skbs(pp))
		...
	else
		...

A better name than mvneta_uses_small_skbs() would of course be useful.

> > For example, in mvneta_rx_refill(), you store the skb_size in
> > pp->frag_size, and then you later re-use it in mvneta_rxq_drop_pkts.
> > What guarantees you that mvneta_rx_refill() hasn't be called in the
> > mean time for a different packet, in a different rxq, for the same
> > network interface, and the value of pp->frag_size has been overridden?
> 
> It's not a problem since the refill() applies pp->pkt_size which doesn't
> change between calls. It's only changed in mvneta_change_mtu() which
> first stops the device. So I think it's safe.

Yeah, sure, now that I see that pp->frag_size is constant for a given
MTU configuration, it looks safe. But I find the current code that
retests and reassigns pp->frag_size at every call of rxq_refill() to be
very confusing.

Thanks!

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

  reply	other threads:[~2013-07-05  7:50 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-04 17:35 [PATCH] net: convert mvneta to build_skb() Willy Tarreau
2013-07-04 21:31 ` David Miller
2013-07-04 22:12   ` Willy Tarreau
2013-07-05  7:17 ` Thomas Petazzoni
2013-07-05  7:43   ` Willy Tarreau
2013-07-05  7:50     ` Thomas Petazzoni [this message]
2013-07-05  8:09       ` Willy Tarreau
2013-07-15 14:34 ` Thomas Petazzoni
2013-07-15 15:12   ` Willy Tarreau
2013-07-15 15:23     ` Thomas Petazzoni
2013-07-15 15:30       ` Willy Tarreau
2013-07-15 15:35         ` Thomas Petazzoni
2013-07-15 15:52           ` Florian Fainelli
2013-07-15 17:01             ` Willy Tarreau
2013-07-15 19:44             ` Thomas Petazzoni
2013-07-15 23:02               ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130705095038.018de378@skate \
    --to=thomas.petazzoni@free-electrons$(echo .)com \
    --cc=edumazet@google$(echo .)com \
    --cc=gregory.clement@free-electrons$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=w@1wt$(echo .)eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox