From: Alexander Duyck <alexander.h.duyck@intel•com>
To: Ian Campbell <Ian.Campbell@citrix•com>
Cc: Eric Dumazet <eric.dumazet@gmail•com>,
"netdev@vger•kernel.org" <netdev@vger•kernel.org>,
David Miller <davem@davemloft•net>,
"Michael S. Tsirkin" <mst@redhat•com>,
"Wei Liu (Intern)" <wei.liu2@citrix•com>,
"xen-devel@lists•xen.org" <xen-devel@lists•xen.org>
Subject: Re: [PATCH 05/10] net: move destructor_arg to the front of sk_buff.
Date: Wed, 11 Apr 2012 09:31:38 -0700 [thread overview]
Message-ID: <4F85B1EA.9000600@intel.com> (raw)
In-Reply-To: <1334131241.12209.199.camel@dagon.hellion.org.uk>
On 04/11/2012 01:00 AM, Ian Campbell wrote:
> On Tue, 2012-04-10 at 20:15 +0100, Alexander Duyck wrote:
>> On 04/10/2012 11:41 AM, Eric Dumazet wrote:
>>> On Tue, 2012-04-10 at 11:33 -0700, Alexander Duyck wrote:
>>>
>>>> Have you checked this for 32 bit as well as 64? Based on my math your
>>>> next patch will still mess up the memset on 32 bit with the structure
>>>> being split somewhere just in front of hwtstamps.
>>>>
>>>> Why not just take frags and move it to the start of the structure? It
>>>> is already an unknown value because it can be either 16 or 17 depending
>>>> on the value of PAGE_SIZE, and since you are making changes to frags the
>>>> changes wouldn't impact the alignment of the other values later on since
>>>> you are aligning the end of the structure. That way you would be
>>>> guaranteed that all of the fields that will be memset would be in the
>>>> last 64 bytes.
>>>>
>>> Now when a fragmented packet is copied in pskb_expand_head(), you access
>>> two separate zones of memory to copy the shinfo. But its supposed to be
>>> slow path.
>>>
>>> Problem with this is that the offsets of often used fields will be big
>>> (instead of being < 127) and code will be bigger on x86.
>> Actually now that I think about it my concerns go much further than the
>> memset. I'm convinced that this is going to cause a pretty significant
>> performance regression on multiple drivers, especially on non x86_64
>> architecture. What we have right now on most platforms is a
>> skb_shared_info structure in which everything up to and including frag 0
>> is all in one cache line. This gives us pretty good performance for igb
>> and ixgbe since that is our common case when jumbo frames are not
>> enabled is to split the head and place the data in a page.
> With all the changes in this series it is still possible to fit a
> maximum standard MTU frame and the shinfo on the same 4K page while also
> have the skb_shared_info up to and including frag [0] aligned to the
> same 64 byte cache line.
>
> The only exception is destructor_arg on 64 bit which is on the preceding
> cache line but that is not a field used in any hot path.
The problem I have is that this is only true on x86_64. Proper work
hasn't been done to guarantee this on any other architectures.
I think what I would like to see is instead of just setting things up
and hoping it comes out cache aligned on nr_frags why not take steps to
guarantee it? You could do something like place and size the structure
based on:
SKB_DATA_ALIGN(sizeof(skb_shared_info) - offsetof(struct
skb_shared_info, nr_frags)) + offsetof(struct skb_shared_info, nr_frags)
That way you would have your alignment still guaranteed based off of the
end of the structure, but anything placed before nr_frags would be
placed on the end of the previous cache line.
>> However the change being recommend here only resolves the issue for one
>> specific architecture, and that is what I don't agree with. What we
>> need is a solution that also works for 64K pages or 32 bit pointers and
>> I am fairly certain this current solution does not.
> I think it does work for 32 bit pointers. What issue to do you see with
> 64K pages?
>
> Ian.
With 64K pages the MAX_SKB_FRAGS value drops from 17 to 16. That will
undoubtedly mess up the alignment.
Thanks,
Alex
next prev parent reply other threads:[~2012-04-11 16:31 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-10 14:26 [PATCH v4 0/10] skb paged fragment destructors Ian Campbell
2012-04-10 14:26 ` [PATCH 01/10] net: add and use SKB_ALLOCSIZE Ian Campbell
2012-04-10 14:57 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 02/10] net: Use SKB_WITH_OVERHEAD in build_skb Ian Campbell
2012-04-10 14:58 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 03/10] chelsio: use SKB_WITH_OVERHEAD Ian Campbell
2012-04-10 14:59 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 04/10] net: pad skb data and shinfo as a whole rather than individually Ian Campbell
2012-04-10 15:01 ` Eric Dumazet
2012-04-10 14:26 ` [PATCH 05/10] net: move destructor_arg to the front of sk_buff Ian Campbell
2012-04-10 15:05 ` Eric Dumazet
2012-04-10 15:19 ` Ian Campbell
2012-04-10 18:33 ` Alexander Duyck
2012-04-10 18:41 ` Eric Dumazet
2012-04-10 19:15 ` Alexander Duyck
2012-04-11 8:00 ` Ian Campbell
2012-04-11 16:31 ` Alexander Duyck [this message]
2012-04-11 17:00 ` Ian Campbell
2012-04-11 8:20 ` Eric Dumazet
2012-04-11 16:05 ` Alexander Duyck
2012-04-11 7:56 ` Ian Campbell
2012-04-10 14:26 ` [PATCH 06/10] net: add support for per-paged-fragment destructors Ian Campbell
2012-04-26 20:44 ` [Xen-devel] " Konrad Rzeszutek Wilk
2012-04-10 14:26 ` [PATCH 07/10] net: only allow paged fragments with the same destructor to be coalesced Ian Campbell
2012-04-10 20:11 ` Ben Hutchings
2012-04-11 7:45 ` Ian Campbell
2012-04-10 14:26 ` [PATCH 08/10] net: add skb_orphan_frags to copy aside frags with destructors Ian Campbell
[not found] ` <1334067965.5394.22.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>
2012-04-10 14:26 ` [PATCH 09/10] net: add paged frag destructor support to kernel_sendpage Ian Campbell
2012-04-10 14:26 ` [PATCH 10/10] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack Ian Campbell
2012-04-10 14:58 ` [PATCH v4 0/10] skb paged fragment destructors Michael S. Tsirkin
2012-04-10 15:00 ` Michael S. Tsirkin
2012-04-10 15:46 ` Bart Van Assche
2012-04-10 15:50 ` Ian Campbell
2012-04-11 10:02 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F85B1EA.9000600@intel.com \
--to=alexander.h.duyck@intel$(echo .)com \
--cc=Ian.Campbell@citrix$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=mst@redhat$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=wei.liu2@citrix$(echo .)com \
--cc=xen-devel@lists$(echo .)xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox