From: Jeremy Fitzhardinge <jeremy@goop•org>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: netdev@vger•kernel.org, Rusty Russell <rusty@rustcorp•com.au>,
xen-devel <xen-devel@lists•xensource.com>,
Ian Campbell <Ian.Campbell@citrix•com>
Subject: Re: SKB paged fragment lifecycle on receive
Date: Fri, 24 Jun 2011 11:21:15 -0700 [thread overview]
Message-ID: <4E04D59B.8060301@goop.org> (raw)
In-Reply-To: <1308938183.2532.8.camel@edumazet-laptop>
On 06/24/2011 10:56 AM, Eric Dumazet wrote:
> Le vendredi 24 juin 2011 à 10:29 -0700, Jeremy Fitzhardinge a écrit :
>> On 06/24/2011 08:43 AM, Ian Campbell wrote:
>>> We've previously looked into solutions using the skb destructor callback
>>> but that falls over if the skb is cloned since you also need to know
>>> when the clone is destroyed. Jeremy Fitzhardinge and I subsequently
>>> looked at the possibility of a no-clone skb flag (i.e. always forcing a
>>> copy instead of a clone) but IIRC honouring it universally turned into a
>>> very twisty maze with a number of nasty corner cases etc. It also seemed
>>> that the proportion of SKBs which get cloned at least once appeared as
>>> if it could be quite high which would presumably make the performance
>>> impact unacceptable when using the flag. Another issue with using the
>>> skb destructor is that functions such as __pskb_pull_tail will eat (and
>>> free) pages from the start of the frag array such that by the time the
>>> skb destructor is called they are no longer there.
>>>
>>> AIUI Rusty Russell had previously looked into a per-page destructor in
>>> the shinfo but found that it couldn't be made to work (I don't remember
>>> why, or if I even knew at the time). Could that be an approach worth
>>> reinvestigating?
>>>
>>> I can't really think of any other solution which doesn't involve some
>>> sort of driver callback at the time a page is free()d.
> This reminds me the packet mmap (tx path) games we play with pages.
>
> net/packet/af_packet.c : tpacket_destruct_skb(), poking
> TP_STATUS_AVAILABLE back to user to tell him he can reuse space...
Yes. Its similar in the sense that its a tx from a page which isn't
being handed over entirely to the network stack, but has some other
longer-term lifetime.
>> One simple approach would be to simply make sure that we retain a page
>> reference on any granted pages so that the network stack's put pages
>> will never result in them being released back to the kernel. We can
>> also install an skb destructor. If it sees a page being released with a
>> refcount of 1, then we know its our own reference and can free the page
>> immediately. If the refcount is > 1 then we can add it to a queue of
>> pending pages, which can be periodically polled to free pages whose
>> other references have been dropped.
>>
>> However, the question is how large will this queue get? If it remains
>> small then this scheme could be entirely practical. But if almost every
>> page ends up having transient stray references, it could become very
>> awkward.
>>
>> So it comes down to "how useful is an skb destructor callback as a
>> heuristic for page free"?
>>
> Dangerous I would say. You could have a skb1 page transferred to another
> skb2, and call skb1 destructor way before page being released.
Under what circumstances would that happen?
> TCP stack could do that in tcp_collapse() [ it currently doesnt play
> with pages ]
Do you mean "dangerous" in the sense that many pages could end up being
tied up in the pending-release queue? We'd always check the page
refcount, so it should never release pages prematurely.
Thanks,
J
next prev parent reply other threads:[~2011-06-24 18:21 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-24 15:43 SKB paged fragment lifecycle on receive Ian Campbell
2011-06-24 17:29 ` Jeremy Fitzhardinge
2011-06-24 17:56 ` Eric Dumazet
2011-06-24 18:21 ` Jeremy Fitzhardinge [this message]
2011-06-24 19:46 ` David Miller
2011-06-24 20:11 ` Jeremy Fitzhardinge
2011-06-24 20:27 ` David Miller
2011-06-25 11:58 ` Ian Campbell
2011-06-27 20:51 ` Jeremy Fitzhardinge
2011-06-28 10:25 ` Ian Campbell
2011-06-27 14:42 ` Ian Campbell
2011-06-27 22:49 ` David Miller
2011-06-28 10:24 ` Ian Campbell
2011-06-24 22:44 ` Ian Campbell
2011-06-24 22:48 ` Jeremy Fitzhardinge
2011-06-26 10:25 ` Michael S. Tsirkin
2011-06-27 9:41 ` Ian Campbell
2011-06-27 10:21 ` Michael S. Tsirkin
2011-06-27 10:54 ` Ian Campbell
2011-06-27 11:19 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E04D59B.8060301@goop.org \
--to=jeremy@goop$(echo .)org \
--cc=Ian.Campbell@citrix$(echo .)com \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=rusty@rustcorp$(echo .)com.au \
--cc=xen-devel@lists$(echo .)xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox