From: Vasily Averin <vvs@parallels•com>
To: David Miller <davem@davemloft•net>
Cc: netdev@vger•kernel.org, kuznet@ms2•inr.ac.ru, jmorris@namei•org,
yoshfuji@linux-ipv6•org, kaber@trash•net, eric.dumazet@gmail•com
Subject: Re: [PATCH v2] ipv4: dst_entry leak in ip_append_data()
Date: Wed, 15 Oct 2014 11:48:02 +0400 [thread overview]
Message-ID: <543E26B2.3010206@parallels.com> (raw)
In-Reply-To: <20141014.161225.1399177558139744041.davem@davemloft.net>
On 15.10.2014 00:12, David Miller wrote:
> From: Vasily Averin <vvs@parallels•com>
> Date: Tue, 14 Oct 2014 08:57:14 +0400
>
>> v2: adjust the indentation of the arguments __ip_append_data() call
>>
>> Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()")
>>
>> If sk_write_queue is empty ip_append_data() executes ip_setup_cork()
>> that "steals" dst entry from rt to cork. Later it calls __ip_append_data()
>> that creates skb and adds it to sk_write_queue.
>>
>> If skb was added successfully following ip_push_pending_frames() call
>> reassign dst entries from cork to skb, and kfree_skb frees dst_entry.
>>
>> However nobody frees stolen dst_entry if skb was not added into sk_write_queue.
>>
>> Signed-off-by: Vasily Averin <vvs@parallels•com>
>
> Why doesn't ip_make_skb() need the same fix? It seems to do the same
> thing.
It seems for me ip_make_skb() works (almost) correctly,
but seems refcounting can be is incorrect if queue can be not empty
(Please see details below).
If __ip_append_data() returns errors ip_make_skb() calls
__ip_flush_pending_frames() that calls ip_cork_release() inside
and frees stolen dst_entry.
If __ip_append_data() returns success -- dst refcounter changes are not required.
In this case skb will be created and added to queue (and it will not be empty)
Later in __ip_make_skb() these skb will get dst reference,
and refcounter will be decremented during kfree_skb().
I do not like that there is such unclear dependency between functions,
but seems currently it works correctly.
However I afraid dst refcountng can work incorrectly if sk_write_queue
can be not empty at the moment of ip_append_data() call.
It was not happen in case ip_send_unicast_reply() but probably
can happen in other places.
Let's calculate dst refcounters changes in this case.
First packet:
dst_refcount increment was happen in ip_append_data() caller, taken during rt lookup
- ip_append_data():
-- sk_write_queue is empty, ip_setup_cork() steals dst entry
-- __ip_append_data() adds skb to queue, queue is not flushed, waiting for next packets.
ip_rt_put in ip_append_data() caller does not work, because dst reference was stolen.
dst refcount here +1
then we want to sent 2nd packet:
dst refcount increment was happen in ip_append_data() caller
- ip_append_data():
-- sk_write_queue is NOT empty, dst was not stolen
-- __ip_append_data() adds skb to queue
ip_rt_put in ip_append_data() caller decrements dst refcount, because it as not stolen
dst refcount here +1
Then we handle new packets, all of them are added to queue
dst refcount is still +1
Then queue is flushed.
Each packet in queue get dst reference from cork,
Each kfree_skb decrements dst refcounter, and it may become negative.
Am I wrong probably?
next prev parent reply other threads:[~2014-10-15 7:49 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-14 4:57 [PATCH v2] ipv4: dst_entry leak in ip_append_data() Vasily Averin
2014-10-14 20:12 ` David Miller
2014-10-15 7:48 ` Vasily Averin [this message]
2014-10-15 4:46 ` Eric Dumazet
2014-10-15 6:56 ` Vasily Averin
2014-10-15 9:30 ` Eric Dumazet
2014-10-15 11:31 ` Vasily Averin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=543E26B2.3010206@parallels.com \
--to=vvs@parallels$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=jmorris@namei$(echo .)org \
--cc=kaber@trash$(echo .)net \
--cc=kuznet@ms2$(echo .)inr.ac.ru \
--cc=netdev@vger$(echo .)kernel.org \
--cc=yoshfuji@linux-ipv6$(echo .)org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox