public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Vasily Averin <vvs@parallels•com>
To: David Miller <davem@davemloft•net>
Cc: netdev@vger•kernel.org, kuznet@ms2•inr.ac.ru, jmorris@namei•org,
	yoshfuji@linux-ipv6•org, kaber@trash•net, eric.dumazet@gmail•com
Subject: Re: [PATCH v2] ipv4: dst_entry leak in ip_append_data()
Date: Wed, 15 Oct 2014 11:48:02 +0400	[thread overview]
Message-ID: <543E26B2.3010206@parallels.com> (raw)
In-Reply-To: <20141014.161225.1399177558139744041.davem@davemloft.net>



On 15.10.2014 00:12, David Miller wrote:
> From: Vasily Averin <vvs@parallels•com>
> Date: Tue, 14 Oct 2014 08:57:14 +0400
> 
>> v2: adjust the indentation of the arguments __ip_append_data() call
>>
>> Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()")
>>
>> If sk_write_queue is empty ip_append_data() executes ip_setup_cork()
>> that "steals" dst entry from rt to cork. Later it calls __ip_append_data()
>> that creates skb and adds it to sk_write_queue.
>>
>> If skb was added successfully following ip_push_pending_frames() call
>> reassign dst entries from cork to skb, and kfree_skb frees dst_entry.
>>
>> However nobody frees stolen dst_entry if skb was not added into sk_write_queue.
>>
>> Signed-off-by: Vasily Averin <vvs@parallels•com>
> 
> Why doesn't ip_make_skb() need the same fix?  It seems to do the same
> thing.

It seems for me ip_make_skb() works (almost) correctly,
but seems refcounting can be is incorrect if queue can be not empty 
(Please see details below).

If __ip_append_data() returns errors ip_make_skb() calls 
__ip_flush_pending_frames() that calls ip_cork_release() inside
and frees stolen dst_entry.

If __ip_append_data() returns success -- dst refcounter changes are not required.
In this case skb will be created and added to queue (and it will not be empty)
Later in __ip_make_skb() these skb will get dst reference,
and refcounter will be decremented during kfree_skb(). 

I do not like that there is such unclear dependency between functions,
but seems currently it works correctly.

However I afraid dst refcountng can work incorrectly if sk_write_queue
can be not empty at the moment of ip_append_data() call.
It was not happen in case ip_send_unicast_reply() but probably
can happen in other places.

Let's calculate dst refcounters changes in this case.

First packet:

dst_refcount increment was happen in ip_append_data() caller, taken during rt lookup
- ip_append_data():
--  sk_write_queue is empty, ip_setup_cork() steals dst entry
-- __ip_append_data() adds skb to queue, queue is not flushed, waiting for next packets.
ip_rt_put in ip_append_data() caller does not work, because dst reference was stolen.

dst refcount here +1

then we want to sent 2nd packet:
dst refcount increment was happen in ip_append_data() caller
- ip_append_data():
-- sk_write_queue is NOT empty, dst was not stolen
-- __ip_append_data() adds skb to queue
ip_rt_put in ip_append_data() caller decrements dst refcount, because it as not stolen

dst refcount here +1 

Then we handle new packets, all of them are added to queue

dst refcount is still +1

Then queue is flushed.
Each packet in queue get dst reference from cork,
Each kfree_skb decrements dst refcounter, and it may become negative.

Am I wrong probably?

  reply	other threads:[~2014-10-15  7:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-14  4:57 [PATCH v2] ipv4: dst_entry leak in ip_append_data() Vasily Averin
2014-10-14 20:12 ` David Miller
2014-10-15  7:48   ` Vasily Averin [this message]
2014-10-15  4:46 ` Eric Dumazet
2014-10-15  6:56   ` Vasily Averin
2014-10-15  9:30     ` Eric Dumazet
2014-10-15 11:31       ` Vasily Averin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=543E26B2.3010206@parallels.com \
    --to=vvs@parallels$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=eric.dumazet@gmail$(echo .)com \
    --cc=jmorris@namei$(echo .)org \
    --cc=kaber@trash$(echo .)net \
    --cc=kuznet@ms2$(echo .)inr.ac.ru \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=yoshfuji@linux-ipv6$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox