From: Nicolas Dichtel <nicolas.dichtel@6wind•com>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: netdev <netdev@vger•kernel.org>, Octavian Purdila <opurdila@ixiacom•com>
Subject: Re: [PATCH] ipv4: remove all rt cache entries on UNREGISTER event
Date: Tue, 28 Sep 2010 18:45:58 +0200 [thread overview]
Message-ID: <4CA21BC6.5070300@6wind.com> (raw)
In-Reply-To: <1285691629.3154.80.camel@edumazet-laptop>
Eric Dumazet wrote:
> Le mardi 28 septembre 2010 à 17:24 +0200, Nicolas Dichtel a écrit :
>> Hi,
>>
>> I face a problem when I try to remove an interface,
>> netdev_wait_allrefs() complains about refcount.
>>
>> Here is a trivial scenario to reproduce the problem:
>> # ip tunnel add mode ipip remote 10.16.0.164 local 10.16.0.72 dev eth0
>> # ./a.out tunl1
>> # ip tunnel del tunl1
>>
>> Note: a.out binary create an IPv4 raw socket, attach it to tunl1
>> (SO_BINDTODEVICE), set it as multicast (IP_MULTICAST_LOOP), set the
>> multicast interface to tunl1 (IP_MULTICAST_IF), build the IP header
>> (IP_HDRINCL) and then send a single packet (192.168.6.1 -> 224.0.0.18).
>>
>> Note2: when a.out is executed, tunl1 has no ip address and is down.
>>
>
> CC Octavian Purdila, the patch author.
>
> I am just wondering why this route is created in the first place.
At first, I asked myself the same question, but it seems that this is
allowed to send a packet through this kind of socket, even if interface
is down. Packet will be destroyed by the noop qdisk.
But I agree that it is strange to perform route lookup and everything to
destroy the packet at the end ...
Maybe raw_sendmsg() can delete it directly ;-) ... or maybe
ip_route_output_flow().
Any suggestions welcome.
Regards,
Nicolas
>
> Maybe a fix would be to forbid this ?
>
> Some machines have a giant route cache, so its very important to avoid
> expensive scans.
>
>> Then, I got a serie of "kernel:[1206699.728010] unregister_netdevice:
>> waiting for tunl1 to become free. Usage count = 3" and after some time,
>> interface is removed.
>>
>> The problem is that route cache entries are only invalidate on
>> UNREGISTER event, and not removed (introduced by commit
>> e2ce146848c81af2f6d42e67990191c284bf0c33). We must wait that
>> rt_check_expire() remove the remaining route cache entries.
>>
>> To fix the problem, I propose to remove a part of the previous commit.
>>
>> Regards,
>> Nicolas
>> pièce jointe différences entre fichiers
>> (0001-ipv4-remove-all-rt-cache-entries-on-UNREGISTER-even.patch)
>> From 3344e2e0431fe803c4dac8757a8746908357d780 Mon Sep 17 00:00:00 2001
>> From: Nicolas Dichtel <nicolas.dichtel@6wind•com>
>> Date: Tue, 28 Sep 2010 16:38:19 +0200
>> Subject: [PATCH] ipv4: remove all rt cache entries on UNREGISTER event
>>
>> Commit e2ce146848c81af2f6d42e67990191c284bf0c33 (ipv4: factorize cache clearing
>> for batched unregister operations) add a new parameter to fib_disable_ip() to
>> only invalidate route cache entries on unregister event.
>> This is wrong, we should ensure that all cache entries are removed on
>> unregister event, else netdev_wait_allrefs() may complain. A cache entry
>> can be created between event DOWN and UNREGISTER.
>>
>> So, I revert a part of the patch.
>>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind•com>
>> ---
>> net/ipv4/fib_frontend.c | 10 +++++-----
>> 1 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
>> index 7d02a9f..377e815 100644
>> --- a/net/ipv4/fib_frontend.c
>> +++ b/net/ipv4/fib_frontend.c
>> @@ -917,11 +917,11 @@ static void nl_fib_lookup_exit(struct net *net)
>> net->ipv4.fibnl = NULL;
>> }
>>
>> -static void fib_disable_ip(struct net_device *dev, int force, int delay)
>> +static void fib_disable_ip(struct net_device *dev, int force)
>> {
>> if (fib_sync_down_dev(dev, force))
>> fib_flush(dev_net(dev));
>> - rt_cache_flush(dev_net(dev), delay);
>> + rt_cache_flush(dev_net(dev), 0);
>> arp_ifdown(dev);
>> }
>>
>> @@ -944,7 +944,7 @@ static int fib_inetaddr_event(struct notifier_block *this, unsigned long event,
>> /* Last address was deleted from this interface.
>> Disable IP.
>> */
>> - fib_disable_ip(dev, 1, 0);
>> + fib_disable_ip(dev, 1);
>> } else {
>> rt_cache_flush(dev_net(dev), -1);
>> }
>> @@ -959,7 +959,7 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
>> struct in_device *in_dev = __in_dev_get_rtnl(dev);
>>
>> if (event == NETDEV_UNREGISTER) {
>> - fib_disable_ip(dev, 2, -1);
>> + fib_disable_ip(dev, 2);
>> return NOTIFY_DONE;
>> }
>>
>> @@ -977,7 +977,7 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
>> rt_cache_flush(dev_net(dev), -1);
>> break;
>> case NETDEV_DOWN:
>> - fib_disable_ip(dev, 0, 0);
>> + fib_disable_ip(dev, 0);
>> break;
>> case NETDEV_CHANGEMTU:
>> case NETDEV_CHANGE:
>
>
next prev parent reply other threads:[~2010-09-28 16:46 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-28 15:24 [PATCH] ipv4: remove all rt cache entries on UNREGISTER event Nicolas Dichtel
2010-09-28 16:33 ` Eric Dumazet
2010-09-28 16:45 ` Nicolas Dichtel [this message]
2010-09-28 16:56 ` Eric Dumazet
2010-09-29 7:49 ` Nicolas Dichtel
2010-09-29 8:35 ` Eric Dumazet
2010-09-29 9:18 ` Eric Dumazet
2010-09-30 11:49 ` Nicolas Dichtel
2010-12-22 8:32 ` Nicolas Dichtel
2010-12-22 9:55 ` Eric Dumazet
2010-12-22 10:07 ` Eric Dumazet
2010-12-22 13:43 ` Nicolas Dichtel
2010-12-22 14:39 ` [PATCH] ipv4: dont create routes on down devices Eric Dumazet
2010-12-23 8:50 ` Octavian Purdila
2010-12-26 4:05 ` David Miller
2010-09-28 17:35 ` [PATCH] ipv4: remove all rt cache entries on UNREGISTER event Octavian Purdila
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CA21BC6.5070300@6wind.com \
--to=nicolas.dichtel@6wind$(echo .)com \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=opurdila@ixiacom$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox