From: Chris Clayton <chris2553@googlemail•com>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: David Miller <davem@davemloft•net>,
netdev@vger•kernel.org, gpiez@web•de,
Dave Jones <davej@redhat•com>, Julian Anastasov <ja@ssi•bg>
Subject: Re: [PATCH] ipv4: add a fib_type to fib_info
Date: Thu, 04 Oct 2012 14:08:31 +0100 [thread overview]
Message-ID: <506D8A4F.3020008@googlemail.com> (raw)
In-Reply-To: <1349349926.16011.48.camel@edumazet-glaptop>
On 10/04/12 12:25, Eric Dumazet wrote:
> On Tue, 2012-10-02 at 17:48 +0200, Eric Dumazet wrote:
>> From: Eric Dumazet <edumazet@google•com>
>>
>> On Tue, 2012-10-02 at 17:35 +0200, Eric Dumazet wrote:
>>> On Mon, 2012-10-01 at 22:04 +0200, Eric Dumazet wrote:
>>>> On Mon, 2012-10-01 at 16:01 -0400, David Miller wrote:
>>>
>>>>> If you can find a way to more reliably trigger the case, that would
>>>>> help us immensely.
>>>>
>>>> I am building a KMEMCHECK kernel, as a last try before my night ;)
>>>
>>> This was a total disaster. KMEMCHECK dies horribly on my machine
>>>
>>> David, shouldnt we use a nh_rth_forward instead of a nh_rth_input in
>>> __mkroute_input() ?
>>>
>>> (And change rt_cache_route() as well ?)
>>>
>>> I am testing a patch right now.
>>
>
> OK so I implemented David idea and it seems to work.
>
> Testers are needed, thanks ! ;)
>
I've tested 3.6.0 with this patch applied and networking in a WinXP KVM
client is now working fine. The patch applies cleanly to 3.6.0, so I
assume the patch will be forwarded to stable in due course.
Tested-by: Chris Clayton <chris2553@googlemail•com>
> [PATCH] ipv4: add a fib_type to fib_info
>
> commit d2d68ba9fe8 (ipv4: Cache input routes in fib_info nexthops.)
> introduced a regression for forwarding.
>
> This was hard to reproduce but the symptom was that packets were
> delivered to local host instead of being forwarded.
>
> David suggested to add fib_type to fib_info so that we dont
> inadvertently share same fib_info for different purposes.
>
> With help from Julian Anastasov who provided very helpful
> hints, reproduced here :
>
> <quote>
> Can it be a problem related to fib_info reuse
> from different routes. For example, when local IP address
> is created for subnet we have:
>
> broadcast 192.168.0.255 dev DEV proto kernel scope link src
> 192.168.0.1
> 192.168.0.0/24 dev DEV proto kernel scope link src 192.168.0.1
> local 192.168.0.1 dev DEV proto kernel scope host src 192.168.0.1
>
> The "dev DEV proto kernel scope link src 192.168.0.1" is
> a reused fib_info structure where we put cached routes.
> The result can be same fib_info for 192.168.0.255 and
> 192.168.0.0/24. RTN_BROADCAST is cached only for input
> routes. Incoming broadcast to 192.168.0.255 can be cached
> and can cause problems for traffic forwarded to 192.168.0.0/24.
> So, this patch should solve the problem because it
> separates the broadcast from unicast traffic.
>
> And the ip_route_input_slow caching will work for
> local and broadcast input routes (above routes 1 and 3) just
> because they differ in scope and use different fib_info.
>
> </quote>
>
> Many thanks to Chris Clayton for his patience and help.
>
> Reported-by: Chris Clayton <chris2553@googlemail•com>
> Bisected-by: Chris Clayton <chris2553@googlemail•com>
> Reported-by: Dave Jones <davej@redhat•com>
> Signed-off-by: Eric Dumazet <edumazet@google•com>
> Cc: Julian Anastasov <ja@ssi•bg>
> ---
> include/net/ip_fib.h | 1 +
> net/ipv4/fib_semantics.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> index 926142e..9497be1 100644
> --- a/include/net/ip_fib.h
> +++ b/include/net/ip_fib.h
> @@ -102,6 +102,7 @@ struct fib_info {
> unsigned char fib_dead;
> unsigned char fib_protocol;
> unsigned char fib_scope;
> + unsigned char fib_type;
> __be32 fib_prefsrc;
> u32 fib_priority;
> u32 *fib_metrics;
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index 3509065..2677530 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -314,6 +314,7 @@ static struct fib_info *fib_find_info(const struct fib_info *nfi)
> nfi->fib_scope == fi->fib_scope &&
> nfi->fib_prefsrc == fi->fib_prefsrc &&
> nfi->fib_priority == fi->fib_priority &&
> + nfi->fib_type == fi->fib_type &&
> memcmp(nfi->fib_metrics, fi->fib_metrics,
> sizeof(u32) * RTAX_MAX) == 0 &&
> ((nfi->fib_flags ^ fi->fib_flags) & ~RTNH_F_DEAD) == 0 &&
> @@ -833,6 +834,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
> fi->fib_flags = cfg->fc_flags;
> fi->fib_priority = cfg->fc_priority;
> fi->fib_prefsrc = cfg->fc_prefsrc;
> + fi->fib_type = cfg->fc_type;
>
> fi->fib_nhs = nhs;
> change_nexthops(fi) {
>
>
next prev parent reply other threads:[~2012-10-04 13:08 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-17 15:44 Possible networking regression in 3.6.0 Chris Clayton
2012-09-18 14:21 ` Chris Clayton
2012-09-18 14:31 ` Chris Clayton
2012-09-18 14:40 ` Eric Dumazet
2012-09-18 15:51 ` Chris Clayton
2012-09-19 15:26 ` Chris Clayton
2012-09-22 6:26 ` Chris Clayton
2012-09-27 11:50 ` Chris Clayton
2012-09-27 12:14 ` Eric Dumazet
2012-09-27 18:05 ` Chris Clayton
2012-09-27 21:03 ` Eric Dumazet
2012-09-27 21:17 ` Eric Dumazet
2012-09-28 6:53 ` David Miller
2012-09-28 9:14 ` Chris Clayton
2012-09-28 9:22 ` Chris Clayton
2012-09-28 11:26 ` Eric Dumazet
2012-09-28 14:28 ` Chris Clayton
2012-09-30 15:26 ` Chris Clayton
2012-09-30 19:45 ` Eric Dumazet
2012-10-01 8:36 ` Chris Clayton
2012-10-01 9:15 ` Eric Dumazet
2012-10-01 15:13 ` Chris Clayton
2012-10-01 15:31 ` Eric Dumazet
2012-10-01 16:19 ` Chris Clayton
2012-10-01 16:37 ` Eric Dumazet
2012-10-01 18:28 ` Chris Clayton
2012-10-01 18:34 ` Captain Obvious
2012-10-01 19:21 ` Eric Dumazet
2012-10-01 19:55 ` Chris Clayton
2012-10-01 19:22 ` Chris Clayton
2012-10-01 19:34 ` Dave Jones
2012-10-01 20:01 ` David Miller
2012-10-01 20:04 ` Eric Dumazet
2012-10-02 15:27 ` Edivaldo de Araújo Pereira
2012-10-02 15:35 ` Eric Dumazet
2012-10-02 15:48 ` Eric Dumazet
2012-10-02 15:57 ` Dave Jones
2012-10-02 16:06 ` Eric Dumazet
2012-10-02 18:25 ` David Miller
2012-10-02 21:14 ` Alexander Duyck
2012-10-02 21:35 ` Eric Dumazet
2012-10-02 23:24 ` Julian Anastasov
2012-10-03 3:10 ` David Miller
2012-10-03 15:01 ` Chris Clayton
2012-10-03 20:57 ` Julian Anastasov
2012-10-03 7:28 ` [PATCH] udp: increment UDP_MIB_NOPORTS in mcast receive Eric Dumazet
2012-10-03 12:45 ` David Stevens
2012-10-03 13:15 ` Eric Dumazet
2012-10-03 14:09 ` David Stevens
2012-10-03 15:29 ` Eric Dumazet
2012-10-03 17:31 ` David Stevens
2012-10-03 19:30 ` David Miller
2012-10-03 17:39 ` Rick Jones
2012-10-03 2:55 ` Possible networking regression in 3.6.0 David Miller
2012-10-04 11:25 ` [PATCH] ipv4: add a fib_type to fib_info Eric Dumazet
2012-10-04 13:08 ` Chris Clayton [this message]
2012-10-04 13:32 ` Eric Dumazet
2012-10-04 18:14 ` David Miller
2012-09-18 14:44 ` Possible networking regression in 3.6.0 Chris Clayton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=506D8A4F.3020008@googlemail.com \
--to=chris2553@googlemail$(echo .)com \
--cc=davej@redhat$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=gpiez@web$(echo .)de \
--cc=ja@ssi$(echo .)bg \
--cc=netdev@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox