From: Guillaume Nault <gnault@redhat•com>
To: David Ahern <dsahern@kernel•org>
Cc: David Miller <davem@davemloft•net>,
Jakub Kicinski <kuba@kernel•org>,
netdev@vger•kernel.org,
Hideaki YOSHIFUJI <yoshfuji@linux-ipv6•org>
Subject: Re: [PATCH net] ipv4: fix route lookups when handling ICMP redirects and PMTU updates
Date: Mon, 28 Feb 2022 21:54:40 +0100 [thread overview]
Message-ID: <20220228205440.GA24680@debian.home> (raw)
In-Reply-To: <922b4932-fcd5-d362-4679-6689046560c7@kernel.org>
On Mon, Feb 28, 2022 at 10:31:58AM -0700, David Ahern wrote:
> On 2/28/22 10:16 AM, Guillaume Nault wrote:
> > Fixes: d3a25c980fc2 ("ipv4: Fix nexthop exception hash computation.")
>
> That does not seem related to tos in the flow struct at all.
Ouch, copy/paste mistake.
I meant 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions."), which is
the next commit with 'git log -- net/ipv4/route.c'.
Really sorry :/, and thanks a lot for catching that!
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index f33ad1f383b6..d5d058de3664 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -499,6 +499,15 @@ void __ip_select_ident(struct net *net, struct iphdr *iph, int segs)
> > }
> > EXPORT_SYMBOL(__ip_select_ident);
> >
> > +static void ip_rt_fix_tos(struct flowi4 *fl4)
>
> make this a static inline in include/net/flow.h and update
> flowi4_init_output and flowi4_update_output to use it. That should cover
> a few of the cases below leaving just ...
Hum, I didn't think about this option, but it looks risky to me. As I
put it in note 1, ip_route_output_key_hash() unconditionally sets
->flowi4_scope, assuming it can infer the scope from the RTO_ONLINK bit
of ->flowi4_tos. If we santise these fields in flowi4_init_output()
(and flowi4_update_output()), then ip_route_output_key_hash() would
sometimes work on already santised values and sometimes not. So it
wouldn't know if it should initialise ->flowi4_scope.
We could decide to let ip_route_output_key_hash() initialise
->flowi4_scope only when the RTO_ONLINK bit is set, which
guarantees that we don't have sanitised values. But before that, we'd
need to audit all other callers, to verify that they correctly
initialise the ->flowi4_scope with RT_SCOPE_UNIVERSE, since
ip_route_output_key_hash() isn't going do it for them anymore.
I'll audit all these callers, but that should be something for
net-next.
> > @@ -2613,9 +2625,7 @@ struct rtable *ip_route_output_key_hash(struct net *net, struct flowi4 *fl4,
> > struct rtable *rth;
> >
> > fl4->flowi4_iif = LOOPBACK_IFINDEX;
> > - fl4->flowi4_tos = tos & IPTOS_RT_MASK;
> > - fl4->flowi4_scope = ((tos & RTO_ONLINK) ?
> > - RT_SCOPE_LINK : RT_SCOPE_UNIVERSE);
> > + ip_rt_fix_tos(fl4);
>
> ... this one to call the new helper.
BTW, here's a bit more about the context around this patch.
I found the problem while working on removing the use of RTO_ONLINK, so
that ->flowi4_tos could be converted to dscp_t.
The objective is to modify callers so that they'd set ->flowi4_scope
directly, instead using RTO_ONLINK to mark their intention (and that's
why I said I'd have to audit them anyway).
Once that will be done, ip_rt_fix_tos() won't have to touch the scope
anymore. And once ->flowi4_tos will be converted to dscp_t, we'll can
remove that function entirely since dscp_t ensures ECN bits are cleared
(IPTOS_RT_MASK also ensures that high order bits are cleared too, but
that's redundant with the RT_TOS() calls already done by callers, and
which somewhat aren't really desirable anyway).
> >
> > rcu_read_lock();
> > rth = ip_route_output_key_hash_rcu(net, fl4, &res, skb);
>
next prev parent reply other threads:[~2022-02-28 20:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-28 17:16 [PATCH net] ipv4: fix route lookups when handling ICMP redirects and PMTU updates Guillaume Nault
2022-02-28 17:31 ` David Ahern
2022-02-28 20:54 ` Guillaume Nault [this message]
2022-03-01 4:31 ` David Ahern
2022-03-01 11:41 ` Guillaume Nault
2022-03-02 16:19 ` David Ahern
2022-03-02 17:40 ` Guillaume Nault
2022-02-28 17:32 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220228205440.GA24680@debian.home \
--to=gnault@redhat$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=dsahern@kernel$(echo .)org \
--cc=kuba@kernel$(echo .)org \
--cc=netdev@vger$(echo .)kernel.org \
--cc=yoshfuji@linux-ipv6$(echo .)org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox