From: Roopa Prabhu <roopa@cumulusnetworks•com>
To: David Ahern <dsa@cumulusnetworks•com>
Cc: netdev@vger•kernel.org, ddutt@cumulusnetworks•com
Subject: Re: [PATCH net-next v2 1/3] net: ipv6: Allow shorthand delete of all nexthops in multipath route
Date: Mon, 16 Jan 2017 07:48:55 -0800 [thread overview]
Message-ID: <587CEB67.3040807@cumulusnetworks.com> (raw)
In-Reply-To: <1484510826-2723-2-git-send-email-dsa@cumulusnetworks.com>
On 1/15/17, 12:07 PM, David Ahern wrote:
> IPv4 allows multipath routes to be deleted using just the prefix and
> length. For example:
> $ ip ro ls vrf red
> unreachable default metric 8192
> 1.1.1.0/24
> nexthop via 10.100.1.254 dev eth1 weight 1
> nexthop via 10.11.200.2 dev eth11.200 weight 1
> 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3
> 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3
>
> $ ip ro del 1.1.1.0/24 vrf red
>
> $ ip ro ls vrf red
> unreachable default metric 8192
> 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3
> 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3
>
> The same notation does not work with IPv6 because of how multipath routes
> are implemented for IPv6. For IPv6 only the first nexthop of a multipath
> route is deleted if the request contains only a prefix and length. This
> leads to unnecessary complexity in userspace dealing with IPv6 multipath
> routes.
>
> This patch allows all nexthops to be deleted without specifying each one
> in the delete request by passing a new flag, RTM_F_ALL_NEXTHOPS, in
> rtm_flags. Internally, this is done by walking the sibling list of the
> route matching the specifications given (prefix, length, metric, protocol,
> etc).
>
> With this patch (and an updated iproute2 command):
> $ ip -6 ro ls vrf red
> 2001:db8::/120 via 2001:db8:1::62 dev eth1 metric 256 pref medium
> 2001:db8::/120 via 2001:db8:1::61 dev eth1 metric 256 pref medium
> 2001:db8::/120 via 2001:db8:1::60 dev eth1 metric 256 pref medium
> 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium
> ...
>
> $ ip -6 ro del vrf red 1111::1/120
> $ ip -6 ro ls vrf red
> 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium
> ...
>
> The flag is added to fib6_config by converting fc_type to a u16 (as
> noted fc_type only uses 8 bits). The new u16 hole is a bitmap with
> fc_delete_all_nexthop as the first bit.
>
> Suggested-by: Dinesh Dutt <ddutt@cumulusnetworks•com>
> Signed-off-by: David Ahern <dsa@cumulusnetworks•com>
> ---
> v2
> - switched example to rfc 3849 documentation address per request
> - changed delete loop to explicitly look at siblings list for
> first route matching specs given (metric, protocol, etc)
>
> include/net/ip6_fib.h | 4 +++-
> include/uapi/linux/rtnetlink.h | 1 +
> net/ipv6/route.c | 28 +++++++++++++++++++++++++---
> 3 files changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index a74e2aa40ef4..11ab99e87c5f 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -37,7 +37,9 @@ struct fib6_config {
> int fc_ifindex;
> u32 fc_flags;
> u32 fc_protocol;
> - u32 fc_type; /* only 8 bits are used */
> + u16 fc_type; /* only 8 bits are used */
> + u16 fc_delete_all_nexthop : 1,
> + __unused : 15;
>
> struct in6_addr fc_dst;
> struct in6_addr fc_src;
> diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
> index 8c93ad1ef9ab..7fb206bc42f9 100644
> --- a/include/uapi/linux/rtnetlink.h
> +++ b/include/uapi/linux/rtnetlink.h
> @@ -276,6 +276,7 @@ enum rt_scope_t {
> #define RTM_F_EQUALIZE 0x400 /* Multipath equalizer: NI */
> #define RTM_F_PREFIX 0x800 /* Prefix addresses */
> #define RTM_F_LOOKUP_TABLE 0x1000 /* set rtm_table to FIB lookup result */
> +#define RTM_F_ALL_NEXTHOPS 0x2000 /* delete all nexthops (IPv6) */
>
Do we really need the flag ?. It seems like delete with just prefix should delete all the routes in a multipath
route by default... (understand that you have it there to preserve existing behavior...for people who maybe relying on it. But this seems more like a bug fix. route replace went through a few such bug fixes "ipv6: fix ECMP route replacement"). ok with either approach.
next prev parent reply other threads:[~2017-01-16 15:48 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-15 20:07 [PATCH net-next 0/3] net: ipv6: Improve user experience with multipath routes David Ahern
2017-01-15 20:07 ` [PATCH net-next v2 1/3] net: ipv6: Allow shorthand delete of all nexthops in multipath route David Ahern
2017-01-16 15:48 ` Roopa Prabhu [this message]
2017-01-16 15:58 ` David Ahern
2017-01-17 0:51 ` David Miller
2017-01-17 1:27 ` David Ahern
2017-01-17 1:37 ` David Miller
2017-01-17 1:38 ` David Ahern
2017-01-15 20:07 ` [PATCH net-next 2/3] net: ipv6: remove nowait arg to rt6_fill_node David Ahern
2017-01-15 20:07 ` [PATCH net-next 3/3] net: ipv6: Add option to dump multipath routes via RTA_MULTIPATH attribute David Ahern
2017-01-16 17:40 ` David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=587CEB67.3040807@cumulusnetworks.com \
--to=roopa@cumulusnetworks$(echo .)com \
--cc=ddutt@cumulusnetworks$(echo .)com \
--cc=dsa@cumulusnetworks$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox