public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Jamal Hadi Salim <jhs@mojatatu•com>
To: davem@davemloft•net, stephen@networkplumber•org
Cc: netdev@vger•kernel.org, vyasevic@redhat•com,
	sfeldma@cumulusnetworks•com, john.r.fastabend@intel•com,
	roopa@cumulusnetworks•com
Subject: Re: [net-next PATCH 2/2] bridge netlink dump interface at par with brctl Actually better than brctl showmacs because we can filter by bridge port in the kernel
Date: Sun, 01 Jun 2014 08:16:41 -0400	[thread overview]
Message-ID: <538B19A9.4050607@mojatatu.com> (raw)
In-Reply-To: <1401623780-4297-2-git-send-email-jhs@emojatatu.com>


This is mostly to you Vlad since you brought it up earlier.
I ended using ifm instead of ndm. Currently there is lack of
symettry - we send requests with ifm and get responses with
ndms. Unfortunately after spending 2-3 hours I came to the
conclusion i cant change it without breaking old iproute2s that
were expecting this behavior. What we have here is a magnitude
better filtering but we could have done slightly better if we
were able to use an ndm. A little acrobatics later on to filter
by vlans may work..

cheers,
jamal

On 06/01/14 07:56, Jamal Hadi Salim wrote:
> From: Jamal Hadi Salim <jhs@mojatatu•com>
>
> The current bridge netlink interface doesnt scale when you have many bridges each
> with large fdbs or even bridges with many bridge ports
>
> Example usage:
>
> Lets start with two bridges each with a port...
>
> root@moja-mojo:bridge# ./bridge link
> 8: eth1 state DOWN : <BROADCAST,MULTICAST> mtu 1500 master br0 state disabled priority 32 cost 19
> 17: sw1-p1 state DOWN : <BROADCAST,NOARP> mtu 1500 master sw1 state disabled priority 32 cost 100
>
> show all...
> root@moja-mojo:bridge# ./bridge fdb show
> 33:33:00:00:00:01 dev bond0 self permanent
> 33:33:00:00:00:01 dev dummy0 self permanent
> 33:33:00:00:00:01 dev ifb0 self permanent
> 33:33:00:00:00:01 dev ifb1 self permanent
> 33:33:00:00:00:01 dev eth0 self permanent
> 01:00:5e:00:00:01 dev eth0 self permanent
> 33:33:ff:22:01:01 dev eth0 self permanent
> 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 dev eth1 self permanent
> 33:33:00:00:00:01 dev eth1 self permanent
> 33:33:00:00:00:01 dev gretap0 self permanent
> 33:33:00:00:00:01 dev br0 self permanent
> 33:33:00:00:00:01 dev sw1 self permanent
> a2:fb:21:4c:47:25 dev sw1-p1 vlan 0 master sw1 permanent
> 33:33:00:00:00:01 dev sw1-p1 self permanent
>
> Lets see a port that is not attached to a bridge
> root@moja-mojo:bridge# ./bridge fdb show brport eth0
> 33:33:00:00:00:01 self permanent
> 01:00:5e:00:00:01 self permanent
> 33:33:ff:22:01:01 self permanent
>
> Lets see a port that is attached to a bridge
> root@moja-mojo:bridge# ./bridge fdb show brport eth1
> 02:00:00:12:01:02 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 self permanent
> 33:33:00:00:00:01 self permanent
>
> Specify the correct bridge and you get good stuff
> root@moja-mojo:bridge# ./bridge fdb show brport eth1 br br0
> 02:00:00:12:01:02 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 self permanent
> 33:33:00:00:00:01 self permanent
>
> Specify the wrong bridge and you get good nada
> root@moja-mojo:bridge# ./bridge fdb show brport eth1 br sw1
>
> dump only br0
> root@moja-mojo:bridge# ./bridge fdb show br br0
> 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 dev eth1 self permanent
> 33:33:00:00:00:01 dev eth1 self permanent
>
> Lets move a port from one bridge to another for shits-and-giggles
> (as they say in New Brunswick)
> root@moja-mojo:bridge# ip link set sw1-p1 master br0
>
> Now dump again br0
> root@moja-mojo:bridge# ./bridge fdb show br br0
> 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent
> 00:17:42:8a:b4:07 dev eth1 self permanent
> 33:33:00:00:00:01 dev eth1 self permanent
> a2:fb:21:4c:47:25 dev sw1-p1 vlan 0 master br0 permanent
> 33:33:00:00:00:01 dev sw1-p1 self permanent
>
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu•com>
> ---
>   net/core/rtnetlink.c |   68 +++++++++++++++++++++++++++++++++++++++++---------
>   1 file changed, 56 insertions(+), 12 deletions(-)
>
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 064418e..71e6bc8 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -2508,26 +2508,70 @@ EXPORT_SYMBOL(ndo_dflt_fdb_dump);
>
>   static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
>   {
> -	int idx = 0;
> -	struct net *net = sock_net(skb->sk);
>   	struct net_device *dev;
> +	struct net_device *br_dev;
> +	struct nlattr *tb[IFLA_MAX+1];
> +	const struct net_device_ops *ops;
> +	struct ifinfomsg *ifm = nlmsg_data(cb->nlh);
> +	struct net *net = sock_net(skb->sk);
> +	int brport_idx = 0;
> +	int br_idx = 0;
> +	int idx = 0;
> +
> +	if (nlmsg_parse(cb->nlh, sizeof(struct ifinfomsg), tb, IFLA_MAX,
> +			ifla_policy) == 0) {
> +		if (tb[IFLA_MASTER])
> +			br_idx = nla_get_u32(tb[IFLA_MASTER]);
> +	}
> +	
> +	brport_idx = ifm->ifi_index;
>
>   	rcu_read_lock();
>   	for_each_netdev_rcu(net, dev) {
> -		if (dev->priv_flags & IFF_BRIDGE_PORT) {
> -			struct net_device *br_dev;
> -			const struct net_device_ops *ops;
>
> -			br_dev = netdev_master_upper_dev_get(dev);
> +		if (brport_idx && (dev->ifindex != brport_idx))
> +			continue;
> +
> +		if (!br_idx) {
> +			if (dev->priv_flags & IFF_BRIDGE_PORT) {
> +				br_dev = netdev_master_upper_dev_get(dev);
> +				ops = br_dev->netdev_ops;
> +				if (ops->ndo_fdb_dump)
> +					idx = ops->ndo_fdb_dump(skb, cb, br_dev,
> +								dev, idx);
> +			}
> +
> +			/* all of bridge fdb entries are dumped via brports fdb
> +			 * therefore only allow for selfies for bridges
> +			*/
> +			if (!(dev->priv_flags & IFF_EBRIDGE) &&
> +			      dev->netdev_ops->ndo_fdb_dump)
> +				idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev,
> +								    NULL, idx);
> +			else
> +				idx = ndo_dflt_fdb_dump(skb, cb, dev, NULL, idx);
> +
> +		} else {
> +			if (!(dev->priv_flags & IFF_BRIDGE_PORT))
> +				continue;
> +
> +			br_dev = __dev_get_by_index(net, br_idx);
> +			if (!br_dev)
> +				return -ENODEV;
> +
> +			if (br_dev != netdev_master_upper_dev_get(dev))
> +				continue;
> +
>   			ops = br_dev->netdev_ops;
>   			if (ops->ndo_fdb_dump)
> -				idx = ops->ndo_fdb_dump(skb, cb, dev, NULL, idx);
> -		}
> +				idx = ops->ndo_fdb_dump(skb, cb, br_dev, dev, idx);
>
> -		if (dev->netdev_ops->ndo_fdb_dump)
> -			idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev, NULL, idx);
> -		else
> -			idx = ndo_dflt_fdb_dump(skb, cb, dev, NULL, idx);
> +			if (dev->netdev_ops->ndo_fdb_dump)
> +				idx = dev->netdev_ops->ndo_fdb_dump(skb, cb, dev,
> +								    NULL, idx);
> +			else
> +				idx = ndo_dflt_fdb_dump(skb, cb, dev, NULL, idx);
> +		}
>   	}
>   	rcu_read_unlock();
>
>

  reply	other threads:[~2014-06-01 12:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-01 11:56 [net-next PATCH 1/2] bridge fdb dumping takes a filter device Dumping a bridge fdb dumps every fdb entry held. With this change we are going to filter on selected bridge port Jamal Hadi Salim
2014-06-01 11:56 ` [net-next PATCH 2/2] bridge netlink dump interface at par with brctl Actually better than brctl showmacs because we can filter by bridge port in the kernel Jamal Hadi Salim
2014-06-01 12:16   ` Jamal Hadi Salim [this message]
2014-06-01 12:24     ` Jamal Hadi Salim
2014-06-02 15:34   ` Vlad Yasevich
2014-06-02 22:17     ` Jamal Hadi Salim
2014-06-05  7:15 ` [net-next PATCH 1/2] bridge fdb dumping takes a filter device Dumping a bridge fdb dumps every fdb entry held. With this change we are going to filter on selected bridge port David Miller
2014-06-07 12:41   ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=538B19A9.4050607@mojatatu.com \
    --to=jhs@mojatatu$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=john.r.fastabend@intel$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=roopa@cumulusnetworks$(echo .)com \
    --cc=sfeldma@cumulusnetworks$(echo .)com \
    --cc=stephen@networkplumber$(echo .)org \
    --cc=vyasevic@redhat$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox