From: Robert Shearman <rshearma@brocade•com>
To: roopa <roopa@cumulusnetworks•com>
Cc: <netdev@vger•kernel.org>,
"Eric W. Biederman" <ebiederm@xmission•com>,
Thomas Graf <tgraf@suug•ch>,
Dinesh Dutt <ddutt@cumulusnetworks•com>,
"Vivek Venkatraman" <vivek@cumulusnetworks•com>
Subject: Re: [RFC net-next 2/3] ipv4: storing and retrieval of per-nexthop encap
Date: Tue, 2 Jun 2015 17:35:18 +0100 [thread overview]
Message-ID: <556DDB46.9010701@brocade.com> (raw)
In-Reply-To: <556DD368.8070806@cumulusnetworks.com>
On 02/06/15 17:01, roopa wrote:
> On 6/1/15, 9:46 AM, Robert Shearman wrote:
>> Parse RTA_ENCAP attribute for one path and multipath routes. The encap
>> length is stored in a newly added field to fib_nh, nh_encap_len,
>> although this is added to a padding hole in the structure so that it
>> doesn't increase the size at all. The encap data itself is stored at
>> the end of the array of nexthops. Whilst this means that retrieval
>> isn't optimal, especially if there are multiple nexthops, this avoids
>> the memory cost of an extra pointer, as well as any potential change
>> to the cache or instruction layout that could cause a performance
>> impact.
>>
>> Currently, the dst structure allocated to represent the destination of
>> the packet and used for retrieving the encap by the encap-type
>> interface has been grown through the addition of the rt_encap_len and
>> rt_encap fields. This isn't desirable and could be fixed by defining a
>> new destination type with operations copied from the normal case,
>> other than the addition of the get_encap operation.
>>
>> Signed-off-by: Robert Shearman <rshearma@brocade•com>
...
>> @@ -434,6 +445,83 @@ static int fib_detect_death(struct fib_info *fi,
>> int order,
>> return 1;
>> }
>> +static int fib_total_encap(struct fib_config *cfg)
>> +{
>> + struct net *net = cfg->fc_nlinfo.nl_net;
>> + int total_encap_len = 0;
>> +
>> + if (cfg->fc_mp) {
>> + int remaining = cfg->fc_mp_len;
>> + struct rtnexthop *rtnh = cfg->fc_mp;
>> +
>> + while (rtnh_ok(rtnh, remaining)) {
>> + struct nlattr *nla, *attrs = rtnh_attrs(rtnh);
>> + int attrlen;
>> +
>> + attrlen = rtnh_attrlen(rtnh);
>> + nla = nla_find(attrs, attrlen, RTA_ENCAP);
>> + if (nla) {
>> + struct net_device *dev;
>> + int len;
>> +
>> + dev = __dev_get_by_index(net,
>> + rtnh->rtnh_ifindex);
>> + if (!dev)
>> + return -EINVAL;
>> +
>> + /* Determine space required */
>> + len = rtnl_parse_encap(dev, nla, NULL);
>> + if (len < 0)
>> + return len;
>> +
>> + total_encap_len += len;
>> + }
>> +
>> + rtnh = rtnh_next(rtnh, &remaining);
>> + }
>> + } else {
>> + if (cfg->fc_encap) {
>> + struct net_device *dev;
>> + int len;
>> +
>> + dev = __dev_get_by_index(net, cfg->fc_oif);
>> + if (!dev)
>> + return -EINVAL;
>> +
>> + /* Determine space required */
>> + len = rtnl_parse_encap(dev, cfg->fc_encap, NULL);
>> + if (len < 0)
>> + return len;
>> +
>> + total_encap_len += len;
>> + }
>> + }
>> +
>> + return total_encap_len;
>> +}
> we could avoid parsing and finding this device twice, if fib_nh just
> held a pointer to the encap_info
> (or tunnel info) ?. And the encap_info/tun_info could be refcounted and
> shared between
> nexthops ?. In my implementation i have just a pointer to parsed encap
> state
> in fib_nh
Right - I took the approach here to avoid any extra memory use if encap
isn't in use for the nexthop/route, and to avoid any potential
performance penalty caused by extra cache misses. It would certainly
make things simpler if those weren't concerns. I'd appreciate input from
the community on this.
>> @@ -1403,6 +1418,15 @@ static void rt_set_nexthop(struct rtable *rt,
>> __be32 daddr,
>> #ifdef CONFIG_IP_ROUTE_CLASSID
>> rt->dst.tclassid = nh->nh_tclassid;
>> #endif
>> +
>> + nh_encap = fib_get_nh_encap(fi, nh);
>> + if (unlikely(nh_encap)) {
>> + rt->rt_encap = kmemdup(nh_encap, nh->nh_encap_len,
>> + GFP_KERNEL);
>> + if (rt->rt_encap)
>> + rt->rt_encap_len = nh->nh_encap_len;
>> + }
>> +
>
> And..., you could make the rtable point to the same encap info.
Ack.
Thanks,
Rob
next prev parent reply other threads:[~2015-06-02 16:36 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-01 16:46 [RFC net-next 0/3] IP imposition of per-nh MPLS encap Robert Shearman
2015-06-01 16:46 ` [RFC net-next 1/3] net: infra for per-nexthop encap data Robert Shearman
2015-06-02 18:15 ` Eric W. Biederman
2015-06-01 16:46 ` [RFC net-next 2/3] ipv4: storing and retrieval of per-nexthop encap Robert Shearman
2015-06-02 16:01 ` roopa
2015-06-02 16:35 ` Robert Shearman [this message]
2015-06-01 16:46 ` [RFC net-next 3/3] mpls: new ipmpls device for encapsulating IP packets as mpls Robert Shearman
2015-06-02 16:15 ` roopa
2015-06-02 16:33 ` Robert Shearman
2015-06-02 18:57 ` roopa
2015-06-02 21:06 ` Robert Shearman
2015-06-03 18:43 ` Vivek Venkatraman
2015-06-04 18:46 ` Robert Shearman
2015-06-04 21:38 ` Vivek Venkatraman
2015-06-02 18:26 ` Eric W. Biederman
2015-06-02 21:37 ` Thomas Graf
2015-06-02 22:48 ` Eric W. Biederman
2015-06-02 23:23 ` Eric W. Biederman
2015-06-03 9:50 ` Thomas Graf
2015-06-02 0:06 ` [RFC net-next 0/3] IP imposition of per-nh MPLS encap Thomas Graf
2015-06-02 13:28 ` Robert Shearman
2015-06-02 21:43 ` Thomas Graf
2015-06-03 13:30 ` Robert Shearman
2015-06-02 15:31 ` roopa
2015-06-02 18:30 ` Eric W. Biederman
2015-06-02 18:39 ` roopa
2015-06-02 18:11 ` Eric W. Biederman
2015-06-02 20:57 ` Robert Shearman
2015-06-02 21:10 ` Eric W. Biederman
2015-06-02 22:15 ` Robert Shearman
2015-06-02 22:58 ` Eric W. Biederman
2015-06-04 15:12 ` Nicolas Dichtel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556DDB46.9010701@brocade.com \
--to=rshearma@brocade$(echo .)com \
--cc=ddutt@cumulusnetworks$(echo .)com \
--cc=ebiederm@xmission$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=roopa@cumulusnetworks$(echo .)com \
--cc=tgraf@suug$(echo .)ch \
--cc=vivek@cumulusnetworks$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox