From: John Fastabend <john.fastabend@gmail•com>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: xiyou.wangcong@gmail•com, davem@davemloft•net, jhs@mojatatu•com,
netdev@vger•kernel.org, paulmck@linux•vnet.ibm.com,
brouer@redhat•com
Subject: Re: [net-next PATCH v3 09/15] net: sched: make cls_u32 lockless
Date: Tue, 09 Sep 2014 09:35:23 -0700 [thread overview]
Message-ID: <540F2C4B.6060404@gmail.com> (raw)
In-Reply-To: <1410268807.11872.177.camel@edumazet-glaptop2.roam.corp.google.com>
On 09/09/2014 06:20 AM, Eric Dumazet wrote:
> On Mon, 2014-09-08 at 22:58 -0700, John Fastabend wrote:
>> Make cls_u32 classifier safe to run without holding lock. This patch
>> converts statistics that are kept in read section u32_classify into
>> per cpu counters.
>>
>> This patch was tested with a tight u32 filter add/delete loop while
>> generating traffic with pktgen. By running pktgen on vlan devices
>> created on top of a physical device we can hit the qdisc layer
>> correctly. For ingress qdisc's a loopback cable was used.
>>
>> for i in {1..100}; do
>> q=`echo $i%8|bc`;
>> echo -n "u32 tos: iteration $i on queue $q";
>> tc filter add dev p3p2 parent $p prio $i u32 match ip tos 0x10 0xff \
>> action skbedit queue_mapping $q;
>> sleep 1;
>> tc filter del dev p3p2 prio $i;
>>
>> echo -n "u32 tos hash table: iteration $i on queue $q";
>> tc filter add dev p3p2 parent $p protocol ip prio $i handle 628: u32 divisor 1
>> tc filter add dev p3p2 parent $p protocol ip prio $i u32 \
>> match ip protocol 17 0xff link 628: offset at 0 mask 0xf00 shift 6 plus 0
>> tc filter add dev p3p2 parent $p protocol ip prio $i u32 \
>> ht 628:0 match ip tos 0x10 0xff action skbedit queue_mapping $q
>> sleep 2;
>> tc filter del dev p3p2 prio $i
>> sleep 1;
>> done
>>
>
>
> Note it might be easier to split this patch in 2 parts.
>
> (The percpu stuff could be done in a first step, then rcu conversion)
Sure, I'll split it up.
>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel•com>
>> ---
[...]
>>
>> static inline unsigned int u32_hash_fold(__be32 key,
>> @@ -96,7 +103,7 @@ static int u32_classify(struct sk_buff *skb, const struct tcf_proto *tp, struct
>> unsigned int off;
>> } stack[TC_U32_MAXDEPTH];
>>
>> - struct tc_u_hnode *ht = tp->root;
>> + struct tc_u_hnode *ht = rcu_dereference_bh(tp->root);
>> unsigned int off = skb_network_offset(skb);
>> struct tc_u_knode *n;
>> int sdepth = 0;
>> @@ -108,23 +115,23 @@ static int u32_classify(struct sk_buff *skb, const struct tcf_proto *tp, struct
>> int i, r;
>>
>> next_ht:
>> - n = ht->ht[sel];
>> + n = rcu_dereference_bh(ht->ht[sel]);
>>
>> next_knode:
>> if (n) {
>> struct tc_u32_key *key = n->sel.keys;
>>
>> #ifdef CONFIG_CLS_U32_PERF
>> - n->pf->rcnt += 1;
>> + this_cpu_inc(n->pf->rcnt);
>
> As we run in BH, we are not preemptable, we can use instead :
> __this_cpu_inc() (on all occurrences)
>
> Using a macro would also reduce the #ifdef mess of this file
>
I'll go ahead and do the macro conversion.
>> j = 0;
>> #endif
>>
>> #ifdef CONFIG_CLS_U32_MARK
>> - if ((skb->mark & n->mark.mask) != n->mark.val) {
>> - n = n->next;
>> + if ((skb->mark & n->mask) != n->val) {
>> + n = rcu_dereference_bh(n->next);
>> goto next_knode;
>> } else {
>> - n->mark.success++;
>> + this_cpu_inc(*n->pcpu_success);
>> }
>> #endif
>>
[...]
>> @@ -326,11 +342,11 @@ static int u32_init(struct tcf_proto *tp)
>> }
>>
>> tp_c->refcnt++;
>> - root_ht->next = tp_c->hlist;
>> - tp_c->hlist = root_ht;
>> + rcu_assign_pointer(root_ht->next, tp_c->hlist);
>
> RCU_INIT_POINTER() or root_ht->next = tp_c->hlist;
>
I'll fix all these as well.
>> + rcu_assign_pointer(tp_c->hlist, root_ht);
>> root_ht->tp_c = tp_c;
>>
>> - tp->root = root_ht;
>> + rcu_assign_pointer(tp->root, root_ht);
>> tp->data = tp_c;
>> return 0;
>> }
>> @@ -342,25 +358,33 @@ static int u32_destroy_key(struct tcf_proto *tp, struct tc_u_knode *n)
>> if (n->ht_down)
>> n->ht_down->refcnt--;
>> #ifdef CONFIG_CLS_U32_PERF
>> - kfree(n->pf);
>> + free_percpu(n->pf);
>> #endif
>> kfree(n);
>> return 0;
>> }
>>
>> +void __u32_delete_key(struct rcu_head *rcu)
>
> Can we consistently use _rcu, as in u32_delete_key_rcu() ?
Yes, I'll rename these calls.
Thanks,
John
--
John Fastabend Intel Corporation
next prev parent reply other threads:[~2014-09-09 16:35 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-09 5:53 [net-next PATCH v3 00/15] net/sched use rcu filters John Fastabend
2014-09-09 5:54 ` [net-next PATCH v3 01/15] net: qdisc: use rcu prefix and silence sparse warnings John Fastabend
2014-09-09 12:05 ` Eric Dumazet
2014-09-09 5:54 ` [net-next PATCH v3 02/15] net: rcu-ify tcf_proto John Fastabend
2014-09-09 12:20 ` Eric Dumazet
2014-09-09 15:15 ` John Fastabend
2014-09-09 15:20 ` Eric Dumazet
2014-09-09 5:55 ` [net-next PATCH v3 03/15] net: sched: cls_basic use RCU John Fastabend
2014-09-09 12:29 ` Eric Dumazet
2014-09-09 5:55 ` [net-next PATCH v3 04/15] net: sched: cls_cgroup " John Fastabend
2014-09-09 12:34 ` Eric Dumazet
2014-09-09 5:56 ` [net-next PATCH v3 05/15] net: sched: cls_flow " John Fastabend
2014-09-09 12:41 ` Eric Dumazet
2014-09-09 16:09 ` John Fastabend
2014-09-09 16:20 ` Eric Dumazet
2014-09-09 5:56 ` [net-next PATCH v3 06/15] net: sched: fw " John Fastabend
2014-09-09 12:48 ` Eric Dumazet
2014-09-09 5:57 ` [net-next PATCH v3 07/15] net: sched: RCU cls_route John Fastabend
2014-09-09 12:59 ` Eric Dumazet
2014-09-09 16:24 ` John Fastabend
2014-09-09 5:57 ` [net-next PATCH v3 08/15] net: sched: RCU cls_tcindex John Fastabend
2014-09-09 5:58 ` [net-next PATCH v3 09/15] net: sched: make cls_u32 lockless John Fastabend
2014-09-09 13:20 ` Eric Dumazet
2014-09-09 16:35 ` John Fastabend [this message]
2014-09-09 18:02 ` Cong Wang
2014-09-10 4:53 ` John Fastabend
2014-09-09 5:58 ` [net-next PATCH v3 10/15] net: sched: rcu'ify cls_rsvp John Fastabend
2014-09-09 13:27 ` Eric Dumazet
2014-09-09 16:46 ` John Fastabend
2014-09-09 5:59 ` [net-next PATCH v3 11/15] net: sched: rcu'ify cls_bpf John Fastabend
2014-09-09 5:59 ` [net-next PATCH v3 12/15] net: sched: make tc_action safe to walk under RCU John Fastabend
2014-09-09 6:00 ` [net-next PATCH v3 13/15] net: sched: make bstats per cpu and estimator RCU safe John Fastabend
2014-09-09 6:00 ` [net-next PATCH v3 14/15] net: sched: make qstats per cpu John Fastabend
2014-09-09 6:01 ` [net-next PATCH v3 15/15] net: sched: drop ingress qdisc lock John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=540F2C4B.6060404@gmail.com \
--to=john.fastabend@gmail$(echo .)com \
--cc=brouer@redhat$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=jhs@mojatatu$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=paulmck@linux$(echo .)vnet.ibm.com \
--cc=xiyou.wangcong@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox