From: Daniel Borkmann <daniel@iogearbox•net>
To: Jamal Hadi Salim <jhs@mojatatu•com>,
Shmulik Ladkani <shmulik.ladkani@gmail•com>
Cc: "David S. Miller" <davem@davemloft•net>,
WANG Cong <xiyou.wangcong@gmail•com>,
Eric Dumazet <edumazet@google•com>,
netdev@vger•kernel.org, Florian Westphal <fw@strlen•de>
Subject: Re: [PATCH net-next 4/4] net/sched: act_mirred: Implement ingress actions
Date: Sun, 25 Sep 2016 18:26:47 +0200 [thread overview]
Message-ID: <57E7FAC7.6090904@iogearbox.net> (raw)
In-Reply-To: <6d2bd45a-a8a0-846d-5934-5e246522cab8@mojatatu.com>
On 09/25/2016 03:05 PM, Jamal Hadi Salim wrote:
> On 16-09-23 11:40 AM, Shmulik Ladkani wrote:
>> On Fri, 23 Sep 2016 08:48:33 -0400 Jamal Hadi Salim <jhs@mojatatu•com> wrote:
>>>> Even today, one may create loops using existing 'egress redirect',
>>>> e.g. this rediculously errorneous construct:
>>>>
>>>> # ip l add v0 type veth peer name v0p
>>>> # tc filter add dev v0p parent ffff: basic \
>>>> action mirred egress redirect dev v0
>>>
>>> I think we actually recover from this one by eventually
>>> dropping (theres a ttl field).
>>
>> [off topic]
>>
>> Don't know about that :) cpu fan got very noisy, 3 of 4 cores at 100%,
>> and after one second I got:
>>
>> # ip -s l show type veth
>> 16: v0p@v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
>> link/ether a2:64:ff:10:dd:85 brd ff:ff:ff:ff:ff:ff
>> RX: bytes packets errors dropped overrun mcast
>> 71660305923 469890864 0 0 0 0
>> TX: bytes packets errors dropped carrier collsns
>> 3509 24 0 0 0 0
>> 17: v0@v0p: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
>> link/ether 52:a2:34:f6:7c:ec brd ff:ff:ff:ff:ff:ff
>> RX: bytes packets errors dropped overrun mcast
>> 3509 24 0 0 0 0
>> TX: bytes packets errors dropped carrier collsns
>> 71660713017 469893555 0 0 0 0
>>
>
> I think this is still on topic!
>
> Now I realize that code we took out around 4.2.x is still useful
> for such a use case (I wasnt thinking about veth when Florian was
> slimming the skb);
> +Cc Florian W.
>
> This snippet from 4.2:
> -------------
> 3525 static int ing_filter(struct sk_buff *skb, struct netdev_queue *rxq)
> 3526 {
> 3527 struct net_device *dev = skb->dev;
> 3528 u32 ttl = G_TC_RTTL(skb->tc_verd);
> 3529 int result = TC_ACT_OK;
> 3530 struct Qdisc *q;
> 3531
> 3532 if (unlikely(MAX_RED_LOOP < ttl++)) {
> 3533 net_warn_ratelimited("Redir loop detected Dropping packet (%d->%d)\n",
> 3534 skb->skb_iif, dev->ifindex);
> 3535 return TC_ACT_SHOT;
> 3536 }
> 3537
> 3538 skb->tc_verd = SET_TC_RTTL(skb->tc_verd, ttl);
> 3539 skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
> 3540
> 3541 q = rcu_dereference(rxq->qdisc);
> 3542 if (q != &noop_qdisc) {
> 3543 spin_lock(qdisc_lock(q));
> 3544 if (likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
> 3545 result = qdisc_enqueue_root(skb, q);
> 3546 spin_unlock(qdisc_lock(q));
> 3547 }
> 3548
> 3549 return result;
> 3550 }
> --------------------
>
> MAX_RED_LOOP (stands for "Maximum Redirect loop") still exists in
> current code. The idea above was that we would increment the rttl
> counter once and if we saw it again upto MAX_RED_LOOP we would assume
> a loop and drop the packet (at the time i didnt think it was wise to
> let the actions be in charge of setting the RTTL; it had to be central
> core code - but it may not be neccessary)
>
> Florian, when we discussed I said it was fine to reclaim those 3 bits
> on tc verdict for RTTL at the time because i had taken out the
> feature and never added it back. Your comment at the time was we can
> add it back when someone shows up with the feature.
> Shmulik is looking to add it.
Why not just reuse xmit_recursion, which is what we did in tc cls_bpf
programs f.e. see __bpf_tx_skb()? Would be a pity to waste 3 bits on
this in the skb.
>> Similarly to all constructs injecting skbs to device rx (bond/team,
>> vlan, macvlan, tunnels, ifb, __dev_forward_skb callers, etc..), we are
>> obligated to assign 'skb2->dev' as the new rx device.
>>
>> Regarding 'skb2->skb_iif', original act_mirred code already has:
>>
>> skb2->skb_iif = skb->dev->ifindex; <--- THIS IS ORIG DEV IIF
>> skb2->dev = dev; <--- THIS IS TARGET DEV
>> err = dev_queue_xmit(skb2);
>>
>> I'm preserving this; OTOH the suggested modification in the patch is
>>
>> - err = dev_queue_xmit(skb2);
>> + if (tcf_mirred_act_direction(m->tcfm_eaction) & AT_EGRESS)
>> + err = dev_queue_xmit(skb2);
>> + else
>> + netif_receive_skb(skb2);
>>
>> now, the call to 'netif_receive_skb' will eventually override skb_iif to
>> the target RX dev's index, upon entry to __netif_receive_skb_core.
>>
>> I think this IS the expected behavior - as done by other "rx injection"
>> constructs.
>
> Sounds fine.
> I am wondering if we can have a tracing feature to show the lifetime of
> the packet as it is being cycled around the kernel? It would help
> debugging if some policy misbehaves.
>
>> My doubts were around whether we should call 'dev_forward_skb' instead
>> of 'netif_receive_skb'.
>> The former does some things I assumed we're not interested of, like
>> testing 'is_skb_forwardable' and re-running 'eth_type_trans'.
>> OTOH, it DOES scrub the skb.
>> Maybe we should scrub it as well prior the netif_receive_skb call?
>
> Scrubbing the skb could be a bad idea if it gets rid of global state
> like the RTTL if you add it back.
>
> cheers,
> jamal
>
next prev parent reply other threads:[~2016-09-25 16:26 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-22 13:21 [PATCH net-next 0/4] act_mirred: Ingress actions support Shmulik Ladkani
2016-09-22 13:21 ` [PATCH net-next 1/4] net/sched: act_mirred: Rename tcfm_ok_push to tcfm_mac_header_xmit Shmulik Ladkani
2016-09-27 10:30 ` Daniel Borkmann
2016-09-27 18:24 ` Shmulik Ladkani
2016-09-22 13:21 ` [PATCH net-next 2/4] net/sched: act_mirred: Refactor detection whether dev needs xmit at mac header Shmulik Ladkani
2016-09-22 13:21 ` [PATCH net-next 3/4] net/sched: tc_mirred: Rename public predicates 'is_tcf_mirred_redirect' and 'is_tcf_mirred_mirror' Shmulik Ladkani
2016-09-22 13:21 ` [PATCH net-next 4/4] net/sched: act_mirred: Implement ingress actions Shmulik Ladkani
2016-09-22 14:54 ` Eric Dumazet
2016-09-22 18:27 ` Shmulik Ladkani
2016-09-22 18:42 ` Eric Dumazet
2016-09-22 23:40 ` Jamal Hadi Salim
2016-09-23 5:11 ` Shmulik Ladkani
2016-09-23 12:48 ` Jamal Hadi Salim
2016-09-23 15:40 ` Shmulik Ladkani
2016-09-25 0:20 ` Cong Wang
2016-09-25 13:05 ` Jamal Hadi Salim
2016-09-25 16:26 ` Daniel Borkmann [this message]
2016-09-25 18:33 ` Florian Westphal
2016-09-25 23:47 ` Jamal Hadi Salim
2016-09-25 23:31 ` Jamal Hadi Salim
2016-09-25 17:33 ` Shmulik Ladkani
2016-09-25 18:31 ` Florian Westphal
2016-09-26 1:15 ` Jamal Hadi Salim
2016-09-26 1:35 ` Florian Westphal
2016-09-26 1:40 ` Jamal Hadi Salim
2016-09-26 14:43 ` Hannes Frederic Sowa
2016-09-26 14:53 ` Daniel Borkmann
2016-09-26 15:12 ` Hannes Frederic Sowa
2016-09-26 15:53 ` Daniel Borkmann
2016-09-26 15:26 ` Shmulik Ladkani
2016-09-25 23:45 ` Jamal Hadi Salim
2016-09-25 0:07 ` Cong Wang
2016-09-25 13:39 ` Jamal Hadi Salim
2016-09-26 4:55 ` Cong Wang
2016-09-25 17:59 ` Shmulik Ladkani
2016-09-26 4:56 ` Cong Wang
2016-09-24 23:50 ` Cong Wang
2016-09-27 5:56 ` David Miller
2016-09-27 8:07 ` Shmulik Ladkani
2016-09-27 10:39 ` Daniel Borkmann
2016-09-27 13:44 ` David Miller
2016-09-27 14:18 ` Shmulik Ladkani
2016-09-27 14:47 ` Daniel Borkmann
2016-09-27 14:06 ` Jamal Hadi Salim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57E7FAC7.6090904@iogearbox.net \
--to=daniel@iogearbox$(echo .)net \
--cc=davem@davemloft$(echo .)net \
--cc=edumazet@google$(echo .)com \
--cc=fw@strlen$(echo .)de \
--cc=jhs@mojatatu$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=shmulik.ladkani@gmail$(echo .)com \
--cc=xiyou.wangcong@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox