public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@plumgrid•com>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: "David S. Miller" <davem@davemloft•net>,
	Michael Holzheu <holzheu@linux•vnet.ibm.com>,
	Daniel Borkmann <daniel@iogearbox•net>,
	netdev@vger•kernel.org
Subject: Re: [PATCH net-next 1/2] bpf: introduce bpf_skb_vlan_push/pop() helpers
Date: Fri, 17 Jul 2015 11:17:33 -0700	[thread overview]
Message-ID: <55A946BD.4030008@plumgrid.com> (raw)
In-Reply-To: <1437120738.1026.16.camel@edumazet-glaptop2.roam.corp.google.com>

On 7/17/15 1:12 AM, Eric Dumazet wrote:
> On Thu, 2015-07-16 at 19:58 -0700, Alexei Starovoitov wrote:
>> In order to let eBPF programs call skb_vlan_push/pop via helper functions
>
> Why should eBPF program do such thing ?
>
> Are BPF users in the kernel expecting skb being changed, and are we sure
> they reload all cached values when/if needed ?

well, classic BPF and even extended BPF with socket filters cannot use
these helpers. They are for TC ingress/egress only. There different
actions can already change skb->data while classifiers/actions are
running. btw, before we started discussing this topic at nfws,
I thought that bpf programs will never be able to change skb->data from
inside the programs, but turned out it's only JITs that needed
re-caching. Programs cannot see skb->data. They can access packet
only via ld_abs/ld_ind instructions and helper functions.
So any changes to internal fields of skb are invisible to programs.
skb->data/hlen cache that is part of JIT is also invisible to the
programs. It's an optimization that some JITs do. arm64 JIT doesn't
do this optimization, for example.
I'll reword commit log to better explain this.

One of the use cases is this phys2virt gateway I presented. There I
need to do vlan-learning and src mac forwarding. Currently I'm
creating as many as I can vlan netdevs on top of regular eth0
and attach tc+bpf to all of them. That's very inefficient.
With these helpers I'll attach tc+bpf to eth0 only and skip creation
of thousands of vlan netdevs.

>> eBPF JITs need to recognize helpers that change skb->data, since
>> skb->data and hlen are cached as part of JIT code generation.
>> - arm64 JIT is using bpf_load_pointer() without caching, so it's ok as-is.
>> - x64 JIT recognizes bpf_skb_vlan_push/pop() calls and re-caches skb->data/hlen
>>    after such calls (experiments showed that conditional re-caching is slower).
>> - s390 JIT falls back to interpreter for now when bpf_skb_vlan_push() is present
>>    in the program (re-caching is tbd).
>
>
>
>> +static u64 bpf_skb_vlan_push(u64 r1, u64 vlan_proto, u64 vlan_tci, u64 r4, u64 r5)
>> +{
>> +	struct sk_buff *skb = (struct sk_buff *) (long) r1;
>> +
>> +	if (unlikely(vlan_proto != htons(ETH_P_8021Q) &&
>> +		     vlan_proto != htons(ETH_P_8021AD)))
>> +		vlan_proto = htons(ETH_P_8021Q);
>
>
> This would raise sparse error, as vlan_proto is u64, and htons() __be16
>
> make C=2 CF=-D__CHECK_ENDIAN__ net/core/filter.o

yes.
When I wrote these lines I thought of the same, so I did run the sparse
and it spewed a lot of false positives and stopped on 'too many errors'
before reaching these lines.
So I downloaded the latest sparse, hacked it a little and tried again.
Still it didn't complain about the endianness. That was puzzling,
so I left the above lines as-is.
but since your eagle eyes caught it, I will add casts :)

  reply	other threads:[~2015-07-17 18:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-17  2:58 [PATCH net-next 0/2] bpf: introduce bpf_skb_vlan_push/pop() helpers Alexei Starovoitov
2015-07-17  2:58 ` [PATCH net-next 1/2] " Alexei Starovoitov
2015-07-17  8:12   ` Eric Dumazet
2015-07-17 18:17     ` Alexei Starovoitov [this message]
2015-07-17  2:58 ` [PATCH net-next 2/2] test_bpf: add bpf_skb_vlan_push/pop() tests Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55A946BD.4030008@plumgrid.com \
    --to=ast@plumgrid$(echo .)com \
    --cc=daniel@iogearbox$(echo .)net \
    --cc=davem@davemloft$(echo .)net \
    --cc=eric.dumazet@gmail$(echo .)com \
    --cc=holzheu@linux$(echo .)vnet.ibm.com \
    --cc=netdev@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox