public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox•net>
To: Alexei Starovoitov <alexei.starovoitov@gmail•com>,
	David Miller <davem@davemloft•net>
Cc: brouer@redhat•com, kubakici@wp•pl, netdev@vger•kernel.org,
	xdp-newbies@vger•kernel.org
Subject: Re: [PATCH v3 net-next RFC] Generic XDP
Date: Tue, 18 Apr 2017 01:33:48 +0200	[thread overview]
Message-ID: <58F550DC.3050200@iogearbox.net> (raw)
In-Reply-To: <20170417230436.GA96258@ast-mbp.thefacebook.com>

On 04/18/2017 01:04 AM, Alexei Starovoitov wrote:
> On Mon, Apr 17, 2017 at 03:49:55PM -0400, David Miller wrote:
>> From: Jesper Dangaard Brouer <brouer@redhat•com>
>> Date: Sun, 16 Apr 2017 22:26:01 +0200
>>
>>> The bpf tail-call use-case is a very good example of why the
>>> verifier cannot deduct the needed HEADROOM upfront.
>>
>> This brings up a very interesting question for me.
>>
>> I notice that tail calls are implemented by JITs largely by skipping
>> over the prologue of that destination program.
>>
>> However, many JITs preload cached SKB values into fixed registers in
>> the prologue.  But they only do this if the program being JITed needs
>> those values.
>>
>> So how can it work properly if a program that does not need the SKB
>> values tail calls into one that does?
>
> For x86 JIT it's fine, since caching of skb values is not part of the prologue:
>    emit_prologue(&prog);
>    if (seen_ld_abs)
>            emit_load_skb_data_hlen(&prog);
> and tail_call jumps into the next program as:
>    EMIT4(0x48, 0x83, 0xC0, PROLOGUE_SIZE);   /* add rax, prologue_size */
>    EMIT2(0xFF, 0xE0);                        /* jmp rax */
> whereas inside emit_prologue() we have:
> B  UILD_BUG_ON(cnt != PROLOGUE_SIZE);
>
> arm64 has similar proplogue skipping code and it's even
> simpler than x86, since it doesn't try to optimize LD_ABS/IND in assembler
> and instead calls into bpf_load_pointer() from generated code,
> so no caching of skb values at all.
>
> s390 jit has partial skipping of prologue, since bunch
> of registers are save/restored during tail_call and it looks fine
> to me as well.

And ppc64 does unwinding/tearing down the stack of the prog before
jumping into the other program. Thus, no skipping of others prologue;
looks fine, too.

> It's very hard to extend test_bpf.ko with tail_calls, since maps need
> to be allocated and populated with file descriptors which are
> not feasible to do from .ko. Instead we need a user space based test for it.
> We've started building one in tools/testing/selftests/bpf/test_progs.c
> much more tests need to be added. Thorough testing of tail_calls
> is on the todo list.

  reply	other threads:[~2017-04-17 23:33 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-12 18:54 [PATCH v3 net-next RFC] Generic XDP David Miller
2017-04-12 19:54 ` David Ahern
2017-04-13  2:08   ` David Miller
2017-04-13  2:16     ` David Ahern
2017-04-12 21:30 ` Stephen Hemminger
2017-04-12 21:49   ` Eric Dumazet
2017-04-13  1:55     ` David Miller
2017-04-13  1:54   ` David Miller
2017-04-13  4:20 ` Alexei Starovoitov
2017-04-13  6:10   ` Johannes Berg
2017-04-13 15:38     ` David Miller
2017-04-14 19:41       ` Alexei Starovoitov
2017-04-18  9:47         ` Johannes Berg
2017-04-18 23:09           ` Alexei Starovoitov
2017-04-13 15:37   ` David Miller
2017-04-13 19:22     ` Johannes Berg
2017-04-13 20:01       ` David Miller
2017-04-14  8:07         ` Johannes Berg
2017-04-14 19:09           ` Alexei Starovoitov
2017-04-14  9:05     ` Jesper Dangaard Brouer
2017-04-14 19:28       ` Alexei Starovoitov
2017-04-14 22:18         ` Daniel Borkmann
2017-04-14 22:30         ` Jakub Kicinski
2017-04-15  0:46           ` Alexei Starovoitov
2017-04-15  1:47             ` Jakub Kicinski
2017-04-16 20:26             ` Jesper Dangaard Brouer
2017-04-17 19:49               ` David Miller
2017-04-17 23:04                 ` Alexei Starovoitov
2017-04-17 23:33                   ` Daniel Borkmann [this message]
2017-04-18 18:46                   ` David Miller
2017-04-18 23:05                     ` Alexei Starovoitov
2017-04-13  6:48 ` Michael Chan
2017-04-13 15:38   ` David Miller
2017-04-13 15:57 ` Daniel Borkmann
2017-04-13 16:04   ` David Miller
2017-04-13 17:13 ` aa5c2fd79f: net/core/dev.c:#suspicious_rcu_dereference_check()usage kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58F550DC.3050200@iogearbox.net \
    --to=daniel@iogearbox$(echo .)net \
    --cc=alexei.starovoitov@gmail$(echo .)com \
    --cc=brouer@redhat$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=kubakici@wp$(echo .)pl \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=xdp-newbies@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox