Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration

public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed

From: Daniel Borkmann <daniel@iogearbox•net>
To: Hannes Frederic Sowa <hannes@stressinduktion•org>,
	Alexei Starovoitov <ast@fb•com>, Martin KaFai Lau <kafai@fb•com>,
	netdev@vger•kernel.org
Cc: kernel-team@fb•com, "David S. Miller" <davem@davemloft•net>,
	Jesper Dangaard Brouer <brouer@redhat•com>,
	John Fastabend <john.fastabend@gmail•com>,
	Thomas Graf <tgraf@suug•ch>
Subject: Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration
Date: Fri, 28 Apr 2017 20:24:02 +0200	[thread overview]
Message-ID: <590388C2.7080000@iogearbox.net> (raw)
In-Reply-To: <44cdb2d2-9f5c-5d28-2966-3e43e6d2a2ef@stressinduktion.org>

On 04/28/2017 01:50 PM, Hannes Frederic Sowa wrote:
> On 28.04.2017 03:11, Alexei Starovoitov wrote:
[...]
>> i disagree re: kallsyms. The goal of prog_tag is to let program writers
>> understand which program is running in a stable way.
>
> But exactly it doesn't let program writers do that, it just confuses them:
>
> ---
>
> jit on:
>
> perf record -e bpf_redirect -agR
>
> The unwinder walks the stack, extracts address of upper function and
> sends it to user space (perf) or handles it inside the kernel/kallsyms
> (ftrace).
>
> User takes tag of bpf program and wants to inspect related maps to the
> program. Unfortunately the tag is not unique and thus we need to expand
> the tag back to all possible programs with the same tag and expand that
> to the union of all possible maps that those programs reference again.
>
> That is what we present to the application developer. I would seriously
> be very confused.
>
> If application developer doesn't trust perf and uses instruction pointer
> value from the stack directly he can't find out which program there is,
> because fdinfo e.g. doesn't show the actual address of where the program
> is allocated. I would use /dev/kmem now.

I don't think it would be reasonable to let fdinfo unconditionally
dump the address of the program including unprivileged progs. We
probably could add a run-time check into bpf_prog_show_fdinfo() and
show it dynamically when user has cap_sys_admin.

> ---
>
> jit off:
>
> perf probe -a '__bpf_prog_run ctx insn'
> perf probe -a 'bpf_redirect flags ifindex'
> perf record -e bpf_redirect -agR
>
> Situation doesn't change. We do get the insn pointer thus have a unique
> id for the program. That's it, no further introspection. I can read
> /dev/kmem now.
>
> ---
>
> Personally I wouldn't rely on such infrastructure.
>
> My proposal would be to maybe hash a map id into the program, so instead
> of replacing the user space file descriptor with zero, take a map id
> (like discussed below) or an inode number of the map into the register
> and hash with that, so that those program have unique identifiers.

I don't think that proposal would work, f.e. placing dev + inode number
(inode itself wouldn't be sufficient either; map would also have to be
pinned as anonymous inode from fd wouldn't work) or map id into insn
won't give you out of a sudden a unique prog id, since maps can be shared
among multiple progs, but also the same prog can be attached to, say,
multiple attachment points.

> Otherwise construct kallsym entries with prog id instead of tag.
>
> I think that the hash should try to reassemble some kind of identity
> function and mapping two programs to the same tag, that do something
> completely differently is not good (based on we don't include the map).
>
> Also I do think in future the difference between non-jit and jit
> operation in regards to tracing should also be lifted. We could add a
> manual tracing point into the interpreter for reporting the same event
> as if the program was jitted.
>
> Debugging should not be that different based on the sysctl flags.

With regards to tracing it's quite useful to see whether a program was
JITed or not JITed (aka __bpf_prog_run()), so I don't think it makes
sense to e.g. have everything named __bpf_prog_run(), at least the other
way around wouldn't work for interpreter as far as I see.

But lets assume JIT is off for a moment, and you only see __bpf_prog_run().
Then, in the stack trace you'll also see related functions that call this
in the first place, for example, mlx4_en_poll_rx_cq() / mlx4_en_process_rx_cq()
in case of XDP, meaning, you get the call path context as well, for which
you later on (with the proposed infrastructure for getting fds from
attachment points + dumping them) can return the attached prog fd and
with that also dump the code or map data.

next prev parent reply	other threads:[~2017-04-28 18:24 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-27  6:24 [RFC net-next 0/2] Introduce bpf_prog ID and iteration Martin KaFai Lau
2017-04-27  6:24 ` [RFC net-next 1/2] bpf: Introduce bpf_prog ID Martin KaFai Lau
2017-05-15  8:14   ` [lkp-robot] [bpf] de05014aba: BUG:sleeping_function_called_from_invalid_context_at_mm/slab.h kernel test robot
2017-05-15 18:57     ` David Miller
2017-04-27  6:24 ` [RFC net-next 2/2] bpf: Test for bpf_prog ID and BPF_PROG_GET_NEXT_ID Martin KaFai Lau
2017-04-27  7:23   ` Alexander Alemayhu
2017-04-27 21:10     ` Martin KaFai Lau
2017-04-27 13:36 ` [RFC net-next 0/2] Introduce bpf_prog ID and iteration Hannes Frederic Sowa
2017-04-27 21:14   ` Martin KaFai Lau
2017-04-28  1:11   ` prog ID and next steps. Was: " Alexei Starovoitov
2017-04-28 11:50     ` Hannes Frederic Sowa
2017-04-28 18:24       ` Daniel Borkmann [this message]
2017-04-28 18:51         ` Hannes Frederic Sowa
2017-04-28 18:57           ` Hannes Frederic Sowa
2017-04-28 19:31       ` Alexei Starovoitov
2017-04-28 21:13         ` Hannes Frederic Sowa
2017-04-30  2:06           ` Alexei Starovoitov
2017-04-30  2:36             ` Hannes Frederic Sowa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=590388C2.7080000@iogearbox.net \
    --to=daniel@iogearbox$(echo .)net \
    --cc=ast@fb$(echo .)com \
    --cc=brouer@redhat$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=hannes@stressinduktion$(echo .)org \
    --cc=john.fastabend@gmail$(echo .)com \
    --cc=kafai@fb$(echo .)com \
    --cc=kernel-team@fb$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=tgraf@suug$(echo .)ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox