From: "Toke Høiland-Jørgensen" <toke@redhat•com>
To: Jiri Pirko <jiri@resnulli•us>
Cc: Jamal Hadi Salim <jhs@mojatatu•com>,
John Fastabend <john.fastabend@gmail•com>,
Jamal Hadi Salim <hadi@mojatatu•com>,
Willem de Bruijn <willemb@google•com>,
Stanislav Fomichev <sdf@google•com>,
Jakub Kicinski <kuba@kernel•org>,
netdev@vger•kernel.org, kernel@mojatatu•com,
deb.chatterjee@intel•com, anjali.singhai@intel•com,
namrata.limaye@intel•com, khalidm@nvidia•com, tom@sipanda•io,
pratyush@sipanda•io, xiyou.wangcong@gmail•com,
davem@davemloft•net, edumazet@google•com, pabeni@redhat•com,
vladbu@nvidia•com, simon.horman@corigine•com,
stefanc@marvell•com, seong.kim@amd•com, mattyk@nvidia•com,
dan.daly@intel•com, john.andy.fingerhut@intel•com
Subject: Re: [PATCH net-next RFC 00/20] Introducing P4TC
Date: Tue, 31 Jan 2023 18:01:27 +0100 [thread overview]
Message-ID: <87357qvdso.fsf@toke.dk> (raw)
In-Reply-To: <Y9kn6bh8z11xWsDh@nanopsycho>
Jiri Pirko <jiri@resnulli•us> writes:
> Tue, Jan 31, 2023 at 01:17:14PM CET, toke@redhat•com wrote:
>>Jamal Hadi Salim <jhs@mojatatu•com> writes:
>>
>>> Toke, i dont think i have managed to get across that there is an
>>> "autonomous" control built into the kernel. It is not just things that
>>> come across netlink. It's about the whole infra.
>>
>>I'm not disputing the need for the TC infra to configure the pipelines
>>and their relationship in the hardware. I'm saying that your
>>implementation *of the SW path* is the wrong approach and it would be
>>better done by using BPF (not talking about the existing TC-BPF,
>>either).
>>
>>It's a bit hard to know your thinking for sure here, since your patch
>>series doesn't include any of the offload control bits. But from the
>>slides and your hints in this series, AFAICT, the flow goes something
>>like:
>>
>>hw_pipeline_id = devlink_program_hardware(dev, p4_compiled_blob);
>>sw_pipeline_id = `tc p4template create ...` (etc, this is generated by P4C)
>>
>>tc_act = tc_act_create(hw_pipeline_id, sw_pipeline_id)
>>
>>which will turn into something like:
>>
>>struct p4_cls_offload ofl = {
>> .classid = classid,
>> .pipeline_id = hw_pipeline_id
>>};
>>
>>if (check_sw_and_hw_equivalence(hw_pipeline_id, sw_pipeline_id)) /* some magic check here */
>> return -EINVAL;
>>
>>netdev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_P4, &ofl);
>>
>>
>>I.e, all that's being passed to the hardware is the ID of the
>>pre-programmed pipeline, because that programming is going to be
>>out-of-band via devlink anyway.
>>
>>In which case, you could just as well replace the above:
>>
>>sw_pipeline_id = `tc p4template create ...` (etc, this is generated by P4C)
>>
>>with
>>
>>sw_pipeline_id = bpf_prog_load(BPF_PROG_TYPE_P4TC, "my_obj_file.o"); /* my_obj_file is created by P4c */
>>
>>and achieve exactly the same.
>>
>>Having all the P4 data types and concepts exist inside the kernel
>>*might* make sense if the kernel could then translate those into the
>>hardware representations and manage their lifecycle in a uniform way.
>>But as far as I can tell from the slides and what you've been saying in
>>this thread that's not going to be possible anyway, so why do you need
>>anything more granular than the pipeline ID?
>
> Toke, I understand what what you describe above is applicable for the P4
> program instantiation (pipeline definition).
>
> What is the suggestion for the actual "rule insertions" ? Would it make
> sense to use TC iface (Jamal's or similar) to insert rules to both BPF SW
> path and offloaded HW path?
Hmm, so by "rule insertions" here you're referring to populating what P4
calls 'tables', right?
I could see a couple of ways this could be bridged between the BPF side
and the HW side:
- Create a new BPF map type that is backed by the TC-internal data
structure, so updates from userspace go via the TC interface, but BPF
programs access the contents via the bpf_map_*() helpers (or we could
allow updating via the bpf() syscall as well)
- Expose the TC data structures to BPF via their own set of kfuncs,
similar to what we did for conntrack
- Scrap the TC interface entirely and make this an offload-enabled BPF
map type (using the BPF ndo and bpf_map_dev_ops operations to update
it). Userspace would then populate it via the bpf() syscall like any
other map.
I suspect the map interface is the most straight-forward to use from the
BPF side, but informing this by what existing implementations do
(thinking of the P4->XDP compiler in particular) might be a good idea?
-Toke
next prev parent reply other threads:[~2023-01-31 17:02 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-24 17:03 [PATCH net-next RFC 00/20] Introducing P4TC Jamal Hadi Salim
2023-01-26 23:30 ` Jakub Kicinski
2023-01-27 13:33 ` Jamal Hadi Salim
2023-01-27 17:18 ` Jakub Kicinski
2023-01-27 19:42 ` Jamal Hadi Salim
2023-01-28 1:34 ` Singhai, Anjali
2023-01-28 21:17 ` Tom Herbert
2023-01-29 2:09 ` Stephen Hemminger
2023-01-30 3:09 ` Singhai, Anjali
2023-01-30 17:05 ` Tom Herbert
2023-01-27 18:26 ` Jiri Pirko
2023-01-27 20:04 ` Jamal Hadi Salim
2023-01-27 22:26 ` sdf
2023-01-27 23:06 ` Tom Herbert
2023-01-28 0:47 ` Stanislav Fomichev
2023-01-28 1:32 ` Tom Herbert
2023-01-27 23:27 ` Jamal Hadi Salim
2023-01-28 0:47 ` Stanislav Fomichev
2023-01-28 13:37 ` Willem de Bruijn
2023-01-28 15:10 ` Jamal Hadi Salim
2023-01-28 15:33 ` Willem de Bruijn
2023-01-29 5:39 ` John Fastabend
2023-01-29 11:11 ` Jamal Hadi Salim
2023-01-29 11:19 ` Jamal Hadi Salim
2023-01-30 4:30 ` John Fastabend
2023-01-30 10:13 ` Jiri Pirko
2023-01-30 11:26 ` Toke Høiland-Jørgensen
2023-01-30 14:06 ` Jamal Hadi Salim
2023-01-30 14:42 ` Andrew Lunn
2023-01-30 15:31 ` Jamal Hadi Salim
2023-01-30 17:04 ` Toke Høiland-Jørgensen
2023-01-30 19:02 ` Jamal Hadi Salim
2023-01-30 20:21 ` Toke Høiland-Jørgensen
2023-01-30 21:10 ` John Fastabend
2023-01-30 21:20 ` Toke Høiland-Jørgensen
2023-01-30 22:53 ` Jamal Hadi Salim
2023-01-30 23:24 ` Singhai, Anjali
2023-01-31 0:06 ` John Fastabend
2023-01-31 0:26 ` Jamal Hadi Salim
2023-01-31 4:12 ` Jakub Kicinski
2023-01-31 10:27 ` Jamal Hadi Salim
2023-01-31 10:30 ` Jamal Hadi Salim
2023-01-31 19:10 ` Jakub Kicinski
2023-01-31 22:32 ` Jamal Hadi Salim
2023-01-31 22:36 ` Jakub Kicinski
2023-01-31 22:50 ` Jamal Hadi Salim
2023-01-30 23:32 ` John Fastabend
2023-01-31 12:17 ` Toke Høiland-Jørgensen
2023-01-31 12:37 ` Jiri Pirko
2023-01-31 14:38 ` Jiri Pirko
2023-01-31 17:01 ` Toke Høiland-Jørgensen [this message]
2023-01-31 22:23 ` Jamal Hadi Salim
2023-01-31 22:53 ` Toke Høiland-Jørgensen
2023-01-31 23:31 ` Jamal Hadi Salim
2023-02-01 18:08 ` Toke Høiland-Jørgensen
2023-02-02 18:50 ` Jamal Hadi Salim
2023-02-02 23:34 ` Tom Herbert
2023-01-30 22:41 ` Tom Herbert
2023-02-14 17:07 ` Edward Cree
2023-02-14 20:44 ` Jamal Hadi Salim
2023-02-16 20:24 ` Jamal Hadi Salim
2023-01-29 11:02 ` Jamal Hadi Salim
2023-01-29 22:14 ` Toke Høiland-Jørgensen
2023-01-28 13:41 ` Jamal Hadi Salim
2023-01-27 23:02 ` Daniel Borkmann
2023-01-27 23:57 ` Jamal Hadi Salim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87357qvdso.fsf@toke.dk \
--to=toke@redhat$(echo .)com \
--cc=anjali.singhai@intel$(echo .)com \
--cc=dan.daly@intel$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=deb.chatterjee@intel$(echo .)com \
--cc=edumazet@google$(echo .)com \
--cc=hadi@mojatatu$(echo .)com \
--cc=jhs@mojatatu$(echo .)com \
--cc=jiri@resnulli$(echo .)us \
--cc=john.andy.fingerhut@intel$(echo .)com \
--cc=john.fastabend@gmail$(echo .)com \
--cc=kernel@mojatatu$(echo .)com \
--cc=khalidm@nvidia$(echo .)com \
--cc=kuba@kernel$(echo .)org \
--cc=mattyk@nvidia$(echo .)com \
--cc=namrata.limaye@intel$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
--cc=pabeni@redhat$(echo .)com \
--cc=pratyush@sipanda$(echo .)io \
--cc=sdf@google$(echo .)com \
--cc=seong.kim@amd$(echo .)com \
--cc=simon.horman@corigine$(echo .)com \
--cc=stefanc@marvell$(echo .)com \
--cc=tom@sipanda$(echo .)io \
--cc=vladbu@nvidia$(echo .)com \
--cc=willemb@google$(echo .)com \
--cc=xiyou.wangcong@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox