From: Nicolas Dichtel <nicolas.dichtel@6wind•com>
To: Mahesh Bandewar <mahesh@bandewar•net>,
David Miller <davem@davemloft•net>
Cc: Mahesh Bandewar <maheshb@google•com>,
Eric Dumazet <edumazet@google•com>,
netdev <netdev@vger•kernel.org>,
"Eric W. Biederman" <ebiederm@xmission•com>,
Cong Wang <cwang@twopensource•com>
Subject: Re: [PATCH next v2 0/7] Introduce l3_dev pointer for L3 processing
Date: Thu, 10 Mar 2016 10:47:02 +0100 [thread overview]
Message-ID: <56E14296.5010103@6wind.com> (raw)
In-Reply-To: <1457560189-12870-1-git-send-email-mahesh@bandewar.net>
Le 09/03/2016 22:49, Mahesh Bandewar a écrit :
> From: Mahesh Bandewar <maheshb@google•com>
>
> One of the major request (for enhancement) that I have received
> from various users of IPvlan in L3 mode is its inability to handle
> IPtables.
>
> While looking at the code and how we handle ingress, the problem
> can be attributed to the asymmetry in the way packets get processed
> for IPvlan devices configured in L3 mode. L3 mode is supposed to
> be restrictive and all the L3 decisions need to be taken for the
> traffic in master's ns. This does happen as expected for egress
> traffic however on ingress traffic, the IPvlan packet-handler
> changes the skb->dev and this forces packet to be processed with
> the IPvlan slave and it's associated ns. This causes above mentioned
> problem and few other which are not yet reported / attempted. e.g.
> IPsec with L3 mode or even ingress routing.
>
> This could have been solved if we had a way to handover packet to
> slave and associated ns after completing the L3 phase. This is a
> non-trivial issue to fix especially looking at IPsec code.
>
> This patch series attempts to solve this problem by introducing the
> device pointer l3_dev which resides in net_device structure in the
> RX cache line. We initialize the l3_dev to self. This would mean
> there is no complex logic to when-and-how-to initialize it. Now
> the stack will use this dev pointer during the L3 phase. This should
> not alter any existing properties / behavior and also there should
> not be any additional penalties since it resides in the same RX
> cache line.
If I understand correctly (and as Cong already said), information are leaking
between netns during the input phase. On the tx side, skb_scrub_packet() is
called, but not on the rx side. I think it's wrong. There should be an explicit
boundary.
Another small comment: maybe finding another name than l3_dev could help to
avoid confusion with the existing l3mdev.
next prev parent reply other threads:[~2016-03-10 9:47 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-09 21:49 [PATCH next v2 0/7] Introduce l3_dev pointer for L3 processing Mahesh Bandewar
2016-03-10 9:47 ` Nicolas Dichtel [this message]
2016-03-10 21:29 ` Cong Wang
2016-03-14 0:01 ` Mahesh Bandewar
2016-03-14 18:13 ` Cong Wang
2016-03-13 23:44 ` Mahesh Bandewar
2016-03-14 1:50 ` David Miller
2016-03-14 2:29 ` Mahesh Bandewar
2016-03-14 3:53 ` David Miller
2016-03-14 17:57 ` Mahesh Bandewar
2016-03-17 8:47 ` Nicolas Dichtel
2016-03-14 18:15 ` Cong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56E14296.5010103@6wind.com \
--to=nicolas.dichtel@6wind$(echo .)com \
--cc=cwang@twopensource$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=ebiederm@xmission$(echo .)com \
--cc=edumazet@google$(echo .)com \
--cc=mahesh@bandewar$(echo .)net \
--cc=maheshb@google$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox