public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash•net>
To: hadi@cyberus•ca
Cc: Netfilter Development Mailinglist
	<netfilter-devel@vger•kernel.org>,
	Linux Netdev List <netdev@vger•kernel.org>,
	containers@lists•linux-foundation.org,
	Ben Greear <greearb@candelatech•com>
Subject: Re: RFC: netfilter: nf_conntrack: add support for "conntrack zones"
Date: Fri, 15 Jan 2010 11:15:22 +0100	[thread overview]
Message-ID: <4B50403A.6010507@trash.net> (raw)
In-Reply-To: <1263490403.23480.109.camel@bigi>

jamal wrote:
> On Thu, 2010-01-14 at 16:37 +0100, Patrick McHardy wrote:
>> jamal wrote:
> 
>>> Agreed that this would be a main driver of such a feature.
>>> Which means that you need zones (or whatever noun other people use) to
>>> work on not just netfilter, but also routing, ipsec etc.
>> Routing already works fine. I believe IPsec should also work already,
>> but I haven't tried it.
> 
> maybe further discussion  would clarify this point..
> 
>> The zone is set based on some other criteria (in this case the
>> incoming device).
> 
> If you are using a netdev as a reference point, then I take it 
> if you add vlans should be possible to do multiple zones on a single
> physical netdev? Or is there some other way to satisfy that?

Yes, you can assign a zone to each netdev. macvlan will also work.

Using a netfilter target for the raw table might be a better choice
on second thought though, it provides more flexibility and avoids
the netfilter-specific device setting. I'll probably change that.

>>  The packets make one pass through the stack
>> to a veth device and are SNATed in POSTROUTING to non-clashing
>> addresses. 
> 
> Ok - makes sense. 
> i.e NAT would work; and policy routing as well as arp would be fine.
> Also it looks to be sufficiently useful to fit a specific use case you
> are interested in.
> But back to my question on routing, ipsec etc (and you may not be
> interested in solving this problem, but it is what i was getting to
> earlier). Lets take for example: 
> a) network tables like SAD/SPD tables: how you would separate those on a
> per-zone basis? i.e 10.0.0.1/zone1 could use different
> policy/association than 10.0.0.1/zone2

The selectors include an ifindex, which could be used to
distinguish both based on the interface.

> b) dynamic protocols (routing, IKE etc): how do you do that without 
> making both sides understand what is going on?

In case of IPsec the outer addresses are different, its only the
selectors which will have similar addresses. A keying deamon should
have no trouble with this. The ifindex would be needed in the
selectors though to make sure each policy is used for the correct
traffic.

A routing daemon is unrealistic to be used in this scenario, at
least a single one for all the overlapping networks.

>>> This is a valid concern against the namespace approach. Existing tools
>>> of course could be taught to know about namespaces - and one could
>>> argue that if you can resolve the overlap IP address issue, then you
>>> _have to_ modify user space anyways.
>> I don't think thats true. 
> 
> Refer to my statements above for an example.
> 
>> In any case its completely impractical
>> to modify every userspace tool that does something with networking
>> and potentially make complex configuration changes to have all
>> those namespaces interact nicely. 
> 
> Agreed. But the major ones like iproute2 etc could be taught. We have
> namespaces in the kernel already, over a period of time I think changing
> the user space tools would a sensible evolution.

Yes, that might be useful in any case. But I don't think it would
even work for iproute or other standalone programs, a process can't
associate to an existing namespace except through clone(). So it
needs to run as child of a process already associated with the
namespace.

>> Currently they are simply not
>> very well suited for virtualizing selected parts of networking.
> 
> My contention is that it is a lot less headache to just virtualize 
> all the network stack and then use what you want than it is to go and
> selectively changing the network objects.
> Note: if i wanted today i could run racoon on every namespace 
> unchanged and it would work or i could modify racoon to understand
> namespaces...

See above.

>> I'm not sure whether there is a typical user for overlapping
>> networks :) I know of setups with ~150 overlapping networks.
>>
>> The number of conntracks per zone doesn't matter since the
>> table is shared between all zones. network namespaces would
>> allocate 150 tables, each of the same size, which might be
>> quite large.
> 
> Thats what i was looking for ..
> So the difference, to pick the 150 zones example so as to put a number
> around it, is namespaces will consume 150.X bytes (where X is the
> overhead of a conntrack table) and you approach will be (X + 152) bytes,
> correct?
> What is the typical sizeof X?

No, to give some correct number. Assuming a conntrack table of
10MB (large, but reasonable depending on the number of connections)
we get an overhead of:

namespaces: 150 * 10MB memory use
"zones": 152 bytes increased code size

Both approaches additionally need one extra connection tracking
entry of ~300 bytes per connection that is actually handled twice.

>>> You may also wanna look as a metric at code complexity/maintainability
>>> of this scheme vs namespace (which adds zero changes to the kernel).
>> There's not a lot of complexity, its basically passing a numeric
>> identifier around in a few spots and comparing it. Something like
>> TOS handling in the routing code.
> 
> I think the challenge is whether zones will have to encroach on other
> net stack objects or not. You are already touching structure netdev...

That will go away once I add a target for classification. I completely
agree that its undesirable to add this in more spots, but this is meant
purely for being able to pass traffic through conntrack/NAT more than
once.

  reply	other threads:[~2010-01-15 10:15 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-14 14:05 RFC: netfilter: nf_conntrack: add support for "conntrack zones" Patrick McHardy
2010-01-14 15:05 ` jamal
2010-01-14 15:37   ` Patrick McHardy
2010-01-14 17:33     ` jamal
2010-01-15 10:15       ` Patrick McHardy [this message]
2010-01-15 15:19         ` jamal
2010-02-22 20:46           ` Eric W. Biederman
2010-02-22 21:55             ` jamal
2010-02-22 23:17               ` Eric W. Biederman
     [not found]                 ` <m1wry46es9.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-02-23 13:27                   ` jamal
2010-02-23 14:07                     ` Eric W. Biederman
2010-02-23 14:20                       ` jamal
2010-02-23 20:00                         ` Eric W. Biederman
2010-02-23 23:09                           ` jamal
2010-02-24  1:43                             ` Eric W. Biederman
2010-02-25 20:57                             ` [RFC][PATCH] ns: Syscalls for better namespace sharing control Eric W. Biederman
2010-02-25 21:31                               ` Daniel Lezcano
2010-02-25 21:49                                 ` Eric W. Biederman
2010-02-25 22:13                                   ` Daniel Lezcano
2010-02-25 22:31                                     ` Eric W. Biederman
2010-02-26 20:35                                   ` Eric W. Biederman
2010-02-25 21:46                               ` Matt Helsley
2010-02-25 21:54                                 ` Eric W. Biederman
2010-02-26  0:53                                 ` Eric W. Biederman
2010-02-26  1:09                               ` Matt Helsley
2010-02-26  1:26                                 ` Eric W. Biederman
2010-02-26  3:15                               ` [RFC][PATCH] ns: Syscalls for better namespace sharing control. v2 Eric W. Biederman
     [not found]                                 ` <m18wagy9f3.fsf_-_-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-03-03 20:29                                   ` Jonathan Corbet
2010-03-03 20:50                                     ` Eric W. Biederman
     [not found]                               ` <m1pr3t2fvl.fsf_-_-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-02-26 21:13                                 ` [RFC][PATCH] ns: Syscalls for better namespace sharing control Pavel Emelyanov
2010-02-26 21:24                                   ` Eric W. Biederman
2010-02-26 21:34                                     ` Pavel Emelyanov
2010-02-26 21:42                                       ` Eric W. Biederman
2010-02-26 21:58                                         ` Oren Laadan
2010-02-26 22:16                                           ` Eric W. Biederman
2010-02-26 22:52                                             ` Oren Laadan
2010-02-26 23:13                                               ` Eric W. Biederman
2010-02-27  8:30                                         ` Pavel Emelyanov
2010-02-27  9:04                                           ` Eric W. Biederman
     [not found]                                             ` <m1mxyvrqvk.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-02-27  9:21                                               ` Pavel Emelyanov
2010-02-27  9:42                                                 ` Eric W. Biederman
2010-02-27 16:16                                                   ` Pavel Emelyanov
2010-02-27 19:08                                                     ` Eric W. Biederman
2010-02-27 19:29                                                       ` Pavel Emelyanov
2010-02-27 19:44                                                         ` Eric W. Biederman
2010-02-28 22:05                                                           ` Daniel Lezcano
2010-03-01 19:24                                                             ` Eric W. Biederman
2010-03-01 21:42                                                             ` Eric W. Biederman
2010-03-02 13:10                                                               ` Cedric Le Goater
2010-03-02 15:03                                                             ` Pavel Emelyanov
2010-03-02 15:14                                                               ` Jan Engelhardt
2010-03-02 21:45                                                                 ` Eric W. Biederman
2010-03-02 21:19                                                               ` Sukadev Bhattiprolu
2010-03-02 22:13                                                                 ` Eric W. Biederman
2010-03-03  0:07                                                                   ` Sukadev Bhattiprolu
2010-03-03  0:46                                                                     ` Eric W. Biederman
2010-03-03 15:38                                                                       ` Serge E. Hallyn
2010-03-03 19:47                                                                         ` Eric W. Biederman
2010-03-04 21:45                                                                           ` Eric W. Biederman
2010-03-04 22:55                                                                             ` Jan Engelhardt
2010-03-03 16:50                                                                       ` Pavel Emelyanov
2010-03-03 20:16                                                                         ` Eric W. Biederman
2010-03-05 19:18                                                                           ` Pavel Emelyanov
2010-03-05 20:26                                                                             ` Eric W. Biederman
2010-03-06 14:47                                                                               ` Daniel Lezcano
     [not found]                                                                                 ` <4B926B1B.5070207-GANU6spQydw@public.gmane.org>
2010-03-06 20:48                                                                                   ` Eric W. Biederman
     [not found]                                                                                     ` <m1aaulyy5c.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-03-06 21:26                                                                                       ` Daniel Lezcano
2010-03-08  8:32                                                                                         ` Eric W. Biederman
2010-03-08 16:54                                                                                           ` Daniel Lezcano
2010-03-08 17:29                                                                                             ` Eric W. Biederman
2010-03-08 19:57                                                                                               ` Daniel Lezcano
2010-03-08 20:24                                                                                                 ` Eric W. Biederman
2010-03-08 20:42                                                                                                   ` Daniel Lezcano
2010-03-08 20:47                                                                                                     ` Eric W. Biederman
2010-03-08 21:12                                                                                                       ` Daniel Lezcano
2010-03-08 21:25                                                                                                         ` Eric W. Biederman
2010-03-08 21:49                                                                                                           ` Serge E. Hallyn
2010-03-08 22:24                                                                                                             ` Eric W. Biederman
2010-03-09 10:03                                                                                                           ` Daniel Lezcano
2010-03-09 10:13                                                                                                             ` Eric W. Biederman
2010-03-09 10:26                                                                                                               ` Daniel Lezcano
2010-03-10 21:16                                                                                                           ` Daniel Lezcano
2010-03-08 17:07                                                                                           ` Serge E. Hallyn
2010-03-08 17:35                                                                                             ` Eric W. Biederman
2010-03-08 17:47                                                                                               ` Serge E. Hallyn
2010-03-03 20:59                                                             ` Oren Laadan
2010-03-03 21:05                                                               ` Eric W. Biederman
     [not found]                                     ` <m1bpfbwuze.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-02-26 21:35                                       ` Pavel Emelyanov
2010-02-26 21:49                                         ` Eric W. Biederman
2010-02-23 23:49                           ` RFC: netfilter: nf_conntrack: add support for "conntrack zones" Matt Helsley
2010-02-24  1:32                             ` Eric W. Biederman
2010-02-24  1:39                               ` Serge E. Hallyn
2010-01-14 18:32   ` Ben Greear
2010-01-15 15:03     ` jamal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B50403A.6010507@trash.net \
    --to=kaber@trash$(echo .)net \
    --cc=containers@lists$(echo .)linux-foundation.org \
    --cc=greearb@candelatech$(echo .)com \
    --cc=hadi@cyberus$(echo .)ca \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=netfilter-devel@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox