public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: ebiederm@xmission•com (Eric W. Biederman)
To: Benjamin LaHaise <bcrl@kvack•org>
Cc: rsa <ravi.mlists@gmail•com>, netdev@vger•kernel.org
Subject: Re: switching network namespace midway
Date: Thu, 01 Nov 2012 23:18:58 -0700	[thread overview]
Message-ID: <87k3u4il2l.fsf@xmission.com> (raw)
In-Reply-To: <20121102022542.GD18091@kvack.org> (Benjamin LaHaise's message of "Thu, 1 Nov 2012 22:25:43 -0400")

Benjamin LaHaise <bcrl@kvack•org> writes:

> On Thu, Oct 25, 2012 at 09:15:34AM -0700, Eric W. Biederman wrote:
>> > I've read IPv4 gre code, and it appears to do the right thing on rx, but it 
>> > does *not* appear to handle namespaces correctly on transmit.  In general, 
>> > I would expect pretty much all code to get namespace handling correct for 
>> > the rx case.  I'll have a closer look at fixing this tomorrow if nobody else 
>> > beats me to it.
>> 
>> It will be interesting to see what you come up with. 
>
> Well, I finally had some time to work on the ip_gre module a bit today, 
> and here's what I came up with.  The basic idea is to store the network 
> namespace in the ip_tunnel structure at creation time for use in sending 
> and receiving packets, allowing the gre network device to be safely moved 
> into another network namespace.  This works for me in moving a gre tunnel 
> into an lxc container, and survives module unload and namespace 
> destruction.  I'll try to spend a bit more time adding similar support to 
> the other ip_tunnel devices over the next few days.  Comments/thoughts?
>
> 		-ben

You need a per network namespace exit function to delete the tunnel when
the xmit direction goes away.  Otherwise we have a very nasty race if
the original network namespace exits.

NETNS_LOCAL may make sense on the reference device that is used to
support ioctls for creating devices.

ipgre_open ?  It looks like it needs to be handled.  Probably that
ip_route_output_gre needs to be moved.

ipv6?

Eric

> -- 
> "Thought is the essence of where you are now."
> 
> 
> diff --git a/include/net/ipip.h b/include/net/ipip.h
> index ddc077c..9cfba92 100644
> --- a/include/net/ipip.h
> +++ b/include/net/ipip.h
> @@ -19,6 +19,7 @@ struct ip_tunnel_6rd_parm {
>  struct ip_tunnel {
>  	struct ip_tunnel __rcu	*next;
>  	struct net_device	*dev;
> +	struct net		*net;		/* Namespace for packet i/o */
>  
>  	int			err_count;	/* Number of arrived ICMP errors */
>  	unsigned long		err_time;	/* Time when the last ICMP error arrived */
> diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> index 7240f8e..705dc66 100644
> --- a/net/ipv4/ip_gre.c
> +++ b/net/ipv4/ip_gre.c
> @@ -461,6 +461,7 @@ static struct ip_tunnel *ipgre_tunnel_locate(struct net *net,
>  	dev_net_set(dev, net);
>  
>  	nt = netdev_priv(dev);
> +	nt->net = net;
>  	nt->parms = *parms;
>  	dev->rtnl_link_ops = &ipgre_link_ops;
>  
> @@ -484,8 +485,10 @@ failed_free:
>  
>  static void ipgre_tunnel_uninit(struct net_device *dev)
>  {
> -	struct net *net = dev_net(dev);
> -	struct ipgre_net *ign = net_generic(net, ipgre_net_id);
> +	struct ip_tunnel *tunnel = netdev_priv(dev);
> +	struct ipgre_net *ign;
> +
> +	ign = net_generic(tunnel->net, ipgre_net_id);
>  
>  	ipgre_tunnel_unlink(ign, netdev_priv(dev));
>  	dev_put(dev);
> @@ -837,7 +840,7 @@ static netdev_tx_t ipgre_tunnel_xmit(struct sk_buff *skb, struct net_device *dev
>  			tos = ipv6_get_dsfield((const struct ipv6hdr *)old_iph);
>  	}
>  
> -	rt = ip_route_output_gre(dev_net(dev), &fl4, dst, tiph->saddr,
> +	rt = ip_route_output_gre(tunnel->net, &fl4, dst, tiph->saddr,
>  				 tunnel->parms.o_key, RT_TOS(tos),
>  				 tunnel->parms.link);
>  	if (IS_ERR(rt)) {
> @@ -1010,7 +1013,7 @@ static int ipgre_tunnel_bind_dev(struct net_device *dev)
>  		struct flowi4 fl4;
>  		struct rtable *rt;
>  
> -		rt = ip_route_output_gre(dev_net(dev), &fl4,
> +		rt = ip_route_output_gre(tunnel->net, &fl4,
>  					 iph->daddr, iph->saddr,
>  					 tunnel->parms.o_key,
>  					 RT_TOS(iph->tos),
> @@ -1341,7 +1344,6 @@ static void ipgre_tunnel_setup(struct net_device *dev)
>  	dev->flags		= IFF_NOARP;
>  	dev->iflink		= 0;
>  	dev->addr_len		= 4;
> -	dev->features		|= NETIF_F_NETNS_LOCAL;
>  	dev->priv_flags		&= ~IFF_XMIT_DST_RELEASE;
>  
>  	dev->features		|= GRE_FEATURES;
> @@ -1432,6 +1434,7 @@ static void ipgre_destroy_tunnels(struct ipgre_net *ign, struct list_head *head)
>  static int __net_init ipgre_init_net(struct net *net)
>  {
>  	struct ipgre_net *ign = net_generic(net, ipgre_net_id);
> +	struct ip_tunnel *tunnel;
>  	int err;
>  
>  	ign->fb_tunnel_dev = alloc_netdev(sizeof(struct ip_tunnel), "gre0",
> @@ -1445,6 +1448,9 @@ static int __net_init ipgre_init_net(struct net *net)
>  	ipgre_fb_tunnel_init(ign->fb_tunnel_dev);
>  	ign->fb_tunnel_dev->rtnl_link_ops = &ipgre_link_ops;
>  
> +	tunnel = netdev_priv(ign->fb_tunnel_dev);
> +	tunnel->net = net;
> +
>  	if ((err = register_netdev(ign->fb_tunnel_dev)))
>  		goto err_reg_dev;
>  

  reply	other threads:[~2012-11-02  6:19 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-23 17:49 switching network namespace midway rsa
2012-10-24 21:11 ` Eric W. Biederman
2012-10-24 21:21   ` Benjamin LaHaise
2012-10-25  1:37     ` Eric W. Biederman
2012-10-25 14:38       ` Benjamin LaHaise
2012-10-25 16:21         ` Stephen Hemminger
2012-10-28  5:43           ` Eric W. Biederman
2012-10-29 14:23             ` Stephen Hemminger
2012-10-30  0:21               ` Eric W. Biederman
2012-10-30  8:55                 ` James Chapman
2012-10-25 15:12     ` rsa
2012-10-25 15:29     ` rsa
2012-10-25 15:59       ` Benjamin LaHaise
2012-10-25 16:15         ` Eric W. Biederman
2012-11-02  2:25           ` Benjamin LaHaise
2012-11-02  6:18             ` Eric W. Biederman [this message]
2012-11-02 14:03               ` Benjamin LaHaise
2012-11-02 20:45                 ` Eric W. Biederman
2013-06-24 14:13                   ` [RFC PATCH net-next 0/2] sit: allow to switch netns during encap/decap Nicolas Dichtel
2013-06-24 14:13                     ` [RFC PATCH net-next 1/2] dev: introduce dev_cleanup_skb() Nicolas Dichtel
2013-06-24 18:13                       ` Ben Hutchings
2013-06-24 19:05                         ` Eric W. Biederman
2013-06-24 14:13                     ` [RFC PATCH net-next 2/2] sit: add support of x-netns Nicolas Dichtel
2013-06-24 19:28                       ` Eric W. Biederman
2013-06-24 21:11                         ` Nicolas Dichtel
2013-06-24 22:42                           ` Eric W. Biederman
2013-06-25 14:10                             ` Nicolas Dichtel
2013-06-25 14:24                               ` [PATCH v2 net-next 0/2] sit: allow to switch netns during encap/decap Nicolas Dichtel
2013-06-25 14:24                                 ` [PATCH v2 net-next 1/2] dev: introduce skb_scrub_packet() Nicolas Dichtel
2013-06-25 14:24                                 ` [PATCH v2 net-next 2/2] sit: add support of x-netns Nicolas Dichtel
2013-06-25 23:56                                   ` David Miller
2013-06-26  1:35                                     ` Eric W. Biederman
2013-06-26  5:48                                       ` David Miller
2013-06-26 10:03                                         ` Eric W. Biederman
2013-06-26 10:22                                           ` Eric Dumazet
2013-06-26 12:15                                             ` Nicolas Dichtel
2013-06-26 14:11                                               ` [PATCH v3 net-next 0/2] sit: allow to switch netns during encap/decap Nicolas Dichtel
2013-06-26 14:11                                                 ` [PATCH v3 net-next 1/2] dev: introduce skb_scrub_packet() Nicolas Dichtel
2013-06-26 14:11                                                 ` [PATCH v3 net-next 2/2] sit: add support of x-netns Nicolas Dichtel
2013-06-28  5:36                                                 ` [PATCH v3 net-next 0/2] sit: allow to switch netns during encap/decap David Miller
2013-07-03 15:00                                                   ` [PATCH net-next 0/3] ipip/ip6tnl: " Nicolas Dichtel
2013-07-03 15:00                                                     ` [PATCH net-next 1/3] sit: fix tunnel update via netlink Nicolas Dichtel
2013-07-03 15:00                                                     ` [PATCH net-next 2/3] ipip: add x-netns support Nicolas Dichtel
2013-07-03 15:00                                                     ` [PATCH net-next 3/3] ip6tnl: " Nicolas Dichtel
2013-07-04 21:56                                                     ` [PATCH net-next 0/3] ipip/ip6tnl: allow to switch netns during encap/decap David Miller
2013-08-13 15:51                                                       ` [PATCH net-next v2 0/4] " Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 1/4] dev: move skb_scrub_packet() after eth_type_trans() Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 2/4] ipv4 tunnels: use net_eq() helper to check netns Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 3/4] ipip: add x-netns support Nicolas Dichtel
2013-08-13 15:51                                                         ` [PATCH net-next v2 4/4] ip6tnl: " Nicolas Dichtel
2013-08-15  8:01                                                         ` [PATCH net-next v2 0/4] ipip/ip6tnl: allow to switch netns during encap/decap David Miller
2013-06-26 13:49                                     ` [PATCH v2 net-next 2/2] sit: add support of x-netns Nicolas Dichtel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k3u4il2l.fsf@xmission.com \
    --to=ebiederm@xmission$(echo .)com \
    --cc=bcrl@kvack$(echo .)org \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=ravi.mlists@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox