public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Stephen Hemminger <shemminger@vyatta•com>
To: ebiederm@xmission•com (Eric W. Biederman)
Cc: Francesco Ruggeri <fruggeri@aristanetworks•com>,
	netdev@vger•kernel.org, Eric Dumazet <eric.dumazet@gmail•com>,
	David Miller <davem@davemloft•ne>
Subject: Re: Race condition in ipv6 code
Date: Thu, 12 Jan 2012 17:57:58 -0800	[thread overview]
Message-ID: <20120112175758.773c8e85@nehalam.linuxnetplumber.net> (raw)
In-Reply-To: <m1ty40tn83.fsf@fess.ebiederm.org>

On Thu, 12 Jan 2012 17:17:32 -0800
ebiederm@xmission•com (Eric W. Biederman) wrote:

> Francesco Ruggeri <fruggeri@aristanetworks•com> writes:
> 
> > We have hit a race condition in ipv6 code when setting
> > /proc/sys/net/ipv6/conf/*/forwarding. This happens when the syscall
> > has to be restarted.
> >
> > I wonder if anyone else has run into the same issue.
> >
> > The current sequence in addrconf_sysctl_forward() and
> > addrconf_fixup_forwarding()  is as follows:
> > - change the parameter in idev->cnf.forwarding (using proc_dointvec())
> > - try to get the rtnl lock
> > - if cannot get the lock then restore the original value in
> > idev->cnf.forwarding and restart the syscall.
> >
> > While this is going on, the ipv6 code may access idev->cnf.forwarding
> > and get an incorrect value.
> > In our case we were in addrconf_ifdown (holding the rtnl lock)  and
> > calling __ipv6_ifa_notify(RTM_DELADDR, ifa) on the idev->addr_list
> > entries.
> > __ipv6_ifa_notify() only invokes addrconf_leave_anycast() if
> > idev->cnf.forwarding is set. Because a process trying to set
> > forwarding to 0 was stuck in the restart_syscall sequence above
> > flipping the flag on and off, we erroneously read the flag as 0, with
> > the result that addrconf_leave_anycast() was not invoked, some
> > idev->ac_list entries were never released, idev was never freed and
> > kept a reference to its net_device, and the net_device was never freed
> > and caused the "unregister_netdevice: waiting for xxx to become free"
> > message forever. In our case this was a vlan interfaces that was being
> > deleted, so we ended up getting stuck in vlan_ioctl_handler() holding
> > vlan_ioctl_mutex with further bad consequences.
> > The following diffs (for 2.6.38, but the same logic seems to be used
> > in 3.2) address the issue by modifying idev->cnf.forwarding only after
> > the rtnl lock is acquired. There is a similar situation for
> > disable_ipv6.
> > Any comments are appreciated.
> 
> Interesting.  So ultimately the problem is not the syscall restart
> although that exacerbates it, the problem is that we expect 
> idev->cnf.forwarding to be protected by the rtnl_lock and it is not.
> 
> At first read through your patch looks good.  I am a bit worried that
> we have some versions of the value: aka
> net->ipv6.devconf_dflt->forwarding not protected by the rtnl_lock
> and other version of the value protected by the rtnl_lock.
> 
> That just seems confusing.
> 
> We can't hold the rtnl_lock around proc_dointvec because that can sleep
> indefinitely in copy_from_user.  So it looks like your change to create
> a temporary ctl_table and call proc_dointvec seems very reasonable,
> and necessary however we do this.
> 
> I don't know if there are other places that need the rtnl_lock that
> but your patch below looks like it makes things better for all of
> the right reasons.  So on that score.
> 
> Acked-by: "Eric W. Biederman" <ebiederm@xmission•com>
> 
> Unless someone wants to volunteer to sort out the impedance mismatch
> between these tunables and the sysctl infrastructure.  I suggest
> you resend this patch to David with [PATCH] in the subject line.
> 
> I would also suggest a little clearer description why
> idev->cnf.forwarding and idev->cnf.disable_ipv6 need rntl_lock
> protection.
> 
> But overall this looks like a pretty obvious bug fix, to the
> problem that we need the rtnl_lock to protect idev->cnf.forwarding,
> and we currently allow updates to idev->cnf.forwarding without
> holding the rtnl_lock. 
> 
> Eric
> 

Looks like a better function (proc_doint_rtnl?) needs to be built
that has the locking in the right place. I.e:
  get value from user
  get lock (with restart)
  do changes
  unlock

  reply	other threads:[~2012-01-13  1:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-12  2:13 Race condition in ipv6 code Francesco Ruggeri
2012-01-12  6:31 ` Eric Dumazet
2012-01-12  6:44   ` David Miller
2012-01-12 20:48     ` Francesco Ruggeri
2012-01-13  0:11   ` Eric W. Biederman
2012-01-13  6:02     ` Eric Dumazet
2012-01-13  7:40       ` Eric W. Biederman
2012-01-13 17:04         ` Ben Greear
2012-01-14  5:46           ` Eric W. Biederman
2012-01-14 18:31             ` Ben Greear
2012-01-20  2:54               ` Eric W. Biederman
2012-01-13  1:17 ` Eric W. Biederman
2012-01-13  1:57   ` Stephen Hemminger [this message]
2012-01-13 22:02   ` Francesco Ruggeri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120112175758.773c8e85@nehalam.linuxnetplumber.net \
    --to=shemminger@vyatta$(echo .)com \
    --cc=davem@davemloft$(echo .)ne \
    --cc=ebiederm@xmission$(echo .)com \
    --cc=eric.dumazet@gmail$(echo .)com \
    --cc=fruggeri@aristanetworks$(echo .)com \
    --cc=netdev@vger$(echo .)kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox