public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
* [PATCH net-next] inet: remove rcu protection on tw_net
@ 2011-12-14 14:58 Eric Dumazet
  2011-12-14 18:35 ` David Miller
  2011-12-14 20:17 ` Eric W. Biederman
  0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2011-12-14 14:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric W. Biederman

commit b099ce2602d806 (net: Batch inet_twsk_purge) added rcu protection
on tw_net for no obvious reason.

struct net are refcounted anyway since timewait sockets escape from rcu
protected sections. tw_net stay valid for the whole timwait lifetime.

This also removes a lot of sparse errors.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail•com>
CC: Eric W. Biederman <ebiederm@xmission•com>
---
 include/net/inet_timewait_sock.h |   12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index e8c25b9..ba52c83 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -218,20 +218,12 @@ extern void inet_twsk_purge(struct inet_hashinfo *hashinfo,
 static inline
 struct net *twsk_net(const struct inet_timewait_sock *twsk)
 {
-#ifdef CONFIG_NET_NS
-	return rcu_dereference_raw(twsk->tw_net); /* protected by locking, */
-						  /* reference counting, */
-						  /* initialization, or RCU. */
-#else
-	return &init_net;
-#endif
+	return read_pnet(&twsk->tw_net);
 }
 
 static inline
 void twsk_net_set(struct inet_timewait_sock *twsk, struct net *net)
 {
-#ifdef CONFIG_NET_NS
-	rcu_assign_pointer(twsk->tw_net, net);
-#endif
+	write_pnet(&twsk->tw_net, net);
 }
 #endif	/* _INET_TIMEWAIT_SOCK_ */

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next] inet: remove rcu protection on tw_net
  2011-12-14 14:58 [PATCH net-next] inet: remove rcu protection on tw_net Eric Dumazet
@ 2011-12-14 18:35 ` David Miller
  2011-12-14 20:17 ` Eric W. Biederman
  1 sibling, 0 replies; 4+ messages in thread
From: David Miller @ 2011-12-14 18:35 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, ebiederm

From: Eric Dumazet <eric.dumazet@gmail•com>
Date: Wed, 14 Dec 2011 15:58:33 +0100

> commit b099ce2602d806 (net: Batch inet_twsk_purge) added rcu protection
> on tw_net for no obvious reason.
> 
> struct net are refcounted anyway since timewait sockets escape from rcu
> protected sections. tw_net stay valid for the whole timwait lifetime.
> 
> This also removes a lot of sparse errors.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail•com>
> CC: Eric W. Biederman <ebiederm@xmission•com>

Applied.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next] inet: remove rcu protection on tw_net
  2011-12-14 14:58 [PATCH net-next] inet: remove rcu protection on tw_net Eric Dumazet
  2011-12-14 18:35 ` David Miller
@ 2011-12-14 20:17 ` Eric W. Biederman
  2011-12-14 22:28   ` Eric Dumazet
  1 sibling, 1 reply; 4+ messages in thread
From: Eric W. Biederman @ 2011-12-14 20:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev

Eric Dumazet <eric.dumazet@gmail•com> writes:

> commit b099ce2602d806 (net: Batch inet_twsk_purge) added rcu protection
> on tw_net for no obvious reason.

>From that commit I see:

                sk_nulls_for_each_rcu(sk, node, &head->twchain) {
                        tw = inet_twsk(sk);
-                       if (!net_eq(twsk_net(tw), net) ||
-                           tw->tw_family != family)
+                       if ((tw->tw_family != family) ||
+                               atomic_read(&twsk_net(tw)->count))

That atomic_read is a new dereference of twsk_net in an only rcu
protected section.  That seems like an obvious reason to me.

> struct net are refcounted anyway since timewait sockets escape from rcu
> protected sections. tw_net stay valid for the whole timwait lifetime.

What? twsk_net_set does not bump the struct net ref count.

There is that stupid hold_net/release_net over designed debugging
thinko that makes it look like we have a refcount.  We should probably
just kill that thing.  But a time wait socket unlike a normal socket
does not keep a network namespace alive.  Which is why we have to purge
pending timewait sockets when a network namespace exits.

> This also removes a lot of sparse errors.

What is sparse saying that we are doing wrong?

There may be constraints that are strong enough that we can get away
this but I am at least a little dubious.

Eric


> Signed-off-by: Eric Dumazet <eric.dumazet@gmail•com>
> CC: Eric W. Biederman <ebiederm@xmission•com>
> ---
>  include/net/inet_timewait_sock.h |   12 ++----------
>  1 file changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
> index e8c25b9..ba52c83 100644
> --- a/include/net/inet_timewait_sock.h
> +++ b/include/net/inet_timewait_sock.h
> @@ -218,20 +218,12 @@ extern void inet_twsk_purge(struct inet_hashinfo *hashinfo,
>  static inline
>  struct net *twsk_net(const struct inet_timewait_sock *twsk)
>  {
> -#ifdef CONFIG_NET_NS
> -	return rcu_dereference_raw(twsk->tw_net); /* protected by locking, */
> -						  /* reference counting, */
> -						  /* initialization, or RCU. */
> -#else
> -	return &init_net;
> -#endif
> +	return read_pnet(&twsk->tw_net);
>  }
>  
>  static inline
>  void twsk_net_set(struct inet_timewait_sock *twsk, struct net *net)
>  {
> -#ifdef CONFIG_NET_NS
> -	rcu_assign_pointer(twsk->tw_net, net);
> -#endif
> +	write_pnet(&twsk->tw_net, net);
>  }
>  #endif	/* _INET_TIMEWAIT_SOCK_ */
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger•kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next] inet: remove rcu protection on tw_net
  2011-12-14 20:17 ` Eric W. Biederman
@ 2011-12-14 22:28   ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2011-12-14 22:28 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: David Miller, netdev

Le mercredi 14 décembre 2011 à 12:17 -0800, Eric W. Biederman a écrit :
> Eric Dumazet <eric.dumazet@gmail•com> writes:
> 
> > commit b099ce2602d806 (net: Batch inet_twsk_purge) added rcu protection
> > on tw_net for no obvious reason.
> 
> From that commit I see:
> 
>                 sk_nulls_for_each_rcu(sk, node, &head->twchain) {
>                         tw = inet_twsk(sk);
> -                       if (!net_eq(twsk_net(tw), net) ||
> -                           tw->tw_family != family)
> +                       if ((tw->tw_family != family) ||
> +                               atomic_read(&twsk_net(tw)->count))
> 
> That atomic_read is a new dereference of twsk_net in an only rcu
> protected section.  That seems like an obvious reason to me.
> 

Absolutely not.

As I said twsk_net() cannot change in a tw lifetime.

(If it could, this code would be bogus anyway, you should use :

net = twsk_net(tw);
if (tw->tw_family != family || !net || atomic_read(&net->count))
	...

> > struct net are refcounted anyway since timewait sockets escape from rcu
> > protected sections. tw_net stay valid for the whole timwait lifetime.
> 
> What? twsk_net_set does not bump the struct net ref count.
> 

The commit I mentioned did not change anything in this respect.

> There is that stupid hold_net/release_net over designed debugging
> thinko that makes it look like we have a refcount.  We should probably
> just kill that thing.  But a time wait socket unlike a normal socket
> does not keep a network namespace alive.  Which is why we have to purge
> pending timewait sockets when a network namespace exits.
> 

Since you do a cleanup _before_ removing "struct net", you have absolute
guarantee tw_net is stable, you dont need rcu_dereference() (and implied
smb_rmb()) at all.

Why pay the price twice, once in the super heavy inet_twsk_purge()
function, once in every twsk_net() calls ?

[ By the way, rcu_dereference_raw() dubious use is another sign we dont
really know what RCU invariant is respected when calling twsk_net(tw) ]

> > This also removes a lot of sparse errors.
> 
> What is sparse saying that we are doing wrong?
> 

CONFIG_SPARSE_RCU_POINTER=y

make C=2 net/ipv4/inet_timewait_sock.o

  CHECK   net/ipv4/inet_timewait_sock.c
include/net/inet_timewait_sock.h:222:16: error: incompatible types in
comparison expression (different address spaces)

[ repeat NN times ]

comparison expression (different address spaces)
include/net/inet_timewait_sock.h:222:16: error: incompatible types in
comparison expression (different address spaces)
include/net/inet_timewait_sock.h:222:16: error: incompatible types in
comparison expression (different address spaces)
include/net/inet_timewait_sock.h:222:16: error: incompatible types in
comparison expression (different address spaces)
include/net/inet_timewait_sock.h:222:16: error: too many errors



> There may be constraints that are strong enough that we can get away
> this but I am at least a little dubious.
> 

Thanks for reviewing and comments !

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-12-14 22:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-14 14:58 [PATCH net-next] inet: remove rcu protection on tw_net Eric Dumazet
2011-12-14 18:35 ` David Miller
2011-12-14 20:17 ` Eric W. Biederman
2011-12-14 22:28   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox