public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay•com>
To: Stephen Hemminger <shemminger@vyatta•com>
Cc: David Miller <davem@davemloft•net>, netdev@vger•kernel.org
Subject: Re: [PATCH 5/5] netfilter: convert x_tables to use RCU
Date: Fri, 30 Jan 2009 07:53:09 +0100	[thread overview]
Message-ID: <4982A3D5.3030701@cosmosbay.com> (raw)
In-Reply-To: <20090129151624.37dce05e@extreme>

Stephen Hemminger a écrit :
> On Fri, 30 Jan 2009 00:04:16 +0100
> Eric Dumazet <dada1@cosmosbay•com> wrote:
> 
>> Stephen Hemminger a écrit :
>>> Replace existing reader/writer lock with Read-Copy-Update to
>>> elminate the overhead of a read lock on each incoming packet.
>>> This should reduce the overhead of iptables especially on SMP
>>> systems.
>>>
>>> The previous code used a reader-writer lock for two purposes.
>>> The first was to ensure that the xt_table_info reference was not in
>>> process of being changed. Since xt_table_info is only freed via one
>>> routine, it was a direct conversion to RCU.
>>>
>>> The other use of the reader-writer lock was to to block changes
>>> to counters while they were being read. This synchronization was
>>> fixed by the previous patch.  But still need to make sure table info
>>> isn't going away.
>>>
>>> Signed-off-by: Stephen Hemminger <shemminger@vyatta•com>
>>>
>>>
>>> ---
>>>  include/linux/netfilter/x_tables.h |   10 ++++++-
>>>  net/ipv4/netfilter/arp_tables.c    |   12 ++++-----
>>>  net/ipv4/netfilter/ip_tables.c     |   12 ++++-----
>>>  net/ipv6/netfilter/ip6_tables.c    |   12 ++++-----
>>>  net/netfilter/x_tables.c           |   48 ++++++++++++++++++++++++++-----------
>>>  5 files changed, 60 insertions(+), 34 deletions(-)
>>>
>>> --- a/include/linux/netfilter/x_tables.h	2009-01-28 22:04:39.316517913 -0800
>>> +++ b/include/linux/netfilter/x_tables.h	2009-01-28 22:14:54.648490491 -0800
>>> @@ -352,8 +352,8 @@ struct xt_table
>>>  	/* What hooks you will enter on */
>>>  	unsigned int valid_hooks;
>>>  
>>> -	/* Lock for the curtain */
>>> -	rwlock_t lock;
>>> +	/* Lock for curtain */
>>> +	spinlock_t lock;
>>>  
>>>  	/* Man behind the curtain... */
>>>  	struct xt_table_info *private;
>>> @@ -386,6 +386,12 @@ struct xt_table_info
>>>  	/* Secret compartment */
>>>  	seqcount_t *seq;
>>>  
>>> +	/* For the dustman... */
>>> +	union {
>>> +		struct rcu_head rcu;
>>> +		struct work_struct work;
>>> +	};
>>> +
>>>  	/* ipt_entry tables: one per CPU */
>>>  	/* Note : this field MUST be the last one, see XT_TABLE_INFO_SZ */
>>>  	char *entries[1];
>>> --- a/net/ipv4/netfilter/arp_tables.c	2009-01-28 22:13:16.423490077 -0800
>>> +++ b/net/ipv4/netfilter/arp_tables.c	2009-01-28 22:14:54.648490491 -0800
>>> @@ -238,8 +238,8 @@ unsigned int arpt_do_table(struct sk_buf
>>>  	indev = in ? in->name : nulldevname;
>>>  	outdev = out ? out->name : nulldevname;
>>>  
>>> -	read_lock_bh(&table->lock);
>>> -	private = table->private;
>>> +	rcu_read_lock_bh();
>>> +	private = rcu_dereference(table->private);
>>>  	table_base = (void *)private->entries[smp_processor_id()];
>>>  	seq = per_cpu_ptr(private->seq, smp_processor_id());
>>>  	e = get_entry(table_base, private->hook_entry[hook]);
>>> @@ -315,7 +315,7 @@ unsigned int arpt_do_table(struct sk_buf
>>>  			e = (void *)e + e->next_offset;
>>>  		}
>>>  	} while (!hotdrop);
>>> -	read_unlock_bh(&table->lock);
>>> +	rcu_read_unlock_bh();
>>>  
>>>  	if (hotdrop)
>>>  		return NF_DROP;
>>> @@ -1163,8 +1163,8 @@ static int do_add_counters(struct net *n
>>>  		goto free;
>>>  	}
>>>  
>>> -	write_lock_bh(&t->lock);
>>> -	private = t->private;
>>> +	rcu_read_lock_bh();
>>> +	private = rcu_dereference(t->private);
>>>  	if (private->number != num_counters) {
>>>  		ret = -EINVAL;
>>>  		goto unlock_up_free;
>>> @@ -1179,7 +1179,7 @@ static int do_add_counters(struct net *n
>>>  			   paddc,
>>>  			   &i);
>>>   unlock_up_free:
>>> -	write_unlock_bh(&t->lock);
>>> +	rcu_read_unlock_bh();
>>>  	xt_table_unlock(t);
>>>  	module_put(t->me);
>>>   free:
>>> --- a/net/ipv4/netfilter/ip_tables.c	2009-01-28 22:06:10.596739805 -0800
>>> +++ b/net/ipv4/netfilter/ip_tables.c	2009-01-28 22:14:54.648490491 -0800
>>> @@ -348,9 +348,9 @@ ipt_do_table(struct sk_buff *skb,
>>>  	mtpar.family  = tgpar.family = NFPROTO_IPV4;
>>>  	tgpar.hooknum = hook;
>>>  
>>> -	read_lock_bh(&table->lock);
>>> +	rcu_read_lock_bh();
>>>  	IP_NF_ASSERT(table->valid_hooks & (1 << hook));
>>> -	private = table->private;
>>> +	private = rcu_dereference(table->private);
>>>  	table_base = (void *)private->entries[smp_processor_id()];
>>>  	seq = per_cpu_ptr(private->seq, smp_processor_id());
>>>  	e = get_entry(table_base, private->hook_entry[hook]);
>>> @@ -449,7 +449,7 @@ ipt_do_table(struct sk_buff *skb,
>>>  		}
>>>  	} while (!hotdrop);
>>>  
>>> -	read_unlock_bh(&table->lock);
>>> +	rcu_read_unlock_bh();
>>>  
>>>  #ifdef DEBUG_ALLOW_ALL
>>>  	return NF_ACCEPT;
>>> @@ -1408,8 +1408,8 @@ do_add_counters(struct net *net, void __
>>>  		goto free;
>>>  	}
>>>  
>>> -	write_lock_bh(&t->lock);
>>> -	private = t->private;
>>> +	rcu_read_lock_bh();
>>> +	private = rcu_dereference(t->private);
>> I feel litle bit nervous seeing a write_lock_bh() changed to a rcu_read_lock()
> 
> Facts, it is only updating entries on current cpu

Yes, like done in ipt_do_table() ;)

Fact is we need to tell other threads, running on other cpus, that an update
 of our entries is running.

Let me check if your v4 and xt_counters abstraction already solved this problem.

> 
>> Also, add_counter_to_entry() is not using seqcount protection, so another thread
>> doing an iptables -L in parallel with this thread will possibly get corrupted counters.
> add_counter_to_entry is local to current CPU.
> 
> 
>> (With write_lock_bh(), this corruption could not occur)
>>
>>
> --


  reply	other threads:[~2009-01-30  6:53 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-29  6:25 [PATCH 0/5] iptables lockless receive (v0.3) Stephen Hemminger
2009-01-29  6:25 ` [PATCH 1/5] netfilter: change elements in x_tables Stephen Hemminger
2009-01-29  6:25 ` [PATCH 2/5] netfilter: remove unneeded initializations Stephen Hemminger
2009-01-29  6:25 ` [PATCH 3/5] ebtables: " Stephen Hemminger
2009-01-29  6:25 ` [PATCH 4/5] netfilter: use sequence number synchronization for counters Stephen Hemminger
2009-01-29  8:47   ` Eric Dumazet
2009-01-29  6:25 ` [PATCH 5/5] netfilter: convert x_tables to use RCU Stephen Hemminger
2009-01-29 23:04   ` Eric Dumazet
2009-01-29 23:16     ` Stephen Hemminger
2009-01-30  6:53       ` Eric Dumazet [this message]
2009-01-30  7:02         ` Eric Dumazet
2009-01-30  7:05           ` Eric Dumazet
2009-01-29  8:07 ` [PATCH 0/5] iptables lockless receive (v0.3) Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4982A3D5.3030701@cosmosbay.com \
    --to=dada1@cosmosbay$(echo .)com \
    --cc=davem@davemloft$(echo .)net \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=shemminger@vyatta$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox