From: Eric Dumazet <dada1@cosmosbay•com>
To: David Miller <davem@davemloft•net>
Cc: kchang@athenacr•com, netdev@vger•kernel.org,
cl@linux-foundation•org, bmb@athenacr•com
Subject: Re: Multicast packet loss
Date: Mon, 09 Mar 2009 07:36:57 +0100 [thread overview]
Message-ID: <49B4B909.7050002@cosmosbay.com> (raw)
In-Reply-To: <20090308.194922.267426196.davem@davemloft.net>
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay•com>
> Date: Sun, 08 Mar 2009 17:46:13 +0100
>
>> + if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) {
>> + if (in_softirq()) {
>> + if (!softirq_del(&sk->sk_del, sock_readable_defer))
>> + goto unlock;
>> + return;
>> + }
>
> This is interesting.
>
> I think you should make softirq_del() more flexible. Make it the
> socket's job to make sure it doesn't try to defer different
> functions, and put the onus on locking there as well.
>
> The cmpxchg() and all of this checking is just wasted work.
>
> I'd really like to get rid of that callback lock too, then we'd
> really be in business. :-)
First thanks for your review David.
I chose cmpxchg() because I needed some form of exclusion here.
I first added a spinlock inside "struct softirq_del" then I realize
I could use cmpxchg() instead and keep the structure small. As the
synchronization is only needed at queueing time, we could pass
the address of a spinlock XXX to sofirq_del() call.
Also, when an event was queued for later invocation, I also needed to keep
a reference on "struct socket" to make sure it doesnt disappear before
the invocation. Not all sockets are RCU guarded (we added RCU only for
some protocols (TCP, UDP ...). So I found keeping a read_lock
on callback was the easyest thing to do. I now realize we might
overflow preempt_count, so special care is needed.
About your first point, maybe we should make sofirq_del() (poor name I confess)
only have one argument (pointer to struct softirq_del), and initialize
the function pointer at socket init time. That would insure "struct softirq_del"
is associated to one callback only. cmpxchg() test would have to be
done on "next" field then (or use the spinlock XXX)
I am not sure output path needs such tricks, since threads are rarely
blocking on output : We dont trigger 400.000 wakeups per second ?
Another point : I did a tbench test and got 2517 MB/s with the patch,
instead of 2538 MB/s (using Linus 2.6 git tree), thats ~ 0.8 % regression
for this workload.
next prev parent reply other threads:[~2009-03-09 6:37 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-30 17:49 Multicast packet loss Kenny Chang
2009-01-30 19:04 ` Eric Dumazet
2009-01-30 19:17 ` Denys Fedoryschenko
2009-01-30 20:03 ` Neil Horman
2009-01-30 22:29 ` Kenny Chang
2009-01-30 22:41 ` Eric Dumazet
2009-01-31 16:03 ` Neil Horman
2009-02-02 16:13 ` Kenny Chang
2009-02-02 16:48 ` Kenny Chang
2009-02-03 11:55 ` Neil Horman
2009-02-03 15:20 ` Kenny Chang
2009-02-04 1:15 ` Neil Horman
2009-02-04 16:07 ` Kenny Chang
2009-02-04 16:46 ` Wesley Chow
2009-02-04 18:11 ` Eric Dumazet
2009-02-05 13:33 ` Neil Horman
2009-02-05 13:46 ` Wesley Chow
2009-02-05 13:29 ` Neil Horman
2009-02-01 12:40 ` Eric Dumazet
2009-02-02 13:45 ` Neil Horman
2009-02-02 16:57 ` Eric Dumazet
2009-02-02 18:22 ` Neil Horman
2009-02-02 19:51 ` Wes Chow
2009-02-02 20:29 ` Eric Dumazet
2009-02-02 21:09 ` Wes Chow
2009-02-02 21:31 ` Eric Dumazet
2009-02-03 17:34 ` Kenny Chang
2009-02-04 1:21 ` Neil Horman
2009-02-26 17:15 ` Kenny Chang
2009-02-28 8:51 ` Eric Dumazet
2009-03-01 17:03 ` Eric Dumazet
2009-03-04 8:16 ` David Miller
2009-03-04 8:36 ` Eric Dumazet
2009-03-07 7:46 ` Eric Dumazet
2009-03-08 16:46 ` Eric Dumazet
2009-03-09 2:49 ` David Miller
2009-03-09 6:36 ` Eric Dumazet [this message]
2009-03-13 21:51 ` David Miller
2009-03-13 22:30 ` Eric Dumazet
2009-03-13 22:38 ` David Miller
2009-03-13 22:45 ` Eric Dumazet
2009-03-14 9:03 ` [PATCH] net: reorder fields of struct socket Eric Dumazet
2009-03-16 2:59 ` David Miller
2009-03-16 22:22 ` Multicast packet loss Eric Dumazet
2009-03-17 10:11 ` Peter Zijlstra
2009-03-17 11:08 ` Eric Dumazet
2009-03-17 11:57 ` Peter Zijlstra
2009-03-17 15:00 ` Brian Bloniarz
2009-03-17 15:16 ` Eric Dumazet
2009-03-17 19:39 ` David Stevens
2009-03-17 21:19 ` Eric Dumazet
2009-04-03 19:28 ` Brian Bloniarz
2009-04-05 13:49 ` Eric Dumazet
2009-04-06 21:53 ` Brian Bloniarz
2009-04-06 22:12 ` Brian Bloniarz
2009-04-07 20:08 ` Brian Bloniarz
2009-04-08 8:12 ` Eric Dumazet
2009-03-09 22:56 ` Brian Bloniarz
2009-03-10 5:28 ` Eric Dumazet
2009-03-10 23:22 ` Brian Bloniarz
2009-03-11 3:00 ` Eric Dumazet
2009-03-12 15:47 ` Brian Bloniarz
2009-03-12 16:34 ` Eric Dumazet
2009-02-27 18:40 ` Christoph Lameter
2009-02-27 18:56 ` Eric Dumazet
2009-02-27 19:45 ` Christoph Lameter
2009-02-27 20:12 ` Eric Dumazet
2009-02-27 21:36 ` Eric Dumazet
2009-02-02 13:53 ` Eric Dumazet
-- strict thread matches above, loose matches on Subject: below --
2009-04-05 14:42 bmb
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49B4B909.7050002@cosmosbay.com \
--to=dada1@cosmosbay$(echo .)com \
--cc=bmb@athenacr$(echo .)com \
--cc=cl@linux-foundation$(echo .)org \
--cc=davem@davemloft$(echo .)net \
--cc=kchang@athenacr$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox