* [PATCH v2] [net] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
@ 2015-09-20 9:18 Aaron Conole
2015-09-20 16:46 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: Aaron Conole @ 2015-09-20 9:18 UTC (permalink / raw)
To: netdev; +Cc: Aaron Conole
From: Aaron Conole <aaron@bytheb•org>
AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag
is set.
This is referenced in kernel bugzilla #12323 @
https://bugzilla.kernel.org/show_bug.cgi?id=12323
As described both in the BZ and lkml thread @
http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an
AF_UNIX socket only reads a single skb, where the desired effect is
to return as much skb data has been queued, until hitting the recv
buffer size (whichever comes first).
The modified MSG_PEEK path will now move to the next skb in the tree
and jump to the again: label, rather than following the natural loop
structure. This requires duplicating some of the loop head actions.
This was tested using the python socketpair python code attached to
the bugzilla issue.
Signed-off-by: Aaron Conole <aaron@bytheb•org>
---
net/unix/af_unix.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 03ee4d3..988fbbd4 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2179,9 +2179,24 @@ unlock:
if (UNIXCB(skb).fp)
scm.fp = scm_fp_dup(UNIXCB(skb).fp);
- sk_peek_offset_fwd(sk, chunk);
+ if (skip) {
+ sk_peek_offset_fwd(sk, chunk);
+ skip -= chunk;
+ }
- break;
+ if (UNIXCB(skb).fp)
+ break;
+
+ /* XXX - this is ugly; a better approach would be
+ * rewriting this function
+ */
+ last = skb;
+ last_len = skb->len;
+ unix_state_lock(&sk);
+ skb = skb_peek_next(skb, &sk->sk_receive_queue);
+ if (skb)
+ goto again;
+ goto unlock;
}
} while (size);
--
1.9.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] [net] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
2015-09-20 9:18 [PATCH v2] [net] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag Aaron Conole
@ 2015-09-20 16:46 ` Eric Dumazet
2015-09-20 19:07 ` Aaron Conole
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-09-20 16:46 UTC (permalink / raw)
To: Aaron Conole; +Cc: netdev, Aaron Conole
On Sun, 2015-09-20 at 05:18 -0400, Aaron Conole wrote:
> From: Aaron Conole <aaron@bytheb•org>
>
> AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag
> is set.
>
> This is referenced in kernel bugzilla #12323 @
> https://bugzilla.kernel.org/show_bug.cgi?id=12323
>
> As described both in the BZ and lkml thread @
> http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an
> AF_UNIX socket only reads a single skb, where the desired effect is
> to return as much skb data has been queued, until hitting the recv
> buffer size (whichever comes first).
>
> The modified MSG_PEEK path will now move to the next skb in the tree
> and jump to the again: label, rather than following the natural loop
> structure. This requires duplicating some of the loop head actions.
>
> This was tested using the python socketpair python code attached to
> the bugzilla issue.
>
> Signed-off-by: Aaron Conole <aaron@bytheb•org>
> ---
> net/unix/af_unix.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 03ee4d3..988fbbd4 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -2179,9 +2179,24 @@ unlock:
> if (UNIXCB(skb).fp)
> scm.fp = scm_fp_dup(UNIXCB(skb).fp);
>
> - sk_peek_offset_fwd(sk, chunk);
> + if (skip) {
> + sk_peek_offset_fwd(sk, chunk);
> + skip -= chunk;
> + }
>
> - break;
> + if (UNIXCB(skb).fp)
> + break;
> +
> + /* XXX - this is ugly; a better approach would be
> + * rewriting this function
> + */
> + last = skb;
> + last_len = skb->len;
> + unix_state_lock(&sk);
I am wondering what this is expected to do, and how this code would
possibly not trigger a crash.
Are you 100% sure you tested this patch and code path ?
Before resending v3, please make sure to compile and test with
CONFIG_LOCKDEP=y. Add a temporary (in your tree, not final patch)
pr_err_once("went there at least one time\n");
(to make sure this code path was tested)
It might be time to get rid of unix_sk macro for a proper function to
avoid these kind of errors.
diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index 4a167b30a12f..cb1b9bbda332 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -63,7 +63,11 @@ struct unix_sock {
#define UNIX_GC_MAYBE_CYCLE 1
struct socket_wq peer_wq;
};
-#define unix_sk(__sk) ((struct unix_sock *)__sk)
+
+static inline struct unix_sock *unix_sk(struct sock *sk)
+{
+ return (struct unix_sock *)sk;
+}
#define peer_wait peer_wq.wait
Thanks.
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] [net] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
2015-09-20 16:46 ` Eric Dumazet
@ 2015-09-20 19:07 ` Aaron Conole
2015-09-20 19:21 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: Aaron Conole @ 2015-09-20 19:07 UTC (permalink / raw)
To: Eric Dumazet, netdev
Eric Dumazet <eric.dumazet@gmail•com> writes:
> On Sun, 2015-09-20 at 05:18 -0400, Aaron Conole wrote:
>> From: Aaron Conole <aaron@bytheb•org>
>>
>
> I am wondering what this is expected to do, and how this code would
> possibly not trigger a crash.
Are you suspecting it should crash from a possible double-lock case?
On line 2125, there is an unconditional unlock, which should be
guaranteeing that there is no longer a condition to 'double lock' the
socket.
With my patch, I re-do a lock just before entering skb_peek_next, and
then loop to again: label (line 2078); I admit that there is a check
at the top of the loop which I do not include (the check for SOCK_DEAD).
Do you think this check is needed (and the cause for your concern on
the suspected crash)?
I will re-do the testing as you outline later, and report the results.
> Are you 100% sure you tested this patch and code path ?
Yes, 100%; I used the python code attached to the bug before hacking on
this function whatsoever to ensure that the bug still exists in current
kernel (it does). Then after my patch, I reran the same test. There
were no oops, bugs, panics, or other errors reported.
> Before resending v3, please make sure to compile and test with
> CONFIG_LOCKDEP=y. Add a temporary (in your tree, not final patch)
>
> pr_err_once("went there at least one time\n");
>
> (to make sure this code path was tested)
I will do this testing as requested; my current config does include
LOCKDEP_SUPPORT=y.
> It might be time to get rid of unix_sk macro for a proper function to
> avoid these kind of errors.
>
> diff --git a/include/net/af_unix.h b/include/net/af_unix.h
> index 4a167b30a12f..cb1b9bbda332 100644
> --- a/include/net/af_unix.h
> +++ b/include/net/af_unix.h
> @@ -63,7 +63,11 @@ struct unix_sock {
> #define UNIX_GC_MAYBE_CYCLE 1
> struct socket_wq peer_wq;
> };
> -#define unix_sk(__sk) ((struct unix_sock *)__sk)
> +
> +static inline struct unix_sock *unix_sk(struct sock *sk)
> +{
> + return (struct unix_sock *)sk;
> +}
>
> #define peer_wait peer_wq.wait
If you'd like, I'll add this to a V3 version of this patch, re-do
testing with your requested config above, and report the results.
> Thanks.
Thank you for the feedback, it is very good.
-Aaron
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] [net] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
2015-09-20 19:07 ` Aaron Conole
@ 2015-09-20 19:21 ` Eric Dumazet
[not found] ` <87h9mo7ui9.fsf@bytheb.org>
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-09-20 19:21 UTC (permalink / raw)
To: Aaron Conole; +Cc: netdev
On Sun, 2015-09-20 at 15:07 -0400, Aaron Conole wrote:
> Eric Dumazet <eric.dumazet@gmail•com> writes:
>
> > On Sun, 2015-09-20 at 05:18 -0400, Aaron Conole wrote:
> >> From: Aaron Conole <aaron@bytheb•org>
> >>
> >
> > I am wondering what this is expected to do, and how this code would
> > possibly not trigger a crash.
> Are you suspecting it should crash from a possible double-lock case?
> On line 2125, there is an unconditional unlock, which should be
> guaranteeing that there is no longer a condition to 'double lock' the
> socket.
Not at all.
I am suggesting there is a big difference between
unix_state_lock(&sk);
and
unix_state_lock(sk);
Can you see it ?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] [net] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
[not found] ` <87h9mo7ui9.fsf@bytheb.org>
@ 2015-09-20 19:38 ` Aaron Conole
0 siblings, 0 replies; 5+ messages in thread
From: Aaron Conole @ 2015-09-20 19:38 UTC (permalink / raw)
To: Eric Dumazet, netdev
Resending, I accidentally dropped the list.
> Eric Dumazet <eric.dumazet@gmail•com> writes:
>
>> On Sun, 2015-09-20 at 15:07 -0400, Aaron Conole wrote:
>>> Eric Dumazet <eric.dumazet@gmail•com> writes:
>>>
>>> > On Sun, 2015-09-20 at 05:18 -0400, Aaron Conole wrote:
>>> >> From: Aaron Conole <aaron@bytheb•org>
>>> >>
>>> >
>>> > I am wondering what this is expected to do, and how this code would
>>> > possibly not trigger a crash.
>>> Are you suspecting it should crash from a possible double-lock case?
>>> On line 2125, there is an unconditional unlock, which should be
>>> guaranteeing that there is no longer a condition to 'double lock' the
>>> socket.
>>
>> Not at all.
>>
>> I am suggesting there is a big difference between
>>
>> unix_state_lock(&sk);
>>
>> and
>>
>> unix_state_lock(sk);
>>
>> Can you see it ?
Wow!
That's an excellent catch, thank you! I did test the originally
submitted patch, and got no oops, bug, panic, etc (I usually have
panic_on_oops set to 1 when first testing new code).
I guess I got very lucky, somehow. I'll change this, and make sure
to retest.
I will also try to enhance the python case attached to the bug to
include a filepointer as well, and will repost a v3 when I have done
this.
Thanks,
-Aaron
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-09-20 19:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-20 9:18 [PATCH v2] [net] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag Aaron Conole
2015-09-20 16:46 ` Eric Dumazet
2015-09-20 19:07 ` Aaron Conole
2015-09-20 19:21 ` Eric Dumazet
[not found] ` <87h9mo7ui9.fsf@bytheb.org>
2015-09-20 19:38 ` Aaron Conole
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox