From: Pavel Emelyanov <xemul@parallels•com>
To: Eric Dumazet <eric.dumazet@gmail•com>
Cc: Linux Netdev List <netdev@vger•kernel.org>,
David Miller <davem@davemloft•net>
Subject: Re: [PATCH 4/6] tcp: Repair socket queues
Date: Thu, 03 May 2012 12:59:16 +0400 [thread overview]
Message-ID: <4FA248E4.7060501@parallels.com> (raw)
In-Reply-To: <1335957064.22133.428.camel@edumazet-glaptop>
On 05/02/2012 03:11 PM, Eric Dumazet wrote:
> On Thu, 2012-04-19 at 17:41 +0400, Pavel Emelyanov wrote:
>> Reading queues under repair mode is done with recvmsg call.
>> The queue-under-repair set by TCP_REPAIR_QUEUE option is used
>> to determine which queue should be read. Thus both send and
>> receive queue can be read with this.
>>
>> Caller must pass the MSG_PEEK flag.
>>
>> Writing to queues is done with sendmsg call and yet again --
>> the repair-queue option can be used to push data into the
>> receive queue.
>>
>> When putting an skb into receive queue a zero tcp header is
>> appented to its head to address the tcp_hdr(skb)->syn and
>> the ->fin checks by the (after repair) tcp_recvmsg. These
>> flags flags are both set to zero and that's why.
>>
>> The fin cannot be met in the queue while reading the source
>> socket, since the repair only works for closed/established
>> sockets and queueing fin packet always changes its state.
>>
>> The syn in the queue denotes that the respective skb's seq
>> is "off-by-one" as compared to the actual payload lenght. Thus,
>> at the rcv queue refill we can just drop this flag and set the
>> skb's sequences to precice values.
>>
>> When the repair mode is turned off, the write queue seqs are
>> updated so that the whole queue is considered to be 'already sent,
>> waiting for ACKs' (write_seq = snd_nxt <= snd_una). From the
>> protocol POV the send queue looks like it was sent, but the data
>> between the write_seq and snd_nxt is lost in the network.
>>
>> This helps to avoid another sockoption for setting the snd_nxt
>> sequence. Leaving the whole queue in a 'not yet sent' state (as
>> it will be after sendmsg-s) will not allow to receive any acks
>> from the peer since the ack_seq will be after the snd_nxt. Thus
>> even the ack for the window probe will be dropped and the
>> connection will be 'locked' with the zero peer window.
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels•com>
>> ---
>> net/ipv4/tcp.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++--
>> net/ipv4/tcp_output.c | 1 +
>> 2 files changed, 87 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> index e38d6f2..47e2f49 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -912,6 +912,39 @@ static inline int select_size(const struct sock *sk, bool sg)
>> return tmp;
>> }
>>
>> +static int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
>> +{
>> + struct sk_buff *skb;
>> + struct tcp_skb_cb *cb;
>> + struct tcphdr *th;
>> +
>> + skb = alloc_skb(size + sizeof(*th), sk->sk_allocation);
>
> I am not sure any check is performed on 'size' ?
No, no checks here.
> A caller might trigger OOM or wrap bug.
Well, yes, but this ability is given to CAP_SYS_NET_ADMIN users only.
Do you think it's nonetheless worth accounting this allocation into
the socket's rmem?
Thanks,
Pavel
next prev parent reply other threads:[~2012-05-03 8:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-19 13:38 [PATCH net-next 0/6] TCP connection repair (v4) Pavel Emelyanov
2012-04-19 13:39 ` [PATCH 1/6] sock: Introduce named constants for sk_reuse Pavel Emelyanov
2012-04-19 13:40 ` [PATCH 2/6] tcp: Move code around Pavel Emelyanov
2012-04-19 13:40 ` [PATCH 3/6] tcp: Initial repair mode Pavel Emelyanov
2012-04-19 13:41 ` [PATCH 4/6] tcp: Repair socket queues Pavel Emelyanov
2012-05-02 11:11 ` Eric Dumazet
2012-05-03 8:59 ` Pavel Emelyanov [this message]
2012-05-03 9:08 ` Eric Dumazet
2012-05-03 9:15 ` Pavel Emelyanov
2012-05-03 9:31 ` David Miller
2012-04-19 13:41 ` [PATCH 5/6] tcp: Report mss_clamp with TCP_MAXSEG option in repair mode Pavel Emelyanov
2012-04-19 13:41 ` [PATCH 6/6] tcp: Repair connection-time negotiated parameters Pavel Emelyanov
2012-04-21 19:53 ` [PATCH net-next 0/6] TCP connection repair (v4) David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FA248E4.7060501@parallels.com \
--to=xemul@parallels$(echo .)com \
--cc=davem@davemloft$(echo .)net \
--cc=eric.dumazet@gmail$(echo .)com \
--cc=netdev@vger$(echo .)kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox