public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
From: Bill Fink <billfink@mindspring•com>
To: Jeremy Jackson <jerj@coplanar•net>
Cc: ilpo.jarvinen@helsinki•fi, Evgeniy Polyakov <zbr@ioremap•net>,
	bert hubert <bert.hubert@netherlabs•nl>,
	"H. Willstrand" <h.willstrand@gmail•com>,
	Netdev <netdev@vger•kernel.org>
Subject: Re: sendfile()? Re: SO_LINGER dead: I get an immediate RST on 2.6.24?
Date: Fri, 20 Feb 2009 13:10:46 -0500	[thread overview]
Message-ID: <20090220131046.46e3af16.billfink@mindspring.com> (raw)
In-Reply-To: <1234544555.28913.451.camel@ragnarok>

On Fri, 13 Feb 2009, Jeremy Jackson wrote:

> On Tue, 2009-01-13 at 00:31 -0500, Bill Fink wrote:
> > On Mon, 12 Jan 2009, Ilpo Järvinen wrote:
> > 
> > > On Sun, 11 Jan 2009, Bill Fink wrote:
> > > 
> > > > On Mon, 12 Jan 2009, Evgeniy Polyakov wrote:
> > > > 
> > > > > On Mon, Jan 12, 2009 at 12:08:24AM +0100, bert hubert (bert.hubert@netherlabs•nl) wrote:
> > > > > > I fully understand. Sometimes I have to talk to stupid devices though. What
> 
> An excellent article on this subject:
> 
> http://ds9a.nl/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable.txt
> 
> "Luckily, it turns out that Linux keeps track of the amount of
> unacknowledged
> data, which can be queried using the SIOCOUTQ ioctl(). Once we see this
> number hit 0, we can be reasonably sure our data reached at least the
> remote
> operating system."
> 
> is this the same as the TCP_INFO getsockopt() ?

If you mean the tcpinfo_unacked variable, then no it is not the same
as the SIOCOUTQ info.

> if you follow the progression from write(socket_fd, ) ... the data sits
> in
> the socket buffer, and SIOCOUTQ is initially zero.  If the connection
> started with a zero window,
> it could sit like that for a while (sometimes called a "tarpit ?).  But,
> you should still see the data in your socket buffer, yes?
> 
> So, I think you want to make sure your socket write buffer is empty
> (converted to unacked data), *then* make sure your unacked data is 0.
> 
> 	write(sock, buffer, 1000000);             // returns 1000000
> 	shutdown(sock, SHUT_WR);
> 	now wait for SIOCOUTQ to hit 0.
> 
> if window is 0, shutdown() would wait until show device sets window > 0
> again, or forever on a tarpitted connection.  Either way, when if/when
> it finishes, you know all data was transmitted, now wait for all of it
> to be ACKed with SIOCOUTQ.

While the "shutdown(sock, SHUT_WR)" might be useful, it isn't actually
necessary, since the SIOCOUTQ info includes both unACKed data (reported
by tcpinfo_unacked variable) and never sent data (written by app but
outside of receiver's allowed window).

						-Bill



> > > > > > I do find is the TCP_INFO ioctl, which offers this field in struct tcp_info:
> > > > > > 
> > > > > >         __u32   tcpi_unacked;
> > > > > > 
> > > > > > Which comes from:
> > > > > > 
> > > > > > struct tcp_sock {
> > > > > > ...
> > > > > >         u32     packets_out;    /* Packets which are "in flight"        */
> > > > > > ...
> > > > > > }
> > > > > > 
> > > > > > If this becomes 0, perhaps this might tell me everything I sent was acked?
> > > > > 
> > > > > 0 means that there are noin-flight packets, which is effectively number
> > > > > of unacked packets. So if your application waits for this field to
> > > > > become zero, it will wait for all sent packets to be acked.
> > > > 
> > > > I use this type of strategy in nuttcp, and it seems to work fine.
> > > > I have a loop with a small delay and a check of tcpi_unacked, and
> > > > break out of the loop if tcpi_unacked becomes 0 or a defined timeout
> > > > period has passed.
> > > 
> > > Checking tcpi_unacked alone won't be reliable. The peer might be slow 
> > > enough to advertize zero window for a short period of time and during 
> > > that period you would have packets_out zero...
> > 
> > I'll keep this in mind for the future, although it doesn't seem to
> > be a significant issue in practice.  I use this scheme to try and
> > account for the tcpi_total_retrans for the data stream, so if this
> > corner case was hit, it would mean an under reporting of the total
> > TCP retransmissions for the nuttcp test.
> > 
> > If I understand you correctly, to hit this corner case, just after
> > the final TCP write, there would have to be no packets in flight
> > together with a zero TCP window.  To make it more bullet-proof, I
> > guess after seeing a zero tcpi_unacked, an additional small delay
> > should be performed, and then rechecking for a zero tcpi_unacked.
> > I don't see anything else obvious (to me anyway) in the tcp_info
> > that would be particularly helpful in handling this.
> 
> -- 
> Jeremy Jackson
> Coplanar Networks
> (519)489-4903
> http://www.coplanar.net
> jerj@coplanar•net

  reply	other threads:[~2009-02-20 18:11 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-11 21:23 SO_LINGER dead: I get an immediate RST on 2.6.24? bert hubert
2009-01-11 22:08 ` H. Willstrand
2009-01-11 22:45   ` sendfile()? " bert hubert
2009-01-11 22:54     ` Evgeniy Polyakov
2009-01-11 23:08       ` bert hubert
2009-01-11 23:18         ` Evgeniy Polyakov
2009-01-12  4:50           ` Bill Fink
2009-01-12  9:18             ` Ilpo Järvinen
2009-01-13  5:31               ` Bill Fink
2009-02-13 17:02                 ` Jeremy Jackson
2009-02-20 18:10                   ` Bill Fink [this message]
  -- strict thread matches above, loose matches on Subject: below --
2009-01-13  6:32 Herbert Xu
2009-01-13  6:56 ` Bill Fink
2009-01-13  7:01   ` Herbert Xu
2009-01-14  7:43     ` Bill Fink
2009-01-14  8:29       ` Herbert Xu
2009-01-14  9:05         ` Bill Fink
2009-01-14 11:30           ` Herbert Xu
2009-01-15  6:33             ` Bill Fink
2009-01-13  7:06   ` Rick Jones
2009-01-14  8:05     ` Bill Fink
2009-01-14  8:08       ` Rick Jones
2009-01-14  8:32         ` Bill Fink

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090220131046.46e3af16.billfink@mindspring.com \
    --to=billfink@mindspring$(echo .)com \
    --cc=bert.hubert@netherlabs$(echo .)nl \
    --cc=h.willstrand@gmail$(echo .)com \
    --cc=ilpo.jarvinen@helsinki$(echo .)fi \
    --cc=jerj@coplanar$(echo .)net \
    --cc=netdev@vger$(echo .)kernel.org \
    --cc=zbr@ioremap$(echo .)net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox