Alex Sidorenko wrote: > Here are the values from live kernel (obtained with 'crash') when the host was > in SWS state: > > full_space=708 full_space/2=354 > free_space=393 > window=76 > > In this case the test from my original fix, (window < full_space/2), > succeeds. But John's test > > free_space > window + full_space/2 > 393 430 > > does not. So I suspect that the new fix will not always work. From tcpdump > traces we can see that both hosts exchange with 76-byte packets for a long > time. From customer's application log we see that it continues to read > 76-byte chunks per each read() call - even though more than that is available > in the receive buffer. Technically it's OK for read() to return even after > reading one byte, so if sk->receive_queue contains multiple 76-byte skbuffs > we may return after processing just one skbuff (but we we don't understand > the details of why this happens on customer's system). > > Are there any particular reasons why you want to postpone window update until > free_space becomes > window + full_space/2 and not as soon as > free_space > full_space/2? As the only real-life occurance of SWS shows > free_space oscillating slightly above full_space/2, I created the fix > specifically to match this phenomena as seen on customer's host. We reach the > modified section only when (free_space > full_space/2) so it should be OK to > update the window at this point if mss==full_space. > > So yes, we can test John's fix on customer's host but I doubt it will work for > the reasons mentioned above, in brief: > > 'window = free_space' instead of 'window=full_space/2' is OK, > but the test 'free_space > window + full_space/2' is not for the specific > pattern customer sees on his hosts. Sorry for the long delay in response, I've been on vacation. I'm okay with your patch, and I can't think of any real problem with it, except that the behavior is non-standard. Then again, Linux acking in general is non-standard, which has created the bug in the first place. :) The only thing I can think where it might still ack too often is if free_space frequently drops just below full_space/2 for a bit then rises above full_space/2. I've also attached a corrected version of my earlier patch that I think solves the problem you noted. Thanks, -John