* r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review [not found] ` <pan.2013.05.10.10.54.26.943195@googlemail.com> @ 2013-05-15 0:07 ` Ken Moffat 2013-05-15 6:14 ` Francois Romieu 2013-05-15 8:31 ` Holger Hoffstaette 0 siblings, 2 replies; 6+ messages in thread From: Ken Moffat @ 2013-05-15 0:07 UTC (permalink / raw) To: Holger Hoffstaette; +Cc: linux-kernel, stable, netdev On Fri, May 10, 2013 at 12:54:27PM +0200, Holger Hoffstaette wrote: Cc'ing to netdev because I don't think this has had a response, and I care because I *might* be seeing the same problem on both 3.9.2 and 3.10-rc1, but my take on the problem is slightly different [ details after Holger's posting ] > On Thu, 09 May 2013 15:31:23 -0700, Greg Kroah-Hartman wrote: > > > This is the start of the stable review cycle for the 3.8.13 release. There > > This patchset broke my internet, with all sorts of weird effects like > Samba clients having problems to talk to the server and only partially > working DNS resolution (CDNs broken, Amazon unreachable). > > After two reboots to/from .12/.13 (to rule out temporary internet > brokenness) the problem has been identified as: > > > Stefan Bader <stefan.bader@canonical•com> > > r8169: fix 8168evl frame padding. > > After reverting only this patch (turning r8169 back to 3.8.12) things > again behave as expected with the rest of .13. So far no other regressions > detected. > > This patch should probably be removed from 3.9.2-rc as well. > > -h > For me, I'm seeing what might be a similar problem in about 2/5 of my boots of 3.9.2 and 3.10-rc1 : in my case, r8169 is a module [ most things are built in ], I use dhclient to get an ip address, and I have separate nfs shares for /sources [ in /etc/mtab ] and my user's ~/notes [ mounted from ~/.bashrc ]. On a good boot, everything mounts. On a bad boot, /sources is NOT mounted because eth0 is not up, but by the time anyone logs in it _is_ up so mounting ~/notes [and manually mounting /sources as root] works. What seems to be happening is that eth0 is coming up slightly later on some occasions (about 11 seconds from booting, instead of 10.5 seconds) and somehow the dhclient script seems to have ended *before* that. For me, this isn't bisectable (so far, 10 boots of 3.9.2 and 3.10-rc1 on this box, and 4 were problematic). On this box, 3.9.0 itself was perfect. Holger : apologies if I've hijacked this thread with what turns out to be a different problem. ken -- das eine Mal als Tragödie, das andere Mal als Farce ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review 2013-05-15 0:07 ` r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review Ken Moffat @ 2013-05-15 6:14 ` Francois Romieu 2013-05-15 17:09 ` Ken Moffat 2013-05-15 20:39 ` David Miller 2013-05-15 8:31 ` Holger Hoffstaette 1 sibling, 2 replies; 6+ messages in thread From: Francois Romieu @ 2013-05-15 6:14 UTC (permalink / raw) To: Ken Moffat; +Cc: Holger Hoffstaette, linux-kernel, stable, netdev Ken Moffat <zarniwhoop@ntlworld•com> : [...] > Cc'ing to netdev because I don't think this has had a response, and A patch has been sent to netdev a few hours ago. It needs more work, especially testing (hint, hint) as I don't have a proven test case yet. Please note: - if you don't use a 8168evl (check your dmesg for the XID line emitted by the r8169 driver), you are not the experiencing the same bug. - if you don't enable Tx checksum offload (distro/vendor dependent though disabled by default in the vanilla driver, see ethtool -k eth0, ethtool -K eth0 tx on sg on)), you are not the experiencing the same bug. - if you are experiencing the same bug, 3.10-rc1 should work again after reverting e5195c1f31f399289347e043d6abf3ffa80f0005 If someone comes with a failing network capture and a working one, it will save time. A 64 bytes (max) packet is not correctly transmitted. -- Ueimor ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review 2013-05-15 6:14 ` Francois Romieu @ 2013-05-15 17:09 ` Ken Moffat 2013-05-15 20:39 ` David Miller 1 sibling, 0 replies; 6+ messages in thread From: Ken Moffat @ 2013-05-15 17:09 UTC (permalink / raw) To: Francois Romieu; +Cc: Holger Hoffstaette, linux-kernel, stable, netdev On Wed, May 15, 2013 at 08:14:01AM +0200, Francois Romieu wrote: > Ken Moffat <zarniwhoop@ntlworld•com> : > [...] > > Cc'ing to netdev because I don't think this has had a response, and > > A patch has been sent to netdev a few hours ago. It needs more work, > especially testing (hint, hint) as I don't have a proven test case yet. > > Please note: > - if you don't use a 8168evl (check your dmesg for the XID line emitted > by the r8169 driver), you are not the experiencing the same bug. [ 3.174180] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at 0xffffc90000008000, c8:60:00:97:07:35, XID 0c900800 IRQ 41 > - if you don't enable Tx checksum offload (distro/vendor dependent though > disabled by default in the vanilla driver, see ethtool -k eth0, > ethtool -K eth0 tx on sg on)), you are not the experiencing the same > bug. If it is disabled by default then my problem is different (I don't have an ethtool program) > - if you are experiencing the same bug, 3.10-rc1 should work again > after reverting e5195c1f31f399289347e043d6abf3ffa80f0005 > > If someone comes with a failing network capture and a working one, it will > save time. A 64 bytes (max) packet is not correctly transmitted. > > -- > Ueimor Thanks for the detailed comments, looks as if my intermittent problem is something else. ken -- das eine Mal als Tragödie, das andere Mal als Farce ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review 2013-05-15 6:14 ` Francois Romieu 2013-05-15 17:09 ` Ken Moffat @ 2013-05-15 20:39 ` David Miller 2013-05-15 23:15 ` David Miller 1 sibling, 1 reply; 6+ messages in thread From: David Miller @ 2013-05-15 20:39 UTC (permalink / raw) To: romieu; +Cc: zarniwhoop, holger.hoffstaette, linux-kernel, stable, netdev From: Francois Romieu <romieu@fr•zoreil.com> Date: Wed, 15 May 2013 08:14:01 +0200 > Ken Moffat <zarniwhoop@ntlworld•com> : > [...] >> Cc'ing to netdev because I don't think this has had a response, and > > A patch has been sent to netdev a few hours ago. It needs more work, > especially testing (hint, hint) as I don't have a proven test case yet. > > Please note: > - if you don't use a 8168evl (check your dmesg for the XID line emitted > by the r8169 driver), you are not the experiencing the same bug. > - if you don't enable Tx checksum offload (distro/vendor dependent though > disabled by default in the vanilla driver, see ethtool -k eth0, > ethtool -K eth0 tx on sg on)), you are not the experiencing the same > bug. > - if you are experiencing the same bug, 3.10-rc1 should work again > after reverting e5195c1f31f399289347e043d6abf3ffa80f0005 > > If someone comes with a failing network capture and a working one, it will > save time. A 64 bytes (max) packet is not correctly transmitted. FWIW, I was about to submit the regression causing commit to -stable but now I'm going to hold off until we get the fix for it to Linus. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review 2013-05-15 20:39 ` David Miller @ 2013-05-15 23:15 ` David Miller 0 siblings, 0 replies; 6+ messages in thread From: David Miller @ 2013-05-15 23:15 UTC (permalink / raw) To: romieu; +Cc: zarniwhoop, holger.hoffstaette, linux-kernel, stable, netdev From: David Miller <davem@davemloft•net> Date: Wed, 15 May 2013 13:39:13 -0700 (PDT) > From: Francois Romieu <romieu@fr•zoreil.com> > Date: Wed, 15 May 2013 08:14:01 +0200 > >> Ken Moffat <zarniwhoop@ntlworld•com> : >> [...] >>> Cc'ing to netdev because I don't think this has had a response, and >> >> A patch has been sent to netdev a few hours ago. It needs more work, >> especially testing (hint, hint) as I don't have a proven test case yet. >> >> Please note: >> - if you don't use a 8168evl (check your dmesg for the XID line emitted >> by the r8169 driver), you are not the experiencing the same bug. >> - if you don't enable Tx checksum offload (distro/vendor dependent though >> disabled by default in the vanilla driver, see ethtool -k eth0, >> ethtool -K eth0 tx on sg on)), you are not the experiencing the same >> bug. >> - if you are experiencing the same bug, 3.10-rc1 should work again >> after reverting e5195c1f31f399289347e043d6abf3ffa80f0005 >> >> If someone comes with a failing network capture and a working one, it will >> save time. A 64 bytes (max) packet is not correctly transmitted. > > FWIW, I was about to submit the regression causing commit to -stable > but now I'm going to hold off until we get the fix for it to Linus. Oh, I see, someone merged it behind my back. Well, guys, now do you see why I let patches cook in Linus's tree for a week or two before I submit them to Linus? It's so that bugs like this are less likely to propagate, and we find them such problems and fix them before a change hits -stable and therefore has an effect on an even larger number of users. Please, use the networking -stable submission process. If you want a patch merged, tell me, and I'll queue it up in patchwork. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review 2013-05-15 0:07 ` r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review Ken Moffat 2013-05-15 6:14 ` Francois Romieu @ 2013-05-15 8:31 ` Holger Hoffstaette 1 sibling, 0 replies; 6+ messages in thread From: Holger Hoffstaette @ 2013-05-15 8:31 UTC (permalink / raw) To: stable; +Cc: linux-kernel, netdev On Wed, 15 May 2013 01:07:15 +0100, Ken Moffat wrote: > Cc'ing to netdev because I don't think this has had a response, and > I care because I *might* be seeing the same problem on both 3.9.2 and > 3.10-rc1, but my take on the problem is slightly different [ details after > Holger's posting ] I later isolated the regression to tx offloading, which allows one to reproduce the problem on the fly (enable: b0rken internets, disable: all fine). So unless you have tx offloading enabled then whatever you are seeing has likely nothing to do with *this* problem. Francois posted a patch to -netdev which I am running as we speak, and the problem seems fixed for me; tx offloading (in adition to rx/gro) works again, with no side effects. Please help test! -h ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-05-15 23:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20130509222757.917088509@linuxfoundation.org>
[not found] ` <pan.2013.05.10.10.54.26.943195@googlemail.com>
2013-05-15 0:07 ` r8169 on 3.8.13, 3.9.2, 3.10-rc1, was Re: [ 00/73] 3.8.13-stable review Ken Moffat
2013-05-15 6:14 ` Francois Romieu
2013-05-15 17:09 ` Ken Moffat
2013-05-15 20:39 ` David Miller
2013-05-15 23:15 ` David Miller
2013-05-15 8:31 ` Holger Hoffstaette
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox