public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
* 2.6.29 forcedeth hang W/O NAPI enabled
@ 2009-03-24 15:28 Mr. Berkley Shands
  0 siblings, 0 replies; 11+ messages in thread
From: Mr. Berkley Shands @ 2009-03-24 15:28 UTC (permalink / raw)
  To: netdev; +Cc: Mark A. Bober, Lloyd, Dave

Another new kernel, another interesting lock up. Centos 5.2, X86_84 on 
an opteron 8GB 4 cores (275 X 2).
If  CONFIG_FORCEDETH_NAPI is not enabled, then within 60 seconds of the 
console login prompt
appearing, the network becomes unresponsive. packets are seen to appear 
according to ifconfig eth0
and with ethtool -S eth0, but they go nowhere. NFS stops, ping stops, 
logins stop, ldap stops.
My network is class B, netmask 255.255.0.0, and the department router is 
directly connected
under this netmask. If I re-compile the forcedeth.ko with NAPI enabled, 
then reinstall it,
and depmod -aq then

service network stop; rmmod forcedeth; modprobe forcedeth; service 
network start

brings everything back online eventually. This was not an issue with 
2.6.28-8 or before.

Berkley




^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: 2.6.29 forcedeth hang W/O NAPI enabled
@ 2009-03-25 23:24 Adam Richter
  2009-03-26  0:05 ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Adam Richter @ 2009-03-25 23:24 UTC (permalink / raw)
  To: netdev; +Cc: berkley


	I am experiencing what is probably the same forcedeth ethernet
hang with FORCEDETH_NAPI disabled as reported by Berkley Shands.  I
want to add the following additional data (items 2-7 basically just
confirm what one would expect):

	1) I can narrow where the problem was introduced.  The problem
	   does not occur for me in 2.6.29-rc8-git6, the last git snapshot
	   before 2.6.29.  There are no changes to forcedeth.c between
	   these versions.

	2) The amount of time it takes to reproduce the problem seems
	   to depend on networking utilization.  I can reproduce the
	   problem in about 30 seconds by doing "ping -f" to a
	   computer on my local ethernet for about one minute.
	   Otherwise, my computer, which normally does not do much
	   network communication takes about an hour to exhibit the
	   problem.

	3) I can recover by doing "rmmod forcedeth ; modprobe forcedeth"
	   even without recompiling with NAPI enabled, but the
	   problem seems to recur more quickly, until reloading the
	   forcedeth module no longer seems to work.  (I infer from
	   Berkley Shands' message that reloading the module
	   recompiled with NAPI enabled will cause the problem not
	   to recur.)

	4) Given that this looks like a NAPI problem, it should come
	   as no surprise that ethernet transmit still works when the
	   problem is occuring.  I know this because I can run ping
	   from the effected machine to a target machine running
	   tcpdump, and the target machine sees the ping packets.

	5) When the problem occurs, "ifconfig eth0" reports a gradually
	   increasing count of "RX packets" (I assume from random
	   broadcast packets originating elsewhere on the local
	   ethernet), and no obvious signs of trouble:
 
          RX packets:2092 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:177338 (173.1 KB)  TX bytes:6732 (6.5 KB)

	6) No complaints on the kernel console appear when
	   ethernet receive stops working.

	7) When the problem occurs, the other functions of the
	   computer apparently continue to work fine.  In particular,
	   I can reboot the computer from a user program without
	   incident.

	When I can find some time, I plan to try to narrow the problem
with git bisect, but that may not be today.

Adam Richter



      

^ permalink raw reply	[flat|nested] 11+ messages in thread
* 2.6.29 forcedeth hang W/O NAPI enabled
@ 2009-03-26  0:06 Adam Richter
  2009-03-26  0:08 ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Adam Richter @ 2009-03-26  0:06 UTC (permalink / raw)
  To: netdev; +Cc: berkley


In addition to seeing the problem with CONFIG_FORCEDETH_NAPI disabled,
I have now reproduced the problem with that configuration option enabled.  So, NAPI might not be the problem at all.

Adam Richter



      

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-03-26 23:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-24 15:28 2.6.29 forcedeth hang W/O NAPI enabled Mr. Berkley Shands
  -- strict thread matches above, loose matches on Subject: below --
2009-03-25 23:24 Adam Richter
2009-03-26  0:05 ` David Miller
2009-03-26  1:20   ` Adam Richter
2009-03-26  3:14     ` David Miller
2009-03-26  3:36     ` Herbert Xu
2009-03-26  5:24       ` Adam Richter
2009-03-26  6:58         ` Herbert Xu
2009-03-26 23:29           ` Adam Richter
2009-03-26  0:06 Adam Richter
2009-03-26  0:08 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox