public inbox for netdev@vger.kernel.org 
 help / color / mirror / Atom feed
* Re: Strange delays / what usually happens every 10 min?
       [not found] <fhcc39$51b$1@ger.gmane.org>
@ 2007-11-13 16:23 ` Eric Dumazet
  2007-11-13 17:54   ` Florian Boelstler
  0 siblings, 1 reply; 2+ messages in thread
From: Eric Dumazet @ 2007-11-13 16:23 UTC (permalink / raw)
  To: Florian Boelstler; +Cc: linux-kernel, netdev

Florian Boelstler a écrit :
> Hi,
>
> this issue has been already discussed on the kernelnewbies mailing 
> list [1],[2] and suggested to be further discussed here.
>
> I am currently working on a MPC8540-based custom board, which runs Linux
> 2.6.15 (arch/ppc). The original Linux sources have been modified to 
> support that custom board. (Additional patches to support LTT are 
> applied as well, though disabled in the running kernel)
>
> I set up a periodically running kernel thread, which is delayed for a
> single jiffy using schedule_timeout() in an infinite loop. It is used to
> measure delays between invocations of that thread. For measuring the
> distance in time the PPC's time base lower half register is used
> (obtained using get_cycles() defined in asm/timex.h).
>
> The thread calculates the delay to the previous run and only outputs the
> result if a new maximum value has been determined (in respect to all
> previous cycles). Further the thread outputs a warning if a very "high"
> delay was determined. I.e. a delay greater than 5ms.
>
> While running that test driver a delay of about 10ms _exactly_ occurs
> every 10 minutes.
>
> The kernel is configured using CONFIG_HZ=1000 and CONFIG_PREEMPT.
> The CCB is at 333MHz, whereas the TBR update rate is 333 MHz / 8, i.e.
> 41,625 MHz.
> Kernel configuration as a whole is found here: 
> http://nopaste.info/5e4d0283bb.html
>
> And now the funny part starts.
> I got a response from Bruce Rowen on kernelnewbies, telling me that he 
> came across the same problem. He increased his AMD-Geode-based 
> platform to 1GB of RAM (256MB before) and also hit the 
> 10-minutes-issue a few month ago (using Linux 2.6.13).
> Going back to 256MB cured the problem. I did the same thing by 
> instructing the boot loader in order to only use 256 MB of RAM 
> (instead of 512MB) and yes, the 10-minutes-issue was gone as well.
>
> Apart of some kernel threads almost all user processes have been killed
> during the test. Only SSH and a bash were running (whereas a test with 
> network interfaces completely disabled and only operated from a serial 
> console turned out the same results).
> The kernel comes with compiled in CIFS support, some kernel debugging
> features like soft-lockup detection and preemption debugging. I.e. ps
> lists the kernel threads ksoftirqd, watchdog, events, khelper, kthread,
> kblockd, pdflush, aio, cifsoplockd and cifsdnotifyd.
>
> An appropriate userspace test tool based on nanosleep() determined the
> same results like the kernel thread:
>
> root@mpc0:/# /tmp/wait.rt
> looping 1 milli seconds nanosleep ...
> 15:26:16: #1 FRAME MAX 1996 us (at 4139773004 ticks)
> 15:26:16: #2 FRAME MAX 2002 us (at 4139856360 ticks)
> 15:26:16: #155 FRAME MAX 2102 us (at 4152597854 ticks)
> 15:41:37: #460398 FRAME MAX 8941 us (at 3813406605 ticks)
> 15:41:37: #460398 FRAME HIGH 8941 us (at 3813406605 ticks)
> 15:51:37: #760394 FRAME MAX 9936 us (at 3018602602 ticks)
> 15:51:37: #760394 FRAME HIGH 9936 us (at 3018602602 ticks)
> 16:01:37: #1060390 FRAME HIGH 9935 us (at 2223798809 ticks)
> 16:11:37: #1360386 FRAME HIGH 9934 us (at 1428994989 ticks)
> 16:21:37: #1660382 FRAME HIGH 9935 us (at 634191241 ticks)
> [...]
>
> Thanks for any help!
>
> Cheers,
>
>   Florian
>
> [1] http://thread.gmane.org/gmane.linux.kernel.kernelnewbies/23419
> [2] http://thread.gmane.org/gmane.linux.kernel.kernelnewbies/23426
>

Hi Florian

I think you hit the periodic flush of IP route cache, which is fired 
every 600 seconds per default.

(Check /proc/sys/net/ipv4/route/secret_interval )

For a 1GB machine, this hash table is so big that a full scan might take 
more than 10 ms, even if empty.

Instead of using less RAM, you could just boot with rhash_entries=1024 
to lower the size of this table.

Or just change secret_interval to 2000000 for example (not much more 
because * HZ could overflow)

Eric





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Strange delays / what usually happens every 10 min?
  2007-11-13 16:23 ` Strange delays / what usually happens every 10 min? Eric Dumazet
@ 2007-11-13 17:54   ` Florian Boelstler
  0 siblings, 0 replies; 2+ messages in thread
From: Florian Boelstler @ 2007-11-13 17:54 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev

Hi Eric,

Eric Dumazet wrote:
> Instead of using less RAM, you could just boot with rhash_entries=1024 
> to lower the size of this table.

I just tried that and it seems to reduce the scan time. This is the 
result for the first 40 minutes of runtime:

root@mpc0:/# /tmp/wait.rt
looping 1 milli seconds nanosleep ...
17:10:11/425384 #1 MAX 1996/83117/-268599896 us/tick/usec (at 2107848557)
17:10:11/427385 #2 MAX 2001/83327/2001 us/tick/usec (at 2107931884)
17:10:11/433534 #5 MAX 2149/89477/2150 us/tick/usec (at 2108187839)
17:27:02/5897 #505291 MAX 2512/104576/2513 us/tick/usec (at 1223589469)

The first ~10ms delay usually occurred after ~15 minutes. So one could 
argue that the reported HIGH-value at 17:27:02 (GMT) is the first flush 
of IP route cache. And all later flushes weren't longer than 2,5ms.

Thanks to all of you, especially Eric. Now it seems I got an instrument 
to lower system response time.

Cheers,

   Florian

PS: Unfortunately I had to remove some CC:-entries since the local 
firewall seems to not allow anything but NNTP (for gmane) and HTTP.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-11-13 17:54 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fhcc39$51b$1@ger.gmane.org>
2007-11-13 16:23 ` Strange delays / what usually happens every 10 min? Eric Dumazet
2007-11-13 17:54   ` Florian Boelstler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox