public inbox for linux-next@vger.kernel.org 
 help / color / mirror / Atom feed
* [powerpc][next-20200701] Hung task timeouts during regression test runs
@ 2020-07-02 11:23 Sachin Sant
  2020-07-02 11:52 ` Ming Lei
  0 siblings, 1 reply; 3+ messages in thread
From: Sachin Sant @ 2020-07-02 11:23 UTC (permalink / raw)
  To: linuxppc-dev, linux-block; +Cc: Linux Next Mailing List, ming.lei, axboe

Starting with linux-next 20200701 release I am observing automated regressions
tests taking longer time to complete. A test which took 10 minutes with next-20200630
took more than 60 minutes against next-20200701. 

Following hung task timeout messages were seen during these runs

[ 1718.848351]       Not tainted 5.8.0-rc3-next-20200701-autotest #1
[ 1718.848356] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1718.848362] NetworkManager  D    0  2626      1 0x00040080
[ 1718.848367] Call Trace:
[ 1718.848374] [c0000008b0f6b8f0] [c000000000c6d558] schedule+0x78/0x130 (unreliable)
[ 1718.848382] [c0000008b0f6bad0] [c00000000001b070] __switch_to+0x2e0/0x480
[ 1718.848388] [c0000008b0f6bb30] [c000000000c6ce9c] __schedule+0x2cc/0x910
[ 1718.848394] [c0000008b0f6bc10] [c000000000c6d558] schedule+0x78/0x130
[ 1718.848401] [c0000008b0f6bc40] [c0000000005d5a64] jbd2_log_wait_commit+0xd4/0x1a0
[ 1718.848408] [c0000008b0f6bcc0] [c00000000055fb6c] ext4_sync_file+0x1cc/0x480
[ 1718.848415] [c0000008b0f6bd20] [c000000000493530] vfs_fsync_range+0x70/0xf0
[ 1718.848421] [c0000008b0f6bd60] [c000000000493638] do_fsync+0x58/0xd0
[ 1718.848427] [c0000008b0f6bda0] [c0000000004936d8] sys_fsync+0x28/0x40
[ 1718.848433] [c0000008b0f6bdc0] [c000000000035e28] system_call_exception+0xf8/0x1c0
[ 1718.848440] [c0000008b0f6be20] [c00000000000ca70] system_call_common+0xf0/0x278

Comparing next-20200630 with next-20200701 one possible candidate seems to
be following commit:

commit 37f4a24c2469a10a4c16c641671bd766e276cf9f
    blk-mq: centralise related handling into blk_mq_get_driver_tag

Reverting this commit allows the test to complete in 10 minutes.

Thanks
-Sachin


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [powerpc][next-20200701] Hung task timeouts during regression test runs
  2020-07-02 11:23 [powerpc][next-20200701] Hung task timeouts during regression test runs Sachin Sant
@ 2020-07-02 11:52 ` Ming Lei
  2020-07-02 13:16   ` Sachin Sant
  0 siblings, 1 reply; 3+ messages in thread
From: Ming Lei @ 2020-07-02 11:52 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linuxppc-dev, linux-block, Linux Next Mailing List, axboe

On Thu, Jul 02, 2020 at 04:53:04PM +0530, Sachin Sant wrote:
> Starting with linux-next 20200701 release I am observing automated regressions
> tests taking longer time to complete. A test which took 10 minutes with next-20200630
> took more than 60 minutes against next-20200701. 
> 
> Following hung task timeout messages were seen during these runs
> 
> [ 1718.848351]       Not tainted 5.8.0-rc3-next-20200701-autotest #1
> [ 1718.848356] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1718.848362] NetworkManager  D    0  2626      1 0x00040080
> [ 1718.848367] Call Trace:
> [ 1718.848374] [c0000008b0f6b8f0] [c000000000c6d558] schedule+0x78/0x130 (unreliable)
> [ 1718.848382] [c0000008b0f6bad0] [c00000000001b070] __switch_to+0x2e0/0x480
> [ 1718.848388] [c0000008b0f6bb30] [c000000000c6ce9c] __schedule+0x2cc/0x910
> [ 1718.848394] [c0000008b0f6bc10] [c000000000c6d558] schedule+0x78/0x130
> [ 1718.848401] [c0000008b0f6bc40] [c0000000005d5a64] jbd2_log_wait_commit+0xd4/0x1a0
> [ 1718.848408] [c0000008b0f6bcc0] [c00000000055fb6c] ext4_sync_file+0x1cc/0x480
> [ 1718.848415] [c0000008b0f6bd20] [c000000000493530] vfs_fsync_range+0x70/0xf0
> [ 1718.848421] [c0000008b0f6bd60] [c000000000493638] do_fsync+0x58/0xd0
> [ 1718.848427] [c0000008b0f6bda0] [c0000000004936d8] sys_fsync+0x28/0x40
> [ 1718.848433] [c0000008b0f6bdc0] [c000000000035e28] system_call_exception+0xf8/0x1c0
> [ 1718.848440] [c0000008b0f6be20] [c00000000000ca70] system_call_common+0xf0/0x278
> 
> Comparing next-20200630 with next-20200701 one possible candidate seems to
> be following commit:
> 
> commit 37f4a24c2469a10a4c16c641671bd766e276cf9f
>     blk-mq: centralise related handling into blk_mq_get_driver_tag
> 
> Reverting this commit allows the test to complete in 10 minutes.

Hello,

Thanks for the report.

Please try the following fix:

https://lore.kernel.org/linux-block/20200702062041.GC2452799@T590/raw


Thanks,
Ming


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [powerpc][next-20200701] Hung task timeouts during regression test runs
  2020-07-02 11:52 ` Ming Lei
@ 2020-07-02 13:16   ` Sachin Sant
  0 siblings, 0 replies; 3+ messages in thread
From: Sachin Sant @ 2020-07-02 13:16 UTC (permalink / raw)
  To: Ming Lei; +Cc: linuxppc-dev, linux-block, Linux Next Mailing List, axboe



> On 02-Jul-2020, at 5:22 PM, Ming Lei <ming.lei@redhat•com> wrote:
> 
> On Thu, Jul 02, 2020 at 04:53:04PM +0530, Sachin Sant wrote:
>> Starting with linux-next 20200701 release I am observing automated regressions
>> tests taking longer time to complete. A test which took 10 minutes with next-20200630
>> took more than 60 minutes against next-20200701. 
>> 
>> Following hung task timeout messages were seen during these runs
>> 
>> [ 1718.848351]       Not tainted 5.8.0-rc3-next-20200701-autotest #1
>> [ 1718.848356] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 1718.848362] NetworkManager  D    0  2626      1 0x00040080
>> [ 1718.848367] Call Trace:
>> [ 1718.848374] [c0000008b0f6b8f0] [c000000000c6d558] schedule+0x78/0x130 (unreliable)
>> [ 1718.848382] [c0000008b0f6bad0] [c00000000001b070] __switch_to+0x2e0/0x480
>> [ 1718.848388] [c0000008b0f6bb30] [c000000000c6ce9c] __schedule+0x2cc/0x910
>> [ 1718.848394] [c0000008b0f6bc10] [c000000000c6d558] schedule+0x78/0x130
>> [ 1718.848401] [c0000008b0f6bc40] [c0000000005d5a64] jbd2_log_wait_commit+0xd4/0x1a0
>> [ 1718.848408] [c0000008b0f6bcc0] [c00000000055fb6c] ext4_sync_file+0x1cc/0x480
>> [ 1718.848415] [c0000008b0f6bd20] [c000000000493530] vfs_fsync_range+0x70/0xf0
>> [ 1718.848421] [c0000008b0f6bd60] [c000000000493638] do_fsync+0x58/0xd0
>> [ 1718.848427] [c0000008b0f6bda0] [c0000000004936d8] sys_fsync+0x28/0x40
>> [ 1718.848433] [c0000008b0f6bdc0] [c000000000035e28] system_call_exception+0xf8/0x1c0
>> [ 1718.848440] [c0000008b0f6be20] [c00000000000ca70] system_call_common+0xf0/0x278
>> 
>> Comparing next-20200630 with next-20200701 one possible candidate seems to
>> be following commit:
>> 
>> commit 37f4a24c2469a10a4c16c641671bd766e276cf9f
>>    blk-mq: centralise related handling into blk_mq_get_driver_tag
>> 
>> Reverting this commit allows the test to complete in 10 minutes.
> 
> Hello,
> 
> Thanks for the report.
> 
> Please try the following fix:
> 
> https://lore.kernel.org/linux-block/20200702062041.GC2452799@T590/raw

The fix works for me.

Tested-by : Sachin Sant <sachinp@linux•vnet.ibm.com>

Thanks
-Sachin

> 
> 
> Thanks,
> Ming


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-07-02 13:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-07-02 11:23 [powerpc][next-20200701] Hung task timeouts during regression test runs Sachin Sant
2020-07-02 11:52 ` Ming Lei
2020-07-02 13:16   ` Sachin Sant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox