* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
[not found] ` <20250813232835.43458-3-inochiama@gmail.com>
@ 2025-08-26 19:45 ` Anders Roxell
2025-08-26 22:09 ` Nathan Chancellor
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Anders Roxell @ 2025-08-26 19:45 UTC (permalink / raw)
To: Inochi Amaoto, regressions, linux-next
Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
benjamin.copeland
On 2025-08-14 07:28, Inochi Amaoto wrote:
> As the RISC-V PLIC can not apply affinity setting without calling
> irq_enable(), it will make the interrupt unavailble when using as
> an underlying IRQ chip for MSI controller.
>
> Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> these startup and shutdown the parent as well, which allows the
> irq on the parent chip to be enabled if the irq is not enabled
> when allocating. This is necessary for the MSI controllers which
> use PLIC as underlying IRQ chip.
>
> Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> Signed-off-by: Inochi Amaoto <inochiama@gmail•com>
Regressions found while booting the Linux next-20250826 on the
qemu-arm64, qemu-armv7 due to following kernel log.
Bisection identified this commit as the cause of the regression.
Regression Analysis:
- New regression? Yes
- Reproducible? Yes
First seen on the next-20250826
Good: next-20250825
Bad: next-20250826
Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
qemu-armv7.
Expected behavior: System should boot normally and virtio block devices
should be detected and initialized immediately.
Actual behavior: System hangs for ~30 seconds during virtio block device
initialization before showing scheduler deadline replenish errors and
failing to complete boot.
Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
[...]
<6>[ 1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
0003)
<6>[ 1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
enabled
<6>[ 1.450858] msm_serial: driver initialized
<6>[ 1.454489] SuperH (H)SCI(F) driver initialized
<6>[ 1.456056] STM32 USART driver initialized
<6>[ 1.513325] loop: module loaded
<6>[ 1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
<5>[ 1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
blocks (2.76 GB/2.57 GiB)
<4>[ 29.761219] sched: DL replenish lagged too much
[here it hangs]
Reverting this commit restores normal boot behavior.
qemu-arm64
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log
qemu-armv7
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log
## Source
* Git tree:
* https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
* Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
* Git describe: next-20250826
* Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
* Architectures: arm64
* Toolchains: gcc-13
* Kconfigs: gcc-13-lkftconfig
## Build
* Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
* Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
* Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
* Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config
--
Linaro LKFT
https://lkft.linaro.org
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-26 19:45 ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
@ 2025-08-26 22:09 ` Nathan Chancellor
2025-08-27 10:33 ` Mark Brown
2025-08-26 22:33 ` Inochi Amaoto
2025-08-27 9:44 ` Inochi Amaoto
2 siblings, 1 reply; 9+ messages in thread
From: Nathan Chancellor @ 2025-08-26 22:09 UTC (permalink / raw)
To: Anders Roxell
Cc: Inochi Amaoto, regressions, linux-next, Thomas Gleixner,
Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi, Shradha Gupta,
Haiyang Zhang, Jonathan Cameron, Juergen Gross, Nicolin Chen,
Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci, Yixun Lan,
Longbin Li, arnd, dan.carpenter, naresh.kamboju,
benjamin.copeland
On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> Regressions found while booting the Linux next-20250826 on the
> qemu-arm64, qemu-armv7 due to following kernel log.
>
> Bisection identified this commit as the cause of the regression.
>
> Regression Analysis:
> - New regression? Yes
> - Reproducible? Yes
>
> First seen on the next-20250826
> Good: next-20250825
> Bad: next-20250826
>
> Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> qemu-armv7.
>
> Expected behavior: System should boot normally and virtio block devices
> should be detected and initialized immediately.
>
> Actual behavior: System hangs for ~30 seconds during virtio block device
> initialization before showing scheduler deadline replenish errors and
> failing to complete boot.
>
> Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
>
> [...]
> <6>[ 1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> 0003)
> <6>[ 1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> enabled
> <6>[ 1.450858] msm_serial: driver initialized
> <6>[ 1.454489] SuperH (H)SCI(F) driver initialized
> <6>[ 1.456056] STM32 USART driver initialized
> <6>[ 1.513325] loop: module loaded
> <6>[ 1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> <5>[ 1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> blocks (2.76 GB/2.57 GiB)
> <4>[ 29.761219] sched: DL replenish lagged too much
> [here it hangs]
FWIW, I am also seeing this on real arm64 hardware (an LX2160A board and
an Ampere Altra one) but with my NVMe drives failing to be recognized.
In somewhat ironic fashion, I am seeing the message from cover letter
repeating.
nvme nvme0: I/O tag 8 (1008) QID 0 timeout, completion polled
[ 125.810062] dracut-initqueue[640]: Timed out while waiting for udev queue to empty.
nvme nvme0: I/O tag 9 (1009) QID 0 timeout, completion polled
I am happy to test patches or provide information.
Cheers,
Nathan
# bad: [d0630b758e593506126e8eda6c3d56097d1847c5] Add linux-next specific files for 20250826
# good: [b6add54ba61890450fa54fd9327d10fdfd653439] Merge tag 'pinctrl-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
git bisect start 'd0630b758e593506126e8eda6c3d56097d1847c5' 'b6add54ba61890450fa54fd9327d10fdfd653439'
# good: [968d16786392f6e047329f5eff66acc131636019] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
git bisect good 968d16786392f6e047329f5eff66acc131636019
# good: [042e9f528d5362c499b5d8e2716cf6f64ca53add] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394.git
git bisect good 042e9f528d5362c499b5d8e2716cf6f64ca53add
# bad: [beebb75399dc36e7c244db0a08426053b4581ecc] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git
git bisect bad beebb75399dc36e7c244db0a08426053b4581ecc
# good: [62df8fb299358a45a915381de09025cf5e6a4a8f] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git
git bisect good 62df8fb299358a45a915381de09025cf5e6a4a8f
# bad: [1e6d2dcb13c8d94b56de1eff60235ca90587046b] Merge branch 'master' of https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
git bisect bad 1e6d2dcb13c8d94b56de1eff60235ca90587046b
# bad: [a0daa9e939dbcd7767090151771d94ade75a4fd5] Merge branch into tip/master: 'x86/build'
git bisect bad a0daa9e939dbcd7767090151771d94ade75a4fd5
# bad: [d147a3db0dfa15c8e460f007128bd0fe2e1b877f] Merge branch into tip/master: 'perf/core'
git bisect bad d147a3db0dfa15c8e460f007128bd0fe2e1b877f
# good: [be5697d7136525a91e7f30fdca2e7de737d9a8ed] Merge branch into tip/master: 'irq/core'
git bisect good be5697d7136525a91e7f30fdca2e7de737d9a8ed
# good: [5d299897f1e36025400ca84fd36c15925a383b03] perf: Split out the RB allocation
git bisect good 5d299897f1e36025400ca84fd36c15925a383b03
# bad: [7fb83eb664e9b3a0438dd28859e9f0fd49d4c165] irqchip/loongson-eiointc: Route interrupt parsed from bios table
git bisect bad 7fb83eb664e9b3a0438dd28859e9f0fd49d4c165
# bad: [7ee4a5a2ec3748facfb4ca96e4cce6cabbdecab2] irqchip/sg2042-msi: Set MSI_FLAG_MULTI_PCI_MSI flags for SG2044
git bisect bad 7ee4a5a2ec3748facfb4ca96e4cce6cabbdecab2
# bad: [9d8c41816bac518b4824f83b346ae30a1be83f68] irqchip/sg2042-msi: Fix broken affinity setting
git bisect bad 9d8c41816bac518b4824f83b346ae30a1be83f68
# bad: [54f45a30c0d0153d2be091ba2d683ab6db6d1d5b] PCI/MSI: Add startup/shutdown for per device domains
git bisect bad 54f45a30c0d0153d2be091ba2d683ab6db6d1d5b
# first bad commit: [54f45a30c0d0153d2be091ba2d683ab6db6d1d5b] PCI/MSI: Add startup/shutdown for per device domains
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-26 19:45 ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
2025-08-26 22:09 ` Nathan Chancellor
@ 2025-08-26 22:33 ` Inochi Amaoto
2025-08-26 23:28 ` Inochi Amaoto
2025-08-27 9:44 ` Inochi Amaoto
2 siblings, 1 reply; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-26 22:33 UTC (permalink / raw)
To: Anders Roxell, Inochi Amaoto, regressions, linux-next
Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
benjamin.copeland
On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> On 2025-08-14 07:28, Inochi Amaoto wrote:
> > As the RISC-V PLIC can not apply affinity setting without calling
> > irq_enable(), it will make the interrupt unavailble when using as
> > an underlying IRQ chip for MSI controller.
> >
> > Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> > MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> > these startup and shutdown the parent as well, which allows the
> > irq on the parent chip to be enabled if the irq is not enabled
> > when allocating. This is necessary for the MSI controllers which
> > use PLIC as underlying IRQ chip.
> >
> > Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> > Signed-off-by: Inochi Amaoto <inochiama@gmail•com>
>
> Regressions found while booting the Linux next-20250826 on the
> qemu-arm64, qemu-armv7 due to following kernel log.
>
> Bisection identified this commit as the cause of the regression.
>
> Regression Analysis:
> - New regression? Yes
> - Reproducible? Yes
>
> First seen on the next-20250826
> Good: next-20250825
> Bad: next-20250826
>
> Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> qemu-armv7.
>
> Expected behavior: System should boot normally and virtio block devices
> should be detected and initialized immediately.
>
> Actual behavior: System hangs for ~30 seconds during virtio block device
> initialization before showing scheduler deadline replenish errors and
> failing to complete boot.
>
> Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
>
> [...]
> <6>[ 1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> 0003)
> <6>[ 1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> enabled
> <6>[ 1.450858] msm_serial: driver initialized
> <6>[ 1.454489] SuperH (H)SCI(F) driver initialized
> <6>[ 1.456056] STM32 USART driver initialized
> <6>[ 1.513325] loop: module loaded
> <6>[ 1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> <5>[ 1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> blocks (2.76 GB/2.57 GiB)
> <4>[ 29.761219] sched: DL replenish lagged too much
> [here it hangs]
>
>
> Reverting this commit restores normal boot behavior.
>
>
> qemu-arm64
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log
>
> qemu-armv7
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log
>
> ## Source
> * Git tree:
> * https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
> * Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
> * Git describe: next-20250826
> * Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
> * Architectures: arm64
> * Toolchains: gcc-13
> * Kconfigs: gcc-13-lkftconfig
>
>
> ## Build
> * Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
> * Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
> * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
> * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config
>
Is there a link for me to get the command line args for qemu? So I can
reproduce it locally.
Regards,
Inochi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-26 22:33 ` Inochi Amaoto
@ 2025-08-26 23:28 ` Inochi Amaoto
2025-08-27 0:47 ` Nathan Chancellor
0 siblings, 1 reply; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-26 23:28 UTC (permalink / raw)
To: Anders Roxell, Nathan Chancellor, Inochi Amaoto, regressions,
linux-next
Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
benjamin.copeland
On Wed, Aug 27, 2025 at 06:33:44AM +0800, Inochi Amaoto wrote:
> On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> > On 2025-08-14 07:28, Inochi Amaoto wrote:
> > > As the RISC-V PLIC can not apply affinity setting without calling
> > > irq_enable(), it will make the interrupt unavailble when using as
> > > an underlying IRQ chip for MSI controller.
> > >
> > > Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> > > MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> > > these startup and shutdown the parent as well, which allows the
> > > irq on the parent chip to be enabled if the irq is not enabled
> > > when allocating. This is necessary for the MSI controllers which
> > > use PLIC as underlying IRQ chip.
> > >
> > > Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> > > Signed-off-by: Inochi Amaoto <inochiama@gmail•com>
> >
> > Regressions found while booting the Linux next-20250826 on the
> > qemu-arm64, qemu-armv7 due to following kernel log.
> >
> > Bisection identified this commit as the cause of the regression.
> >
> > Regression Analysis:
> > - New regression? Yes
> > - Reproducible? Yes
> >
> > First seen on the next-20250826
> > Good: next-20250825
> > Bad: next-20250826
> >
> > Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> > qemu-armv7.
> >
> > Expected behavior: System should boot normally and virtio block devices
> > should be detected and initialized immediately.
> >
> > Actual behavior: System hangs for ~30 seconds during virtio block device
> > initialization before showing scheduler deadline replenish errors and
> > failing to complete boot.
> >
> > Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
> >
> > [...]
> > <6>[ 1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> > 0003)
> > <6>[ 1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> > enabled
> > <6>[ 1.450858] msm_serial: driver initialized
> > <6>[ 1.454489] SuperH (H)SCI(F) driver initialized
> > <6>[ 1.456056] STM32 USART driver initialized
> > <6>[ 1.513325] loop: module loaded
> > <6>[ 1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> > <5>[ 1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> > blocks (2.76 GB/2.57 GiB)
> > <4>[ 29.761219] sched: DL replenish lagged too much
> > [here it hangs]
> >
> >
> > Reverting this commit restores normal boot behavior.
> >
> >
> > qemu-arm64
> > - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log
> >
> > qemu-armv7
> > - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log
> >
> > ## Source
> > * Git tree:
> > * https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
> > * Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
> > * Git describe: next-20250826
> > * Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
> > * Architectures: arm64
> > * Toolchains: gcc-13
> > * Kconfigs: gcc-13-lkftconfig
> >
> >
> > ## Build
> > * Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
> > * Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
> > * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
> > * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config
> >
>
> Is there a link for me to get the command line args for qemu? So I can
> reproduce it locally.
>
OK, I guess I know why: I have missed one condition for startup.
Could you test the following patch? If worked, I will send it as
a fix.
---
diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
index e0a800f918e8..b11b7f63f0d6 100644
--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -154,6 +154,8 @@ static void cond_shutdown_parent(struct irq_data *data)
if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
irq_chip_shutdown_parent(data);
+ else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
+ irq_chip_mask_parent(data);
}
static unsigned int cond_startup_parent(struct irq_data *data)
@@ -162,6 +164,9 @@ static unsigned int cond_startup_parent(struct irq_data *data)
if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
return irq_chip_startup_parent(data);
+ else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
+ irq_chip_unmask_parent(data);
+
return 0;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-26 23:28 ` Inochi Amaoto
@ 2025-08-27 0:47 ` Nathan Chancellor
2025-08-27 8:17 ` Naresh Kamboju
0 siblings, 1 reply; 9+ messages in thread
From: Nathan Chancellor @ 2025-08-27 0:47 UTC (permalink / raw)
To: Inochi Amaoto
Cc: Anders Roxell, regressions, linux-next, Thomas Gleixner,
Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi, Shradha Gupta,
Haiyang Zhang, Jonathan Cameron, Juergen Gross, Nicolin Chen,
Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci, Yixun Lan,
Longbin Li, arnd, dan.carpenter, naresh.kamboju,
benjamin.copeland
On Wed, Aug 27, 2025 at 07:28:46AM +0800, Inochi Amaoto wrote:
> OK, I guess I know why: I have missed one condition for startup.
>
> Could you test the following patch? If worked, I will send it as
> a fix.
Yes, that appears to resolve the issue on one system. I cannot test the
other at the moment since it is under load.
Tested-by: Nathan Chancellor <nathan@kernel•org>
> ---
> diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
> index e0a800f918e8..b11b7f63f0d6 100644
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -154,6 +154,8 @@ static void cond_shutdown_parent(struct irq_data *data)
>
> if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
> irq_chip_shutdown_parent(data);
> + else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> + irq_chip_mask_parent(data);
> }
>
> static unsigned int cond_startup_parent(struct irq_data *data)
> @@ -162,6 +164,9 @@ static unsigned int cond_startup_parent(struct irq_data *data)
>
> if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
> return irq_chip_startup_parent(data);
> + else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> + irq_chip_unmask_parent(data);
> +
> return 0;
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-27 0:47 ` Nathan Chancellor
@ 2025-08-27 8:17 ` Naresh Kamboju
2025-08-27 9:45 ` Inochi Amaoto
0 siblings, 1 reply; 9+ messages in thread
From: Naresh Kamboju @ 2025-08-27 8:17 UTC (permalink / raw)
To: Nathan Chancellor
Cc: Inochi Amaoto, Anders Roxell, regressions, linux-next,
Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
Yixun Lan, Longbin Li, arnd, dan.carpenter, benjamin.copeland
On Wed, 27 Aug 2025 at 06:17, Nathan Chancellor <nathan@kernel•org> wrote:
>
> On Wed, Aug 27, 2025 at 07:28:46AM +0800, Inochi Amaoto wrote:
> > OK, I guess I know why: I have missed one condition for startup.
> >
> > Could you test the following patch? If worked, I will send it as
> > a fix.
>
> Yes, that appears to resolve the issue on one system. I cannot test the
> other at the moment since it is under load.
I have built on top of Linux next-20250826 tag and the qemu-arm64 boot test
pass and LTP smoke test also pass.
>
> Tested-by: Nathan Chancellor <nathan@kernel•org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro•org>
>
> > ---
> > diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
> > index e0a800f918e8..b11b7f63f0d6 100644
> > --- a/drivers/pci/msi/irqdomain.c
> > +++ b/drivers/pci/msi/irqdomain.c
> > @@ -154,6 +154,8 @@ static void cond_shutdown_parent(struct irq_data *data)
> >
> > if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
> > irq_chip_shutdown_parent(data);
> > + else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> > + irq_chip_mask_parent(data);
> > }
> >
> > static unsigned int cond_startup_parent(struct irq_data *data)
> > @@ -162,6 +164,9 @@ static unsigned int cond_startup_parent(struct irq_data *data)
> >
> > if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
> > return irq_chip_startup_parent(data);
> > + else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> > + irq_chip_unmask_parent(data);
> > +
> > return 0;
> > }
> >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-26 19:45 ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
2025-08-26 22:09 ` Nathan Chancellor
2025-08-26 22:33 ` Inochi Amaoto
@ 2025-08-27 9:44 ` Inochi Amaoto
2 siblings, 0 replies; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-27 9:44 UTC (permalink / raw)
To: Anders Roxell, Inochi Amaoto, regressions, linux-next
Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
benjamin.copeland
On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> On 2025-08-14 07:28, Inochi Amaoto wrote:
> > As the RISC-V PLIC can not apply affinity setting without calling
> > irq_enable(), it will make the interrupt unavailble when using as
> > an underlying IRQ chip for MSI controller.
> >
> > Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> > MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> > these startup and shutdown the parent as well, which allows the
> > irq on the parent chip to be enabled if the irq is not enabled
> > when allocating. This is necessary for the MSI controllers which
> > use PLIC as underlying IRQ chip.
> >
> > Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> > Signed-off-by: Inochi Amaoto <inochiama@gmail•com>
>
> Regressions found while booting the Linux next-20250826 on the
> qemu-arm64, qemu-armv7 due to following kernel log.
>
> Bisection identified this commit as the cause of the regression.
>
> Regression Analysis:
> - New regression? Yes
> - Reproducible? Yes
>
> First seen on the next-20250826
> Good: next-20250825
> Bad: next-20250826
>
> Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> qemu-armv7.
>
> Expected behavior: System should boot normally and virtio block devices
> should be detected and initialized immediately.
>
> Actual behavior: System hangs for ~30 seconds during virtio block device
> initialization before showing scheduler deadline replenish errors and
> failing to complete boot.
>
> Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
>
> [...]
> <6>[ 1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> 0003)
> <6>[ 1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> enabled
> <6>[ 1.450858] msm_serial: driver initialized
> <6>[ 1.454489] SuperH (H)SCI(F) driver initialized
> <6>[ 1.456056] STM32 USART driver initialized
> <6>[ 1.513325] loop: module loaded
> <6>[ 1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> <5>[ 1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> blocks (2.76 GB/2.57 GiB)
> <4>[ 29.761219] sched: DL replenish lagged too much
> [here it hangs]
>
>
> Reverting this commit restores normal boot behavior.
>
>
> qemu-arm64
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log
>
> qemu-armv7
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log
>
> ## Source
> * Git tree:
> * https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
> * Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
> * Git describe: next-20250826
> * Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
> * Architectures: arm64
> * Toolchains: gcc-13
> * Kconfigs: gcc-13-lkftconfig
>
>
> ## Build
> * Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
> * Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
> * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
> * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config
>
> --
> Linaro LKFT
> https://lkft.linaro.org
Fix patch is here:
https://lore.kernel.org/all/20250827062911.203106-1-inochiama@gmail.com/
Regards,
Inochi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-27 8:17 ` Naresh Kamboju
@ 2025-08-27 9:45 ` Inochi Amaoto
0 siblings, 0 replies; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-27 9:45 UTC (permalink / raw)
To: Naresh Kamboju, Nathan Chancellor
Cc: Inochi Amaoto, Anders Roxell, regressions, linux-next,
Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
Yixun Lan, Longbin Li, arnd, dan.carpenter, benjamin.copeland
On Wed, Aug 27, 2025 at 01:47:14PM +0530, Naresh Kamboju wrote:
> On Wed, 27 Aug 2025 at 06:17, Nathan Chancellor <nathan@kernel•org> wrote:
> >
> > On Wed, Aug 27, 2025 at 07:28:46AM +0800, Inochi Amaoto wrote:
> > > OK, I guess I know why: I have missed one condition for startup.
> > >
> > > Could you test the following patch? If worked, I will send it as
> > > a fix.
> >
> > Yes, that appears to resolve the issue on one system. I cannot test the
> > other at the moment since it is under load.
>
> I have built on top of Linux next-20250826 tag and the qemu-arm64 boot test
> pass and LTP smoke test also pass.
>
> >
> > Tested-by: Nathan Chancellor <nathan@kernel•org>
>
> Tested-by: Linux Kernel Functional Testing <lkft@linaro•org>
>
Thanks for your tag, can you resend you tag to the following url?
I have sent a fix patch here. Thanks.
https://lore.kernel.org/all/20250827062911.203106-1-inochiama@gmail.com/
Regards,
Inochi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
2025-08-26 22:09 ` Nathan Chancellor
@ 2025-08-27 10:33 ` Mark Brown
0 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2025-08-27 10:33 UTC (permalink / raw)
To: Nathan Chancellor
Cc: Anders Roxell, Inochi Amaoto, regressions, linux-next,
Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
benjamin.copeland
[-- Attachment #1: Type: text/plain, Size: 871 bytes --]
On Tue, Aug 26, 2025 at 03:09:59PM -0700, Nathan Chancellor wrote:
> On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> > <5>[ 1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> > blocks (2.76 GB/2.57 GiB)
> > <4>[ 29.761219] sched: DL replenish lagged too much
> > [here it hangs]
> FWIW, I am also seeing this on real arm64 hardware (an LX2160A board and
> an Ampere Altra one) but with my NVMe drives failing to be recognized.
> In somewhat ironic fashion, I am seeing the message from cover letter
> repeating.
> nvme nvme0: I/O tag 8 (1008) QID 0 timeout, completion polled
> [ 125.810062] dracut-initqueue[640]: Timed out while waiting for udev queue to empty.
> nvme nvme0: I/O tag 9 (1009) QID 0 timeout, completion polled
> I am happy to test patches or provide information.
Same here, it's breaking at least Orion O6.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-08-27 10:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250813232835.43458-1-inochiama@gmail.com>
[not found] ` <20250813232835.43458-3-inochiama@gmail.com>
2025-08-26 19:45 ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
2025-08-26 22:09 ` Nathan Chancellor
2025-08-27 10:33 ` Mark Brown
2025-08-26 22:33 ` Inochi Amaoto
2025-08-26 23:28 ` Inochi Amaoto
2025-08-27 0:47 ` Nathan Chancellor
2025-08-27 8:17 ` Naresh Kamboju
2025-08-27 9:45 ` Inochi Amaoto
2025-08-27 9:44 ` Inochi Amaoto
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox