public inbox for linux-next@vger.kernel.org 
 help / color / mirror / Atom feed
* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
       [not found] ` <20250813232835.43458-3-inochiama@gmail.com>
@ 2025-08-26 19:45   ` Anders Roxell
  2025-08-26 22:09     ` Nathan Chancellor
                       ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Anders Roxell @ 2025-08-26 19:45 UTC (permalink / raw)
  To: Inochi Amaoto, regressions, linux-next
  Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
	Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
	Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
	Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
	benjamin.copeland

On 2025-08-14 07:28, Inochi Amaoto wrote:
> As the RISC-V PLIC can not apply affinity setting without calling
> irq_enable(), it will make the interrupt unavailble when using as
> an underlying IRQ chip for MSI controller.
> 
> Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> these startup and shutdown the parent as well, which allows the
> irq on the parent chip to be enabled if the irq is not enabled
> when allocating. This is necessary for the MSI controllers which
> use PLIC as underlying IRQ chip.
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> Signed-off-by: Inochi Amaoto <inochiama@gmail•com>

Regressions found while booting the Linux next-20250826 on the
qemu-arm64, qemu-armv7 due to following kernel log.

Bisection identified this commit as the cause of the regression.

Regression Analysis:
- New regression? Yes
- Reproducible? Yes

First seen on the next-20250826
Good: next-20250825
Bad: next-20250826

Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
qemu-armv7.

Expected behavior: System should boot normally and virtio block devices
should be detected and initialized immediately.

Actual behavior: System hangs for ~30 seconds during virtio block device
initialization before showing scheduler deadline replenish errors and
failing to complete boot.

Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>

[...]
<6>[    1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
0003)
<6>[    1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
enabled
<6>[    1.450858] msm_serial: driver initialized
<6>[    1.454489] SuperH (H)SCI(F) driver initialized
<6>[    1.456056] STM32 USART driver initialized
<6>[    1.513325] loop: module loaded
<6>[    1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
<5>[    1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
blocks (2.76 GB/2.57 GiB)
<4>[   29.761219] sched: DL replenish lagged too much
[here it hangs]


Reverting this commit restores normal boot behavior.


qemu-arm64
 - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log

qemu-armv7
 - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log

## Source
* Git tree:
* https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
* Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
* Git describe: next-20250826
* Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
* Architectures: arm64
* Toolchains: gcc-13
* Kconfigs: gcc-13-lkftconfig


## Build
* Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
* Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
* Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
* Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-26 19:45   ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
@ 2025-08-26 22:09     ` Nathan Chancellor
  2025-08-27 10:33       ` Mark Brown
  2025-08-26 22:33     ` Inochi Amaoto
  2025-08-27  9:44     ` Inochi Amaoto
  2 siblings, 1 reply; 9+ messages in thread
From: Nathan Chancellor @ 2025-08-26 22:09 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Inochi Amaoto, regressions, linux-next, Thomas Gleixner,
	Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi, Shradha Gupta,
	Haiyang Zhang, Jonathan Cameron, Juergen Gross, Nicolin Chen,
	Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci, Yixun Lan,
	Longbin Li, arnd, dan.carpenter, naresh.kamboju,
	benjamin.copeland

On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> Regressions found while booting the Linux next-20250826 on the
> qemu-arm64, qemu-armv7 due to following kernel log.
> 
> Bisection identified this commit as the cause of the regression.
> 
> Regression Analysis:
> - New regression? Yes
> - Reproducible? Yes
> 
> First seen on the next-20250826
> Good: next-20250825
> Bad: next-20250826
> 
> Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> qemu-armv7.
> 
> Expected behavior: System should boot normally and virtio block devices
> should be detected and initialized immediately.
> 
> Actual behavior: System hangs for ~30 seconds during virtio block device
> initialization before showing scheduler deadline replenish errors and
> failing to complete boot.
> 
> Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
> 
> [...]
> <6>[    1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> 0003)
> <6>[    1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> enabled
> <6>[    1.450858] msm_serial: driver initialized
> <6>[    1.454489] SuperH (H)SCI(F) driver initialized
> <6>[    1.456056] STM32 USART driver initialized
> <6>[    1.513325] loop: module loaded
> <6>[    1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> <5>[    1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> blocks (2.76 GB/2.57 GiB)
> <4>[   29.761219] sched: DL replenish lagged too much
> [here it hangs]

FWIW, I am also seeing this on real arm64 hardware (an LX2160A board and
an Ampere Altra one) but with my NVMe drives failing to be recognized.
In somewhat ironic fashion, I am seeing the message from cover letter
repeating.

  nvme nvme0: I/O tag 8 (1008) QID 0 timeout, completion polled
  [  125.810062] dracut-initqueue[640]: Timed out while waiting for udev queue to empty.
  nvme nvme0: I/O tag 9 (1009) QID 0 timeout, completion polled

I am happy to test patches or provide information.

Cheers,
Nathan

# bad: [d0630b758e593506126e8eda6c3d56097d1847c5] Add linux-next specific files for 20250826
# good: [b6add54ba61890450fa54fd9327d10fdfd653439] Merge tag 'pinctrl-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
git bisect start 'd0630b758e593506126e8eda6c3d56097d1847c5' 'b6add54ba61890450fa54fd9327d10fdfd653439'
# good: [968d16786392f6e047329f5eff66acc131636019] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
git bisect good 968d16786392f6e047329f5eff66acc131636019
# good: [042e9f528d5362c499b5d8e2716cf6f64ca53add] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394.git
git bisect good 042e9f528d5362c499b5d8e2716cf6f64ca53add
# bad: [beebb75399dc36e7c244db0a08426053b4581ecc] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git
git bisect bad beebb75399dc36e7c244db0a08426053b4581ecc
# good: [62df8fb299358a45a915381de09025cf5e6a4a8f] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git
git bisect good 62df8fb299358a45a915381de09025cf5e6a4a8f
# bad: [1e6d2dcb13c8d94b56de1eff60235ca90587046b] Merge branch 'master' of https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
git bisect bad 1e6d2dcb13c8d94b56de1eff60235ca90587046b
# bad: [a0daa9e939dbcd7767090151771d94ade75a4fd5] Merge branch into tip/master: 'x86/build'
git bisect bad a0daa9e939dbcd7767090151771d94ade75a4fd5
# bad: [d147a3db0dfa15c8e460f007128bd0fe2e1b877f] Merge branch into tip/master: 'perf/core'
git bisect bad d147a3db0dfa15c8e460f007128bd0fe2e1b877f
# good: [be5697d7136525a91e7f30fdca2e7de737d9a8ed] Merge branch into tip/master: 'irq/core'
git bisect good be5697d7136525a91e7f30fdca2e7de737d9a8ed
# good: [5d299897f1e36025400ca84fd36c15925a383b03] perf: Split out the RB allocation
git bisect good 5d299897f1e36025400ca84fd36c15925a383b03
# bad: [7fb83eb664e9b3a0438dd28859e9f0fd49d4c165] irqchip/loongson-eiointc: Route interrupt parsed from bios table
git bisect bad 7fb83eb664e9b3a0438dd28859e9f0fd49d4c165
# bad: [7ee4a5a2ec3748facfb4ca96e4cce6cabbdecab2] irqchip/sg2042-msi: Set MSI_FLAG_MULTI_PCI_MSI flags for SG2044
git bisect bad 7ee4a5a2ec3748facfb4ca96e4cce6cabbdecab2
# bad: [9d8c41816bac518b4824f83b346ae30a1be83f68] irqchip/sg2042-msi: Fix broken affinity setting
git bisect bad 9d8c41816bac518b4824f83b346ae30a1be83f68
# bad: [54f45a30c0d0153d2be091ba2d683ab6db6d1d5b] PCI/MSI: Add startup/shutdown for per device domains
git bisect bad 54f45a30c0d0153d2be091ba2d683ab6db6d1d5b
# first bad commit: [54f45a30c0d0153d2be091ba2d683ab6db6d1d5b] PCI/MSI: Add startup/shutdown for per device domains

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-26 19:45   ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
  2025-08-26 22:09     ` Nathan Chancellor
@ 2025-08-26 22:33     ` Inochi Amaoto
  2025-08-26 23:28       ` Inochi Amaoto
  2025-08-27  9:44     ` Inochi Amaoto
  2 siblings, 1 reply; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-26 22:33 UTC (permalink / raw)
  To: Anders Roxell, Inochi Amaoto, regressions, linux-next
  Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
	Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
	Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
	Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
	benjamin.copeland

On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> On 2025-08-14 07:28, Inochi Amaoto wrote:
> > As the RISC-V PLIC can not apply affinity setting without calling
> > irq_enable(), it will make the interrupt unavailble when using as
> > an underlying IRQ chip for MSI controller.
> > 
> > Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> > MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> > these startup and shutdown the parent as well, which allows the
> > irq on the parent chip to be enabled if the irq is not enabled
> > when allocating. This is necessary for the MSI controllers which
> > use PLIC as underlying IRQ chip.
> > 
> > Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> > Signed-off-by: Inochi Amaoto <inochiama@gmail•com>
> 
> Regressions found while booting the Linux next-20250826 on the
> qemu-arm64, qemu-armv7 due to following kernel log.
> 
> Bisection identified this commit as the cause of the regression.
> 
> Regression Analysis:
> - New regression? Yes
> - Reproducible? Yes
> 
> First seen on the next-20250826
> Good: next-20250825
> Bad: next-20250826
> 
> Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> qemu-armv7.
> 
> Expected behavior: System should boot normally and virtio block devices
> should be detected and initialized immediately.
> 
> Actual behavior: System hangs for ~30 seconds during virtio block device
> initialization before showing scheduler deadline replenish errors and
> failing to complete boot.
> 
> Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
> 
> [...]
> <6>[    1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> 0003)
> <6>[    1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> enabled
> <6>[    1.450858] msm_serial: driver initialized
> <6>[    1.454489] SuperH (H)SCI(F) driver initialized
> <6>[    1.456056] STM32 USART driver initialized
> <6>[    1.513325] loop: module loaded
> <6>[    1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> <5>[    1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> blocks (2.76 GB/2.57 GiB)
> <4>[   29.761219] sched: DL replenish lagged too much
> [here it hangs]
> 
> 
> Reverting this commit restores normal boot behavior.
> 
> 
> qemu-arm64
>  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log
> 
> qemu-armv7
>  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log
> 
> ## Source
> * Git tree:
> * https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
> * Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
> * Git describe: next-20250826
> * Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
> * Architectures: arm64
> * Toolchains: gcc-13
> * Kconfigs: gcc-13-lkftconfig
> 
> 
> ## Build
> * Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
> * Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
> * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
> * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config
> 

Is there a link for me to get the command line args for qemu? So I can
reproduce it locally.

Regards,
Inochi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-26 22:33     ` Inochi Amaoto
@ 2025-08-26 23:28       ` Inochi Amaoto
  2025-08-27  0:47         ` Nathan Chancellor
  0 siblings, 1 reply; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-26 23:28 UTC (permalink / raw)
  To: Anders Roxell, Nathan Chancellor, Inochi Amaoto, regressions,
	linux-next
  Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
	Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
	Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
	Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
	benjamin.copeland

On Wed, Aug 27, 2025 at 06:33:44AM +0800, Inochi Amaoto wrote:
> On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> > On 2025-08-14 07:28, Inochi Amaoto wrote:
> > > As the RISC-V PLIC can not apply affinity setting without calling
> > > irq_enable(), it will make the interrupt unavailble when using as
> > > an underlying IRQ chip for MSI controller.
> > > 
> > > Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> > > MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> > > these startup and shutdown the parent as well, which allows the
> > > irq on the parent chip to be enabled if the irq is not enabled
> > > when allocating. This is necessary for the MSI controllers which
> > > use PLIC as underlying IRQ chip.
> > > 
> > > Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> > > Signed-off-by: Inochi Amaoto <inochiama@gmail•com>
> > 
> > Regressions found while booting the Linux next-20250826 on the
> > qemu-arm64, qemu-armv7 due to following kernel log.
> > 
> > Bisection identified this commit as the cause of the regression.
> > 
> > Regression Analysis:
> > - New regression? Yes
> > - Reproducible? Yes
> > 
> > First seen on the next-20250826
> > Good: next-20250825
> > Bad: next-20250826
> > 
> > Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> > qemu-armv7.
> > 
> > Expected behavior: System should boot normally and virtio block devices
> > should be detected and initialized immediately.
> > 
> > Actual behavior: System hangs for ~30 seconds during virtio block device
> > initialization before showing scheduler deadline replenish errors and
> > failing to complete boot.
> > 
> > Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
> > 
> > [...]
> > <6>[    1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> > 0003)
> > <6>[    1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> > enabled
> > <6>[    1.450858] msm_serial: driver initialized
> > <6>[    1.454489] SuperH (H)SCI(F) driver initialized
> > <6>[    1.456056] STM32 USART driver initialized
> > <6>[    1.513325] loop: module loaded
> > <6>[    1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> > <5>[    1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> > blocks (2.76 GB/2.57 GiB)
> > <4>[   29.761219] sched: DL replenish lagged too much
> > [here it hangs]
> > 
> > 
> > Reverting this commit restores normal boot behavior.
> > 
> > 
> > qemu-arm64
> >  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log
> > 
> > qemu-armv7
> >  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log
> > 
> > ## Source
> > * Git tree:
> > * https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
> > * Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
> > * Git describe: next-20250826
> > * Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
> > * Architectures: arm64
> > * Toolchains: gcc-13
> > * Kconfigs: gcc-13-lkftconfig
> > 
> > 
> > ## Build
> > * Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
> > * Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
> > * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
> > * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config
> > 
> 
> Is there a link for me to get the command line args for qemu? So I can
> reproduce it locally.
> 

OK, I guess I know why: I have missed one condition for startup.

Could you test the following patch? If worked, I will send it as
a fix.

---
diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
index e0a800f918e8..b11b7f63f0d6 100644
--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -154,6 +154,8 @@ static void cond_shutdown_parent(struct irq_data *data)
 
 	if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
 		irq_chip_shutdown_parent(data);
+	else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
+		irq_chip_mask_parent(data);
 }
 
 static unsigned int cond_startup_parent(struct irq_data *data)
@@ -162,6 +164,9 @@ static unsigned int cond_startup_parent(struct irq_data *data)
 
 	if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
 		return irq_chip_startup_parent(data);
+	else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
+		irq_chip_unmask_parent(data);
+
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-26 23:28       ` Inochi Amaoto
@ 2025-08-27  0:47         ` Nathan Chancellor
  2025-08-27  8:17           ` Naresh Kamboju
  0 siblings, 1 reply; 9+ messages in thread
From: Nathan Chancellor @ 2025-08-27  0:47 UTC (permalink / raw)
  To: Inochi Amaoto
  Cc: Anders Roxell, regressions, linux-next, Thomas Gleixner,
	Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi, Shradha Gupta,
	Haiyang Zhang, Jonathan Cameron, Juergen Gross, Nicolin Chen,
	Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci, Yixun Lan,
	Longbin Li, arnd, dan.carpenter, naresh.kamboju,
	benjamin.copeland

On Wed, Aug 27, 2025 at 07:28:46AM +0800, Inochi Amaoto wrote:
> OK, I guess I know why: I have missed one condition for startup.
> 
> Could you test the following patch? If worked, I will send it as
> a fix.

Yes, that appears to resolve the issue on one system. I cannot test the
other at the moment since it is under load.

Tested-by: Nathan Chancellor <nathan@kernel•org>

> ---
> diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
> index e0a800f918e8..b11b7f63f0d6 100644
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -154,6 +154,8 @@ static void cond_shutdown_parent(struct irq_data *data)
>  
>  	if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
>  		irq_chip_shutdown_parent(data);
> +	else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> +		irq_chip_mask_parent(data);
>  }
>  
>  static unsigned int cond_startup_parent(struct irq_data *data)
> @@ -162,6 +164,9 @@ static unsigned int cond_startup_parent(struct irq_data *data)
>  
>  	if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
>  		return irq_chip_startup_parent(data);
> +	else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> +		irq_chip_unmask_parent(data);
> +
>  	return 0;
>  }
>  

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-27  0:47         ` Nathan Chancellor
@ 2025-08-27  8:17           ` Naresh Kamboju
  2025-08-27  9:45             ` Inochi Amaoto
  0 siblings, 1 reply; 9+ messages in thread
From: Naresh Kamboju @ 2025-08-27  8:17 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Inochi Amaoto, Anders Roxell, regressions, linux-next,
	Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
	Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
	Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
	Yixun Lan, Longbin Li, arnd, dan.carpenter, benjamin.copeland

On Wed, 27 Aug 2025 at 06:17, Nathan Chancellor <nathan@kernel•org> wrote:
>
> On Wed, Aug 27, 2025 at 07:28:46AM +0800, Inochi Amaoto wrote:
> > OK, I guess I know why: I have missed one condition for startup.
> >
> > Could you test the following patch? If worked, I will send it as
> > a fix.
>
> Yes, that appears to resolve the issue on one system. I cannot test the
> other at the moment since it is under load.

I have built on top of Linux next-20250826 tag and the qemu-arm64 boot test
pass and LTP smoke test also pass.

>
> Tested-by: Nathan Chancellor <nathan@kernel•org>

Tested-by: Linux Kernel Functional Testing <lkft@linaro•org>

>
> > ---
> > diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
> > index e0a800f918e8..b11b7f63f0d6 100644
> > --- a/drivers/pci/msi/irqdomain.c
> > +++ b/drivers/pci/msi/irqdomain.c
> > @@ -154,6 +154,8 @@ static void cond_shutdown_parent(struct irq_data *data)
> >
> >       if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
> >               irq_chip_shutdown_parent(data);
> > +     else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> > +             irq_chip_mask_parent(data);
> >  }
> >
> >  static unsigned int cond_startup_parent(struct irq_data *data)
> > @@ -162,6 +164,9 @@ static unsigned int cond_startup_parent(struct irq_data *data)
> >
> >       if (unlikely(info->flags & MSI_FLAG_PCI_MSI_STARTUP_PARENT))
> >               return irq_chip_startup_parent(data);
> > +     else if (unlikely(info->flags & MSI_FLAG_PCI_MSI_MASK_PARENT))
> > +             irq_chip_unmask_parent(data);
> > +
> >       return 0;
> >  }
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-26 19:45   ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
  2025-08-26 22:09     ` Nathan Chancellor
  2025-08-26 22:33     ` Inochi Amaoto
@ 2025-08-27  9:44     ` Inochi Amaoto
  2 siblings, 0 replies; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-27  9:44 UTC (permalink / raw)
  To: Anders Roxell, Inochi Amaoto, regressions, linux-next
  Cc: Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
	Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
	Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
	Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
	benjamin.copeland

On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:
> On 2025-08-14 07:28, Inochi Amaoto wrote:
> > As the RISC-V PLIC can not apply affinity setting without calling
> > irq_enable(), it will make the interrupt unavailble when using as
> > an underlying IRQ chip for MSI controller.
> > 
> > Implement .irq_startup() and .irq_shutdown() for the PCI MSI and
> > MSI-X templates. For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT,
> > these startup and shutdown the parent as well, which allows the
> > irq on the parent chip to be enabled if the irq is not enabled
> > when allocating. This is necessary for the MSI controllers which
> > use PLIC as underlying IRQ chip.
> > 
> > Suggested-by: Thomas Gleixner <tglx@linutronix•de>
> > Signed-off-by: Inochi Amaoto <inochiama@gmail•com>
> 
> Regressions found while booting the Linux next-20250826 on the
> qemu-arm64, qemu-armv7 due to following kernel log.
> 
> Bisection identified this commit as the cause of the regression.
> 
> Regression Analysis:
> - New regression? Yes
> - Reproducible? Yes
> 
> First seen on the next-20250826
> Good: next-20250825
> Bad: next-20250826
> 
> Test regression: next-20250826 gcc-13 boot failed on qemu-arm64 and
> qemu-armv7.
> 
> Expected behavior: System should boot normally and virtio block devices
> should be detected and initialized immediately.
> 
> Actual behavior: System hangs for ~30 seconds during virtio block device
> initialization before showing scheduler deadline replenish errors and
> failing to complete boot.
> 
> Reported-by: Linux Kernel Functional Testing <lkft@linaro•org>
> 
> [...]
> <6>[    1.369038] virtio-pci 0000:00:01.0: enabling device (0000 ->
> 0003)
> <6>[    1.420097] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> enabled
> <6>[    1.450858] msm_serial: driver initialized
> <6>[    1.454489] SuperH (H)SCI(F) driver initialized
> <6>[    1.456056] STM32 USART driver initialized
> <6>[    1.513325] loop: module loaded
> <6>[    1.515744] virtio_blk virtio0: 2/0/0 default/read/poll queues
> <5>[    1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> blocks (2.76 GB/2.57 GiB)
> <4>[   29.761219] sched: DL replenish lagged too much
> [here it hangs]
> 
> 
> Reverting this commit restores normal boot behavior.
> 
> 
> qemu-arm64
>  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/log
> 
> qemu-armv7
>  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663615/suite/boot/test/gcc-13-lkftconfig/log
> 
> ## Source
> * Git tree:
> * https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
> * Git sha: d0630b758e593506126e8eda6c3d56097d1847c5
> * Git describe: next-20250826
> * Project details: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826
> * Architectures: arm64
> * Toolchains: gcc-13
> * Kconfigs: gcc-13-lkftconfig
> 
> 
> ## Build
> * Test history: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250826/testrun/29663822/suite/boot/test/gcc-13-lkftconfig/history/
> * Test link: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/31oo1cMOi0uSNKYApi80iQahbLi
> * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/
> * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/31onzS5UmJVvvZucEhtB1veoJA1/config
> 
> --
> Linaro LKFT
> https://lkft.linaro.org

Fix patch is here:

https://lore.kernel.org/all/20250827062911.203106-1-inochiama@gmail.com/

Regards,
Inochi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-27  8:17           ` Naresh Kamboju
@ 2025-08-27  9:45             ` Inochi Amaoto
  0 siblings, 0 replies; 9+ messages in thread
From: Inochi Amaoto @ 2025-08-27  9:45 UTC (permalink / raw)
  To: Naresh Kamboju, Nathan Chancellor
  Cc: Inochi Amaoto, Anders Roxell, regressions, linux-next,
	Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
	Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
	Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
	Yixun Lan, Longbin Li, arnd, dan.carpenter, benjamin.copeland

On Wed, Aug 27, 2025 at 01:47:14PM +0530, Naresh Kamboju wrote:
> On Wed, 27 Aug 2025 at 06:17, Nathan Chancellor <nathan@kernel•org> wrote:
> >
> > On Wed, Aug 27, 2025 at 07:28:46AM +0800, Inochi Amaoto wrote:
> > > OK, I guess I know why: I have missed one condition for startup.
> > >
> > > Could you test the following patch? If worked, I will send it as
> > > a fix.
> >
> > Yes, that appears to resolve the issue on one system. I cannot test the
> > other at the moment since it is under load.
> 
> I have built on top of Linux next-20250826 tag and the qemu-arm64 boot test
> pass and LTP smoke test also pass.
> 
> >
> > Tested-by: Nathan Chancellor <nathan@kernel•org>
> 
> Tested-by: Linux Kernel Functional Testing <lkft@linaro•org>
> 

Thanks for your tag, can you resend you tag to the following url?
I have sent a fix patch here. Thanks.

https://lore.kernel.org/all/20250827062911.203106-1-inochiama@gmail.com/

Regards,
Inochi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains
  2025-08-26 22:09     ` Nathan Chancellor
@ 2025-08-27 10:33       ` Mark Brown
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2025-08-27 10:33 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Anders Roxell, Inochi Amaoto, regressions, linux-next,
	Thomas Gleixner, Bjorn Helgaas, Marc Zyngier, Lorenzo Pieralisi,
	Shradha Gupta, Haiyang Zhang, Jonathan Cameron, Juergen Gross,
	Nicolin Chen, Jason Gunthorpe, Chen Wang, linux-kernel, linux-pci,
	Yixun Lan, Longbin Li, arnd, dan.carpenter, naresh.kamboju,
	benjamin.copeland

[-- Attachment #1: Type: text/plain, Size: 871 bytes --]

On Tue, Aug 26, 2025 at 03:09:59PM -0700, Nathan Chancellor wrote:
> On Tue, Aug 26, 2025 at 09:45:48PM +0200, Anders Roxell wrote:

> > <5>[    1.527859] virtio_blk virtio0: [vda] 5397504 512-byte logical
> > blocks (2.76 GB/2.57 GiB)
> > <4>[   29.761219] sched: DL replenish lagged too much
> > [here it hangs]

> FWIW, I am also seeing this on real arm64 hardware (an LX2160A board and
> an Ampere Altra one) but with my NVMe drives failing to be recognized.
> In somewhat ironic fashion, I am seeing the message from cover letter
> repeating.

>   nvme nvme0: I/O tag 8 (1008) QID 0 timeout, completion polled
>   [  125.810062] dracut-initqueue[640]: Timed out while waiting for udev queue to empty.
>   nvme nvme0: I/O tag 9 (1009) QID 0 timeout, completion polled

> I am happy to test patches or provide information.

Same here, it's breaking at least Orion O6.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-27 10:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250813232835.43458-1-inochiama@gmail.com>
     [not found] ` <20250813232835.43458-3-inochiama@gmail.com>
2025-08-26 19:45   ` [PATCH v2 2/4] PCI/MSI: Add startup/shutdown for per device domains Anders Roxell
2025-08-26 22:09     ` Nathan Chancellor
2025-08-27 10:33       ` Mark Brown
2025-08-26 22:33     ` Inochi Amaoto
2025-08-26 23:28       ` Inochi Amaoto
2025-08-27  0:47         ` Nathan Chancellor
2025-08-27  8:17           ` Naresh Kamboju
2025-08-27  9:45             ` Inochi Amaoto
2025-08-27  9:44     ` Inochi Amaoto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox