* Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
[not found] ` <20200826112333.992429909@linutronix.de>
@ 2020-09-25 13:54 ` Qian Cai
2020-09-26 12:38 ` Vasily Gorbik
0 siblings, 1 reply; 7+ messages in thread
From: Qian Cai @ 2020-09-25 13:54 UTC (permalink / raw)
To: Thomas Gleixner, LKML, Heiko Carstens, Vasily Gorbik,
Christian Borntraeger, linux-s390, Stephen Rothwell, linux-next
Cc: x86, Joerg Roedel, iommu, linux-hyperv, Haiyang Zhang,
Jon Derrick, Lu Baolu, Wei Liu, K. Y. Srinivasan,
Stephen Hemminger, Steve Wahl, Dimitri Sivanich, Russ Anderson,
linux-pci, Bjorn Helgaas, Lorenzo Pieralisi,
Konrad Rzeszutek Wilk, xen-devel, Juergen Gross, Boris Ostrovsky,
Stefano Stabellini, Marc Zyngier, Greg Kroah-Hartman,
Rafael J. Wysocki, Megha Dey, Jason Gunthorpe, Dave Jiang,
Alex Williamson, Jacob Pan, Baolu Lu, Kevin Tian, Dan Williams
On Wed, 2020-08-26 at 13:17 +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix•de>
>
> The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture
> requires them or not. Architectures which are fully utilizing hierarchical
> irq domains should never call into that code.
>
> It's not only architectures which depend on that by implementing one or
> more of the weak functions, there is also a bunch of drivers which relies
> on the weak functions which invoke msi_controller::setup_irq[s] and
> msi_controller::teardown_irq.
>
> Make the architectures and drivers which rely on them select them in Kconfig
> and if not selected replace them by stub functions which emit a warning and
> fail the PCI/MSI interrupt allocation.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix•de>
Today's linux-next will have some warnings on s390x:
.config: https://gitlab.com/cailca/linux-mm/-/blob/master/s390.config
WARNING: unmet direct dependencies detected for PCI_MSI_ARCH_FALLBACKS
Depends on [n]: PCI [=n]
Selected by [y]:
- S390 [=y]
WARNING: unmet direct dependencies detected for PCI_MSI_ARCH_FALLBACKS
Depends on [n]: PCI [=n]
Selected by [y]:
- S390 [=y]
> ---
> V2: Make the architectures (and drivers) which need the fallbacks select them
> and not the other way round (Bjorn).
> ---
> arch/ia64/Kconfig | 1 +
> arch/mips/Kconfig | 1 +
> arch/powerpc/Kconfig | 1 +
> arch/s390/Kconfig | 1 +
> arch/sparc/Kconfig | 1 +
> arch/x86/Kconfig | 1 +
> drivers/pci/Kconfig | 3 +++
> drivers/pci/controller/Kconfig | 3 +++
> drivers/pci/msi.c | 3 ++-
> include/linux/msi.h | 31 ++++++++++++++++++++++++++-----
> 10 files changed, 40 insertions(+), 6 deletions(-)
>
> --- a/arch/ia64/Kconfig
> +++ b/arch/ia64/Kconfig
> @@ -56,6 +56,7 @@ config IA64
> select NEED_DMA_MAP_STATE
> select NEED_SG_DMA_LENGTH
> select NUMA if !FLATMEM
> + select PCI_MSI_ARCH_FALLBACKS
> default y
> help
> The Itanium Processor Family is Intel's 64-bit successor to
> --- a/arch/mips/Kconfig
> +++ b/arch/mips/Kconfig
> @@ -86,6 +86,7 @@ config MIPS
> select MODULES_USE_ELF_REL if MODULES
> select MODULES_USE_ELF_RELA if MODULES && 64BIT
> select PERF_USE_VMALLOC
> + select PCI_MSI_ARCH_FALLBACKS
> select RTC_LIB
> select SYSCTL_EXCEPTION_TRACE
> select VIRT_TO_BUS
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -246,6 +246,7 @@ config PPC
> select OLD_SIGACTION if PPC32
> select OLD_SIGSUSPEND
> select PCI_DOMAINS if PCI
> + select PCI_MSI_ARCH_FALLBACKS
> select PCI_SYSCALL if PCI
> select PPC_DAWR if PPC64
> select RTC_LIB
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -185,6 +185,7 @@ config S390
> select OLD_SIGSUSPEND3
> select PCI_DOMAINS if PCI
> select PCI_MSI if PCI
> + select PCI_MSI_ARCH_FALLBACKS
> select SPARSE_IRQ
> select SYSCTL_EXCEPTION_TRACE
> select THREAD_INFO_IN_TASK
> --- a/arch/sparc/Kconfig
> +++ b/arch/sparc/Kconfig
> @@ -43,6 +43,7 @@ config SPARC
> select GENERIC_STRNLEN_USER
> select MODULES_USE_ELF_RELA
> select PCI_SYSCALL if PCI
> + select PCI_MSI_ARCH_FALLBACKS
> select ODD_RT_SIGACTION
> select OLD_SIGSUSPEND
> select CPU_NO_EFFICIENT_FFS
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -225,6 +225,7 @@ config X86
> select NEED_SG_DMA_LENGTH
> select PCI_DOMAINS if PCI
> select PCI_LOCKLESS_CONFIG if PCI
> + select PCI_MSI_ARCH_FALLBACKS
> select PERF_EVENTS
> select RTC_LIB
> select RTC_MC146818_LIB
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -56,6 +56,9 @@ config PCI_MSI_IRQ_DOMAIN
> depends on PCI_MSI
> select GENERIC_MSI_IRQ_DOMAIN
>
> +config PCI_MSI_ARCH_FALLBACKS
> + bool
> +
> config PCI_QUIRKS
> default y
> bool "Enable PCI quirk workarounds" if EXPERT
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -41,6 +41,7 @@ config PCI_TEGRA
> bool "NVIDIA Tegra PCIe controller"
> depends on ARCH_TEGRA || COMPILE_TEST
> depends on PCI_MSI_IRQ_DOMAIN
> + select PCI_MSI_ARCH_FALLBACKS
> help
> Say Y here if you want support for the PCIe host controller found
> on NVIDIA Tegra SoCs.
> @@ -67,6 +68,7 @@ config PCIE_RCAR_HOST
> bool "Renesas R-Car PCIe host controller"
> depends on ARCH_RENESAS || COMPILE_TEST
> depends on PCI_MSI_IRQ_DOMAIN
> + select PCI_MSI_ARCH_FALLBACKS
> help
> Say Y here if you want PCIe controller support on R-Car SoCs in host
> mode.
> @@ -103,6 +105,7 @@ config PCIE_XILINX_CPM
> bool "Xilinx Versal CPM host bridge support"
> depends on ARCH_ZYNQMP || COMPILE_TEST
> select PCI_HOST_COMMON
> + select PCI_MSI_ARCH_FALLBACKS
> help
> Say 'Y' here if you want kernel support for the
> Xilinx Versal CPM host bridge.
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -58,8 +58,8 @@ static void pci_msi_teardown_msi_irqs(st
> #define pci_msi_teardown_msi_irqs arch_teardown_msi_irqs
> #endif
>
> +#ifdef CONFIG_PCI_MSI_ARCH_FALLBACKS
> /* Arch hooks */
> -
> int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
> {
> struct msi_controller *chip = dev->bus->msi;
> @@ -132,6 +132,7 @@ void __weak arch_teardown_msi_irqs(struc
> {
> return default_teardown_msi_irqs(dev);
> }
> +#endif /* CONFIG_PCI_MSI_ARCH_FALLBACKS */
>
> static void default_restore_msi_irq(struct pci_dev *dev, int irq)
> {
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -193,17 +193,38 @@ void pci_msi_mask_irq(struct irq_data *d
> void pci_msi_unmask_irq(struct irq_data *data);
>
> /*
> - * The arch hooks to setup up msi irqs. Those functions are
> - * implemented as weak symbols so that they /can/ be overriden by
> - * architecture specific code if needed.
> + * The arch hooks to setup up msi irqs. Default functions are implemented
> + * as weak symbols so that they /can/ be overriden by architecture specific
> + * code if needed. These hooks must be enabled by the architecture or by
> + * drivers which depend on them via msi_controller based MSI handling.
> + *
> + * If CONFIG_PCI_MSI_ARCH_FALLBACKS is not selected they are replaced by
> + * stubs with warnings.
> */
> +#ifdef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS
> int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
> void arch_teardown_msi_irq(unsigned int irq);
> int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
> void arch_teardown_msi_irqs(struct pci_dev *dev);
> -void arch_restore_msi_irqs(struct pci_dev *dev);
> -
> void default_teardown_msi_irqs(struct pci_dev *dev);
> +#else
> +static inline int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int
> type)
> +{
> + WARN_ON_ONCE(1);
> + return -ENODEV;
> +}
> +
> +static inline void arch_teardown_msi_irqs(struct pci_dev *dev)
> +{
> + WARN_ON_ONCE(1);
> +}
> +#endif
> +
> +/*
> + * The restore hooks are still available as they are useful even
> + * for fully irq domain based setups. Courtesy to XEN/X86.
> + */
> +void arch_restore_msi_irqs(struct pci_dev *dev);
> void default_restore_msi_irqs(struct pci_dev *dev);
>
> struct msi_controller {
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
2020-09-25 13:54 ` [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable Qian Cai
@ 2020-09-26 12:38 ` Vasily Gorbik
2020-09-28 10:11 ` Thomas Gleixner
0 siblings, 1 reply; 7+ messages in thread
From: Vasily Gorbik @ 2020-09-26 12:38 UTC (permalink / raw)
To: Thomas Gleixner, Qian Cai
Cc: LKML, Heiko Carstens, Christian Borntraeger, linux-s390,
Stephen Rothwell, linux-next, x86, Joerg Roedel, iommu,
linux-hyperv, Haiyang Zhang, Jon Derrick, Lu Baolu, Wei Liu,
K. Y. Srinivasan, Stephen Hemminger, Steve Wahl, Dimitri Sivanich,
Russ Anderson, linux-pci, Bjorn Helgaas, Lorenzo Pieralisi,
Konrad Rzeszutek Wilk, xen-devel, Juergen Gross, Boris Ostrovsky,
Stefano Stabellini, Marc Zyngier, Greg Kroah-Hartman,
Rafael J. Wysocki, Megha Dey, Jason Gunthorpe, Dave Jiang,
Alex Williamson, Jacob Pan, Baolu Lu, Kevin Tian, Dan Williams
On Fri, Sep 25, 2020 at 09:54:52AM -0400, Qian Cai wrote:
> On Wed, 2020-08-26 at 13:17 +0200, Thomas Gleixner wrote:
> > From: Thomas Gleixner <tglx@linutronix•de>
> >
> > The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture
> > requires them or not. Architectures which are fully utilizing hierarchical
> > irq domains should never call into that code.
> >
> > It's not only architectures which depend on that by implementing one or
> > more of the weak functions, there is also a bunch of drivers which relies
> > on the weak functions which invoke msi_controller::setup_irq[s] and
> > msi_controller::teardown_irq.
> >
> > Make the architectures and drivers which rely on them select them in Kconfig
> > and if not selected replace them by stub functions which emit a warning and
> > fail the PCI/MSI interrupt allocation.
> >
> > Signed-off-by: Thomas Gleixner <tglx@linutronix•de>
>
> Today's linux-next will have some warnings on s390x:
>
> .config: https://gitlab.com/cailca/linux-mm/-/blob/master/s390.config
>
> WARNING: unmet direct dependencies detected for PCI_MSI_ARCH_FALLBACKS
> Depends on [n]: PCI [=n]
> Selected by [y]:
> - S390 [=y]
>
> WARNING: unmet direct dependencies detected for PCI_MSI_ARCH_FALLBACKS
> Depends on [n]: PCI [=n]
> Selected by [y]:
> - S390 [=y]
>
Yes, as well as on mips and sparc which also don't FORCE_PCI.
This seems to work for s390:
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index b0b7acf07eb8..41136fbe909b 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -192,3 +192,3 @@ config S390
select PCI_MSI if PCI
- select PCI_MSI_ARCH_FALLBACKS
+ select PCI_MSI_ARCH_FALLBACKS if PCI
select SET_FS
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
2020-09-26 12:38 ` Vasily Gorbik
@ 2020-09-28 10:11 ` Thomas Gleixner
0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2020-09-28 10:11 UTC (permalink / raw)
To: Vasily Gorbik, Qian Cai
Cc: LKML, Heiko Carstens, Christian Borntraeger, linux-s390,
Stephen Rothwell, linux-next, x86, Joerg Roedel, iommu,
linux-hyperv, Haiyang Zhang, Jon Derrick, Lu Baolu, Wei Liu,
K. Y. Srinivasan, Stephen Hemminger, Steve Wahl, Dimitri Sivanich,
Russ Anderson, linux-pci, Bjorn Helgaas, Lorenzo Pieralisi,
Konrad Rzeszutek Wilk, xen-devel, Juergen Gross, Boris Ostrovsky,
Stefano Stabellini, Marc Zyngier, Greg Kroah-Hartman,
Rafael J. Wysocki, Megha Dey, Jason Gunthorpe, Dave Jiang,
Alex Williamson, Jacob Pan, Baolu Lu, Kevin Tian, Dan Williams
On Sat, Sep 26 2020 at 14:38, Vasily Gorbik wrote:
> On Fri, Sep 25, 2020 at 09:54:52AM -0400, Qian Cai wrote:
> Yes, as well as on mips and sparc which also don't FORCE_PCI.
> This seems to work for s390:
>
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index b0b7acf07eb8..41136fbe909b 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -192,3 +192,3 @@ config S390
> select PCI_MSI if PCI
> - select PCI_MSI_ARCH_FALLBACKS
> + select PCI_MSI_ARCH_FALLBACKS if PCI
> select SET_FS
lemme fix that for all of them ...
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI
[not found] <20200826111628.794979401@linutronix.de>
[not found] ` <20200826112333.992429909@linutronix.de>
@ 2020-09-25 15:29 ` Qian Cai
2020-09-25 15:49 ` Peter Zijlstra
1 sibling, 1 reply; 7+ messages in thread
From: Qian Cai @ 2020-09-25 15:29 UTC (permalink / raw)
To: Thomas Gleixner, LKML, Stephen Rothwell, linux-next
Cc: x86, Joerg Roedel, iommu, linux-hyperv, Haiyang Zhang,
Jon Derrick, Lu Baolu, Wei Liu, K. Y. Srinivasan,
Stephen Hemminger, Steve Wahl, Dimitri Sivanich, Russ Anderson,
linux-pci, Bjorn Helgaas, Lorenzo Pieralisi,
Konrad Rzeszutek Wilk, xen-devel, Juergen Gross, Boris Ostrovsky,
Stefano Stabellini, Marc Zyngier, Greg Kroah-Hartman,
Rafael J. Wysocki, Megha Dey, Jason Gunthorpe, Dave Jiang,
Alex Williamson, Jacob Pan, Baolu Lu, Kevin Tian, Dan Williams
On Wed, 2020-08-26 at 13:16 +0200, Thomas Gleixner wrote:
> This is the second version of providing a base to support device MSI (non
> PCI based) and on top of that support for IMS (Interrupt Message Storm)
> based devices in a halfways architecture independent way.
>
> The first version can be found here:
>
> https://lore.kernel.org/r/20200821002424.119492231@linutronix.de
>
> It's still a mixed bag of bug fixes, cleanups and general improvements
> which are worthwhile independent of device MSI.
Reverting the part of this patchset on the top of today's linux-next fixed an
boot issue on HPE ProLiant DL560 Gen10, i.e.,
$ git revert --no-edit 13b90cadfc29..bc95fd0d7c42
.config: https://gitlab.com/cailca/linux-mm/-/blob/master/x86.config
It looks like the crashes happen in the interrupt remapping code where they are
only able to to generate partial call traces.
[ 1.912386][ T0] ACPI: X2APIC_NMI (uid[0xf5] high level 9983][ T0] ... MAX_LOCK_DEPTH: 48
[ 7.914876][ T0] ... MAX_LOCKDEP_KEYS: 8192
[ 7.919942][ T0] ... CLASSHASH_SIZE: 4096
[ 7.925009][ T0] ... MAX_LOCKDEP_ENTRIES: 32768
[ 7.930163][ T0] ... MAX_LOCKDEP_CHAINS: 65536
[ 7.935318][ T0] ... CHAINHASH_SIZE: 32768
[ 7.940473][ T0] memory used by lock dependency info: 6301 kB
[ 7.946586][ T0] memory used for stack traces: 4224 kB
[ 7.952088][ T0] per task-struct memory footprint: 1920 bytes
[ 7.968312][ T0] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl
[ 7.980281][ T0] ACPI: Core revision 20200717
[ 7.993343][ T0] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635855245 ns
[ 8.003270][ T0] APIC: Switch to symmetric I/O mode setup
[ 8.008951][ T0] DMAR: Host address width 46
[ 8.013512][ T0] DMAR: DRHD base: 0x000000e5ffc000 flags: 0x0
[ 8.019680][ T0] DMAR: dmar0: reg_base_addr e5ffc000 ver 1:0 cap 8d2078c106f0466 [ T0] DMAR-IR: IOAPIC id 15 under DRHD base 0xe5ffc000 IOMMU 0
[ 8.420990][ T0] DMAR-IR: IOAPIC id 8 under DRHD base 0xddffc000 IOMMU 15
[ 8.428166][ T0] DMAR-IR: IOAPIC id 9 under DRHD base 0xddffc000 IOMMU 15
[ 8.435341][ T0] DMAR-IR: HPET id 0 under DRHD base 0xddffc000
[ 8.441456][ T0] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 8.457911][ T0] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 8.466614][ T0] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 8.474295][ T0] #PF: supervisor instruction fetch in kernel mode
[ 8.480669][ T0] #PF: error_code(0x0010) - not-present page
[ 8.486518][ T0] PGD 0 P4D 0
[ 8.489757][ T0] Oops: 0010 [#1] SMP KASAN PTI
[ 8.494476][ T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 5.9.0-rc6-next-20200925 #2
[ 8.503987][ T0] Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10, BIOS U34 11/13/2019
[ 8.513238][ T0] RIP: 0010:0x0
[ 8.516562][ T0] Code: Bad RIP v
or
[ 2.906744][ T0] ACPI: X2API32, address 0xfec68000, GSI 128-135
[ 2.907063][ T0] IOAPIC[15]: apic_id 29, version 32, address 0xfec70000, GSI 136-143
[ 2.907071][ T0] IOAPIC[16]: apic_id 30, version 32, address 0xfec78000, GSI 144-151
[ 2.907079][ T0] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 2.907084][ T0] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 2.907100][ T0] Using ACPI (MADT) for SMP configuration information
[ 2.907105][ T0] ACPI: HPET id: 0x8086a701 base: 0xfed00000
[ 2.907116][ T0] ACPI: SPCR: console: uart,mmio,0x0,115200
[ 2.907121][ T0] TSC deadline timer available
[ 2.907126][ T0] smpboot: Allowing 144 CPUs, 0 hotplug CPUs
[ 2.907163][ T0] [mem 0xd0000000-0xfdffffff] available for PCI devices
[ 2.907175][ T0] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[ 2.914541][ T0] setup_percpu: NR_CPUS:256 nr_cpumask_bits:144 nr_cpu_ids:144 nr_node_ids:4
[ 2.926109][ 466 ecap f020df
[ 9.134709][ T0] DMAR: DRHD base: 0x000000f5ffc000 flags: 0x0
[ 9.140867][ T0] DMAR: dmar8: reg_base_addr f5ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 9.149610][ T0] DMAR: DRHD base: 0x000000f7ffc000 flags: 0x0
[ 9.155762][ T0] DMAR: dmar9: reg_base_addr f7ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 9.164491][ T0] DMAR: DRHD base: 0x000000f9ffc000 flags: 0x0
[ 9.170645][ T0] DMAR: dmar10: reg_base_addr f9ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 9.179476][ T0] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0
[ 9.185626][ T0] DMAR: dmar11: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 9.194442][ T0] DMAR: DRHD base: 0x000000dfffc000 flags: 0x0
[ 9.200587][ T0] DMAR: dmar12: reg_base_addr dfffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 9.209418][ T0] DMAR: DRHD base: 0x000000e1ffc000 flags: 0x0
[ 9.215551][ T0] DMAR: dmar13: reg_base_addr e1ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 9.224367][ T0] DMAR: DRHD base: 0x000000e3ffc83][ T0] msi_domain_alloc+0x8e/0x280
[ 9.615015][ T0] __irq_domain_a8992cd
[ 9.711906][ T0] R10: ffffffff85407d78 R11: fffffbfff18992cc R12: ffffffff8546ffc0
[ 9.719761][ T0] R13: 0000000000000098 R14: ffff888106e63a40 R15: 0000000000000001
[ 9.727617][ T0] FS: 0000000000000000(0000) GS:ffff8887df800000(0000) knlGS:0000000000000000
[ 9.736431][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9.742892][ T0] CR2: ffffffffffffffd6 CR3: 0000001ba7814001 CR4: 00000000000606b0
[ 9.750747][ T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9.758601][ T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9.766456][ T0] Kernel panic - not syncing: Fatal exception
[ 9.772547][ T0] ---[ end Kernel panic - not syncing: Fatal exception ]---
The working boot (without those patches) looks like this:
[ 1.913963][ T0] ACPI: X2APIC_NMI (uid[0xf4] high level lint[0x1])
[ 1.913967][ T0] ACPI: X2APIC_NMI (uid[0xf5] high level lint[0x1])
[ 1.913970][ T0] ACPI: X2APIC_NMI (uid[0xf6] high level lint[0x1])
[ 1.913974][ T0] ACPI: X2APIC_NMI (uid[0xf7] high level lint[0x1])
[ 1.914017][ T0] IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
[ 1.914032][ T0] IOAPIC[1]: apic_id 9, version 32, address 0xfec01000, GSI 24-31
[ 1.914039][ T0] IOAPIC[2]: apic_id 10, version 32, address 0xfec08000, GSI 32-39
[ 1.914047][ T0] IOAPIC[3]: apic_id 11, version 32, address 0xfec10000, GSI 40-47
[ 1.914054][ T0] IOAPIC[4]: apic_id 12, version 32, address 0xfec18000, GSI 48-55
[ 1.914062][ T0] IOAPIC[5]: apic_id 15, version 32, address 0xfec20000, GSI 56-63
[ 1.[ 7.994567][ T0] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl
[ 8.006541][ T0] ACPI: Core revision 20200717
[ 8.019713][ T0] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635855245 ns
[ 8.029672][ T0] APIC: Switch to symmetric I/O mode setup
[ 8.035354][ T0] DMAR: Host address width 46
[ 8.039915][ T0] DMAR: DRHD base: 0x000000e5ffc000 flags: 0x0
[ 8.046095][ T0] DMAR: dmar0: reg_base_addr e5ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 8.054840][ T0] DMAR: DRHD base: 0x000000e7ffc000 flags: 0x0
[ 8.060997][ T0] DMAR: dmar1: reg_base_addr e7ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 8.069740][ T0] DMAR: DRHD base: 0x000000e9ffc000 flags: 0x0
[ 8.075872][ T0] DMAR: dmar2: reg_base_addr e9ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 8.084615][ T0] DMAR: DRHD base: 0x000000ebffc000 flags: 0x0
[ 8.090761][ T0] DMAR: dmar3: reg_base_addr ebffc000 ver 1:0 cap 8d2078c106f0466 ecap fMAR-IR: Enabled IRQ remapping in x2apic mode
[ 8.513491][ T0] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 8.568289][ T0] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x2b3e459bf4c, max_idle_ns: 440795289890 ns
[ 8.579576][ T0] Calibrating delay loop (skipped), value calculated using timer frequency.. 6000.00 BogoMIPS (lpj=30000000)
[ 8.589574][ T0] pid_max: default: 147456 minimum: 1152
[ 8.714025][ T0] efi: memattr: Entry attributes invalid: RO and XP bits both cleared
[ 8.719577][ T0] efi: memattr: ! 0x0000a057a000-0x0000a05b4fff [Runtime Code |RUN| | | | | | | | | | | | ]
[ 8.775355][ T0] Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes, vmalloc)
[ 8.798868][ T0] Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes, vmalloc)
[ 8.811550][ T0] Mount-cache hash table entries: 131072 (order: 8, 1048576 bytes, vmalloc)
[ 8.820076][ T0] Mountpoint-cache hash table entries: 131072 (order: 8, 1048576 bytes, vmalloc)
[ 8.879327][ T0] mce: CPU0: Thermal mo[ 8.996916][ T1] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
[ 8.999591][ T1] ... version: 4
[ 9.004310][ T1] ... bit width: 48
[ 9.009118][ T1] ... generic registers: 4
[ 9.009574][ T1] ... value mask: 0000ffffffffffff
[ 9.015601][ T1] ... max period: 00007fffffffffff
[ 9.019574][ T1] ... fixed-purpose events: 3
[ 9.024294][ T1] ... event mask: 000000070000000f
[ 9.034357][ T1] rcu: Hierarchical SRCU implementation.
[ 9.062516][ T5] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
>
> There are quite a bunch of issues to solve:
>
> - X86 does not use the device::msi_domain pointer for historical reasons
> and due to XEN, which makes it impossible to create an architecture
> agnostic device MSI infrastructure.
>
> - X86 has it's own msi_alloc_info data type which is pointlessly
> different from the generic version and does not allow to share code.
>
> - The logic of composing MSI messages in an hierarchy is busted at the
> core level and of course some (x86) drivers depend on that.
>
> - A few minor shortcomings as usual
>
> This series addresses that in several steps:
>
> 1) Accidental bug fixes
>
> iommu/amd: Prevent NULL pointer dereference
>
> 2) Janitoring
>
> x86/init: Remove unused init ops
> PCI: vmd: Dont abuse vector irqomain as parent
> x86/msi: Remove pointless vcpu_affinity callback
>
> 3) Sanitizing the composition of MSI messages in a hierarchy
>
> genirq/chip: Use the first chip in irq_chip_compose_msi_msg()
> x86/msi: Move compose message callback where it belongs
>
> 4) Simplification of the x86 specific interrupt allocation mechanism
>
> x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency
> x86/irq: Add allocation type for parent domain retrieval
> iommu/vt-d: Consolidate irq domain getter
> iommu/amd: Consolidate irq domain getter
> iommu/irq_remapping: Consolidate irq domain lookup
>
> 5) Consolidation of the X86 specific interrupt allocation mechanism to be as
> close
> as possible to the generic MSI allocation mechanism which allows to get
> rid
> of quite a bunch of x86'isms which are pointless
>
> x86/irq: Prepare consolidation of irq_alloc_info
> x86/msi: Consolidate HPET allocation
> x86/ioapic: Consolidate IOAPIC allocation
> x86/irq: Consolidate DMAR irq allocation
> x86/irq: Consolidate UV domain allocation
> PCI/MSI: Rework pci_msi_domain_calc_hwirq()
> x86/msi: Consolidate MSI allocation
> x86/msi: Use generic MSI domain ops
>
> 6) x86 specific cleanups to remove the dependency on arch_*_msi_irqs()
>
> x86/irq: Move apic_post_init() invocation to one place
> x86/pci: Reducde #ifdeffery in PCI init code
> x86/irq: Initialize PCI/MSI domain at PCI init time
> irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
> PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI
> PCI/MSI: Provide pci_dev_has_special_msi_domain() helper
> x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
> x86/xen: Rework MSI teardown
> x86/xen: Consolidate XEN-MSI init
> irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
> x86/xen: Wrap XEN MSI management into irqdomain
> iommm/vt-d: Store irq domain in struct device
> iommm/amd: Store irq domain in struct device
> x86/pci: Set default irq domain in pcibios_add_device()
> PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
> x86/irq: Cleanup the arch_*_msi_irqs() leftovers
> x86/irq: Make most MSI ops XEN private
> iommu/vt-d: Remove domain search for PCI/MSI[X]
> iommu/amd: Remove domain search for PCI/MSI
>
> 7) X86 specific preparation for device MSI
>
> x86/irq: Add DEV_MSI allocation type
> x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI
>
> 8) Generic device MSI infrastructure
> platform-msi: Provide default irq_chip:: Ack
> genirq/proc: Take buslock on affinity write
> genirq/msi: Provide and use msi_domain_set_default_info_flags()
> platform-msi: Add device MSI infrastructure
> irqdomain/msi: Provide msi_alloc/free_store() callbacks
>
> 9) POC of IMS (Interrupt Message Storm) irq domain and irqchip
> implementations for both device array and queue storage.
>
> irqchip: Add IMS (Interrupt Message Storm) driver - NOT FOR MERGING
>
> Changes vs. V1:
>
> - Addressed various review comments and addressed the 0day fallout.
> - Corrected the XEN logic (Jürgen)
> - Make the arch fallback in PCI/MSI opt-in not opt-out (Bjorn)
>
> - Fixed the compose MSI message inconsistency
>
> - Ensure that the necessary flags are set for device SMI
>
> - Make the irq bus logic work for affinity setting to prepare
> support for IMS storage in queue memory. It turned out to be
> less scary than I feared.
>
> - Remove leftovers in iommu/intel|amd
>
> - Reworked the IMS POC driver to cover queue storage so Jason can have a
> look whether that fits the needs of MLX devices.
>
> The whole lot is also available from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git device-msi
>
> This has been tested on Intel/AMD/KVM but lacks testing on:
>
> - HYPERV (-ENODEV)
> - VMD enabled systems (-ENODEV)
> - XEN (-ENOCLUE)
> - IMS (-ENODEV)
>
> - Any non-X86 code which might depend on the broken compose MSI message
> logic. Marc excpects not much fallout, but agrees that we need to fix
> it anyway.
>
> #1 - #3 should be applied unconditionally for obvious reasons
> #4 - #6 are wortwhile cleanups which should be done independent of device MSI
>
> #7 - #8 look promising to cleanup the platform MSI implementation
> independent of #8, but I neither had cycles nor the stomach to
> tackle that.
>
> #9 is obviously just for the folks interested in IMS
>
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 7+ messages in thread