public inbox for linux-next@vger.kernel.org 
 help / color / mirror / Atom feed
From: Bert Karwatzki <spasswolf@web•de>
To: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel•org>,
	"Christian König" <christian.koenig@amd•com>,
	linux-kernel@vger•kernel.org
Cc: linux-next@vger•kernel.org, regressions@lists•linux.dev,
	 linux-pci@vger•kernel.org, linux-acpi@vger•kernel.org,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel•com>,
	spasswolf@web•de
Subject: Re: [REGRESSION 00/04] Crash during resume of pcie bridge
Date: Fri, 07 Nov 2025 18:09:53 +0100	[thread overview]
Message-ID: <ab51bd58919a31107caf8f8753804cb2dbfa791d.camel@web.de> (raw)
In-Reply-To: <0cb75fae3a9cdb8dd82ca82348f4df919d34844d.camel@web.de>

Am Freitag, dem 07.11.2025 um 14:09 +0100 schrieb Bert Karwatzki:
> 
> Testing:
> v6.12			booted 13:00, 7.11.2025 no crash after 1h, 890 GPP0 events, 287 resumes
> 
> 
> Bert Karwatzki

v6.12 crashed after 2h, 946 GPP0 events and 499 resumes. So there's no base
for a bisection. 

But the crash from v6.14.11 gave this error in netconsole:

2025-11-06T19:17:34.967439+01:00 T370;[drm] PCIE GART of 512M enabled (table at 0x00000081FEB00000).
2025-11-06T19:17:34.967439+01:00 T370;amdgpu 0000:03:00.0: amdgpu: PSP is resuming...#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:34.967588+01:00 T12;pci_bus 0000:03: Allocating resources#012 SUBSYSTEM=pci_bus#012 DEVICE=+pci_bus:0000:03
2025-11-06T19:17:35.143353+01:00 T370;amdgpu 0000:03:00.0: amdgpu: reserve 0xa00000 from 0x81fd000000 for PSP TMR#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.226021+01:00 T370;amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.237386+01:00 T370;amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available#012 SUBSYSTEM=pci#012
DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.237386+01:00 T370;amdgpu 0000:03:00.0: amdgpu: SMU is resuming...#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.237386+01:00 T370;amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0,
version = 0x003b3100 (59.49.0)#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.237386+01:00 T370;amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.509600+01:00 T370;amdgpu 0000:03:00.0: amdgpu: SMU: response:0xFFFFFFFF for index:6 param:0x00000000 message:EnableAllSmuFeatures?#012
SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.509600+01:00 T370;amdgpu 0000:03:00.0: amdgpu: Failed to enable requested dpm features!#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.509600+01:00 T370;amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw!#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.509600+01:00 T370;amdgpu 0000:03:00.0: amdgpu: resume of IP block <smu> failed -121#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:35.509600+01:00 T370;amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-121).#012 SUBSYSTEM=pci#012 DEVICE=+pci:0000:03:00.0
2025-11-06T19:17:36.114889+01:00 C8;INFO: NMI handler (perf_event_nmi_handler) took too long to run: 35.314 msecs
2025-11-06T19:17:36.114889+01:00 C8;perf: interrupt took too long (275880 > 2500), lowering kernel.perf_event_max_sample_rate to 1000
2025-11-06T19:17:37.930799+01:00 C4;INFO: NMI handler (perf_event_nmi_handler) took too long to run: 152.914 msecs
2025-11-06T19:17:37.930799+01:00 C4;perf: interrupt took too long (1194640 > 344850), lowering kernel.perf_event_max_sample_rate to 1000
2025-11-06T19:17:38.939845+01:00 C14;INFO: NMI handler (perf_event_nmi_handler) took too long to run: 197.312 msecs
2025-11-06T19:17:38.939845+01:00 C14;perf: interrupt took too long (1541521 > 1493300), lowering kernel.perf_event_max_sample_rate to 1000

These 4 lines have not been recorded previously, so perhaps I have to look
for a NULL pointer dereference in an error path:

2025-11-06T19:17:42.571252+01:00 T1896;ACPI Error: AE_TIME, Returned by Handler for [EmbeddedControl] (20240827/evregion-301)
2025-11-06T19:17:42.571252+01:00 T1896;ACPI Error: Timeout from EC hardware or EC device driver (20240827/evregion-311)
2025-11-06T19:17:42.571252+01:00 T1896;ACPI Error: Aborting method \x5c_SB.PCI0.SBRG.EC.BAT1.UPBS due to previous error (AE_TIME) (20240827/psparse-529)
2025-11-06T19:17:42.571252+01:00 T1896;ACPI Error: Aborting method \x5c_SB.PCI0.SBRG.EC.BAT1._BST due to previous error (AE_TIME) (20240827/psparse-529) 


Bert Karwatzki

  reply	other threads:[~2025-11-07 17:10 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-06 12:09 [REGRESSION 00/04] Crash during resume of pcie bridge Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 01/04] " Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 02/04] " Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 03/04] " Bert Karwatzki
2025-10-06 12:09 ` [REGRESSION 04/04] " Bert Karwatzki
2025-10-06 12:39 ` [REGRESSION 00/04] " Christian König
2025-10-06 16:22   ` Bert Karwatzki
2025-10-07  6:50     ` Bert Karwatzki
2025-10-07 21:33 ` Mario Limonciello
2025-10-13 16:29   ` Bert Karwatzki
2025-10-13 18:51     ` Mario Limonciello
2025-10-14 10:50       ` Christian König
     [not found]         ` <1853e2af7f70cf726df278137b6d2d89d9d9dc82.camel@web.de>
2025-10-31 13:38           ` Bert Karwatzki
2025-10-31 13:47             ` Bert Karwatzki
2025-10-31 18:35               ` Bert Karwatzki
2025-11-05 11:44                 ` Bert Karwatzki
2025-11-05 21:31                   ` Mario Limonciello (AMD) (kernel.org)
2025-11-07 13:09                     ` Bert Karwatzki
2025-11-07 17:09                       ` Bert Karwatzki [this message]
2025-11-10 13:33                         ` Christian König
2025-11-16 21:08                           ` Crash during resume of pcie bridge due to infinite loop in ACPICA Bert Karwatzki
2025-11-17 16:40                             ` Rafael J. Wysocki
2025-11-24 22:34                               ` Bert Karwatzki
2025-11-25 19:46                                 ` Rafael J. Wysocki
2025-11-27  0:08                                   ` Bert Karwatzki
2025-11-27 13:02                                     ` Rafael J. Wysocki
2025-11-28 20:47                                       ` Bert Karwatzki
2025-12-02 18:59                                         ` Rafael J. Wysocki
2025-12-02 19:53                                           ` Bert Karwatzki
2025-12-02 20:01                                             ` Rafael J. Wysocki
2025-12-05 10:05                                               ` Crash during resume of pcie bridge due to incorrect error handling Bert Karwatzki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab51bd58919a31107caf8f8753804cb2dbfa791d.camel@web.de \
    --to=spasswolf@web$(echo .)de \
    --cc=christian.koenig@amd$(echo .)com \
    --cc=linux-acpi@vger$(echo .)kernel.org \
    --cc=linux-kernel@vger$(echo .)kernel.org \
    --cc=linux-next@vger$(echo .)kernel.org \
    --cc=linux-pci@vger$(echo .)kernel.org \
    --cc=rafael.j.wysocki@intel$(echo .)com \
    --cc=regressions@lists$(echo .)linux.dev \
    --cc=superm1@kernel$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox