From: Ganesh <ganeshgr@linux•ibm.com>
To: Nicholas Piggin <npiggin@gmail•com>, linuxppc-dev@lists•ozlabs.org
Cc: Mahesh Salgaonkar <mahesh@linux•vnet.ibm.com>
Subject: Re: [PATCH v1] powerpc/64s: Fix unrecoverable MCE crash
Date: Thu, 23 Sep 2021 23:52:16 +0530 [thread overview]
Message-ID: <de062f8e-e99b-04ec-5d9d-0c31d3cd4c2a@linux.ibm.com> (raw)
In-Reply-To: <20210922020247.209409-1-npiggin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3614 bytes --]
On 9/22/21 7:32 AM, Nicholas Piggin wrote:
> The machine check handler is not considered NMI on 64s. The early
> handler is the true NMI handler, and then it schedules the
> machine_check_exception handler to run when interrupts are enabled.
>
> This works fine except the case of an unrecoverable MCE, where the true
> NMI is taken when MSR[RI] is clear, it can not recover to schedule the
> next handler, so it calls machine_check_exception directly so something
> might be done about it.
>
> Calling an async handler from NMI context can result in irq state and
> other things getting corrupted. This can also trigger the BUG at
> arch/powerpc/include/asm/interrupt.h:168.
>
> Fix this by just making the 64s machine_check_exception handler an NMI
> like it is on other subarchs.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail•com>
> ---
Hi Nick,
If I inject control memory access error in LPAR on top of this patch
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210906084303.183921-1-ganeshgr@linux.ibm.com/
I see the following warning trace
WARNING: CPU: 130 PID: 7122 at arch/powerpc/include/asm/interrupt.h:319 machine_check_exception+0x310/0x340
Modules linked in:
CPU: 130 PID: 7122 Comm: inj_access_err Kdump: loaded Tainted: G M 5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty #22
NIP: c00000000002f980 LR: c00000000002f7e8 CTR: c000000000a31860
REGS: c0000039fe51bb20 TRAP: 0700 Tainted: G M (5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty)
MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 88000222 XER: 20040000
CFAR: c00000000002f844 IRQMASK: 0
GPR00: c00000000002f798 c0000039fe51bdc0 c0000000020d0000 0000000000000001
GPR04: 0000000000000000 4000000000000002 4000000000000000 00000000000019af
GPR08: 00000077e5ad0000 0000000000000000 c0000077ee16c700 0000000000000080
GPR12: 0000000088000222 c0000077ee16c700 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 0000000000000000 c0000000020fecd8 0000000000000000
GPR28: 0000000000000000 0000000000000001 0000000000000001 c0000039fe51be80
NIP [c00000000002f980] machine_check_exception+0x310/0x340
LR [c00000000002f7e8] machine_check_exception+0x178/0x340
Call Trace:
[c0000039fe51bdc0] [c00000000002f798] machine_check_exception+0x128/0x340 (unreliable)
[c0000039fe51be10] [c0000000000086ec] machine_check_common+0x1ac/0x1b0
--- interrupt: 200 at 0x10000968
NIP: 0000000010000968 LR: 0000000010000958 CTR: 0000000000000000
REGS: c0000039fe51be80 TRAP: 0200 Tainted: G M (5.15.0-rc2-cma-00054-g4a0d59fbaf71-dirty)
MSR: 8000000002a0f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 22000824 XER: 00000000
CFAR: 000000000000021c DAR: 00007fffb00c0000 DSISR: 02000008 IRQMASK: 0
GPR00: 0000000022000824 00007fffc9647770 0000000010027f00 00007fffb00c0000
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 00007fffb00c0000 0000000000000001 0000000000000000
GPR12: 0000000000000000 00007fffb015a330 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 0000000000000000 0000000000000000 000000001000085c
GPR28: 00007fffc9647d18 0000000000000001 00000000100009b0 00007fffc9647770
NIP [0000000010000968] 0x10000968
LR [0000000010000958] 0x10000958
--- interrupt: 200
[-- Attachment #2: Type: text/html, Size: 4152 bytes --]
prev parent reply other threads:[~2021-09-23 21:37 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-22 2:02 [PATCH v1] powerpc/64s: Fix unrecoverable MCE crash Nicholas Piggin
2021-09-23 18:22 ` Ganesh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de062f8e-e99b-04ec-5d9d-0c31d3cd4c2a@linux.ibm.com \
--to=ganeshgr@linux$(echo .)ibm.com \
--cc=linuxppc-dev@lists$(echo .)ozlabs.org \
--cc=mahesh@linux$(echo .)vnet.ibm.com \
--cc=npiggin@gmail$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox