public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman•id.au>
To: Balbir Singh <bsingharora@gmail•com>, linuxppc-dev@lists•ozlabs.org
Cc: npiggin@gmail•com, Balbir Singh <bsingharora@gmail•com>
Subject: Re: powerpc/powernv/mce: Don't silently restart the machine
Date: Wed, 28 Feb 2018 20:49:32 +1100	[thread overview]
Message-ID: <87a7vt4f6b.fsf@concordia.ellerman.id.au> (raw)
In-Reply-To: <20180228010636.22772-1-bsingharora@gmail.com>

Balbir Singh <bsingharora@gmail•com> writes:

> On MCE the current code will restart the machine with
> ppc_md.restart(). This case was extremely unlikely since
> prior to that a skiboot call is made and that resulted in
> a checkstop for analysis.
>
> With newer skiboots, on P9 we don't checkstop the box by
> default, instead we return back to the kernel to extract
> useful information at the time of the MCE. While we still
> get this information, this patch converts the restart to
> a panic(), so that if configured a dump can be taken and
> we can track and probably debug the potential issue causing
> the MCE.
>
> Signed-off-by: Balbir Singh <bsingharora@gmail•com>
> Reviewed-by: Nicholas Piggin <npiggin@gmail•com>
> ---
>  arch/powerpc/platforms/powernv/opal.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
> index 69b5263fc9e3..b510a6f41b00 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -500,9 +500,12 @@ void pnv_platform_error_reboot(struct pt_regs *regs, const char *msg)
                                                                            ^^^^^^^^^^^^^^^
Why don't we use the msg ..

>  	 *    opal to trigger checkstop explicitly for error analysis.
>  	 *    The FSP PRD component would have already got notified
>  	 *    about this error through other channels.
> +	 * 4. We are running on a newer skiboot that by default does
> +	 *    not cause a checkstop, drops us back to the kernel to
> +	 *    extract context and state at the time of the error.
>  	 */
>  
> -	ppc_md.restart(NULL);
> +	panic("PowerNV Unrecovered Machine Check");
              ^
              Here.

Because we can get here from a HMI so it's confusing to print "Machine
Check" in that case, and we have the msg already.

So just:

> +	panic(msg);

cheers

  reply	other threads:[~2018-02-28  9:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-28  1:06 powerpc/powernv/mce: Don't silently restart the machine Balbir Singh
2018-02-28  9:49 ` Michael Ellerman [this message]
2018-02-28 10:50   ` Balbir Singh
  -- strict thread matches above, loose matches on Subject: below --
2018-03-08  0:36 [PATCH] " Balbir Singh
2018-03-14  9:28 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a7vt4f6b.fsf@concordia.ellerman.id.au \
    --to=mpe@ellerman$(echo .)id.au \
    --cc=bsingharora@gmail$(echo .)com \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    --cc=npiggin@gmail$(echo .)com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox