Re: [rfc] powernv/kdump: Fix cases where the kdump kernel can get HMI's

public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed

From: Nicholas Piggin <npiggin@gmail•com>
To: Balbir Singh <bsingharora@gmail•com>
Cc: "open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)"
	<linuxppc-dev@lists•ozlabs.org>,
	Michael Ellerman <mpe@ellerman•id.au>
Subject: Re: [rfc] powernv/kdump: Fix cases where the kdump kernel can get HMI's
Date: Mon, 4 Dec 2017 13:10:33 +1000	[thread overview]
Message-ID: <20171204131033.2df6ea84@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <CAKTCnz=w5KEECS2RwWjrfDVgMVZWYJqQb4C8EOUbfZh2Bb=z4A@mail.gmail.com>

On Mon, 4 Dec 2017 11:37:01 +1100
Balbir Singh <bsingharora@gmail•com> wrote:

> On Sun, Dec 3, 2017 at 1:36 PM, Nicholas Piggin <npiggin@gmail•com> wrote:
> > Seems like a reasonable approach. Why do we only do this for
> > powernv? It seems like a good idea in general to pull all
> > offlined CPUs out and into the same state for all platforms
> > and for all shutdown/restart/crash paths.
> >  
> 
> The reason is largely wake-up related, do we expect offline CPUs to wake
> up in the kdump kernel. Largely the infrastructure allows us to selectively
> decide what platforms need this support. I did not want to break the world
> by enabling it across platforms (pseries for example) without good reason.

What happens if a pseries offlined CPU gets an exception for some reason
though? It seems like it would return into pseries_mach_cpu_die of the
old kernel which will go wrong.

Maybe the platform has stronger guarantees that it won't wake up there,
like requiring a specific hcall or something?

I was just thinking trying to move all platforms in general to the same
scheme would be preferable, unless there is a good reason not to. Just
for sharing code and behaviour.

> 
> > Also I wonder if there is anything we should do on the other
> > side of the equation for the kdump kernel to pull CPUs into a
> > known state rather than rely on the crash kernel to do it for
> > us. We might have a better ability to do that with system
> > reset IPIs now.
> >  
> 
> Yes, but do we need to do that or quickly dump the vmcore to a file
> and exit?

Well if the previous kernel did not shut them down properly, we need
to do that. Don't we? My point is the previous kernel crashed somehow,
we should be trying to fix everything up rather than hoping it crashed
"nicely" for us.

Yes we shouldn't disturb things as much as possible, but we've booted
an entire new kernel in its own reserved memory, so I'm not sure if
it's such a concern to try fixing up wayward CPUs.

Thanks,
Nick

next prev parent reply	other threads:[~2017-12-04  3:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-02 13:48 [rfc] powernv/kdump: Fix cases where the kdump kernel can get HMI's Balbir Singh
2017-12-03  2:36 ` Nicholas Piggin
2017-12-04  0:37   ` Balbir Singh
2017-12-04  3:10     ` Nicholas Piggin [this message]
2017-12-06  4:29       ` Balbir Singh
2017-12-06  5:07         ` Haren Myneni
2017-12-06  6:13           ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171204131033.2df6ea84@roar.ozlabs.ibm.com \
    --to=npiggin@gmail$(echo .)com \
    --cc=bsingharora@gmail$(echo .)com \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    --cc=mpe@ellerman$(echo .)id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox