* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. [not found] <0184EA26B2509940AA629AE1405DD7F2016959EE@DGGEMA503-MBX.china.huawei.com> @ 2017-10-16 3:17 ` gengdongjiu 2017-10-16 17:02 ` James Morse 2017-10-16 17:09 ` James Morse 1 sibling, 1 reply; 6+ messages in thread From: gengdongjiu @ 2017-10-16 3:17 UTC (permalink / raw) To: linux-arm-kernel > In fact I have below method for that, what do you think about that? > > 1. If there is no RAS, old method, directly inject virtual SError, not need to specify ESR, as shown in the [1] > 2. If there is RAS, KVM set "the kvm_run" guest exit type value to let user space handle the SError abort > A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return " ESR_ELx_ISV " to let user space specify an implementation-defined value, as shown [2] > B. If ESR_EL2 is categorized and error not propagated, the error come from guest user space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR " to let user space specify a recoverable ESR. > Here one side calling memory failure, another side let user pace inject SError. Because usually SEI notification does not deliver SIGBUS signal to user space, so here inject virtual SEI to ensure that. As shown [3] > C. If ESR_EL2 is categorized and error not propagated, the error come from guest kernel, return "-1" to terminate guest. As shown [4] > D. Otherwise, Panic host OS. As shown [5] > For the IMPLEMENTATION ESR, for the safety purposes, I have below suggestion 1. When there is no guest, host OS receives IMPLEMENTATION ESR, I hope host can be panic, because we do not know its implementation meaning. 2. when guest received MPLEMENTATION ESR, and the Error is isolated to guest by "ESB" instruction, not propagate to host. I hope guest will exit, but host not panic. 3. when guest received MPLEMENTATION ESR, and the Error is propagate to host, I hope host can be panic. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. 2017-10-16 3:17 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 gengdongjiu @ 2017-10-16 17:02 ` James Morse 0 siblings, 0 replies; 6+ messages in thread From: James Morse @ 2017-10-16 17:02 UTC (permalink / raw) To: linux-arm-kernel Hi gengdongjiu, On 16/10/17 04:17, gengdongjiu wrote: >> In fact I have below method for that, what do you think about that? >> >> 1. If there is no RAS, old method, directly inject virtual SError, not need to specify ESR, as shown in the [1] >> 2. If there is RAS, KVM set "the kvm_run" guest exit type value to let user space handle the SError abort >> A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return " ESR_ELx_ISV " to let user space specify an implementation-defined value, as shown [2] >> B. If ESR_EL2 is categorized and error not propagated, the error come from guest user space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR " to let user space specify a recoverable ESR. >> Here one side calling memory failure, another side let user pace inject SError. Because usually SEI notification does not deliver SIGBUS signal to user space, so here inject virtual SEI to ensure that. As shown [3] >> C. If ESR_EL2 is categorized and error not propagated, the error come from guest kernel, return "-1" to terminate guest. As shown [4] >> D. Otherwise, Panic host OS. As shown [5] >> > > For the IMPLEMENTATION ESR, for the safety purposes, I have below suggestion > > 1. When there is no guest, host OS receives IMPLEMENTATION ESR, I > hope host can be panic, because we do not know its implementation > meaning. > 2. when guest received MPLEMENTATION ESR, and the Error is isolated > to guest by "ESB" instruction, not propagate to host. I hope guest > will exit, but host not panic. How do we know if impdef SError values are contained by ESB? '2.4.3 ESB and other physical errors' of [0]: > It is IMPLEMENTATION DEFINED whether IMPLEMENTATION DEFINED and uncategorized > SError interrupts are containable or Uncontainable, and whether they can be > synchronized by an Error Synchronization Barrier. > 3. when guest received MPLEMENTATION ESR, and the Error is propagate > to host, I hope host can be panic. I tried to keep this behaviour 'the same' because I don't think there is a 'right thing' to do here: If the SError is due to some catastrophic failure, we should panic the host. If the SError is due to the guest poking a device we don't want to panic the host. We can't tell these two cases apart. We can spot RAS errors and handle those. Thanks, James [0] https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. [not found] <0184EA26B2509940AA629AE1405DD7F2016959EE@DGGEMA503-MBX.china.huawei.com> 2017-10-16 3:17 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 gengdongjiu @ 2017-10-16 17:09 ` James Morse 1 sibling, 0 replies; 6+ messages in thread From: James Morse @ 2017-10-16 17:09 UTC (permalink / raw) To: linux-arm-kernel Hi gengdongjiu, On 15/10/17 17:09, gengdongjiu wrote: >> On 13/10/17 10:25, gengdongjiu wrote: >>> In my first version patch [2], It sets the virtual ESR in the KVM, but >>> Marc and other people disagree that[3][4],and propose to set its value >>> and injection by userspace(when RAS is enabled). >> >> Not quite: for RAS errors. >> When we want to hand a RAS error to a guest, Qemu should be driving that. >> >> What about impdef SError? Qemu should be able to drive that with the same API. >> >> What about this nasty corner where KVM already injects an impdef SError directly? This patch keeps that working. >> >> >> I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But what do we replace it with? It needs to be a guest exit type that existing >> software can't ignore... >> >> (The best I can suggest is: Once we have a mechanism to inject SError into a guest from Qemu, KVM could make an impdef SError pending, >> then give Qemu the opportunity to kill the guest, or set a different ESR. Existing software can ignore the exit, and take the existing >> behaviour.) > In fact I have below method for that, what do you think about that? > > 1. If there is no RAS, old method, directly inject virtual SError, not need to specify > ESR, as shown in the [1] > 2. If there is RAS, KVM set "the kvm_run" guest exit type value to let user space handle > the SError abort Nope. There should not be a RAS-specific kvm exit type. What information do you need to convey that doesn't apply to another user-space process? Qemu/kvmtool will always need to generate a new RAS error from the user-space signals they get as a result of the host handling the error. This way KVMs user-space isn't a special case, and the user-space code is portable across architectures. This does mean the host kernel has to be able to handle any and all RAS errors before they could possibly be presented to a guest. Today we only handle memory errors. Any other error types will need adding in a way that works on all architectures. Again, you're passing RAS SErrors out to user-space. The SError may have nothing to do with the guest. The host has to handle any and all RAS errors. The only case where the guest is involved is if if the guest treads in some poisoned memory. (You know this:) The host has to do its RAS work first, to check this wasn't an error in the hosts page tables and that the host really can keep running. memory_failure() will cause the faulty memory to be unmapped from stage 2 and signal Qemu/kvmtool. If the guest touches the faulty memory (even from a different CPU/vcpu), Qemu/kvmtool will get a signal. Qemu/kvmtool should take these signals and generate any applicable RAS error from these. We have to use these signals so that KVM's user-space is the same as all other user-space. > A. If ESR_EL2 is IMPLEMENTATION or uncategorized, return " ESR_ELx_ISV " to let user > space specify an implementation-defined value, as shown [2] I agree impdef SError are a different case. It doesn't look like you are passing the impdef ESR to user-space, which I think is correct. Passing uncategorized errors like this isn't correct, these are still RAS errors. '2.4.3 ESB and other physical errors' in [0] says its implementation-defined whether these Uncategorized RAS errors are uncontainable, we should assume they weren't contained, and panic(). (If the CPU wants us to do a better job, it needs to give us some information). I don't think there is any point generating a fake ESR. I need to think about using KVM_EXIT_EXCEPTION, but the exit code should mean 'run this again and I'll give it an SError'. This lets Qemu/kvmtool set its own impdef SError or emulate a machine that doesn't generate SError. I think this needs more discussion, do we want to let Qemu/kvmtool reset an affected vcpu so that its effectively running in firmware and can be brought back online again? > If ESR_EL2 is categorized and error not propagated, the error come from guest user > space, return " (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR " to let user space > specify a recoverable ESR. No, these are RAS errors, they should be handled by the kernel, not passed to user-space, and not both. If user-space needs to know, it will be informed via the usual way we tell user-space about this stuff. > Here one side calling memory failure, another side let user pace inject SError. > Because usually SEI notification does not deliver SIGBUS signal to user space, so > here inject virtual SEI to ensure that. As shown [3] For a RAS error we process the CPER records, and if they describe a memory error we send signals to user-space. Is there another type of RAS error we need to support for the guest? We must support this on the host first. > C. If ESR_EL2 is categorized and error not propagated, the error come from guest kernel, > return "-1" to terminate guest. As shown [4] > D. Otherwise, Panic host OS. As shown [5] I don't think KVM should go calling panic(), this needs to be wrapped up the arch's RAS code. > static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run) > { > unsigned int esr = kvm_vcpu_get_hsr(vcpu); > bool impdef_syndrome = esr & ESR_ELx_ISV; /* aka IDS */ > unsigned int aet = esr & ESR_ELx_AET; > > if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) [1] > kvm_inject_vabt(vcpu); > return 1; > } > > kvm_run->exit_reason = KVM_EXIT_EXCEPTION; > kvm_run->ex.exception = KVM_EXCEPTION_SERROR; > > If (impdef_syndrome || ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) { > kvm_run->ex.error_code = ESR_ELx_ISV; > } > > switch (aet) { > case ESR_ELx_AET_CE: /* corrected error */ [2] > case ESR_ELx_AET_UEO: /* restartable error, not yet consumed */ > kvm_run->ex.error_code = (ESR_ELx_AET_UC | ESR_ELx_FSC_SERROR ); > break; /* continue processing the guest exit */ > > > case ESR_ELx_AET_UEU: /* The error has not been propagated */ [3] > case ESR_ELx_AET_UER: > /* > * Only handle the guest user mode SEI if the error has not been propagated > */ > if ((!vcpu_mode_priv(vcpu)) && !handle_guest_sei(kvm_vcpu_get_hsr(vcpu))) > kvm_run->ex.error_code = (ESR_ELx_AET_UCU | ESR_ELx_FSC_SERROR ); Why should KVM care whether it was guest EL0 or guest EL1? Something is designed wrong if you need to behave differently here. (as far as I can see KVMs existing use of this vcpu_mode_priv() helper is to emulate bits of the architecture that behave/undef differently at different exception levels) > break; > else > return -1; [4] > /* If SError handling is failed, continue run */ > default: [5] > /* > * Until now, the CPU supports RAS and SEI is fatal, or user space > * does not support to handle the SError. > */ > panic("This Asynchronous SError interrupt is dangerous, panic"); > } > > return 0; > } > >> >>> So I think we no need to submit another patch, it will be duplicated, >>> and waste our review time. thank you very much. I will combine that. >> >> I agree we're posting competing series, there was some off-list co-ordination on this with Xie XiuQi and Xiongfeng Wang in ~may, it looks >> like you weren't involved at that point. > > Thanks very much for your agreement, I will add you to the off-list. There are two sets of off-list co-ordination? That explains something... >> In your last series touching all this: >> https://lkml.org/lkml/2017/8/31/698 >> >> You had Xie XiuQi's RAS-cpufeature patch in isolation, without the SError rework underneath it. Applied like this SError is still always masked >> in the kernel, so any system without firmware-first will silently consume and discard an uncontained-RAS-error using the esb() in >> __switch_to(). We can't do this, hence the first half of this series. > > > Yes, seems I lost your SError rework series patches. When my patch update and modification > almost done, hope we can combine to one series. thanks I'd like to try and get this series merged as it is, then we can work on the KVM and APEI bits as separate series on top. (otherwise we still depend on getting this merged so we can have code that depends on ARM64_HAS_RAS_EXTENSIONS). For KVM its: * Allowing guests to inject SError. * Allow Qemu/kvmtool to do something if the guest trips a non-RAS SError. * Expose enough information to allow user-space to promote SIGBUS_MCEERR_AR to external-abort. * (Handle SError during world-switch) For APEI its: * Allow multiple sources of NMI * Abstract the estatus cache to use for SEI/SDEI * Increase the priority of memory_failure_queue()s work, * Provide a way to block ret_to-user to memory_failure work is done. (These last two are the problem Xie XiuQi found). Thanks, James [0] https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support
@ 2017-10-05 19:17 James Morse
2017-10-05 19:18 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
0 siblings, 1 reply; 6+ messages in thread
From: James Morse @ 2017-10-05 19:17 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
The aim of this series is to enable IESB and add ESB-instructions to let us
kick any pending RAS errors into firmware to be handled by firmware-first.
Not all systems will have this firmware, so these RAS errors will become
pending SErrors. We should take these as quickly as possible and avoid
panic()ing for errors where we could have continued.
This first part of this series reworks the DAIF masking so that SError is
unmasked unless we are handling a debug exception.
The last part provides the same minimal handling for SError that interrupt
KVM. KVM is currently unable to handle SErrors during world-switch, unless
they occur during a magic single-instruction window, it hyp-panics. I suspect
this will be easier to fix once the VHE world-switch is further optimised.
KVMs kvm_inject_vabt() needs updating for v8.2 as now we can specify an ESR,
and all-zeros has a RAS meaning.
KVM's existing 'impdef SError to the guest' behaviour probably needs revisiting.
These are errors where we don't know what they mean, they may not be
synchronised by ESB. Today we blame the guest.
My half-baked suggestion would be to make a virtual SError pending, but then
exit to user-space to give Qemu the change to quit (for virtual machines that
don't generate SError), pend an SError with a new Qemu-specific ESR, or blindly
continue and take KVMs default all-zeros impdef ESR.
Known issues:
* Synchronous external abort SET severity is not yet considered, all
synchronous-external-aborts are still considered fatal.
* KVM-Migration: VDISR_EL2 is exposed to userspace as DISR_EL1, but how should
HCR_EL2.VSE or VSESR_EL2 be migrated when the guest has an SError pending but
hasn't taken it yet...?
* No HCR_EL2.{TEA/TERR} setting ... Dongjiu Geng had a patch that was almost
finished, I haven't seen the new version.
* KVM unmasks SError and IRQ before calling handle_exit, we may be rescheduled
while holding an uncontained ESR... (this is currently an improvement on
assuming its an impdef error we can blame on the guest)
* We need to fix this for APEI's SEI or kernel-first RAS, the guest-exit
SError handling will need to move to before kvm_arm_vhe_guest_exit().
Changes from v2 ... (where do I start?)
* All the KVM patches rewritten.
* VSESR_EL2 setting/save/restore is new, as is
* save/restoring VDISR_EL2 and exposing it to user space as DISR_EL1.
* The new ARM-ARM (DDI0487B.b) has an SCTLR_EL2.IESB even for !VHE, we turn
that on.
* 'survivable' SError are now described as 'blocking' because the CPU can't
make progress, this makes all the commit messages clearer.
* My IESB!=ESB confusion got fixed, so the crazy eret with SError unmasked
is gone, never to return.
* The cost of masking SError on return to user-space has been wrapped up with
the ret-to-user loop. (This was only visible with microbenchmarks like
getpid)
* entry.S changes got cleaner, commit messages got better,
This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b serror_rework/v3
Comments and contradictions welcome,
James Morse (18):
arm64: explicitly mask all exceptions
arm64: introduce an order for exceptions
arm64: Move the async/fiq helpers to explicitly set process context
flags
arm64: Mask all exceptions during kernel_exit
arm64: entry.S: Remove disable_dbg
arm64: entry.S: convert el1_sync
arm64: entry.S convert el0_sync
arm64: entry.S: convert elX_irq
KVM: arm/arm64: mask/unmask daif around VHE guests
arm64: kernel: Survive corrected RAS errors notified by SError
arm64: cpufeature: Enable IESB on exception entry/return for
firmware-first
arm64: kernel: Prepare for a DISR user
KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
KVM: arm64: Save/Restore guest DISR_EL1
KVM: arm64: Save ESR_EL2 on guest SError
KVM: arm64: Handle RAS SErrors from EL1 on guest exit
KVM: arm64: Handle RAS SErrors from EL2 on guest exit
KVM: arm64: Take any host SError before entering the guest
Xie XiuQi (2):
arm64: entry.S: move SError handling into a C function for future
expansion
arm64: cpufeature: Detect CPU RAS Extentions
arch/arm64/Kconfig | 33 +++++++++++++-
arch/arm64/include/asm/assembler.h | 50 ++++++++++++++-------
arch/arm64/include/asm/barrier.h | 1 +
arch/arm64/include/asm/cpucaps.h | 4 +-
arch/arm64/include/asm/daifflags.h | 61 +++++++++++++++++++++++++
arch/arm64/include/asm/esr.h | 17 +++++++
arch/arm64/include/asm/exception.h | 14 ++++++
arch/arm64/include/asm/irqflags.h | 40 ++++++-----------
arch/arm64/include/asm/kvm_emulate.h | 10 +++++
arch/arm64/include/asm/kvm_host.h | 16 +++++++
arch/arm64/include/asm/processor.h | 2 +
arch/arm64/include/asm/sysreg.h | 6 +++
arch/arm64/include/asm/traps.h | 36 +++++++++++++++
arch/arm64/kernel/asm-offsets.c | 1 +
arch/arm64/kernel/cpufeature.c | 43 ++++++++++++++++++
arch/arm64/kernel/debug-monitors.c | 5 ++-
arch/arm64/kernel/entry.S | 86 +++++++++++++++++++++---------------
arch/arm64/kernel/hibernate.c | 5 ++-
arch/arm64/kernel/machine_kexec.c | 4 +-
arch/arm64/kernel/process.c | 3 ++
arch/arm64/kernel/setup.c | 8 ++--
arch/arm64/kernel/signal.c | 8 +++-
arch/arm64/kernel/smp.c | 12 ++---
arch/arm64/kernel/suspend.c | 7 +--
arch/arm64/kernel/traps.c | 64 ++++++++++++++++++++++++++-
arch/arm64/kvm/handle_exit.c | 19 +++++++-
arch/arm64/kvm/hyp-init.S | 3 ++
arch/arm64/kvm/hyp/entry.S | 13 ++++++
arch/arm64/kvm/hyp/switch.c | 19 ++++++--
arch/arm64/kvm/hyp/sysreg-sr.c | 6 +++
arch/arm64/kvm/inject_fault.c | 13 +++++-
arch/arm64/kvm/sys_regs.c | 1 +
arch/arm64/mm/proc.S | 14 +++---
virt/kvm/arm/arm.c | 4 ++
34 files changed, 513 insertions(+), 115 deletions(-)
create mode 100644 arch/arm64/include/asm/daifflags.h
--
2.13.3
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. 2017-10-05 19:17 [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support James Morse @ 2017-10-05 19:18 ` James Morse 2017-10-13 9:25 ` gengdongjiu 0 siblings, 1 reply; 6+ messages in thread From: James Morse @ 2017-10-05 19:18 UTC (permalink / raw) To: linux-arm-kernel Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature generated an SError with an implementation defined ESR_EL1.ISS, because we had no mechanism to specify the ESR value. On Juno this generates an all-zero ESR, the most significant bit 'ISV' is clear indicating the remainder of the ISS field is invalid. With the RAS Extensions we have a mechanism to specify this value, and the most significant bit has a new meaning: 'IDS - Implementation Defined Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized' instead of 'no valid ISS'. Add KVM support for the VSESR_EL2 register to specify an ESR value when HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to specify an implementation-defined value. We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM save/restores this bit during __deactivate_traps() and hardware clears the bit once the guest has consumed the virtual-SError. Future patches may add an API (or KVM CAP) to pend a virtual SError with a specified ESR. Cc: Dongjiu Geng <gengdongjiu@huawei•com> Signed-off-by: James Morse <james.morse@arm•com> --- arch/arm64/include/asm/kvm_emulate.h | 5 +++++ arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/kvm/hyp/switch.c | 4 ++++ arch/arm64/kvm/inject_fault.c | 13 ++++++++++++- 5 files changed, 25 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index e5df3fce0008..8a7a838eb17a 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr) vcpu->arch.hcr_el2 = hcr; } +static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr) +{ + vcpu->arch.vsesr_el2 = vsesr; +} + static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) { return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc; diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index d3eb79a9ed6b..0af35e71fedb 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -277,6 +277,9 @@ struct kvm_vcpu_arch { /* Detect first run of a vcpu */ bool has_run_once; + + /* Virtual SError ESR to restore when HCR_EL2.VSE is set */ + u64 vsesr_el2; }; #define vcpu_gp_regs(v) (&(v)->arch.ctxt.gp_regs) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 427c36bc5dd6..a493e93de296 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -253,6 +253,7 @@ #define SYS_DACR32_EL2 sys_reg(3, 4, 3, 0, 0) #define SYS_IFSR32_EL2 sys_reg(3, 4, 5, 0, 1) +#define SYS_VSESR_EL2 sys_reg(3, 4, 5, 2, 3) #define SYS_FPEXC32_EL2 sys_reg(3, 4, 5, 3, 0) #define __SYS__AP0Rx_EL2(x) sys_reg(3, 4, 12, 8, x) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 945e79c641c4..af37658223a0 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) isb(); } write_sysreg(val, hcr_el2); + + if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE)) + write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2); + /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */ write_sysreg(1 << 15, hstr_el2); /* diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c index da6a8cfa54a0..52f7f66f1356 100644 --- a/arch/arm64/kvm/inject_fault.c +++ b/arch/arm64/kvm/inject_fault.c @@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu) inject_undef64(vcpu); } +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr) +{ + vcpu_set_vsesr(vcpu, esr); + vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE); +} + /** * kvm_inject_vabt - inject an async abort / SError into the guest * @vcpu: The VCPU to receive the exception * * It is assumed that this code is called from the VCPU thread and that the * VCPU therefore is not currently executing guest code. + * + * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with + * the remaining ISS all-zeros so that this error is not interpreted as an + * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR + * value, so the CPU generates an imp-def value. */ void kvm_inject_vabt(struct kvm_vcpu *vcpu) { - vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE); + pend_guest_serror(vcpu, ESR_ELx_ISV); } -- 2.13.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. 2017-10-05 19:18 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse @ 2017-10-13 9:25 ` gengdongjiu 2017-10-13 16:53 ` James Morse 0 siblings, 1 reply; 6+ messages in thread From: gengdongjiu @ 2017-10-13 9:25 UTC (permalink / raw) To: linux-arm-kernel Hi James, After checking this patch, I think my patch[1] already include this logic(only a little difference). In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and other people disagree that[3][4],and propose to set its value and injection by userspace(when RAS is enabled). If you think we also need to support injection by KVM, I can extend my patch to support that(but I think we should not support according previous review comments). So I think we no need to submit another patch, it will be duplicated, and waste our review time. thank you very much. I will combine that. [1] https://lkml.org/lkml/2017/8/28/497 [2] https://patchwork.kernel.org/patch/9633105/ [3]https://lkml.org/lkml/2017/3/20/441 [4]https://lkml.org/lkml/2017/3/20/516 On 2017/10/6 3:18, James Morse wrote: > Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature > generated an SError with an implementation defined ESR_EL1.ISS, because we > had no mechanism to specify the ESR value. > > On Juno this generates an all-zero ESR, the most significant bit 'ISV' > is clear indicating the remainder of the ISS field is invalid. > > With the RAS Extensions we have a mechanism to specify this value, and the > most significant bit has a new meaning: 'IDS - Implementation Defined > Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized' > instead of 'no valid ISS'. > > Add KVM support for the VSESR_EL2 register to specify an ESR value when > HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to > specify an implementation-defined value. > > We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM > save/restores this bit during __deactivate_traps() and hardware clears the > bit once the guest has consumed the virtual-SError. > > Future patches may add an API (or KVM CAP) to pend a virtual SError with > a specified ESR. > > Cc: Dongjiu Geng <gengdongjiu@huawei•com> > Signed-off-by: James Morse <james.morse@arm•com> > --- > arch/arm64/include/asm/kvm_emulate.h | 5 +++++ > arch/arm64/include/asm/kvm_host.h | 3 +++ > arch/arm64/include/asm/sysreg.h | 1 + > arch/arm64/kvm/hyp/switch.c | 4 ++++ > arch/arm64/kvm/inject_fault.c | 13 ++++++++++++- > 5 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h > index e5df3fce0008..8a7a838eb17a 100644 > --- a/arch/arm64/include/asm/kvm_emulate.h > +++ b/arch/arm64/include/asm/kvm_emulate.h > @@ -61,6 +61,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr) > vcpu->arch.hcr_el2 = hcr; > } > > +static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr) > +{ > + vcpu->arch.vsesr_el2 = vsesr; > +} > + > static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) > { > return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc; > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index d3eb79a9ed6b..0af35e71fedb 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -277,6 +277,9 @@ struct kvm_vcpu_arch { > > /* Detect first run of a vcpu */ > bool has_run_once; > + > + /* Virtual SError ESR to restore when HCR_EL2.VSE is set */ > + u64 vsesr_el2; > }; > > #define vcpu_gp_regs(v) (&(v)->arch.ctxt.gp_regs) > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h > index 427c36bc5dd6..a493e93de296 100644 > --- a/arch/arm64/include/asm/sysreg.h > +++ b/arch/arm64/include/asm/sysreg.h > @@ -253,6 +253,7 @@ > > #define SYS_DACR32_EL2 sys_reg(3, 4, 3, 0, 0) > #define SYS_IFSR32_EL2 sys_reg(3, 4, 5, 0, 1) > +#define SYS_VSESR_EL2 sys_reg(3, 4, 5, 2, 3) > #define SYS_FPEXC32_EL2 sys_reg(3, 4, 5, 3, 0) > > #define __SYS__AP0Rx_EL2(x) sys_reg(3, 4, 12, 8, x) > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > index 945e79c641c4..af37658223a0 100644 > --- a/arch/arm64/kvm/hyp/switch.c > +++ b/arch/arm64/kvm/hyp/switch.c > @@ -86,6 +86,10 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) > isb(); > } > write_sysreg(val, hcr_el2); > + > + if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE)) > + write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2); > + > /* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */ > write_sysreg(1 << 15, hstr_el2); > /* > diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c > index da6a8cfa54a0..52f7f66f1356 100644 > --- a/arch/arm64/kvm/inject_fault.c > +++ b/arch/arm64/kvm/inject_fault.c > @@ -232,14 +232,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu) > inject_undef64(vcpu); > } > > +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr) > +{ > + vcpu_set_vsesr(vcpu, esr); > + vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE); > +} > + > /** > * kvm_inject_vabt - inject an async abort / SError into the guest > * @vcpu: The VCPU to receive the exception > * > * It is assumed that this code is called from the VCPU thread and that the > * VCPU therefore is not currently executing guest code. > + * > + * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with > + * the remaining ISS all-zeros so that this error is not interpreted as an > + * uncatagorized RAS error. Without the RAS Extensions we can't specify an ESR > + * value, so the CPU generates an imp-def value. > */ > void kvm_inject_vabt(struct kvm_vcpu *vcpu) > { > - vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE); > + pend_guest_serror(vcpu, ESR_ELx_ISV); > } > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. 2017-10-13 9:25 ` gengdongjiu @ 2017-10-13 16:53 ` James Morse 0 siblings, 0 replies; 6+ messages in thread From: James Morse @ 2017-10-13 16:53 UTC (permalink / raw) To: linux-arm-kernel Hi gengdongjiu, On 13/10/17 10:25, gengdongjiu wrote: > After checking this patch, I think my patch[1] already include this logic(only a little > difference). Your kvm_handle_guest_sei() is similar to where this series ends up, but the purpose of this patch is to keep KVMs existing behaviour. KVM already injects SError into the guest all by itself, now with the RAS extensions it can specify and ESR, and because of the new ESR encoding it can't use the reset value of all-zeroes. > In my first version patch [2], It sets the virtual ESR in the KVM, but Marc and > other people disagree that[3][4],and propose to set its value and injection by userspace(when > RAS is enabled). Not quite: for RAS errors. When we want to hand a RAS error to a guest, Qemu should be driving that. What about impdef SError? Qemu should be able to drive that with the same API. What about this nasty corner where KVM already injects an impdef SError directly? This patch keeps that working. I'd love to get rid of KVMs internal use of kvm_inject_vabt(). But what do we replace it with? It needs to be a guest exit type that existing software can't ignore... (The best I can suggest is: Once we have a mechanism to inject SError into a guest from Qemu, KVM could make an impdef SError pending, then give Qemu the opportunity to kill the guest, or set a different ESR. Existing software can ignore the exit, and take the existing behaviour.) > So I think we no need to submit another patch, it will be duplicated, and waste our review > time. thank you very much. I will combine that. I agree we're posting competing series, there was some off-list co-ordination on this with Xie XiuQi and Xiongfeng Wang in ~may, it looks like you weren't involved at that point. In your last series touching all this: https://lkml.org/lkml/2017/8/31/698 You had Xie XiuQi's RAS-cpufeature patch in isolation, without the SError rework underneath it. Applied like this SError is still always masked in the kernel, so any system without firmware-first will silently consume and discard an uncontained-RAS-error using the esb() in __switch_to(). We can't do this, hence the first half of this series. James ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-10-16 17:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <0184EA26B2509940AA629AE1405DD7F2016959EE@DGGEMA503-MBX.china.huawei.com>
2017-10-16 3:17 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 gengdongjiu
2017-10-16 17:02 ` James Morse
2017-10-16 17:09 ` James Morse
2017-10-05 19:17 [PATCH v3 00/20] SError rework + RAS&IESB for firmware first support James Morse
2017-10-05 19:18 ` [PATCH v3 15/20] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
2017-10-13 9:25 ` gengdongjiu
2017-10-13 16:53 ` James Morse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox