From: Oliver Upton <oupton@kernel•org>
To: Marc Zyngier <maz@kernel•org>
Cc: Hyunwoo Kim <imv4bel@gmail•com>,
joey.gouly@arm•com, seiden@linux•ibm.com, suzuki.poulose@arm•com,
yuzenghui@huawei•com, catalin.marinas@arm•com, will@kernel•org,
Sascha.Bischoff@arm•com, jic23@kernel•org, timothy.hayes@arm•com,
eric.auger@linaro•org, christoffer.dall@linaro•org,
andre.przywara@arm•com, linux-arm-kernel@lists•infradead.org,
kvmarm@lists•linux.dev
Subject: Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
Date: Fri, 5 Jun 2026 01:43:32 -0700 [thread overview]
Message-ID: <aiKMNIUMv9GQiIbD@kernel.org> (raw)
In-Reply-To: <87ecila0w3.wl-maz@kernel.org>
On Fri, Jun 05, 2026 at 08:42:52AM +0100, Marc Zyngier wrote:
> On Fri, 05 Jun 2026 07:00:37 +0100,
> Oliver Upton <oupton@kernel•org> wrote:
> >
> > On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> > > vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> > > an interrupt to another vCPU. After reacquiring the locks it only checks
> > > that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> > > before moving the interrupt, which assumes that an interrupt whose affinity
> > > is preserved is still queued on this vCPU's ap_list.
> > >
> > > That assumption no longer holds if the interrupt is taken off the ap_list
> > > while the locks are dropped. vgic_flush_pending_lpis() removes the
> > > interrupt from the list and sets irq->vcpu to NULL, but leaves
> > > enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> > > and pending, vgic_target_oracle() returns the same target_vcpu, so the
> > > affinity check passes and list_del() is run a second time on an entry that
> > > has already been removed.
> > >
> > > Also check that the interrupt is still assigned to this vCPU
> > > (irq->vcpu == vcpu) before moving it.
> > >
> > > Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail•com>
> >
> > Looking at this and the other VGIC patch you sent (which should've been
> > a combined series), are you trying to deal with a vCPU writing to
> > another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
> > behind the back of vCPU A?
> >
> > That is extremely relevant information as the off-the-cuff reaction is
> > that no race exists. But since the GIC architecture is awesome and
> > allows for this sort of insanity, it obviously does....
> >
> > Anyway, for LPIs resident on a particular RD, there's zero expectation
> > that the pending state is preserved when EnableLPIs=0. So I'd rather
> > vgic_flush_pending_lpis() just invalidate the pending state.
>
> Just clearing the pending state introduces a potential problem as we
> now have an interrupt that is neither active nor pending on the AP
> list. It is not impossible to solve (we now have similar behaviours
> with SPI deactivation from another vcpu), but that requires posting a
> KVM_REQ_VGIC_PROCESS_UPDATE to the target vcpu.
Right, I was suggesting that in addition to deleting the LPI from the AP
list we actually invalidate the pending state so that someone sitting on
a pointer to a to-be-freed LPI sees vgic_target_oracle() returning
NULL
> > Beyond that, I see two other fixes for lifetime issues around the
> > vgic_irq in the middle of migration. I'd like to see explicit RCU
> > protection around the release && reacquire of the ap_list_lock rather
> > than depending on the precondition that IRQs are disabled.
>
> I'm not sure I follow. Are you suggesting turning the AP list into an
> RCU protected list?
No, sorry, I should expand a little.
We store a reference on the vgic_irq struct in the AP list, which is
stable so long as the ap_list_lock is held. It should be possible for
the refcount to drop to 0 between releasing the ap_list_lock and
reacquiring it.
So either vgic_prune_ap_list() takes an additional reference on the
vgic_irq before dropping the ap_list_lock or rely on RCU to protect
vgic_irq structs observed with a non-zero refcount.
Thanks,
Oliver
prev parent reply other threads:[~2026-06-05 8:43 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 20:59 [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it Hyunwoo Kim
2026-06-05 6:00 ` Oliver Upton
2026-06-05 7:42 ` Marc Zyngier
2026-06-05 8:43 ` Oliver Upton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiKMNIUMv9GQiIbD@kernel.org \
--to=oupton@kernel$(echo .)org \
--cc=Sascha.Bischoff@arm$(echo .)com \
--cc=andre.przywara@arm$(echo .)com \
--cc=catalin.marinas@arm$(echo .)com \
--cc=christoffer.dall@linaro$(echo .)org \
--cc=eric.auger@linaro$(echo .)org \
--cc=imv4bel@gmail$(echo .)com \
--cc=jic23@kernel$(echo .)org \
--cc=joey.gouly@arm$(echo .)com \
--cc=kvmarm@lists$(echo .)linux.dev \
--cc=linux-arm-kernel@lists$(echo .)infradead.org \
--cc=maz@kernel$(echo .)org \
--cc=seiden@linux$(echo .)ibm.com \
--cc=suzuki.poulose@arm$(echo .)com \
--cc=timothy.hayes@arm$(echo .)com \
--cc=will@kernel$(echo .)org \
--cc=yuzenghui@huawei$(echo .)com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox