A problem about SPI Interrupt Configuration

public inbox for linux-arm-kernel@lists.infradead.org 
 help / color / mirror / Atom feed

From: marc.zyngier@arm•com (Marc Zyngier)
To: linux-arm-kernel@lists•infradead.org
Subject: A problem about SPI Interrupt Configuration
Date: Thu, 28 Jan 2016 08:50:40 +0000	[thread overview]
Message-ID: <56A9D660.2060905@arm.com> (raw)
In-Reply-To: <56A97DB8.2000402@huawei.com>

On 28/01/16 02:32, Yang Yingliang wrote:
> 
> 
> On 2016/1/22 15:57, Marc Zyngier wrote:
>> On Wed, 20 Jan 2016 10:38:13 +0800
>> Yang Yingliang <yangyingliang@huawei•com> wrote:
>>
>> Hi Yang,
>>
>>> Hi, Marc
>>>
>>> I got some error messages "RWP timeout, gone fishing".
>>>
>>> The case is :
>>>
>>> 			CPU0												CPU1
>>> 													acquire desc->lock in __setup_irq()
>>> 															enable irq in __setup_irq()
>>> read iar in gic_handle_irq()
>>> waiting for desc->lock in handle_fasteoi_irq()
>>> 												call gic_set_affinity() from setup_affinity()
>>> 												waiting for the irq deactive in gic_do_wait_for_rwp()
>>>
>>>
>>> The hardware will not clear GICD_CTLR.RWP until the interrupt is not
>>> active. The interrupt is keeping active while it's waiting for
>>> desc->lock on cpu0. But the lock is hold by cpu1 while it's waiting for
>>> the interrupt is not active. It causes a deadlock here in 1s.
>>>
>>>
>>> And the GICv3 SPEC says:
>>>
>>> 4.5.5 SPI Interrupt Configuration
>>> To configure an SPI interrupt, to ensure that interrupts are never
>>> distributed using partially updated configuration
>>> information, software must:
>>> o Ensure the interrupt is not active
>>> o Ensure that the interrupt is disabled
>>> o This might be done either by writing to GICD_CTLR to clear the enables
>>> for a group, or
>>> o By writing to GICD_ICENABLERn to clear the Enable bit of the interrupt
>>> (see section 5.3.11).
>>> o In both cases, software must poll GICD_CTLR.RWP to ensure the effects
>>> are visible (see section
>>> 5.3.20).
>>> o Program the routing (if appropriate), priority and group
>>> o Enable the interrupt (if required)
>>>
>>> Because it says "Ensure the interrupt is not active", so I can not tell
>>> it is a hardware or software problem.
>>>
>>> Can you please give some advice?
>>
>> Thanks for the accurate description of the problem. This looks to be a
>> core issue, or at least a problem between core code and the way the GIC
>> behaves, unfortunately. The architecture expects the interrupt to be
>> fully configured before it is enabled and made active, while the core
>> code does this the other way around.
>>
>> Can you please have a go at the patch below and let me know if it
>> improve things? This is just a test, and definitely not the complete
>> solution, but I'd like to find out if I'm on the right track.
>>
>> Thanks,
>>
>> 	M.
>>
>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>> index 0eebaee..e5802fb 100644
>> --- a/kernel/irq/manage.c
>> +++ b/kernel/irq/manage.c
>> @@ -1303,12 +1303,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>>   		if (new->flags & IRQF_ONESHOT)
>>   			desc->istate |= IRQS_ONESHOT;
>>
>> -		if (irq_settings_can_autoenable(desc))
>> -			irq_startup(desc, true);
>> -		else
>> -			/* Undo nested disables: */
>> -			desc->depth = 1;
>> -
>>   		/* Exclude IRQ from balancing if requested */
>>   		if (new->flags & IRQF_NOBALANCING) {
>>   			irq_settings_set_no_balancing(desc);
>> @@ -1318,6 +1312,12 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>>   		/* Set default affinity mask once everything is setup */
>>   		setup_affinity(desc, mask);
>>
>> +		if (irq_settings_can_autoenable(desc))
>> +			irq_startup(desc, true);
>> +		else
>> +			/* Undo nested disables: */
>> +			desc->depth = 1;
>> +
>>   	} else if (new->flags & IRQF_TRIGGER_MASK) {
>>   		unsigned int nmsk = new->flags & IRQF_TRIGGER_MASK;
>>   		unsigned int omsk = irq_settings_get_trigger_mask(desc);
>>
> 
> This patch can avoid the case I said in this mail.
> But in other case like set affinity by the /proc/irq/x/smp_affinity,
> it's will have the same problem _if_ the hardware is waiting for deactive.

Indeed, and the more I think of it, the more I'm convinced that what you
are looking at is a hardware bug.

If RWP is gated by an interrupt being active, or the disable is gated by
the interrupt being active, then you will end-up in all kind of horrible
problems that software *cannot* solve. The text you quoted earlier is
not something that can be enforced from a software perspective (it
contains a race condition that cannot be avoided).

Furthermore, take the example of a VM that has an active interrupt
linked to a physical interrupt (HW bit set in the List Register) while
you are changing the affinity on the host. Nothing guarantees that the
deactivate will *ever* happen (the guest could decide it doesn't need to
do it, or takes a very long time to do so). In that case, you will
deadlock the same way, and there is nothing you can do from a SW PoV to
solve it.

I suggest you get back to your HW folks, and explain that what they have
here is not a viable design.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

     prev parent reply	other threads:[~2016-01-28  8:50 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-20  2:38 A problem about SPI Interrupt Configuration Yang Yingliang
2016-01-22  7:57 ` Marc Zyngier
2016-01-28  2:32   ` Yang Yingliang
2016-01-28  8:50     ` Marc Zyngier [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A9D660.2060905@arm.com \
    --to=marc.zyngier@arm$(echo .)com \
    --cc=linux-arm-kernel@lists$(echo .)infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox