public inbox for linux-arm-kernel@lists.infradead.org 
 help / color / mirror / Atom feed
From: javi.merino@arm•com (Javi Merino)
To: linux-arm-kernel@lists•infradead.org
Subject: [PATCH] ARM: pl330: Fix a race condition
Date: Tue, 20 Sep 2011 14:36:24 +0100	[thread overview]
Message-ID: <4E7896D8.5090606@arm.com> (raw)
In-Reply-To: <CABb+yY0BGHSMA08uDMNkdSehFkQ_9afzzbkdpPkD0cz3owVWBQ@mail.gmail.com>

On 19/09/11 19:07, Jassi Brar wrote:
> On Mon, Sep 19, 2011 at 10:41 PM, Javi Merino <javi.merino@arm•com> wrote:
>> If two requests have been submitted and one of them is running, if you
>> call pl330_chan_ctrl(ch_id, PL330_OP_START), there's a window of time
>> between the spin_lock_irqsave() and the _state() check in which the
>> running transaction may finish.  In that case, we don't receive the
>> interrupt (because they are disabled), but _start() sees that the DMA
>> is stopped, so it starts it.  The problem is that it sends the
>> transaction that has just finished again, because pl330_update()
>> hasn't mark it as done yet.
>>
>> This patch moves the _state() check out of the critical section, which
>> removes the race condition.  It also treats PL330_STATE_COMPLETING as
>> still executing, because that introduces another race condition now
>> that we call _state() with interrupts enabled.  Namely, if we read the
>> state as "completing" and the DMA sends the interrupt before we
>> disable interrupts, pl330_update() starts the next transaction and
>> returns.  Then the _start() in pl330_chan_ctrl() will patiently wait
>> until the just issued transaction finishes (because the state we read
>> was PL330_STATE_COMPLETING) and when it does, it _trigger()s the same
>> transaction again.
>>
>> Signed-off-by: Javi Merino <javi.merino@arm•com>
>> Cc: Jassi Brar <jassi.brar@samsung•com>
>> ---
>>  arch/arm/common/pl330.c |   12 +++++++-----
>>  1 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm/common/pl330.c b/arch/arm/common/pl330.c
>> index 97912fa..26b5615 100644
>> --- a/arch/arm/common/pl330.c
>> +++ b/arch/arm/common/pl330.c
>> @@ -936,9 +936,9 @@ static bool _trigger(struct pl330_thread *thrd)
>>        return true;
>>  }
>>
>> -static bool _start(struct pl330_thread *thrd)
>> +static bool _start(struct pl330_thread *thrd, u32 state)
>>  {
>> -       switch (_state(thrd)) {
>> +       switch (state) {
>>        case PL330_STATE_FAULT_COMPLETING:
>>                UNTIL(thrd, PL330_STATE_FAULTING | PL330_STATE_KILLING);
>>
>> @@ -949,7 +949,6 @@ static bool _start(struct pl330_thread *thrd)
>>                _stop(thrd);
>>
>>        case PL330_STATE_KILLING:
>> -       case PL330_STATE_COMPLETING:
>>                UNTIL(thrd, PL330_STATE_STOPPED)
>>
>>        case PL330_STATE_STOPPED:
>> @@ -961,6 +960,7 @@ static bool _start(struct pl330_thread *thrd)
>>        case PL330_STATE_UPDTPC:
>>        case PL330_STATE_CACHEMISS:
>>        case PL330_STATE_EXECUTING:
>> +       case PL330_STATE_COMPLETING:
>>                return true;
>>
>>        case PL330_STATE_WFE: /* For RESUME, nothing yet */
>> @@ -1471,7 +1471,7 @@ int pl330_update(const struct pl330_info *pi)
>>                        MARK_FREE(rqdone);
>>
>>                        /* Get going again ASAP */
>> -                       _start(thrd);
>> +                       _start(thrd, _state(thrd));
>>
>>                        /* For now, just make a list of callbacks to be done */
>>                        list_add_tail(&rqdone->rqd, &pl330->req_done);
>> @@ -1510,12 +1510,14 @@ int pl330_chan_ctrl(void *ch_id, enum pl330_chan_op op)
>>        struct pl330_dmac *pl330;
>>        unsigned long flags;
>>        int ret = 0, active;
>> +       u32 dma_state;
>>
>>        if (!thrd || thrd->free || thrd->dmac->state == DYING)
>>                return -EINVAL;
>>
>>        pl330 = thrd->dmac;
>>
>> +       dma_state = _state(thrd);
>>        spin_lock_irqsave(&pl330->lock, flags);
>>
>>        switch (op) {
>> @@ -1546,7 +1548,7 @@ int pl330_chan_ctrl(void *ch_id, enum pl330_chan_op op)
>>
>>                /* Start the next */
>>        case PL330_OP_START:
>> -               if (!_start(thrd))
>> +               if (!_start(thrd, dma_state))
>>                        ret = -EIO;
>>                break;
>>
>> --
> 
> IIUIC your race scenario should be taken care of simply by doing...
> 
> diff --git a/arch/arm/common/pl330.c b/arch/arm/common/pl330.c
> index 97912fa..7129cfb 100644
> --- a/arch/arm/common/pl330.c
> +++ b/arch/arm/common/pl330.c
> @@ -1546,7 +1546,7 @@ int pl330_chan_ctrl(void *ch_id, enum pl330_chan_op op)
> 
>  		/* Start the next */
>  	case PL330_OP_START:
> -		if (!_start(thrd))
> +		if (!_thrd_active(thrd) && !_start(thrd))
>  			ret = -EIO;
>  		break;
> 

My first reaction was this was just moving the race condition, but
thinking about it carefully, it looks like a better (simpler) solution.

> Could you please test if it fixes the issue?

Yes, I'll kick off runs later today and see how it goes.  I don't have a
way to reproduce the bug consistently, I just do lots of DMA
transactions and eventually, I end up hitting it.  I'll run it overnight
and see if it fixes it or not.

Cheers,
Javi

  reply	other threads:[~2011-09-20 13:36 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-19 17:11 [PATCH] ARM: pl330: Fix a race condition Javi Merino
2011-09-19 18:07 ` Jassi Brar
2011-09-20 13:36   ` Javi Merino [this message]
2011-10-05 12:57     ` Javi Merino
2011-10-06  9:10       ` [PATCH v2] " Javi Merino
2011-11-05 19:05         ` Thomas Abraham
2011-11-07 10:48           ` Javi Merino
2011-11-07 11:03             ` Thomas Abraham
2011-11-28  8:23             ` Boojin Kim
2011-11-28 16:36               ` Javi Merino
2011-11-29  3:41                 ` Boojin Kim
2011-11-29  9:53                   ` Javi Merino
2011-11-29 10:37                     ` Jassi Brar
2011-12-07  7:52                       ` Kukjin Kim
2011-12-07 10:01                         ` Javi Merino
2011-12-07 20:54                           ` Javi Merino
2011-12-09 11:58                             ` Javi Merino
2011-12-09 13:04                               ` Jassi Brar
2011-12-09 13:41                                 ` Javi Merino
2011-12-09 14:15                                   ` Jassi Brar
2011-12-09 14:52                                     ` Javi Merino
2011-12-09 16:50                                       ` Jassi Brar
2011-12-09 19:50                                         ` Javi Merino
2011-12-11 10:51                                           ` Jassi Brar
2011-12-11 15:09                                             ` Javi Merino
2011-12-11 17:10                                               ` Jassi Brar
2011-12-11 17:42                                                 ` Javi Merino
2011-12-11 19:27                                                   ` [PATCH] ARM: PL330: Fix driver freeze Javi Merino
2011-12-15 17:48                                                     ` Javi Merino
2011-12-16  9:01                                                       ` Tushar Behera
2011-12-16  6:27                                                     ` Jassi Brar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E7896D8.5090606@arm.com \
    --to=javi.merino@arm$(echo .)com \
    --cc=linux-arm-kernel@lists$(echo .)infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox