[PATCH] arm: spin one more cycle in timer-based delays

public inbox for linux-arm-kernel@lists.infradead.org 
 help / color / mirror / Atom feed

From: slash.tmp@free•fr (Mason)
To: linux-arm-kernel@lists•infradead.org
Subject: [PATCH] arm: spin one more cycle in timer-based delays
Date: Fri, 18 Nov 2016 18:51:19 +0100	[thread overview]
Message-ID: <582F3F97.2010701@free.fr> (raw)
In-Reply-To: <582F0DD2.3030805@free.fr>

On 18/11/2016 15:18, Mason wrote:
> On 18/11/2016 13:54, Russell King - ARM Linux wrote:
>> On Fri, Nov 18, 2016 at 12:06:30PM +0000, Will Deacon wrote:
>>> On Tue, Nov 15, 2016 at 02:36:33PM +0100, Mason wrote:
>>>> When polling a tick counter in a busy loop, one might sample the
>>>> counter just *before* it is updated, and then again just *after*
>>>> it is updated. In that case, while mathematically v2-v1 equals 1,
>>>> only epsilon has really passed.
>>>>
>>>> So, if a caller requests an N-cycle delay, we spin until v2-v1
>>>> is strictly greater than N to avoid these random corner cases.
>>>>
>>>> Signed-off-by: Mason <slash.tmp@free•fr>
>>>> ---
>>>>  arch/arm/lib/delay.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/arm/lib/delay.c b/arch/arm/lib/delay.c
>>>> index 69aad80a3af4..3f1cd15ca102 100644
>>>> --- a/arch/arm/lib/delay.c
>>>> +++ b/arch/arm/lib/delay.c
>>>> @@ -60,7 +60,7 @@ static void __timer_delay(unsigned long cycles)
>>>>  {
>>>>  	cycles_t start = get_cycles();
>>>>  
>>>> -	while ((get_cycles() - start) < cycles)
>>>> +	while ((get_cycles() - start) <= cycles)
>>>>  		cpu_relax();
>>>>  }
>>>
>>> I thought a bit about this last night. It is well known that the delay
>>> routines are not guaranteed to be accurate, and I don't *think* that's
>>> limited to over-spinning, so arguably this isn't a bug. However, taking
>>> the approach that "drivers should figure it out" is also unhelpful,
>>> because the frequency of the underlying counter isn't generally known
>>> and so drivers can't simply adjust the delay parameter.
>>
>> I don't think this change makes any sense whatsoever.  udelay() is
>> inaccurate, period.  It _will_ give delays shorter than requested
>> for many reasons, many of which can't be fixed.
> 
> If I understand correctly, udelay is, in fact, a wrapper around
> several possible implementations. (Run-time decision on arch/arm)
> 
> 1) The "historical" software loop-based method, which is calibrated
> during init.
> 
> 2) A hardware solution, based on an actual timer, ticking away
> at a constant frequency.
> 
> 3) others?
> 
> Solution 1 (which probably dates back to Linux 0.1) suffers from
> the inaccuracies you point out, because of interrupt latency
> during calibration, DVFS, etc.
> 
> On the other hand, it is simple enough to implement solution 2
> in a way that guarantees that spin_for_n_cycles(n) spins
> /at least/ for n cycles.
> 
> On platforms using timer-based delays, there can indeed exist
> a function guaranteeing a minimal delay.
> 
> 
>> Having a super-accurate version just encourages people to write broken
>> drivers which assume (eg) that udelay(10) will give _at least_ a 10us
>> delay.  However, there is no such guarantee.
> 
> You say that calling udelay(10) expecting at least 10 ?s is broken.
> This fails the principle of least astonishment in a pretty big way.
> (If it returns after 9.9 ?s, I suppose it could be OK. But for
> shorter delays, the relative error grows.)
> 
> A lot of drivers implement a spec that says
> "do this, wait 1 ?s, do that".
> 
> Driver writers would typically write
> do_this(); udelay(1); do_that();
> 
> Do you suggest they should write udelay(2); ?
> But then, it's not so easy to follow the spec because everything
> has been translated to a different number.
> Hide everything behind a macro?
> 
> Note that people have been using ndelay too.
> (I count 182 in drivers, 288 overall.)
> 
> drivers/cpufreq/s3c2440-cpufreq.c:      ndelay(20);
> 
> I don't think the author expects this to return immediately.
> 
> 
>> So, having udelay() for timers return slightly short is actually a good
>> thing - it causes people not to make the mistake to be soo accurate
>> with their delay specifications.
> 
> I disagree that having an implementation which introduces
> hard-to-track-down random heisenbugs is a good thing.
> 
> 
>> So, NAK on this change.  udelay is not super-accurate.
> 
> usleep_range() fixed this issue recently.
> 6c5e9059692567740a4ee51530dffe51a4b9584d
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=timers/core&id=6c5e9059692567740a4ee51530dffe51a4b9584d
> 
> Do you suggest driver writers should use usleep_range()
> instead of udelay?
> (When does usleep_range start making sense? Around 50/100 ?s
> and above?)
> 
>> Reference (and this is the most important one):
>>
>>   http://lists.openwall.net/linux-kernel/2011/01/12/372

I forgot to explain how I came to work in this area of the code.

When my NAND controller's driver probes, it scans the chips
for bad blocks, which generates 65000 calls to ndelay(100);
(for 1 GB of NAND). For larger sizes of NAND, the number can
climb to 500k calls.

Currently, on arch/arm, ndelay is rounded *up* to the
nearest microsecond. So this might generate a total
delay of up to 500 ms, when 50 ms would be sufficient.

So I locally defined ndelay as
#define NDELAY_MULT	((UL(2199)    * HZ) >> 11)
#define ndelay(n)	__const_udelay((n) * NDELAY_MULT)

I use a 27 MHz tick counter (37 ns period).
ndelay() was resolving to __timer_delay(2)
which might delay only a single tick, so 37 ns.

So the NAND framework occasionally (falsely) flags a block
as bad (random).

There are two sources of error in the timer-based code:
1) rounding down in __timer_const_udelay => loses 1 cycle
2) spinning one cycle too few => loses 1 cycle

With these two errors corrected, I am able to init the
NAND driver faster, with no blocks incorrectly marked
as bad.

Regards.

next prev parent reply	other threads:[~2016-11-18 17:51 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-15 13:36 [PATCH] arm: spin one more cycle in timer-based delays Mason
2016-11-18 12:06 ` Will Deacon
2016-11-18 12:24   ` Mason
2016-11-18 12:54   ` Russell King - ARM Linux
2016-11-18 14:18     ` Mason
2016-11-18 14:42       ` Peter Zijlstra
2016-11-18 17:51       ` Mason [this message]
2016-11-19  7:17       ` Afzal Mohammed
2016-11-19 11:03         ` Russell King - ARM Linux
2016-11-19 18:29           ` Mason
2016-11-20 19:18             ` Doug Anderson
2016-11-20 19:44               ` Russell King - ARM Linux
2016-11-20 20:00                 ` Mason
2016-11-20  6:15           ` Afzal Mohammed
2016-11-20 19:15           ` Doug Anderson
2016-11-18 17:13     ` Doug Anderson
2016-11-18 17:32       ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=582F3F97.2010701@free.fr \
    --to=slash.tmp@free$(echo .)fr \
    --cc=linux-arm-kernel@lists$(echo .)infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox