From: catalin.marinas@arm•com (Catalin Marinas)
To: linux-arm-kernel@lists•infradead.org
Subject: [PATCH 1/2] arm64: atomics: fix use of acquire + release for full barrier semantics
Date: Wed, 19 Feb 2014 17:05:15 +0000 [thread overview]
Message-ID: <20140219170515.GE22252@arm.com> (raw)
In-Reply-To: <1391516953-14541-1-git-send-email-will.deacon@arm.com>
On Tue, Feb 04, 2014 at 12:29:12PM +0000, Will Deacon wrote:
> Linux requires a number of atomic operations to provide full barrier
> semantics, that is no memory accesses after the operation can be
> observed before any accesses up to and including the operation in
> program order.
>
> On arm64, these operations have been incorrectly implemented as follows:
>
> // A, B, C are independent memory locations
>
> <Access [A]>
>
> // atomic_op (B)
> 1: ldaxr x0, [B] // Exclusive load with acquire
> <op(B)>
> stlxr w1, x0, [B] // Exclusive store with release
> cbnz w1, 1b
>
> <Access [C]>
>
> The assumption here being that two half barriers are equivalent to a
> full barrier, so the only permitted ordering would be A -> B -> C
> (where B is the atomic operation involving both a load and a store).
>
> Unfortunately, this is not the case by the letter of the architecture
> and, in fact, the accesses to A and C are permitted to pass their
> nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs
> or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the
> store-release on B). This is a clear violation of the full barrier
> requirement.
>
> The simple way to fix this is to implement the same algorithm as ARMv7
> using explicit barriers:
>
> <Access [A]>
>
> // atomic_op (B)
> dmb ish // Full barrier
> 1: ldxr x0, [B] // Exclusive load
> <op(B)>
> stxr w1, x0, [B] // Exclusive store
> cbnz w1, 1b
> dmb ish // Full barrier
>
> <Access [C]>
>
> but this has the undesirable effect of introducing *two* full barrier
> instructions. A better approach is actually the following, non-intuitive
> sequence:
>
> <Access [A]>
>
> // atomic_op (B)
> 1: ldxr x0, [B] // Exclusive load
> <op(B)>
> stlxr w1, x0, [B] // Exclusive store with release
> cbnz w1, 1b
> dmb ish // Full barrier
>
> <Access [C]>
>
> The simple observations here are:
>
> - The dmb ensures that no subsequent accesses (e.g. the access to C)
> can enter or pass the atomic sequence.
>
> - The dmb also ensures that no prior accesses (e.g. the access to A)
> can pass the atomic sequence.
>
> - Therefore, no prior access can pass a subsequent access, or
> vice-versa (i.e. A is strictly ordered before C).
>
> - The stlxr ensures that no prior access can pass the store component
> of the atomic operation.
>
> The only tricky part remaining is the ordering between the ldxr and the
> access to A, since the absence of the first dmb means that we're now
> permitting re-ordering between the ldxr and any prior accesses.
>
> From an (arbitrary) observer's point of view, there are two scenarios:
>
> 1. We have observed the ldxr. This means that if we perform a store to
> [B], the ldxr will still return older data. If we can observe the
> ldxr, then we can potentially observe the permitted re-ordering
> with the access to A, which is clearly an issue when compared to
> the dmb variant of the code. Thankfully, the exclusive monitor will
> save us here since it will be cleared as a result of the store and
> the ldxr will retry. Notice that any use of a later memory
> observation to imply observation of the ldxr will also imply
> observation of the access to A, since the stlxr/dmb ensure strict
> ordering.
>
> 2. We have not observed the ldxr. This means we can perform a store
> and influence the later ldxr. However, that doesn't actually tell
> us anything about the access to [A], so we've not lost anything
> here either when compared to the dmb variant.
>
> This patch implements this solution for our barriered atomic operations,
> ensuring that we satisfy the full barrier requirements where they are
> needed.
>
> Cc: <stable@vger•kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm•com>
> Cc: Peter Zijlstra <peterz@infradead•org>
> Signed-off-by: Will Deacon <will.deacon@arm•com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm•com>
prev parent reply other threads:[~2014-02-19 17:05 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-04 12:29 [PATCH 1/2] arm64: atomics: fix use of acquire + release for full barrier semantics Will Deacon
2014-02-04 12:29 ` [PATCH 2/2] arm64: asm: remove redundant "cc" clobbers Will Deacon
2014-02-19 17:05 ` Catalin Marinas
2014-02-04 16:43 ` [PATCH 1/2] arm64: atomics: fix use of acquire + release for full barrier semantics Peter Zijlstra
2014-02-04 17:16 ` Will Deacon
2014-02-19 17:05 ` Catalin Marinas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140219170515.GE22252@arm.com \
--to=catalin.marinas@arm$(echo .)com \
--cc=linux-arm-kernel@lists$(echo .)infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox