From: will.deacon@arm•com (Will Deacon)
To: linux-arm-kernel@lists•infradead.org
Subject: Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches
Date: Fri, 6 Jul 2018 13:26:11 +0100 [thread overview]
Message-ID: <20180706122610.GA6993@arm.com> (raw)
In-Reply-To: <20180702174552.GD23687@arm.com>
Hi Lucas,
On Mon, Jul 02, 2018 at 06:45:52PM +0100, Will Deacon wrote:
> On Mon, Jul 02, 2018 at 03:49:26PM +0200, Lucas Stach wrote:
> > Also the section "External caches", "Before AMBA4 ACE" seems to suggest
> > that the barriers aren't fully propagated to the PL310 write-caches and
> > master ports. I'm unsure if this is just an artifact of the mentioned
> > MMIO access, so handling of normal vs. device memory transactions in
> > the PL310.
>
> I'll check with the hardware folks about the PL310, but I don't see how
> you could propagate barriers over AXI3 anyway because I don't think this
> stuff existed back then.
Damn. As I feared, you can't propagate the barrier to the PL310, so the
A9 is unable to enforce ordering beyond the responses it gets back. The
PL310 will then re-order normal memory accesses (including non-cacheable)
in its store buffer, so we do actually need a bloody CACHE SYNC here.
Patch below. Does it help?
Will
--->8
>From 5a2769b45e94b212d21bc44f5940ed5f3bde5cc2 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm•com>
Date: Wed, 4 Jul 2018 10:56:22 +0100
Subject: [PATCH] ARM: barrier: Kick outer-cache and SoC fabric in dma_wmb()
Prior to AMBA4-ACE, the AXI bus protocol had no support for barrier
transactions or shareability domains and was therefore unable to enforce
ordering between memory accesses either side of a barrier instruction,
requiring the CPU to handle all of the ordering itself.
Unfortunately, IP such as the venerable PL310 will re-order stores to
Normal memory inside its store-buffer, meaning that accesses to a coherent
DMA buffer (i.e. one allocated by dma_alloc_coherent()) can be observed
out-of-order by a non-coherent DMA engine unless the PL310 is kicked with
a CACHE_SYNC command between buffering the accesses. Whilst this is already
the case for mb() and wmb(), dma_wmb() just expands to a DMB OSHST, leading
to sequences such as:
STR Rdata, [Rbuf, #DATA_OFFSET]
DMB OSHST
STR Rvalid, [Rbuf, #VALID_OFFSET]
being reordered such that the DMA device sees the write of valid but not
the write of data.
This patch ensures that the SoC fabric and the outer-cache are kicked
by dma_wmb() if they are able to buffer transactions outside of the CPU
but before the Point of Coherency.
Reported-by: Lucas Stach <l.stach@pengutronix•de>
Signed-off-by: Will Deacon <will.deacon@arm•com>
---
arch/arm/include/asm/barrier.h | 14 ++++++++++----
arch/arm/include/asm/outercache.h | 6 ++++++
arch/arm/mm/flush.c | 24 +++++++++++++++++++++++-
3 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 69772e742a0a..e31833f72b69 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -53,17 +53,23 @@
#ifdef CONFIG_ARM_HEAVY_MB
extern void (*soc_mb)(void);
extern void arm_heavy_mb(void);
-#define __arm_heavy_mb(x...) do { dsb(x); arm_heavy_mb(); } while (0)
+extern void arm_heavy_wmb(void);
+extern void arm_heavy_dma_wmb(void);
+#define __arm_heavy_mb() arm_heavy_mb()
+#define __arm_heavy_wmb() arm_heavy_wmb()
+#define __arm_heavy_dma_wmb() arm_heavy_dma_wmb()
#else
-#define __arm_heavy_mb(x...) dsb(x)
+#define __arm_heavy_mb() dsb()
+#define __arm_heavy_wmb() dsb(st)
+#define __arm_heavy_dma_wmb() dmb(oshst)
#endif
#if defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
#define mb() __arm_heavy_mb()
#define rmb() dsb()
-#define wmb() __arm_heavy_mb(st)
+#define wmb() __arm_heavy_wmb()
#define dma_rmb() dmb(osh)
-#define dma_wmb() dmb(oshst)
+#define dma_wmb() __arm_heavy_dma_wmb()
#else
#define mb() barrier()
#define rmb() barrier()
diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index c2bf24f40177..e744efe6e2a7 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -43,6 +43,12 @@ struct outer_cache_fns {
extern struct outer_cache_fns outer_cache;
+#ifdef CONFIG_OUTER_CACHE_SYNC
+static inline bool outer_cache_has_sync(void) { return outer_cache.sync; }
+#else
+static inline bool outer_cache_has_sync(void) { return 0; }
+#endif
+
#ifdef CONFIG_OUTER_CACHE
/**
* outer_inv_range - invalidate range of outer cache lines
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 58469623b015..1252689362d8 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -24,7 +24,7 @@
#ifdef CONFIG_ARM_HEAVY_MB
void (*soc_mb)(void);
-void arm_heavy_mb(void)
+static void __arm_heavy_sync(void)
{
#ifdef CONFIG_OUTER_CACHE_SYNC
if (outer_cache.sync)
@@ -33,7 +33,29 @@ void arm_heavy_mb(void)
if (soc_mb)
soc_mb();
}
+
+void arm_heavy_mb(void)
+{
+ dsb();
+ __arm_heavy_sync();
+}
EXPORT_SYMBOL(arm_heavy_mb);
+
+void arm_heavy_wmb(void)
+{
+ dsb(st);
+ __arm_heavy_sync();
+}
+EXPORT_SYMBOL(arm_heavy_wmb);
+
+void arm_heavy_dma_wmb(void)
+{
+ if (outer_cache_has_sync() || soc_mb)
+ arm_heavy_wmb();
+ else
+ dmb(oshst);
+}
+EXPORT_SYMBOL(arm_heavy_dma_wmb);
#endif
#ifdef CONFIG_CPU_CACHE_VIPT
--
2.1.4
next prev parent reply other threads:[~2018-07-06 12:26 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-29 12:28 Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches Lucas Stach
2018-06-29 14:25 ` Russell King - ARM Linux
2018-06-29 16:19 ` Lucas Stach
2018-06-29 16:22 ` Will Deacon
2018-06-29 16:48 ` Russell King - ARM Linux
2018-06-29 17:43 ` Will Deacon
2018-06-29 18:01 ` Russell King - ARM Linux
2018-07-02 13:49 ` Lucas Stach
2018-07-02 17:45 ` Will Deacon
2018-07-06 12:26 ` Will Deacon [this message]
2018-07-09 6:20 ` Oleksij Rempel
2018-09-13 13:17 ` Will Deacon
2018-09-13 14:09 ` Oleksij Rempel
2018-07-09 9:45 ` Lucas Stach
2018-06-29 17:14 ` Lucas Stach
2018-06-29 17:46 ` Will Deacon
2018-07-02 9:58 ` Lucas Stach
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180706122610.GA6993@arm.com \
--to=will.deacon@arm$(echo .)com \
--cc=linux-arm-kernel@lists$(echo .)infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox