public inbox for linux-arm-kernel@lists.infradead.org 
 help / color / mirror / Atom feed
From: will.deacon@arm•com (Will Deacon)
To: linux-arm-kernel@lists•infradead.org
Subject: Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches
Date: Fri, 6 Jul 2018 13:26:11 +0100	[thread overview]
Message-ID: <20180706122610.GA6993@arm.com> (raw)
In-Reply-To: <20180702174552.GD23687@arm.com>

Hi Lucas,

On Mon, Jul 02, 2018 at 06:45:52PM +0100, Will Deacon wrote:
> On Mon, Jul 02, 2018 at 03:49:26PM +0200, Lucas Stach wrote:
> > Also the section "External caches", "Before AMBA4 ACE" seems to suggest
> > that the barriers aren't fully propagated to the PL310 write-caches and
> > master ports. I'm unsure if this is just an artifact of the mentioned
> > MMIO access, so handling of normal vs. device memory transactions in
> > the PL310.
> 
> I'll check with the hardware folks about the PL310, but I don't see how
> you could propagate barriers over AXI3 anyway because I don't think this
> stuff existed back then.

Damn. As I feared, you can't propagate the barrier to the PL310, so the
A9 is unable to enforce ordering beyond the responses it gets back. The
PL310 will then re-order normal memory accesses (including non-cacheable)
in its store buffer, so we do actually need a bloody CACHE SYNC here.

Patch below. Does it help?

Will

--->8

>From 5a2769b45e94b212d21bc44f5940ed5f3bde5cc2 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm•com>
Date: Wed, 4 Jul 2018 10:56:22 +0100
Subject: [PATCH] ARM: barrier: Kick outer-cache and SoC fabric in dma_wmb()

Prior to AMBA4-ACE, the AXI bus protocol had no support for barrier
transactions or shareability domains and was therefore unable to enforce
ordering between memory accesses either side of a barrier instruction,
requiring the CPU to handle all of the ordering itself.

Unfortunately, IP such as the venerable PL310 will re-order stores to
Normal memory inside its store-buffer, meaning that accesses to a coherent
DMA buffer (i.e. one allocated by dma_alloc_coherent()) can be observed
out-of-order by a non-coherent DMA engine unless the PL310 is kicked with
a CACHE_SYNC command between buffering the accesses. Whilst this is already
the case for mb() and wmb(), dma_wmb() just expands to a DMB OSHST, leading
to sequences such as:

	STR	Rdata, [Rbuf, #DATA_OFFSET]
	DMB	OSHST
	STR	Rvalid, [Rbuf, #VALID_OFFSET]

being reordered such that the DMA device sees the write of valid but not
the write of data.

This patch ensures that the SoC fabric and the outer-cache are kicked
by dma_wmb() if they are able to buffer transactions outside of the CPU
but before the Point of Coherency.

Reported-by: Lucas Stach <l.stach@pengutronix•de>
Signed-off-by: Will Deacon <will.deacon@arm•com>
---
 arch/arm/include/asm/barrier.h    | 14 ++++++++++----
 arch/arm/include/asm/outercache.h |  6 ++++++
 arch/arm/mm/flush.c               | 24 +++++++++++++++++++++++-
 3 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 69772e742a0a..e31833f72b69 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -53,17 +53,23 @@
 #ifdef CONFIG_ARM_HEAVY_MB
 extern void (*soc_mb)(void);
 extern void arm_heavy_mb(void);
-#define __arm_heavy_mb(x...) do { dsb(x); arm_heavy_mb(); } while (0)
+extern void arm_heavy_wmb(void);
+extern void arm_heavy_dma_wmb(void);
+#define __arm_heavy_mb()	arm_heavy_mb()
+#define __arm_heavy_wmb()	arm_heavy_wmb()
+#define __arm_heavy_dma_wmb()	arm_heavy_dma_wmb()
 #else
-#define __arm_heavy_mb(x...) dsb(x)
+#define __arm_heavy_mb()	dsb()
+#define __arm_heavy_wmb()	dsb(st)
+#define __arm_heavy_dma_wmb()	dmb(oshst)
 #endif
 
 #if defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
 #define mb()		__arm_heavy_mb()
 #define rmb()		dsb()
-#define wmb()		__arm_heavy_mb(st)
+#define wmb()		__arm_heavy_wmb()
 #define dma_rmb()	dmb(osh)
-#define dma_wmb()	dmb(oshst)
+#define dma_wmb()	__arm_heavy_dma_wmb()
 #else
 #define mb()		barrier()
 #define rmb()		barrier()
diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index c2bf24f40177..e744efe6e2a7 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -43,6 +43,12 @@ struct outer_cache_fns {
 
 extern struct outer_cache_fns outer_cache;
 
+#ifdef CONFIG_OUTER_CACHE_SYNC
+static inline bool outer_cache_has_sync(void) { return outer_cache.sync; }
+#else
+static inline bool outer_cache_has_sync(void) { return 0; }
+#endif
+
 #ifdef CONFIG_OUTER_CACHE
 /**
  * outer_inv_range - invalidate range of outer cache lines
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 58469623b015..1252689362d8 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -24,7 +24,7 @@
 #ifdef CONFIG_ARM_HEAVY_MB
 void (*soc_mb)(void);
 
-void arm_heavy_mb(void)
+static void __arm_heavy_sync(void)
 {
 #ifdef CONFIG_OUTER_CACHE_SYNC
 	if (outer_cache.sync)
@@ -33,7 +33,29 @@ void arm_heavy_mb(void)
 	if (soc_mb)
 		soc_mb();
 }
+
+void arm_heavy_mb(void)
+{
+	dsb();
+	__arm_heavy_sync();
+}
 EXPORT_SYMBOL(arm_heavy_mb);
+
+void arm_heavy_wmb(void)
+{
+	dsb(st);
+	__arm_heavy_sync();
+}
+EXPORT_SYMBOL(arm_heavy_wmb);
+
+void arm_heavy_dma_wmb(void)
+{
+	if (outer_cache_has_sync() || soc_mb)
+		arm_heavy_wmb();
+	else
+		dmb(oshst);
+}
+EXPORT_SYMBOL(arm_heavy_dma_wmb);
 #endif
 
 #ifdef CONFIG_CPU_CACHE_VIPT
-- 
2.1.4

  reply	other threads:[~2018-07-06 12:26 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 12:28 Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches Lucas Stach
2018-06-29 14:25 ` Russell King - ARM Linux
2018-06-29 16:19   ` Lucas Stach
2018-06-29 16:22   ` Will Deacon
2018-06-29 16:48     ` Russell King - ARM Linux
2018-06-29 17:43       ` Will Deacon
2018-06-29 18:01         ` Russell King - ARM Linux
2018-07-02 13:49         ` Lucas Stach
2018-07-02 17:45           ` Will Deacon
2018-07-06 12:26             ` Will Deacon [this message]
2018-07-09  6:20               ` Oleksij Rempel
2018-09-13 13:17                 ` Will Deacon
2018-09-13 14:09                   ` Oleksij Rempel
2018-07-09  9:45               ` Lucas Stach
2018-06-29 17:14     ` Lucas Stach
2018-06-29 17:46       ` Will Deacon
2018-07-02  9:58         ` Lucas Stach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180706122610.GA6993@arm.com \
    --to=will.deacon@arm$(echo .)com \
    --cc=linux-arm-kernel@lists$(echo .)infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox