public inbox for linux-arm-kernel@lists.infradead.org 
 help / color / mirror / Atom feed
From: gregory.clement@free-electrons•com (Gregory CLEMENT)
To: linux-arm-kernel@lists•infradead.org
Subject: [PATCH 3/4] ARM: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead
Date: Wed, 15 May 2013 15:18:53 +0200	[thread overview]
Message-ID: <51938B3D.2070508@free-electrons.com> (raw)
In-Reply-To: <1364235581-17900-4-git-send-email-will.deacon@arm.com>

Hi Will,

On 03/25/2013 07:19 PM, Will Deacon wrote:
> Many ARMv7 cores have hardware page table walkers that can read the L1
> cache. This is discoverable from the ID_MMFR3 register, although this
> can be expensive to access from the low-level set_pte functions and is a
> pain to cache, particularly with multi-cluster systems.
> 
> A useful observation is that the multi-processing extensions for ARMv7
> require coherent table walks, meaning that we can make use of ALT_SMP
> patching in proc-v7-* to patch away the cache flush safely for these
> cores.

I encountered a regression with 3.10-rc1 on the Armada 370 based boards.
With the 3.10-rc1 they hang during auto testy of the xor engine which are
mainly DMA transfers. If I revert this patch, it no more hang. I found this
by using bisect, it was not obvious at all for me that this patch may have
cause this regression.
The issue appear in SMP and in UP. However I think that  the PJ4B-v7 used in
 the Armada 370 are not MP capable.

I made some investigation. And in UP if I remove the line:
	ALT_UP(W(nop))

at the begining of the cpu_v7_dcache_clean_are macro located in
arch/arm/mm/proc-v7.S

Then the kernel boot again. It is not surprising because in this case
we found the same generated code that before this patch was applied.

now I don't really understand why a W(nop) will cause this issue.

In SMP mode even with this line removed the kernel hang, but in this case
I am not sure of what happen exactly and how the .alt.smp.init section is
used.

I don't know if it is relevant but I tested with these 2 version of gcc:
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
and
gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1)

I hope you will find some explanation and solution to this bug, because currently
the only solution I have is to revert this patch.

Thanks,
Gregory
> 
> Reported-by: Albin Tonnerre <Albin.Tonnerre@arm•com>
> Signed-off-by: Will Deacon <will.deacon@arm•com>
> ---
>  arch/arm/include/asm/tlbflush.h | 2 +-
>  arch/arm/mm/proc-v6.S           | 2 --
>  arch/arm/mm/proc-v7-2level.S    | 3 ++-
>  arch/arm/mm/proc-v7-3level.S    | 3 ++-
>  arch/arm/mm/proc-v7.S           | 4 ++--
>  5 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
> index c7cdb59..42d155e 100644
> --- a/arch/arm/include/asm/tlbflush.h
> +++ b/arch/arm/include/asm/tlbflush.h
> @@ -169,7 +169,7 @@
>  # define v6wbi_always_flags	(-1UL)
>  #endif
>  
> -#define v7wbi_tlb_flags_smp	(TLB_WB | TLB_DCLEAN | TLB_BARRIER | \
> +#define v7wbi_tlb_flags_smp	(TLB_WB | TLB_BARRIER | \
>  				 TLB_V6_U_FULL | TLB_V6_U_PAGE | \
>  				 TLB_V6_U_ASID | TLB_V6_BP | \
>  				 TLB_V7_UIS_FULL | TLB_V7_UIS_PAGE | \
> diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
> index bcaaa8d..a286d47 100644
> --- a/arch/arm/mm/proc-v6.S
> +++ b/arch/arm/mm/proc-v6.S
> @@ -80,12 +80,10 @@ ENTRY(cpu_v6_do_idle)
>  	mov	pc, lr
>  
>  ENTRY(cpu_v6_dcache_clean_area)
> -#ifndef TLB_CAN_READ_FROM_L1_CACHE
>  1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
>  	add	r0, r0, #D_CACHE_LINE_SIZE
>  	subs	r1, r1, #D_CACHE_LINE_SIZE
>  	bhi	1b
> -#endif
>  	mov	pc, lr
>  
>  /*
> diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
> index 78f520b..9704097 100644
> --- a/arch/arm/mm/proc-v7-2level.S
> +++ b/arch/arm/mm/proc-v7-2level.S
> @@ -110,7 +110,8 @@ ENTRY(cpu_v7_set_pte_ext)
>   ARM(	str	r3, [r0, #2048]! )
>   THUMB(	add	r0, r0, #2048 )
>   THUMB(	str	r3, [r0] )
> -	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
> +	ALT_SMP(mov	pc,lr)
> +	ALT_UP (mcr	p15, 0, r0, c7, c10, 1)		@ flush_pte
>  #endif
>  	mov	pc, lr
>  ENDPROC(cpu_v7_set_pte_ext)
> diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
> index 6ffd78c..363027e 100644
> --- a/arch/arm/mm/proc-v7-3level.S
> +++ b/arch/arm/mm/proc-v7-3level.S
> @@ -73,7 +73,8 @@ ENTRY(cpu_v7_set_pte_ext)
>  	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
>  	orreq	r2, #L_PTE_RDONLY
>  1:	strd	r2, r3, [r0]
> -	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
> +	ALT_SMP(mov	pc, lr)
> +	ALT_UP (mcr	p15, 0, r0, c7, c10, 1)		@ flush_pte
>  #endif
>  	mov	pc, lr
>  ENDPROC(cpu_v7_set_pte_ext)
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 3a3c015..37716b0 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -75,14 +75,14 @@ ENTRY(cpu_v7_do_idle)
>  ENDPROC(cpu_v7_do_idle)
>  
>  ENTRY(cpu_v7_dcache_clean_area)
> -#ifndef TLB_CAN_READ_FROM_L1_CACHE
> +	ALT_SMP(mov	pc, lr)			@ MP extensions imply L1 PTW
> +	ALT_UP(W(nop))
>  	dcache_line_size r2, r3
>  1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
>  	add	r0, r0, r2
>  	subs	r1, r1, r2
>  	bhi	1b
>  	dsb
> -#endif
>  	mov	pc, lr
>  ENDPROC(cpu_v7_dcache_clean_area)
>  
> 


-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

  parent reply	other threads:[~2013-05-15 13:18 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-25 18:19 [PATCH 0/4] TLB and mm-related optimisations Will Deacon
2013-03-25 18:19 ` [PATCH 1/4] ARM: tlb: don't perform inner-shareable invalidation for local TLB ops Will Deacon
2013-03-27 10:34   ` Catalin Marinas
2013-03-27 12:07     ` Will Deacon
2013-03-27 12:30       ` Catalin Marinas
2013-03-27 12:56         ` Will Deacon
2013-03-27 13:40           ` Catalin Marinas
2013-03-27 13:54             ` Will Deacon
2013-03-25 18:19 ` [PATCH 2/4] ARM: tlb: don't perform inner-shareable invalidation for local BP ops Will Deacon
2013-03-27 10:36   ` Catalin Marinas
2013-03-25 18:19 ` [PATCH 3/4] ARM: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead Will Deacon
2013-03-27 10:53   ` Catalin Marinas
2013-03-27 12:20     ` Will Deacon
2013-05-15 13:18   ` Gregory CLEMENT [this message]
2013-05-15 13:41     ` Will Deacon
2013-05-15 13:54       ` Gregory CLEMENT
2013-05-15 14:06         ` Will Deacon
2013-05-15 14:46           ` Gregory CLEMENT
2013-05-15 15:04             ` Will Deacon
2013-05-15 15:36               ` Gregory CLEMENT
2013-05-15 15:41                 ` Will Deacon
2013-05-15 16:29                   ` Gregory CLEMENT
2013-05-15 16:48                     ` Will Deacon
2013-05-15 17:16                       ` Russell King - ARM Linux
2013-03-25 18:19 ` [PATCH 4/4] ARM: atomics: don't use exclusives for atomic64 read/set with LPAE Will Deacon
2013-03-27 10:57   ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51938B3D.2070508@free-electrons.com \
    --to=gregory.clement@free-electrons$(echo .)com \
    --cc=linux-arm-kernel@lists$(echo .)infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox