public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Scott Wood <scottwood@freescale•com>
To: Benjamin Herrenschmidt <benh@kernel•crashing.org>
Cc: linuxppc-dev@lists•ozlabs.org
Subject: Re: [PATCH v2 1/3] powerpc/booke64: add sync after writing PTE
Date: Thu, 10 Oct 2013 18:25:32 -0500	[thread overview]
Message-ID: <1381447532.7979.488.camel@snotra.buserror.net> (raw)
In-Reply-To: <1381444273.7979.473.camel@snotra.buserror.net>

On Thu, 2013-10-10 at 17:31 -0500, Scott Wood wrote:
> On Mon, 2013-09-16 at 19:06 -0500, Scott Wood wrote:
> > On Mon, 2013-09-16 at 07:38 +1000, Benjamin Herrenschmidt wrote:
> > > On Fri, 2013-09-13 at 22:50 -0500, Scott Wood wrote:
> > > > The ISA says that a sync is needed to order a PTE write with a
> > > > subsequent hardware tablewalk lookup.  On e6500, without this sync
> > > > we've been observed to die with a DSI due to a PTE write not being seen
> > > > by a subsequent access, even when everything happens on the same
> > > > CPU.
> > > 
> > > This is gross, I didn't realize we had that bogosity in the
> > > architecture...
> > > 
> > > Did you measure the performance impact ?
> > 
> > I didn't see a noticeable impact on the tests I ran, but those were
> > aimed at measuring TLB miss overhead.  I'll need to try it with a
> > benchmark that's more oriented around lots of page table updates.
> 
> Lmbench's fork test runs about 2% slower with the sync.  I've been told
> that nothing relevant has changed since we saw the failure during
> emulation; it's probably luck and/or timing, or maybe a sync got added
> somewhere else since then?  I think it's only really a problem for
> kernel page tables, since user page tables will retry if do_page_fault()
> sees a valid PTE.  So maybe we should put an mb() in map_kernel_page()
> instead.

Looking at some of the code in mm/, I suspect that the normal callers of
set_pte_at() already have an unlock (and thus a sync) already, so we may
not even be relying on those retries.  Certainly some of them do; it
would take some effort to verify all of them.

Also, without such a sync in map_kernel_page(), even with software
tablewalk, couldn't we theoretically have a situation where a store to
pointer X that exposes a new mapping gets reordered before the PTE store
as seen by another CPU?  The other CPU could see non-NULL X and
dereference it, but get the stale PTE.  Callers of ioremap() generally
don't do a barrier of their own prior to exposing the result.

-Scott

  reply	other threads:[~2013-10-10 23:25 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-14  3:50 [PATCH v2 1/3] powerpc/booke64: add sync after writing PTE Scott Wood
2013-09-14  3:50 ` [PATCH v2 2/3] powerpc/e6500: TLB miss handler with hardware tablewalk support Scott Wood
2013-09-14  3:50 ` [PATCH v2 3/3] powerpc/fsl-book3e-64: Use paca for hugetlb TLB1 entry selection Scott Wood
2013-09-15 21:38 ` [PATCH v2 1/3] powerpc/booke64: add sync after writing PTE Benjamin Herrenschmidt
2013-09-17  0:06   ` Scott Wood
2013-10-10 22:31     ` Scott Wood
2013-10-10 23:25       ` Scott Wood [this message]
2013-10-10 23:51         ` Benjamin Herrenschmidt
2013-10-11 22:07           ` Scott Wood
2013-10-11 22:34             ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1381447532.7979.488.camel@snotra.buserror.net \
    --to=scottwood@freescale$(echo .)com \
    --cc=benh@kernel$(echo .)crashing.org \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox