public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop•org>
To: Benjamin Herrenschmidt <benh@kernel•crashing.org>
Cc: linuxppc-dev@lists•ozlabs.org,
	Andrew Morton <akpm@linux-foundation•org>,
	Hugh Dickins <hughd@google•com>,
	Peter Zijlstra <a.p.zijlstra@chello•nl>
Subject: Re: mmotm threatens ppc preemption again
Date: Tue, 22 Mar 2011 13:34:24 +0000	[thread overview]
Message-ID: <4D88A560.8080405@goop.org> (raw)
In-Reply-To: <1300747942.2402.262.camel@pasglop>

On 03/21/2011 10:52 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2011-03-21 at 11:24 +0000, Jeremy Fitzhardinge wrote:
>> I'm very sorry about that, I didn't realize power was also using that
>> interface.  Unfortunately, the "no preemption" definition was an error,
>> and had to be changed to match the pre-existing locking rules.
>>
>> Could you implement a similar "flush batched pte updates on context
>> switch" as x86? 
> Well, we already do that for -rt & co.
>
> However, we have another issue which is the reason we used those
> lazy_mmu hooks to do our flushing.
>
> Our PTEs eventually get faulted into a hash table which is what the real
> MMU uses. We must never (ever) allow that hash table to contain a
> duplicate entry for a given virtual address.
>
> When we do a batch, we remove things from the linux PTE, and keep a
> reference in our batch structure, and only update the hash table at the
> end of the batch.

Wouldn't implicitly ending a batch on context switch get the same effect?

> That means that we must not allow a hash fault to populate the hash with
> a "new" PTE value prior to the old one having been flushed out (which is
> possible if they different in protection attributes for example). For
> that to happen, we must basically not allow a page fault to re-populate
> a PTE invalidated by a batch before that batch has completed.

Kernel ptes are not generally populated on fault though, unless there's
something in power?  On x86 it can happen when syncing a process's
kernel pmd with the init_mm one, but that shouldn't happen in the middle
of an update since you'd deadlock anyway.  If a particular kernel
subsystem has its own locks to manage the ptes for a kernel mapping,
then that should prevent any nested updates within a batch shouldn't it?

> That translates to batches must only happen within a PTE lock section.

Well, in that case, I guess your best bet is to disable batching for
kernel pagetable updates.  These apply_to_page_range() changes are the
first time any attempt to batch kernel pagetable updates has been made
(otherwise you would have seen this problem earlier), so not batching
them will not be a regression for you.

But I'm not sure what the proper fix to get batching in your case will
be.  But the assumption that there's a pte lock for kernel ptes is not
valid.

    J

      reply	other threads:[~2011-03-23  1:23 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-20  4:11 mmotm threatens ppc preemption again Hugh Dickins
2011-03-20 23:53 ` Benjamin Herrenschmidt
2011-03-21  1:41   ` Hugh Dickins
2011-03-21  1:50     ` Benjamin Herrenschmidt
2011-03-21  2:20       ` Hugh Dickins
2011-03-21  2:22         ` Benjamin Herrenschmidt
2011-03-30 20:53           ` Andrew Morton
2011-03-30 21:07             ` Jeremy Fitzhardinge
2011-03-31  0:52               ` Benjamin Herrenschmidt
2011-03-31 17:21                 ` Jeremy Fitzhardinge
2011-03-31 20:38                   ` Benjamin Herrenschmidt
2011-05-18 23:29                     ` Jeremy Fitzhardinge
2011-03-21 11:24   ` Jeremy Fitzhardinge
2011-03-21 22:52     ` Benjamin Herrenschmidt
2011-03-22 13:34       ` Jeremy Fitzhardinge [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D88A560.8080405@goop.org \
    --to=jeremy@goop$(echo .)org \
    --cc=a.p.zijlstra@chello$(echo .)nl \
    --cc=akpm@linux-foundation$(echo .)org \
    --cc=benh@kernel$(echo .)crashing.org \
    --cc=hughd@google$(echo .)com \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox