public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Gabriel Paubert <paubert@iram•es>
To: Benjamin Herrenschmidt <benh@kernel•crashing.org>
Cc: Tejun Heo <htejun@gmail•com>,
	linuxppc-dev@lists•ozlabs.org, Christoph Lameter <cl@gentwo•org>
Subject: Re: power and percpu: Could we move the paca into the percpu area?
Date: Wed, 11 Jun 2014 23:03:51 +0200	[thread overview]
Message-ID: <20140611210351.GA7155@visitor2.iram.es> (raw)
In-Reply-To: <1402518131.14780.60.camel@pasglop>

On Thu, Jun 12, 2014 at 06:22:11AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2014-06-11 at 14:37 -0500, Christoph Lameter wrote:
> > Looking at arch/powerpc/include/asm/percpu.h I see that the per cpu offset
> > comes from a local_paca field and local_paca is in r13. That means that
> > for all percpu operations we first have to determine the address through a
> > memory access.
> > 
> > Would it be possible to put the paca at the beginning of the percpu data
> > area and then have r31 point to the percpu area?
> > 
> > power has these nice instructions that fetch from an offset relative to a
> > base register which could be used throughout for percpu operations in the
> > kernel (similar to x86 segment registers).
> > 
> > With that we may also be able to use the atomic ops for fast percpu access
> > so that we can avoid the irq enable/disable sequence that is now required
> > for percpu atomics. Would result in fast and reliable percpu
> > counters for powerpc.
> 
> So.... this is complicated :) And it's something I did want to tackle
> for a while but haven't had a chance.
> 
> The issues off the top of my head are:
> 
>  - The PACA must be accessible in real mode, which means that when
> running under a hypervisor, it must be allocated in the "RMA" which is
> the low part of memory up to a limit that depends on the hypervisor, but
> can be as low as 128M on some older machines.
> 
>  - However, we use percpu more than paca in normal kernel C code, the
> PACA is mostly used during exception entry/exit, KVM, and for interrupt
> soft-enable/disable. So it might make sense to change things so that r13
> contains the per-cpu offset instead. However, doing that change and
> updating the asm to cope isn't a trivial undertaking.
> 
>  - Direct offset from r13 in asm ... works as long as the offset is
> within the signed 32k range. Otherwise we need at least one more addis
> instruction. Anton mentioned the linker may have some smarts however for
> removing that addis if the high part of the offset happens to be 0.
> 
>  - For atomics, the jury is still out as to whether it would be faster
> or not. The atomic ops (lwarx/stwcx.) are expensive. They flush the
> value out of the L1 (to L2) among others. On the other hand we have
> interrupts soft-disable so masking interrupts isn't very expensive.
> Unmasking, while cheap, is currently out of line however. I have been
> wondering if we could move some of the soft-irq state instead to a CR
> field and mark that -ffixed with gcc so we can make irq
> soft-disable/enable even faster and more in-line.

Actually, from gcc/config/rs6000.h:

/* 1 for registers that have pervasive standard uses
   and are not available for the register allocator.

   On RS/6000, r1 is used for the stack.  On Darwin, r2 is available
   as a local register; for all other OS's r2 is the TOC pointer.

   cr5 is not supposed to be used.

   On System V implementations, r13 is fixed and not available for use.  */

#define FIXED_REGISTERS  \
  {0, 1, FIXED_R2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, FIXED_R13, 0, 0, \
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
   0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1,          \
   /* AltiVec registers.  */                       \
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
   1, 1                                            \
   , 1, 1, 1, 1, 1, 1                              \
}

So cr5, which is number 73, is never used by gcc. 
Disassembling a few kernels seems to confirm this.
This gives you 4 booleans to play with...

	Gabriel

  reply	other threads:[~2014-06-11 21:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-11 19:37 power and percpu: Could we move the paca into the percpu area? Christoph Lameter
2014-06-11 20:22 ` Benjamin Herrenschmidt
2014-06-11 21:03   ` Gabriel Paubert [this message]
2014-06-12 12:26     ` Segher Boessenkool
2014-06-12 21:57       ` Benjamin Herrenschmidt
2014-06-12 22:15         ` Segher Boessenkool
2014-06-13 11:55       ` Gabriel Paubert
2014-06-13 14:16         ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140611210351.GA7155@visitor2.iram.es \
    --to=paubert@iram$(echo .)es \
    --cc=benh@kernel$(echo .)crashing.org \
    --cc=cl@gentwo$(echo .)org \
    --cc=htejun@gmail$(echo .)com \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox