public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Gabriel Paubert <paubert@iram•es>
To: David Laight <David.Laight@ACULAB•COM>
Cc: James Yang <James.Yang@freescale•com>,
	Chris Proctor <cproctor@csc•com.au>,
	Stephen N Chivers <schivers@csc•com.au>,
	"linuxppc-dev@lists•ozlabs.org" <linuxppc-dev@lists•ozlabs.org>
Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
Date: Mon, 10 Feb 2014 14:00:59 +0100	[thread overview]
Message-ID: <20140210130059.GA24697@visitor2.iram.es> (raw)
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D0F6BBDA6@AcuExch.aculab.com>

On Mon, Feb 10, 2014 at 12:32:18PM +0000, David Laight wrote:
> > I disagree, perhaps mostly because the compiler is not clever enough, but right
> > now the code for solution 1 is (actually I have rewritten the code
> > and it reads:
> > 
> > 	mask = (FM & 1)
> > 			| ((FM << 3) & 0x10)
> > 			| ((FM << 6) & 0x100)
> > 			| ((FM << 9) & 0x1000)
> > 			| ((FM << 12) & 0x10000)
> > 			| ((FM << 15) & 0x100000)
> > 			| ((FM << 18) & 0x1000000)
> > 			| ((FM << 21) & 0x10000000);
> > to avoid sequence point in case it hampers the compiler)
> > 
> > and the output is:
> > 
> >         rlwinm 10,3,3,27,27      # D.11621, FM,,
> >         rlwinm 9,3,6,23,23       # D.11621, FM,,
> >         or 9,10,9        #, D.11621, D.11621, D.11621
> >         rlwinm 10,3,0,31,31      # D.11621, FM,
> >         or 9,9,10        #, D.11621, D.11621, D.11621
> >         rlwinm 10,3,9,19,19      # D.11621, FM,,
> >         or 9,9,10        #, D.11621, D.11621, D.11621
> >         rlwinm 10,3,12,15,15     # D.11621, FM,,
> >         or 9,9,10        #, D.11621, D.11621, D.11621
> >         rlwinm 10,3,15,11,11     # D.11621, FM,,
> >         or 9,9,10        #, D.11621, D.11621, D.11621
> >         rlwinm 10,3,18,7,7       # D.11621, FM,,
> >         or 9,9,10        #, D.11621, D.11621, D.11621
> >         rlwinm 3,3,21,3,3        # D.11621, FM,,
> >         or 9,9,3         #, mask, D.11621, D.11621
> >         mulli 9,9,15     # mask, mask,
> > 
> > see that r9 is used 7 times as both input and output operand, plus
> > once for rlwinm. This gives a dependency length of 8 at least.
> > 
> > In the other case (I've deleted the code) the dependency length
> > was significantly shorter. In any case that one is fewer instructions,
> > which is good for occasional use.
> 
> Hmmm... I hand-counted a dependency length of 8 for the other version.
> Maybe there are some ppc instructions that reduce it.

Either I misread the generated code or I got somewhat less.

What helps for method1 is the rotate and mask instructions of PPC. Each of
left shift and mask becomes a single rlwinm. 
> 
> Stupid compiler :-)

Indeed. I've trying to coerce it into generating rlwimi instructions
(in which case the whole building of the mask reduces to 8 assembly
instructions) and failed. It seems that the compiler lacks some patterns
some patterns that would directly map to rlwimi.

> Trouble is, I bet that even if you code it as:
>  	mask1 = (FM & 1) | ((FM << 3) & 0x10);
> 	mask2 = ((FM << 6) & 0x100) | ((FM << 9) & 0x1000);
> 	mask3 = ((FM << 12) & 0x10000) | ((FM << 15) & 0x100000);
> 	mask4 = ((FM << 18) & 0x1000000) | ((FM << 21) & 0x10000000);
> 	mask1 |= mask2;
> 	mask3 |= mask4;
> 	mask = mask1 | mask3;
> the compiler will 'optimise' it to the above before code generation.

Indeed it's what it does :-(

I believe that the current suggestion is good enough.

	Gabriel

  reply	other threads:[~2014-02-10 13:01 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-06  2:09 arch/powerpc/math-emu/mtfsf.c - incorrect mask? Stephen N Chivers
2014-02-06  8:26 ` Gabriel Paubert
2014-02-07  1:27   ` Stephen N Chivers
2014-02-07 10:10     ` Gabriel Paubert
2014-02-07 20:49       ` James Yang
2014-02-09 19:42         ` Stephen N Chivers
2014-02-10 16:50           ` James Yang
2014-02-10 11:03         ` Gabriel Paubert
2014-02-10 11:17           ` David Laight
2014-02-10 12:21             ` Gabriel Paubert
2014-02-10 12:32               ` David Laight
2014-02-10 13:00                 ` Gabriel Paubert [this message]
2014-02-10 17:03           ` James Yang
2014-02-11  7:26             ` Gabriel Paubert
2014-02-11 20:57               ` Linux-3.14-rc2: Order of serial node compatibles in DTS files Stephen N Chivers
2014-02-11 22:33                 ` Kumar Gala
2014-02-11 22:51                   ` Sebastian Hesselbarth
2014-02-11 23:38                     ` Stephen N Chivers
2014-02-11 23:43                       ` Sebastian Hesselbarth
2014-02-12 11:00                         ` Arnd Bergmann
2014-02-11 23:41                     ` Scott Wood
2014-02-11 23:46                       ` Sebastian Hesselbarth
2014-02-12  0:21                         ` Stephen N Chivers
2014-02-12  5:28                           ` Kevin Hao
2014-02-12  8:30                             ` Sebastian Hesselbarth
2014-02-12 10:31                               ` Kevin Hao
2014-02-12 11:26                                 ` Sebastian Hesselbarth
2014-02-12 11:32                                   ` Kevin Hao
2014-02-12  8:25                           ` Sebastian Hesselbarth
2014-02-12 10:35                             ` Kevin Hao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140210130059.GA24697@visitor2.iram.es \
    --to=paubert@iram$(echo .)es \
    --cc=David.Laight@ACULAB$(echo .)COM \
    --cc=James.Yang@freescale$(echo .)com \
    --cc=cproctor@csc$(echo .)com.au \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    --cc=schivers@csc$(echo .)com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox