From: Gabriel Paubert <paubert@iram•es>
To: David Laight <David.Laight@ACULAB•COM>
Cc: James Yang <James.Yang@freescale•com>,
Chris Proctor <cproctor@csc•com.au>,
Stephen N Chivers <schivers@csc•com.au>,
"linuxppc-dev@lists•ozlabs.org" <linuxppc-dev@lists•ozlabs.org>
Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
Date: Mon, 10 Feb 2014 14:00:59 +0100 [thread overview]
Message-ID: <20140210130059.GA24697@visitor2.iram.es> (raw)
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D0F6BBDA6@AcuExch.aculab.com>
On Mon, Feb 10, 2014 at 12:32:18PM +0000, David Laight wrote:
> > I disagree, perhaps mostly because the compiler is not clever enough, but right
> > now the code for solution 1 is (actually I have rewritten the code
> > and it reads:
> >
> > mask = (FM & 1)
> > | ((FM << 3) & 0x10)
> > | ((FM << 6) & 0x100)
> > | ((FM << 9) & 0x1000)
> > | ((FM << 12) & 0x10000)
> > | ((FM << 15) & 0x100000)
> > | ((FM << 18) & 0x1000000)
> > | ((FM << 21) & 0x10000000);
> > to avoid sequence point in case it hampers the compiler)
> >
> > and the output is:
> >
> > rlwinm 10,3,3,27,27 # D.11621, FM,,
> > rlwinm 9,3,6,23,23 # D.11621, FM,,
> > or 9,10,9 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,0,31,31 # D.11621, FM,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,9,19,19 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,12,15,15 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,15,11,11 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,18,7,7 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 3,3,21,3,3 # D.11621, FM,,
> > or 9,9,3 #, mask, D.11621, D.11621
> > mulli 9,9,15 # mask, mask,
> >
> > see that r9 is used 7 times as both input and output operand, plus
> > once for rlwinm. This gives a dependency length of 8 at least.
> >
> > In the other case (I've deleted the code) the dependency length
> > was significantly shorter. In any case that one is fewer instructions,
> > which is good for occasional use.
>
> Hmmm... I hand-counted a dependency length of 8 for the other version.
> Maybe there are some ppc instructions that reduce it.
Either I misread the generated code or I got somewhat less.
What helps for method1 is the rotate and mask instructions of PPC. Each of
left shift and mask becomes a single rlwinm.
>
> Stupid compiler :-)
Indeed. I've trying to coerce it into generating rlwimi instructions
(in which case the whole building of the mask reduces to 8 assembly
instructions) and failed. It seems that the compiler lacks some patterns
some patterns that would directly map to rlwimi.
> Trouble is, I bet that even if you code it as:
> mask1 = (FM & 1) | ((FM << 3) & 0x10);
> mask2 = ((FM << 6) & 0x100) | ((FM << 9) & 0x1000);
> mask3 = ((FM << 12) & 0x10000) | ((FM << 15) & 0x100000);
> mask4 = ((FM << 18) & 0x1000000) | ((FM << 21) & 0x10000000);
> mask1 |= mask2;
> mask3 |= mask4;
> mask = mask1 | mask3;
> the compiler will 'optimise' it to the above before code generation.
Indeed it's what it does :-(
I believe that the current suggestion is good enough.
Gabriel
next prev parent reply other threads:[~2014-02-10 13:01 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-06 2:09 arch/powerpc/math-emu/mtfsf.c - incorrect mask? Stephen N Chivers
2014-02-06 8:26 ` Gabriel Paubert
2014-02-07 1:27 ` Stephen N Chivers
2014-02-07 10:10 ` Gabriel Paubert
2014-02-07 20:49 ` James Yang
2014-02-09 19:42 ` Stephen N Chivers
2014-02-10 16:50 ` James Yang
2014-02-10 11:03 ` Gabriel Paubert
2014-02-10 11:17 ` David Laight
2014-02-10 12:21 ` Gabriel Paubert
2014-02-10 12:32 ` David Laight
2014-02-10 13:00 ` Gabriel Paubert [this message]
2014-02-10 17:03 ` James Yang
2014-02-11 7:26 ` Gabriel Paubert
2014-02-11 20:57 ` Linux-3.14-rc2: Order of serial node compatibles in DTS files Stephen N Chivers
2014-02-11 22:33 ` Kumar Gala
2014-02-11 22:51 ` Sebastian Hesselbarth
2014-02-11 23:38 ` Stephen N Chivers
2014-02-11 23:43 ` Sebastian Hesselbarth
2014-02-12 11:00 ` Arnd Bergmann
2014-02-11 23:41 ` Scott Wood
2014-02-11 23:46 ` Sebastian Hesselbarth
2014-02-12 0:21 ` Stephen N Chivers
2014-02-12 5:28 ` Kevin Hao
2014-02-12 8:30 ` Sebastian Hesselbarth
2014-02-12 10:31 ` Kevin Hao
2014-02-12 11:26 ` Sebastian Hesselbarth
2014-02-12 11:32 ` Kevin Hao
2014-02-12 8:25 ` Sebastian Hesselbarth
2014-02-12 10:35 ` Kevin Hao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140210130059.GA24697@visitor2.iram.es \
--to=paubert@iram$(echo .)es \
--cc=David.Laight@ACULAB$(echo .)COM \
--cc=James.Yang@freescale$(echo .)com \
--cc=cproctor@csc$(echo .)com.au \
--cc=linuxppc-dev@lists$(echo .)ozlabs.org \
--cc=schivers@csc$(echo .)com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox