public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Alan Modra <amodra@bigpond•net.au>
To: Linas Vepstas <linas@austin•ibm.com>
Cc: linuxppc-dev@ozlabs•org
Subject: Re: Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually
Date: Sat, 23 Dec 2006 16:58:31 +1030	[thread overview]
Message-ID: <20061223062831.GA26406@bubble.grove.modra.org> (raw)
In-Reply-To: <20061220211931.GB16860@austin.ibm.com>

On Wed, Dec 20, 2006 at 03:19:31PM -0600, Linas Vepstas wrote:
> On Tue, Dec 19, 2006 at 07:46:50PM -0600, Peter Bergner wrote:
> > On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote:
> > > Per xchat, here's the update. I'm guessing I'm using a broken
> > > compiler, as per chain of evidence below ...
> > [snip]
> > > However, I also note that the following scrolled by:
> > > init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the
> > > kernel. A different compiler version is recommended.
> > 
> > It may be due to this GCC bug which Olaf ran into a while back:
> > 
> >   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24644
> > 
> > You can verify whether you have a broken compiler by compiling
> > the minimal test case I posted in comment #15.  If you see r13
> > being copied into another register and then used, then you have
> > a broken compiler.
> 
> No, that's not it. I'd be surprised, as I was using the SuSE
> SLES10 gcc-4.1.0-28.4.ppc.rpm compiler, which would have that fix.

Hmm, this looks like at problem Paul forwarded on to me, originally
reported by Hugh Dickins <hugh@veritas•com>.  In his email, Hugh said:

> I spent too long looking in the wrong direction (head_64.S and entry_64.S),
> then noticed this in generic_file_aio_read from "objdump -rd mm/filemap.o":
>     3b54:	7d a5 6b 78 	mr      r5,r13
>     3b58:	38 c0 00 00 	li      r6,0
>     3b5c:	7c 09 03 a6 	mtctr   r0
>     3b60:	38 e0 00 00 	li      r7,0
>     3b64:	39 00 00 00 	li      r8,0
>     3b68:	eb a3 00 20 	ld      r29,32(r3)
>     3b6c:	48 00 00 48 	b       3bb4 <.generic_file_aio_read+0xa4>
>     3b70:	e9 49 00 08 	ld      r10,8(r9)
>     3b74:	7c e7 52 14 	add     r7,r7,r10
>     3b78:	7c e9 53 79 	or.     r9,r7,r10
>     3b7c:	41 c0 01 88 	blt-    3d04 <.generic_file_aio_read+0x1f4>
>     3b80:	e9 25 01 a0 	ld      r9,416(r5)
> 
> So, if the task is preempted and rescheduled on a different cpu in between
> the first and the last line, r5 will be looking at a different paca_struct
> from the one we're now on, and pick up the wrong __current.  (Well, there's
> a branch in the middle there, which then branches back: so the flow isn't
> quite as I've shown, but the effect is the same.)
> 
> That's compiled on SuSE 10.1, gcc 4.1.0-25 (with CONFIG_CC_OPTIMIZE_FOR_SIZE,
> but I've since checked that the same kind of thing happens without).  In most
> places it does use the expected 416(r13) for current, but occasionally via an
> intermediate register as here: why it should choose to do it that way I don't
> know, but assume it's some subtle and legitimate optimization.  It looks as
> if YDL 4.1's older gcc 3.4.4-2 does not do it that way.

I don't know if SuSE's 4.1.0-25 has the PR24644 fix, or whether that
fix cures the mm/filemap.c problem.  I do know that a 4.1.2 20061121
compiler I happened to have lying around made copies of r13 on 2.6.17
mm/filemap.c, even with local_paca made volatile.  The following
workaround allowed me to compile a kernel without any silly r13
copies.

#define get_paca()	({__asm__ __volatile__ ("#paca %0" : "=r" (local_paca)); local_paca;})

The asm tells gcc that local_paca is changed in some unspecified way
just before each access.  Explicitly making r13 volatile like this
should avoid the fuzzy gcc semantics of volatile global register
variables.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

      parent reply	other threads:[~2006-12-23  6:28 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-20  0:46 Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually Linas Vepstas
2006-12-20  0:53 ` Benjamin Herrenschmidt
2006-12-20  1:02   ` Linas Vepstas
2006-12-20  1:46 ` Peter Bergner
2006-12-20 21:19   ` Linas Vepstas
2006-12-20 21:28     ` Benjamin Herrenschmidt
2006-12-20 23:03       ` Mutex debug lock failure [was " Linas Vepstas
2006-12-20 23:09         ` Benjamin Herrenschmidt
2006-12-20 23:46           ` Linas Vepstas
2006-12-21  0:36             ` Anton Blanchard
2006-12-21  1:03               ` Linas Vepstas
2006-12-21 14:41                 ` Ingo Molnar
2006-12-21 21:12                   ` Linas Vepstas
2006-12-23  6:28     ` Alan Modra [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061223062831.GA26406@bubble.grove.modra.org \
    --to=amodra@bigpond$(echo .)net.au \
    --cc=linas@austin$(echo .)ibm.com \
    --cc=linuxppc-dev@ozlabs$(echo .)org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox