From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id E8CBFDDDA9 for ; Tue, 5 May 2009 08:36:43 +1000 (EST) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C9CD07.40B7D272" Subject: MSR_SPE - being turned off... Date: Mon, 4 May 2009 18:25:38 -0400 Message-ID: From: "Morrison, Tom" To: List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , This is a multi-part message in MIME format. ------_=_NextPart_001_01C9CD07.40B7D272 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I have both a MPC8548 SBC and MPC8572 system that are running different flavors of the=20 same Linux - 2.6.23. =20 I explicitly am turning it on very early on. Later, I have an application that is compiled=20 with SPE instructions (e.g.: evstdd) , and there is where the problems happen. If I explicitly make sure there are NO SPE instructions in the application, nothing bad happens!=20 =20 I am polling the MSR - and it seems the SPE is turned OFF?=20 =20 What have I done wrong and/or has there been fixes in later kernels that I should be aware of that might help this issue? =20 Tom Morrison Principal Software Engineer EMPIRIX=20 20 Crosby Drive - Bedford, MA 01730 p: 781.266.3567 f: 781.266.3670=20 email: tmorrison@empirix.com =20 www.empirix.com =20 =20 ------_=_NextPart_001_01C9CD07.40B7D272 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

I have both a MPC8548 SBC and MPC8572 system that are running different flavors of the

same Linux – = 2.6.23.

 

I explicitly am turning it on very early on. Later, I = have an application that is compiled

with SPE instructions (e.g.: evstdd) , and there is = where the problems happen. If I explicitly

make sure there are NO SPE instructions in the = application, nothing bad happens!

 

I am polling the MSR – and it seems the SPE is = turned OFF?

 

What have I done wrong and/or has there been fixes in = later kernels that I should be aware of that might help this = issue?

 

Tom Morrison
Principal = Software Engineer

EMPIRIX<= /font>
20 Crosby = Drive =
- Bedford, MA  01730=
p: 781.266.3567 f: 781.266.3670 =
email:<= /font> tmorrison@empirix.com=
www.empirix.com



 

------_=_NextPart_001_01C9CD07.40B7D272-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bilbo.ozlabs.org (bilbo.ozlabs.org [203.10.76.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "bilbo.ozlabs.org", Issuer "CAcert Class 3 Root" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id 9A817DDDA5 for ; Tue, 5 May 2009 14:17:43 +1000 (EST) From: Michael Neuling To: "Morrison, Tom" Subject: Re: MSR_SPE - being turned off... In-reply-to: References: Date: Tue, 05 May 2009 14:17:42 +1000 Message-ID: <23583.1241497062@neuling.org> Cc: linuxppc-dev@ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , > I have both a MPC8548 SBC and MPC8572 system that are running different > flavors of the same Linux - 2.6.23. > > I explicitly am turning it on very early on. Where are you turning this on? In the appication? > Later, I have an application that is compiled with SPE instructions > (e.g.: evstdd) , and there is where the problems happen. What is the problem? > If I explicitly make sure there are NO SPE instructions in the > application, nothing bad happens! > > I am polling the MSR - and it seems the SPE is turned OFF? You are polling MSR in your application? Anyway, this is expected. We do lazy restore of these registers after a context switch. So the SPE bit in the MSR may not be set if no SPE instructions have run since the last context switch. > What have I done wrong and/or has there been fixes in later kernels that > I should be aware of that might help this issue? I'm not clear what the problem is. You've just said "problems happen" when you include SPE instructions. Are you getting a SIGILL or some other signal? Is your program terminating. Can you get a GDB back-trace or dump of which instruction is causing the "problem". Mikey From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 89A27DDD1B for ; Tue, 5 May 2009 21:07:52 +1000 (EST) Message-Id: From: Kumar Gala To: "Morrison, Tom" In-Reply-To: Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: MSR_SPE - being turned off... Date: Tue, 5 May 2009 06:07:40 -0500 References: Cc: linuxppc-dev@ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On May 4, 2009, at 5:25 PM, Morrison, Tom wrote: > I have both a MPC8548 SBC and MPC8572 system that are running =20 > different flavors of the > same Linux =96 2.6.23. > > I explicitly am turning it on very early on. Later, I have an =20 > application that is compiled > with SPE instructions (e.g.: evstdd) , and there is where the =20 > problems happen. If I explicitly > make sure there are NO SPE instructions in the application, nothing =20= > bad happens! > > I am polling the MSR =96 and it seems the SPE is turned OFF? > > What have I done wrong and/or has there been fixes in later kernels =20= > that I should be aware of that might help this issue? Can you explain what you mean by explicitly am turning it on very =20 early on. I can't think of anything that has changed w/regards to SPE handling. - k= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id 01379DDDA2 for ; Tue, 5 May 2009 22:55:35 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: MSR_SPE - being turned off... Date: Tue, 5 May 2009 08:56:33 -0400 Message-ID: In-Reply-To: References: From: "Morrison, Tom" To: "Kumar Gala" Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Hi Kumar/Michael... Sorry, I really didn't explain myself very well... The Problem (answer to Michael): =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D We started using a new compiler that upon -O2 optimization - added heavy SPE related instructions into our applications (where the older compiler might not use as many). Once this was done, we started=20 experiencing problems with data being 'shifted' and/or corrupted=20 throughout the applications which didn't immediately cause problems, but either scribbled on someone else's memory and/or bad results... We knew where one of the offending scribbles started (by the shifting=20 by 1 byte of a structure) and found by comparing binaries with 'older' compiler vs. this one that the only major difference was the 'density'=20 of the SPE instructions... As to your question, Kumar:=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Naively, I explicitly enabled the SPE in a BSP 'early_init' program=20 (as well as enabling Machine Checks) - which is what I meant by Enabling SPE... Michael explained that it is 'normal' if we asynchronously polled=20 the MSR (in an application and/or in the kernel) that it might be=20 disabled at the moment, but that you do a 'lazy switch' that=20 enables it...and gets turned on when an SPE exception comes in... ...ok...I can live with that... -------where I was really going--------- This is where I was trying to go. A developer at our company (who no longer works for us) - did some research/development on the SPE=20 functionality, in the hopes that we could create an optimized library. The results were successful, but because of some of the restrictions=20 (including 8 byte alignment for some instructions) - we decided not to incorporate this library into our application(s) But, this developer in his results, indicated that he believed our kernels were NOT properly saving/restoring the upper 32bits of the=20 GPR (which can/will be used in the SPE instructions)... Thus, if the upper 32bits were not saved (and restored when the application got the SPE to operate on)...then, he thought there would be problems. He unfortunately, was unable to finish his work and fix these 'bugs' before he left our company... Again, I am only going on his results, and not my own investigations (I am not sure where to start to find this problem to begin with)... So, I was REALLY asking - has anybody else run into this type of problem,=20 and/or the Linux community has recognized this problem and has fixed this? ------ I hope I am a little clearer in the history / and outline of the=20 problem I am trying to solve this time? Thanks in advance! Tom Morrison >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: Tuesday, May 05, 2009 7:08 AM >> To: Morrison, Tom >> Cc: linuxppc-dev@ozlabs.org >> Subject: Re: MSR_SPE - being turned off... >>=20 >>=20 >> On May 4, 2009, at 5:25 PM, Morrison, Tom wrote: >>=20 >> > I have both a MPC8548 SBC and MPC8572 system that are running >> > different flavors of the >> > same Linux - 2.6.23. >> > >> > I explicitly am turning it on very early on. Later, I have an >> > application that is compiled >> > with SPE instructions (e.g.: evstdd) , and there is where the >> > problems happen. If I explicitly >> > make sure there are NO SPE instructions in the application, nothing >> > bad happens! >> > >> > I am polling the MSR - and it seems the SPE is turned OFF? >> > >> > What have I done wrong and/or has there been fixes in later kernels >> > that I should be aware of that might help this issue? >>=20 >> Can you explain what you mean by explicitly am turning it on very >> early on. >>=20 >> I can't think of anything that has changed w/regards to SPE handling. >>=20 >> - k From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 9C02CDDD1C for ; Wed, 6 May 2009 07:18:13 +1000 (EST) Message-Id: <756DA1CE-4951-4087-9F1B-FE83A53BB253@kernel.crashing.org> From: Kumar Gala To: "Morrison, Tom" In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: MSR_SPE - being turned off... Date: Tue, 5 May 2009 16:18:00 -0500 References: Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On May 5, 2009, at 7:56 AM, Morrison, Tom wrote: > Hi Kumar/Michael... > > Sorry, I really didn't explain myself very well... > > The Problem (answer to Michael): > ================================ > We started using a new compiler that upon -O2 optimization - added > heavy SPE related instructions into our applications (where the older > compiler might not use as many). Once this was done, we started > experiencing problems with data being 'shifted' and/or corrupted > throughout the applications which didn't immediately cause problems, > but either scribbled on someone else's memory and/or bad results... > We knew where one of the offending scribbles started (by the shifting > by 1 byte of a structure) and found by comparing binaries with 'older' > compiler vs. this one that the only major difference was the 'density' > of the SPE instructions... > > As to your question, Kumar: > =========================== > Naively, I explicitly enabled the SPE in a BSP 'early_init' program > (as well as enabling Machine Checks) - which is what I meant by > Enabling SPE... Are you setting MSR_SPE in your own board code? If so stop doing so. There isn't any need or reason to be doing that. MSR_SPE will get set when an application starts using SPE code and the kernel will manage it properly. - k From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bilbo.ozlabs.org (bilbo.ozlabs.org [203.10.76.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "bilbo.ozlabs.org", Issuer "CAcert Class 3 Root" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id B497CDDDA1 for ; Wed, 6 May 2009 10:01:39 +1000 (EST) From: Michael Neuling To: "Morrison, Tom" Subject: Re: MSR_SPE - being turned off... In-reply-to: References: Date: Wed, 06 May 2009 10:01:39 +1000 Message-ID: <13221.1241568099@neuling.org> Cc: linuxppc-dev@ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , > Hi Kumar/Michael... > > Sorry, I really didn't explain myself very well... > > The Problem (answer to Michael): > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D=3D=3D=3D=3D=3D > We started using a new compiler that upon -O2 optimization - added > heavy SPE related instructions into our applications (where the older > compiler might not use as many). Once this was done, we started=20 > experiencing problems with data being 'shifted' and/or corrupted=20 > throughout the applications which didn't immediately cause problems, > but either scribbled on someone else's memory and/or bad results... > We knew where one of the offending scribbles started (by the shifting=20 > by 1 byte of a structure) and found by comparing binaries with 'older' > compiler vs. this one that the only major difference was the 'density'=20 > of the SPE instructions... > > As to your question, Kumar:=20 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D > Naively, I explicitly enabled the SPE in a BSP 'early_init' program=20 > (as well as enabling Machine Checks) - which is what I meant by > Enabling SPE... Yeah, you don't want to do this. It'll potentially break your application. I'm not that familiar with the CPU you are using but I'm guessing that you can't write the MSR from user space anyway. > Michael explained that it is 'normal' if we asynchronously polled > the MSR (in an application and/or in the kernel) that it might be > disabled at the moment, but that you do a 'lazy switch' that=20 > enables it...and gets turned on when an SPE exception comes in... > > ...ok...I can live with that... > > -------where I was really going--------- > > This is where I was trying to go. A developer at our company (who no > longer works for us) - did some research/development on the SPE=20 > functionality, in the hopes that we could create an optimized library. > The results were successful, but because of some of the restrictions=20 > (including 8 byte alignment for some instructions) - we decided not > to incorporate this library into our application(s) > > But, this developer in his results, indicated that he believed our > kernels were NOT properly saving/restoring the upper 32bits of the > GPR (which can/will be used in the SPE instructions)... Thus, if the > upper 32bits were not saved (and restored when the application got > the SPE to operate on)...then, he thought there would be problems. > He unfortunately, was unable to finish his work and fix these 'bugs' > before he left our company... > > Again, I am only going on his results, and not my own investigations > (I am not sure where to start to find this problem to begin with)... > > So, I was REALLY asking - has anybody else run into this type of > problem, and/or the Linux community has recognized this problem and > has fixed this? If GPRs where getting corrupted in userspace, that would be a serious bug and would be noticed by someone pretty quickly. We'd really need a test case to get anywhere with this report. Mikey From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id 493EEDDDA1 for ; Wed, 6 May 2009 10:06:24 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: MSR_SPE - being turned off... Date: Tue, 5 May 2009 20:07:20 -0400 Message-ID: In-Reply-To: <756DA1CE-4951-4087-9F1B-FE83A53BB253@kernel.crashing.org> References: <756DA1CE-4951-4087-9F1B-FE83A53BB253@kernel.crashing.org> From: "Morrison, Tom" To: "Kumar Gala" Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Ok...taken out... >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: Tuesday, May 05, 2009 5:18 PM >> To: Morrison, Tom >> Cc: linuxppc-dev@ozlabs.org; Michael Neuling >> Subject: Re: MSR_SPE - being turned off... >>=20 >>=20 >> On May 5, 2009, at 7:56 AM, Morrison, Tom wrote: >>=20 >> > Hi Kumar/Michael... >> > >> > Sorry, I really didn't explain myself very well... >> > >> > The Problem (answer to Michael): >> > = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D >> > We started using a new compiler that upon -O2 optimization - added >> > heavy SPE related instructions into our applications (where the older >> > compiler might not use as many). Once this was done, we started >> > experiencing problems with data being 'shifted' and/or corrupted >> > throughout the applications which didn't immediately cause problems, >> > but either scribbled on someone else's memory and/or bad results... >> > We knew where one of the offending scribbles started (by the shifting >> > by 1 byte of a structure) and found by comparing binaries with 'older' >> > compiler vs. this one that the only major difference was the 'density' >> > of the SPE instructions... >> > >> > As to your question, Kumar: >> > = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >> > Naively, I explicitly enabled the SPE in a BSP 'early_init' program >> > (as well as enabling Machine Checks) - which is what I meant by >> > Enabling SPE... >>=20 >> Are you setting MSR_SPE in your own board code? If so stop doing so. >> There isn't any need or reason to be doing that. MSR_SPE will get set >> when an application starts using SPE code and the kernel will manage >> it properly. >>=20 >> - k >>=20 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id 86106DDDF9 for ; Wed, 6 May 2009 10:41:21 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: MSR_SPE - being turned off... Date: Tue, 5 May 2009 20:42:15 -0400 Message-ID: In-Reply-To: <13221.1241568099@neuling.org> References: <13221.1241568099@neuling.org> From: "Morrison, Tom" To: "Michael Neuling" Cc: linuxppc-dev@ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , The test case we found is under 'extreme' duress=20 (intense loading on an MPC8572)...with many applications.... using A LOT of SPE instructions... ---- If you look at the context switch code (in latest code entry_32.S),=20 I believe the context switch performs a SAVE_NVGPR() - which in our=20 interpretation (in ppc_asm.h) - only saves the lower 32 bits of=20 the GPR (stw/lwz)... This is only a guess of where the problem lies - based upon the single SPE instruction that seemingly got misinterpreted, and shifts the data By '1 byte' (and this code gets executed successfully MANY more times=20 at lower bandwidths - than failures seen at higher bandwidths)... ---- I am not sure how to proceed...we know how to recreate with our=20 application, but we would love to know how to change (safely)=20 the pt_regs to "long long" for the GPRs and then safely move all 64bits of each GPR into these doubles... We could then re-test and see if this helps? Tom >> -----Original Message----- >> From: Michael Neuling [mailto:mikey@neuling.org] >> Sent: Tuesday, May 05, 2009 8:02 PM >> To: Morrison, Tom >> Cc: Kumar Gala; linuxppc-dev@ozlabs.org >> Subject: Re: MSR_SPE - being turned off... >>=20 >> > Hi Kumar/Michael... >> > >> > Sorry, I really didn't explain myself very well... >> > >> > The Problem (answer to Michael): >> > >> =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D= 3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D =3D >> 3D=3D >> > =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D >> > We started using a new compiler that upon -O2 optimization - added >> > heavy SPE related instructions into our applications (where the older >> > compiler might not use as many). Once this was done, we = started=3D20 >> > experiencing problems with data being 'shifted' and/or = corrupted=3D20 >> > throughout the applications which didn't immediately cause problems, >> > but either scribbled on someone else's memory and/or bad results... >> > We knew where one of the offending scribbles started (by the >> shifting=3D20 >> > by 1 byte of a structure) and found by comparing binaries with 'older' >> > compiler vs. this one that the only major difference was the >> 'density'=3D20 >> > of the SPE instructions... >> > >> > As to your question, Kumar:=3D20 >> > >> =3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D= 3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D =3D >> 3D=3D >> > =3D3D=3D3D >> > Naively, I explicitly enabled the SPE in a BSP 'early_init' program=3D20 >> > (as well as enabling Machine Checks) - which is what I meant by >> > Enabling SPE... >>=20 >> Yeah, you don't want to do this. It'll potentially break your >> application. >>=20 >> I'm not that familiar with the CPU you are using but I'm guessing that >> you can't write the MSR from user space anyway. >>=20 >> > Michael explained that it is 'normal' if we asynchronously polled >> > the MSR (in an application and/or in the kernel) that it might be >> > disabled at the moment, but that you do a 'lazy switch' that=3D20 >> > enables it...and gets turned on when an SPE exception comes in... >> > >> > ...ok...I can live with that... >> > >> > -------where I was really going--------- >> > >> > This is where I was trying to go. A developer at our company (who no >> > longer works for us) - did some research/development on the = SPE=3D20 >> > functionality, in the hopes that we could create an optimized library. >> > The results were successful, but because of some of the restrictions=3D20 >> > (including 8 byte alignment for some instructions) - we decided not >> > to incorporate this library into our application(s) >> > >> > But, this developer in his results, indicated that he believed our >> > kernels were NOT properly saving/restoring the upper 32bits of the >> > GPR (which can/will be used in the SPE instructions)... Thus, if the >> > upper 32bits were not saved (and restored when the application got >> > the SPE to operate on)...then, he thought there would be problems. >> > He unfortunately, was unable to finish his work and fix these 'bugs' >> > before he left our company... >> > >> > Again, I am only going on his results, and not my own investigations >> > (I am not sure where to start to find this problem to begin with)... >> > >> > So, I was REALLY asking - has anybody else run into this type of >> > problem, and/or the Linux community has recognized this problem and >> > has fixed this? >>=20 >> If GPRs where getting corrupted in userspace, that would be a serious >> bug and would be noticed by someone pretty quickly. >>=20 >> We'd really need a test case to get anywhere with this report. >>=20 >> Mikey From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 22CFEDDDF9 for ; Wed, 6 May 2009 14:24:04 +1000 (EST) Message-Id: From: Kumar Gala To: "Morrison, Tom" In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: MSR_SPE - being turned off... Date: Tue, 5 May 2009 23:23:46 -0500 References: <13221.1241568099@neuling.org> Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On May 5, 2009, at 7:42 PM, Morrison, Tom wrote: > The test case we found is under 'extreme' duress > (intense loading on an MPC8572)...with many applications.... > using A LOT of SPE instructions... > > ---- > > If you look at the context switch code (in latest code entry_32.S), > I believe the context switch performs a SAVE_NVGPR() - which in our > interpretation (in ppc_asm.h) - only saves the lower 32 bits of > the GPR (stw/lwz)... > > This is only a guess of where the problem lies - based upon the single > SPE instruction that seemingly got misinterpreted, and shifts the data > By '1 byte' (and this code gets executed successfully MANY more times > at lower bandwidths - than failures seen at higher bandwidths)... > > ---- > > I am not sure how to proceed...we know how to recreate with our > application, but we would love to know how to change (safely) > the pt_regs to "long long" for the GPRs and then safely move > all 64bits of each GPR into these doubles... > > We could then re-test and see if this helps? > > Tom If you use SPE in an application the full 64-bits are saved and restored it just split into two locations (one for the lower 32-bits and one for the upper 32-bits). Look at load_up_spe and giveup_spe in arch/powerpc/kernel/ head_fsl_booke.S On the 8572 are you running w/SMP? What kernel version are you using if so? Do you see the same issue on the MPC8548? - k From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id 3AD0FDDD1C for ; Wed, 6 May 2009 18:31:05 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: RE: MSR_SPE - being turned off... Date: Wed, 6 May 2009 04:31:58 -0400 Message-ID: References: <13221.1241568099@neuling.org> From: "Morrison, Tom" To: "Kumar Gala" Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Kumar, =20 What about the case of a context switch (i.e.: when things are setup in registers for the SPE, but then a context switch happens before the SPE is executed)?=20 =20 As to load_up_spe & give_up_spe, it was pointed out to me tonight by a = co-worker to look at how things are saved in those routines, I definitely will = look at this again,=20 and see how it is done... =20 This is happening for us on an 8572 SMP. We are trying to get it to = happen=20 on 8548 (and single core 8572), but we haven't been able to push this = part=20 of the application as hard as it is being pushed on 8572...but we will = keep trying.... =20 thank you for your patience and suggestions on this...and I will keep = working it =20 Tom=20 ________________________________ From: Kumar Gala [mailto:galak@kernel.crashing.org] Sent: Wed 5/6/2009 12:23 AM To: Morrison, Tom Cc: Michael Neuling; linuxppc-dev@ozlabs.org Subject: Re: MSR_SPE - being turned off...=20 On May 5, 2009, at 7:42 PM, Morrison, Tom wrote: > The test case we found is under 'extreme' duress > (intense loading on an MPC8572)...with many applications.... > using A LOT of SPE instructions... > > ---- > > If you look at the context switch code (in latest code entry_32.S), > I believe the context switch performs a SAVE_NVGPR() - which in our > interpretation (in ppc_asm.h) - only saves the lower 32 bits of > the GPR (stw/lwz)... > > This is only a guess of where the problem lies - based upon the single > SPE instruction that seemingly got misinterpreted, and shifts the data > By '1 byte' (and this code gets executed successfully MANY more times > at lower bandwidths - than failures seen at higher bandwidths)... > > ---- > > I am not sure how to proceed...we know how to recreate with our > application, but we would love to know how to change (safely) > the pt_regs to "long long" for the GPRs and then safely move > all 64bits of each GPR into these doubles... > > We could then re-test and see if this helps? > > Tom If you use SPE in an application the full 64-bits are saved and=20 restored it just split into two locations (one for the lower 32-bits=20 and one for the upper 32-bits). Look at load_up_spe and giveup_spe in arch/powerpc/kernel/ head_fsl_booke.S On the 8572 are you running w/SMP? What kernel version are you using=20 if so? Do you see the same issue on the MPC8548? - k From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id AABA5DDDB6 for ; Wed, 6 May 2009 22:32:04 +1000 (EST) Message-Id: <343F1A10-459E-4024-B0DD-ADB1D6DCDB9D@kernel.crashing.org> From: Kumar Gala To: "Morrison, Tom" In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: MSR_SPE - being turned off... Date: Wed, 6 May 2009 07:31:47 -0500 References: <13221.1241568099@neuling.org> Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On May 6, 2009, at 3:31 AM, Morrison, Tom wrote: > Kumar, > > What about the case of a context switch (i.e.: when things are setup > in registers for the SPE, but then a context switch happens before > the SPE is executed)? context switches will be fine. What we normally do is keep track of which user app used SPE last and when some other app needs it we clear MSR_SPE for the old app, save its registers. Than we load up the registers for the new app and set MSR_SPE. When the old app context switches in it will get an SPE unavail exception at the point it executes its next SPE insn and we will repeat the process. > As to load_up_spe & give_up_spe, it was pointed out to me tonight by > a co-worker > to look at how things are saved in those routines, I definitely will > look at this again, > and see how it is done... > > This is happening for us on an 8572 SMP. We are trying to get it to > happen > on 8548 (and single core 8572), but we haven't been able to push > this part > of the application as hard as it is being pushed on 8572...but we > will keep trying.... Again, what kernel version for 8572? Its possible old SMP kernels are broken on 8572. - k > ________________________________ > > From: Kumar Gala [mailto:galak@kernel.crashing.org] > Sent: Wed 5/6/2009 12:23 AM > To: Morrison, Tom > Cc: Michael Neuling; linuxppc-dev@ozlabs.org > Subject: Re: MSR_SPE - being turned off... > > > > > On May 5, 2009, at 7:42 PM, Morrison, Tom wrote: > >> The test case we found is under 'extreme' duress >> (intense loading on an MPC8572)...with many applications.... >> using A LOT of SPE instructions... >> >> ---- >> >> If you look at the context switch code (in latest code entry_32.S), >> I believe the context switch performs a SAVE_NVGPR() - which in our >> interpretation (in ppc_asm.h) - only saves the lower 32 bits of >> the GPR (stw/lwz)... >> >> This is only a guess of where the problem lies - based upon the >> single >> SPE instruction that seemingly got misinterpreted, and shifts the >> data >> By '1 byte' (and this code gets executed successfully MANY more times >> at lower bandwidths - than failures seen at higher bandwidths)... >> >> ---- >> >> I am not sure how to proceed...we know how to recreate with our >> application, but we would love to know how to change (safely) >> the pt_regs to "long long" for the GPRs and then safely move >> all 64bits of each GPR into these doubles... >> >> We could then re-test and see if this helps? >> >> Tom > > If you use SPE in an application the full 64-bits are saved and > restored it just split into two locations (one for the lower 32-bits > and one for the upper 32-bits). > > Look at load_up_spe and giveup_spe in arch/powerpc/kernel/ > head_fsl_booke.S > > On the 8572 are you running w/SMP? What kernel version are you using > if so? Do you see the same issue on the MPC8548? > > - k > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id 38E9FDDDF3 for ; Wed, 6 May 2009 22:42:16 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: MSR_SPE - being turned off... Date: Wed, 6 May 2009 08:42:42 -0400 Message-ID: In-Reply-To: <343F1A10-459E-4024-B0DD-ADB1D6DCDB9D@kernel.crashing.org> References: <13221.1241568099@neuling.org> <343F1A10-459E-4024-B0DD-ADB1D6DCDB9D@kernel.crashing.org> From: "Morrison, Tom" To: "Kumar Gala" Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , I'm sorry I forgot to put that, this issue was found with our=20 currently running kernel 2.6.23.final (what comes with the=20 Freescale LTIB BSP package dated 05/23/2009).=20 I am sorry if I don't understand your statement that the SMP might be broken on this kernel, because I tried to analyze the kernel that=20 came with the latest BSP LTIB [ackage from Freescale (dated 12/18/2009=20 (where we got the 4.2.171 compiler from)), and the associated 'switch=20 context' code is exactly the same. Unfortunately, I have not started=20 the process of porting my current platform's BSP to this new kernel -=20 otherwise, I would have done the test on that platform (this also=20 requires a new version of u-boot in order to test correctly)).. I may have mis-interpreted something and/or I am sure I don't=20 understand everything about the SMP resource management (and=20 associated SPE management), so thank you for any insight you=20 may have on this front... Tom >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: Wednesday, May 06, 2009 8:32 AM >> To: Morrison, Tom >> Cc: Michael Neuling; linuxppc-dev@ozlabs.org >> Subject: Re: MSR_SPE - being turned off... >>=20 >>=20 >> On May 6, 2009, at 3:31 AM, Morrison, Tom wrote: >>=20 >> > Kumar, >> > >> > What about the case of a context switch (i.e.: when things are setup >> > in registers for the SPE, but then a context switch happens before >> > the SPE is executed)? >>=20 >> context switches will be fine. What we normally do is keep track of >> which user app used SPE last and when some other app needs it we clear >> MSR_SPE for the old app, save its registers. Than we load up the >> registers for the new app and set MSR_SPE. When the old app context >> switches in it will get an SPE unavail exception at the point it >> executes its next SPE insn and we will repeat the process. >>=20 >> > As to load_up_spe & give_up_spe, it was pointed out to me tonight by >> > a co-worker >> > to look at how things are saved in those routines, I definitely will >> > look at this again, c>> > and see how it is done... >> > >> > This is happening for us on an 8572 SMP. We are trying to get it to >> > happen >> > on 8548 (and single core 8572), but we haven't been able to push >> > this part >> > of the application as hard as it is being pushed on 8572...but we >> > will keep trying.... >>=20 >> Again, what kernel version for 8572? Its possible old SMP kernels are >> broken on 8572. >>=20 >> - k >>=20 >> > ________________________________ >> > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 568A1DDDEE for ; Wed, 6 May 2009 22:44:18 +1000 (EST) Message-Id: <62217899-08C0-473A-8784-DEEC64A145DD@kernel.crashing.org> From: Kumar Gala To: "Morrison, Tom" In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: MSR_SPE - being turned off... Date: Wed, 6 May 2009 07:44:01 -0500 References: <13221.1241568099@neuling.org> <343F1A10-459E-4024-B0DD-ADB1D6DCDB9D@kernel.crashing.org> Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Can you describe the # of processes you are running in your test. Is it possible for you to try the tests w/2.6.29 from kernel.org? - k On May 6, 2009, at 7:42 AM, Morrison, Tom wrote: > I'm sorry I forgot to put that, this issue was found with our > currently running kernel 2.6.23.final (what comes with the > Freescale LTIB BSP package dated 05/23/2009). > > I am sorry if I don't understand your statement that the SMP might > be broken on this kernel, because I tried to analyze the kernel that > came with the latest BSP LTIB [ackage from Freescale (dated 12/18/2009 > (where we got the 4.2.171 compiler from)), and the associated 'switch > context' code is exactly the same. Unfortunately, I have not started > the process of porting my current platform's BSP to this new kernel - > otherwise, I would have done the test on that platform (this also > requires a new version of u-boot in order to test correctly)).. > > I may have mis-interpreted something and/or I am sure I don't > understand everything about the SMP resource management (and > associated SPE management), so thank you for any insight you > may have on this front... > > Tom > >>> -----Original Message----- >>> From: Kumar Gala [mailto:galak@kernel.crashing.org] >>> Sent: Wednesday, May 06, 2009 8:32 AM >>> To: Morrison, Tom >>> Cc: Michael Neuling; linuxppc-dev@ozlabs.org >>> Subject: Re: MSR_SPE - being turned off... >>> >>> >>> On May 6, 2009, at 3:31 AM, Morrison, Tom wrote: >>> >>>> Kumar, >>>> >>>> What about the case of a context switch (i.e.: when things are > setup >>>> in registers for the SPE, but then a context switch happens before >>>> the SPE is executed)? >>> >>> context switches will be fine. What we normally do is keep track of >>> which user app used SPE last and when some other app needs it we > clear >>> MSR_SPE for the old app, save its registers. Than we load up the >>> registers for the new app and set MSR_SPE. When the old app context >>> switches in it will get an SPE unavail exception at the point it >>> executes its next SPE insn and we will repeat the process. >>> >>>> As to load_up_spe & give_up_spe, it was pointed out to me tonight > by >>>> a co-worker >>>> to look at how things are saved in those routines, I definitely > will >>>> look at this again, > c>> > and see how it is done... >>>> >>>> This is happening for us on an 8572 SMP. We are trying to get it to >>>> happen >>>> on 8548 (and single core 8572), but we haven't been able to push >>>> this part >>>> of the application as hard as it is being pushed on 8572...but we >>>> will keep trying.... >>> >>> Again, what kernel version for 8572? Its possible old SMP kernels > are >>> broken on 8572. >>> >>> - k >>> >>>> ________________________________ >>>> > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id C1129DDDFA for ; Thu, 7 May 2009 05:32:48 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: MSR_SPE - being turned off... Date: Wed, 6 May 2009 15:33:53 -0400 Message-ID: In-Reply-To: <62217899-08C0-473A-8784-DEEC64A145DD@kernel.crashing.org> References: <13221.1241568099@neuling.org> <343F1A10-459E-4024-B0DD-ADB1D6DCDB9D@kernel.crashing.org> <62217899-08C0-473A-8784-DEEC64A145DD@kernel.crashing.org> From: "Morrison, Tom" To: "Kumar Gala" Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , After sitting with the developer of the application for a while, we may have=20 two separate issues... a) Alignment (aka: alignment exceptions) - Looking at how it handles it=20 And attempts to=20 b) For aligned data - we still contend that if you have enough tasks working >> -----Original Message----- >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> Sent: Wednesday, May 06, 2009 8:44 AM >> To: Morrison, Tom >> Cc: Michael Neuling; linuxppc-dev@ozlabs.org >> Subject: Re: MSR_SPE - being turned off... >>=20 >> Can you describe the # of processes you are running in your test. Is >> it possible for you to try the tests w/2.6.29 from kernel.org? >>=20 >> - k >>=20 >> On May 6, 2009, at 7:42 AM, Morrison, Tom wrote: >>=20 >> > I'm sorry I forgot to put that, this issue was found with our >> > currently running kernel 2.6.23.final (what comes with the >> > Freescale LTIB BSP package dated 05/23/2009). >> > >> > I am sorry if I don't understand your statement that the SMP might >> > be broken on this kernel, because I tried to analyze the kernel that >> > came with the latest BSP LTIB [ackage from Freescale (dated 12/18/2009 >> > (where we got the 4.2.171 compiler from)), and the associated 'switch >> > context' code is exactly the same. Unfortunately, I have not started >> > the process of porting my current platform's BSP to this new kernel - >> > otherwise, I would have done the test on that platform (this also >> > requires a new version of u-boot in order to test correctly)).. >> > >> > I may have mis-interpreted something and/or I am sure I don't >> > understand everything about the SMP resource management (and >> > associated SPE management), so thank you for any insight you >> > may have on this front... >> > >> > Tom >> > >> >>> -----Original Message----- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from empbedexfe.empirix.com (empbedexfe.empirix.com [12.38.203.54]) by ozlabs.org (Postfix) with ESMTP id CAFD4DDE06 for ; Thu, 7 May 2009 06:13:57 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: MSR_SPE - being turned off... Date: Wed, 6 May 2009 16:15:06 -0400 Message-ID: References: <13221.1241568099@neuling.org> <343F1A10-459E-4024-B0DD-ADB1D6DCDB9D@kernel.crashing.org> <62217899-08C0-473A-8784-DEEC64A145DD@kernel.crashing.org> From: "Morrison, Tom" To: "Kumar Gala" Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sorry, let me try again... >> -----Original Message----- After sitting with the developer of the application for a while,=20 a) Alignment (aka: alignment exceptions) - Looking at how it=20 handles the instruction - it interprets these SPE as common instructions & then resets the 'upper' 32bits. I was just made aware that on 9/14/2007 - Kumar submitted a patch that handles these instructions correctly (we don't have that version - I am in the process of trying to port it=20 to my current version of the kernel (to see if part of problem). In general, this is a VERY disturbing thing. We 'turn on=20 SPE' in the compiler (-mspe=3Dyes)(a). We are NOT explicitly=20 using SPE instructions in our application(b), BUT(c), the 4.2.171 compiler (having origins from Code Sourcery (via Freescale)) upon optimizations put SPE instructions in without any regard for=20 alignment (which instead of making the code faster - might actually make the code slower)? It's a little disturbing to me. Stay tuned for more details about my port - and seeing if some of my problems go away.. b) We still contend if you have multiple tasks using a (VERY) high=20 Density of SPE instructions - and the system is taxed heavily (with lots of context switches) - there is the possibility that a task will get unlucky and the registers setup will NOT there=20 after the context switches back (if some other task does something else with the entire 64bits). Tom >>=20 >> >> -----Original Message----- >> >> From: Kumar Gala [mailto:galak@kernel.crashing.org] >> >> Sent: Wednesday, May 06, 2009 8:44 AM >> >> To: Morrison, Tom >> >> Cc: Michael Neuling; linuxppc-dev@ozlabs.org >> >> Subject: Re: MSR_SPE - being turned off... >> >> >> >> Can you describe the # of processes you are running in your test. Is >> >> it possible for you to try the tests w/2.6.29 from kernel.org? >> >> >> >> - k >> >> >> >> On May 6, 2009, at 7:42 AM, Morrison, Tom wrote: >> >> >> >> > I'm sorry I forgot to put that, this issue was found with our >> >> > currently running kernel 2.6.23.final (what comes with the >> >> > Freescale LTIB BSP package dated 05/23/2009). >> >> > >> >> > I am sorry if I don't understand your statement that the SMP might >> >> > be broken on this kernel, because I tried to analyze the kernel that >> >> > came with the latest BSP LTIB [ackage from Freescale (dated >> 12/18/2009 >> >> > (where we got the 4.2.171 compiler from)), and the associated >> 'switch >> >> > context' code is exactly the same. Unfortunately, I have not started >> >> > the process of porting my current platform's BSP to this new kernel >> - >> >> > otherwise, I would have done the test on that platform (this also >> >> > requires a new version of u-boot in order to test correctly)).. >> >> > >> >> > I may have mis-interpreted something and/or I am sure I don't >> >> > understand everything about the SMP resource management (and >> >> > associated SPE management), so thank you for any insight you >> >> > may have on this front... >> >> > >> >> > Tom >> >> > >> >> >>> -----Original Message----- >> From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 2F1BADDF7E for ; Thu, 7 May 2009 07:23:31 +1000 (EST) Message-Id: <3FAB7217-C73A-4DEF-9653-87F96C2F645B@kernel.crashing.org> From: Kumar Gala To: "Morrison, Tom" In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: MSR_SPE - being turned off... Date: Wed, 6 May 2009 16:23:04 -0500 References: <13221.1241568099@neuling.org> <343F1A10-459E-4024-B0DD-ADB1D6DCDB9D@kernel.crashing.org> <62217899-08C0-473A-8784-DEEC64A145DD@kernel.crashing.org> Cc: linuxppc-dev@ozlabs.org, Michael Neuling List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Since you are mentioning alignment it looks my patch for SPE alignment went in after 2.6.23: commit 26caeb2ee1924d564e8d8190aa783a569532f81a Author: Kumar Gala Date: Fri Aug 24 16:42:53 2007 -0500 [POWERPC] Handle alignment faults on SPE load/store instructions This adds code to handle alignment traps generated by the following SPE (signal processing engine) load/store instructions, by emulating the instruction in the kernel (as is done for other instructions that generate alignment traps): You may want to try and see if you can apply that patch to your kernel tree and see what happens. - k On May 6, 2009, at 3:15 PM, Morrison, Tom wrote: > Sorry, let me try again... > >>> -----Original Message----- > After sitting with the developer of the application for a while, > > a) Alignment (aka: alignment exceptions) - Looking at how it > handles the instruction - it interprets these SPE as common > instructions & then resets the 'upper' 32bits. > > I was just made aware that on 9/14/2007 - Kumar submitted a > patch that handles these instructions correctly (we don't > have that version - I am in the process of trying to port it > to my current version of the kernel (to see if part of > problem). > > In general, this is a VERY disturbing thing. We 'turn on > SPE' in the compiler (-mspe=yes)(a). We are NOT explicitly > using SPE instructions in our application(b), BUT(c), the > 4.2.171 > compiler (having origins from Code Sourcery (via Freescale)) > upon > optimizations put SPE instructions in without any regard for > alignment (which instead of making the code faster - might > actually > make the code slower)? It's a little disturbing to me. > > Stay tuned for more details about my port - and seeing if some > of my problems go away.. > > b) We still contend if you have multiple tasks using a (VERY) high > Density of SPE instructions - and the system is taxed heavily > (with lots of context switches) - there is the possibility that > a task will get unlucky and the registers setup will NOT there > after the context switches back (if some other task does something > else with the entire 64bits). > > > > Tom > >>> >>>>> -----Original Message----- >>>>> From: Kumar Gala [mailto:galak@kernel.crashing.org] >>>>> Sent: Wednesday, May 06, 2009 8:44 AM >>>>> To: Morrison, Tom >>>>> Cc: Michael Neuling; linuxppc-dev@ozlabs.org >>>>> Subject: Re: MSR_SPE - being turned off... >>>>> >>>>> Can you describe the # of processes you are running in your test. > Is >>>>> it possible for you to try the tests w/2.6.29 from kernel.org? >>>>> >>>>> - k >>>>> >>>>> On May 6, 2009, at 7:42 AM, Morrison, Tom wrote: >>>>> >>>>>> I'm sorry I forgot to put that, this issue was found with our >>>>>> currently running kernel 2.6.23.final (what comes with the >>>>>> Freescale LTIB BSP package dated 05/23/2009). >>>>>> >>>>>> I am sorry if I don't understand your statement that the SMP > might >>>>>> be broken on this kernel, because I tried to analyze the kernel > that >>>>>> came with the latest BSP LTIB [ackage from Freescale (dated >>> 12/18/2009 >>>>>> (where we got the 4.2.171 compiler from)), and the associated >>> 'switch >>>>>> context' code is exactly the same. Unfortunately, I have not > started >>>>>> the process of porting my current platform's BSP to this new > kernel >>> - >>>>>> otherwise, I would have done the test on that platform (this > also >>>>>> requires a new version of u-boot in order to test correctly)).. >>>>>> >>>>>> I may have mis-interpreted something and/or I am sure I don't >>>>>> understand everything about the SMP resource management (and >>>>>> associated SPE management), so thank you for any insight you >>>>>> may have on this front... >>>>>> >>>>>> Tom >>>>>> >>>>>>>> -----Original Message----- >>>