public inbox for linuxppc-dev@ozlabs.org 
 help / color / mirror / Atom feed
From: Anthony Foiani <tkil@scrye•com>
To: Xie Shaohui-B21989 <B21989@freescale•com>
Cc: Wood Scott-B07421 <B07421@freescale•com>,
	"linuxppc-dev@lists•ozlabs.org" <linuxppc-dev@lists•ozlabs.org>
Subject: Re: SATA hang on 8315E triggered by heavy flash write?
Date: Wed, 22 May 2013 23:52:23 -0600	[thread overview]
Message-ID: <g8v362qg8.fsf@dworkin.scrye.com> (raw)
In-Reply-To: <ED492CCEAF882048BC2237DE806547C90B1DFA93@039-SN2MPN1-013.039d.mgd.msft.net> (Xie Shaohui-B's message of "Wed\, 22 May 2013 06\:15\:14 +0000")


Shaohui --

Thanks for the quick reply!  Please find my investigation and results
below.

Xie Shaohui-B21989 <B21989@freescale•com> writes:

> 1. only update NOR for a long enough time, for ex. tens of seconds,
>    see if error happens;

It seems that I can do this without any errors:

  / # flash_erase /dev/mtd1 0 0
  Erasing 64 Kibyte @ 7f0000 -- 100 % complete 
  / # dd if=/dev/zero of=/dev/mtd1 
  dd: writing '/dev/mtd1': No space left on device
  16385+0 records in
  16384+0 records out
  8388608 bytes (8.0MB) copied, 62.399439 seconds, 131.3KB/s

> 2. only r/w SSD without NOR operation, see if error happens;

Again, no problem:

  /ssd # ls -al biggie.bin
  -rw-r--r--    1 root     root     2330607084 May 22 19:34 biggie.bin
  /ssd # ls -alh biggie.bin
  -rw-r--r--    1 root     root        2.2G May 22 19:34 biggie.bin
  /ssd # time cp biggie.bin biggie2.bin
  real    3m 27.55s
  user    0m 2.60s
  sys     2m 16.13s

> 3. r/w SSD first and keep it run, then start to read NOR, if no
>    error for a long time, then start to write NOR, see how long the
>    error will happen.

Doing a NOR read during heavy SATA r/w seems to succeed, with no
errors on the console:

  [window 1]
  /ssd # time cp biggie.bin biggie2.bin

  [window 2]
  / # dd if=/dev/mtd1 of=/dev/null
  16384+0 records in
  16384+0 records out
  8388608 bytes (8.0MB) copied, 6.380613 seconds, 1.3MB/s

Doing a NOR write fails almost instantly (within a second):

  [window 1]
  /ssd # time cp biggie.bin biggie2.bin

  [window 2]
  / # dd if=/dev/zero of=/dev/mtd1 

  [console]
  [ 5160.269106] ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x6 frozen
  [ 5160.276387] ata2.00: failed command: READ DMA
  [ 5160.280905] ata2.00: cmd c8/00:00:60:f3:01/00:00:00:00:00/e0 tag 0 dma 131072 in
  [ 5160.280928]          res 50/00:00:f0:c0:48/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
  [ 5160.296386] ata2.00: status: { DRDY }
  [ 5160.300195] ata2: hard resetting link
  [ 5160.347858] ata2: setting speed (in hard reset)
  [ 5170.439981] ata2: No Signature Update
  [ 5170.611901] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
  [ 5170.618204] ata2.00: link online but device misclassified
  [ 5175.623918] ata2.00: qc timeout (cmd 0xec)
  [ 5175.628147] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  [ 5175.634347] ata2.00: revalidation failed (errno=-5)
  [ 5175.639373] ata2: hard resetting link
  [ 5176.143847] ata2: Hardreset failed, not off-lined 0
  [ 5176.155867] ata2: setting speed (in hard reset)
  [ 5185.743871] ata2: No Signature Update
  [ 5185.915900] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
  [ 5185.922203] ata2.00: link online but device misclassified
  [ 5195.927910] ata2.00: qc timeout (cmd 0xec)
  [ 5195.932140] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  [ 5195.938342] ata2.00: revalidation failed (errno=-5)
  [ 5195.943430] ata2: hard resetting link
  [ 5196.443885] ata2: Hardreset failed, not off-lined 0
  ...

At this point, a hard reset / full power cycle is needed to recover.

The board is an MPC8315ERDB derivative, and I'm running a patched
3.4.36 kernel.

I've uploaded some (possibly) relevant files to:

  http://foiani.home.dyndns.org/~tony/linux/ppc-sata-issues-201305/

There is a diff from 3.4.36, a devtree, and a kernel config.

Please let me know if there is any more information that I can
contribute.

Best regards,
Anthony Foiani

  reply	other threads:[~2013-05-23  5:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-15  8:12 SATA hang on 8315E triggered by heavy flash write? Anthony Foiani
2013-05-21 21:44 ` Scott Wood
2013-05-22  4:16   ` Anthony Foiani
2013-05-22  6:15     ` Xie Shaohui-B21989
2013-05-23  5:52       ` Anthony Foiani [this message]
2013-05-23  6:04         ` Xie Shaohui-B21989
2013-05-23 15:10           ` Anthony Foiani
2013-05-23 15:49             ` Anthony Foiani
2013-05-27  7:50             ` Xie Shaohui-B21989
2013-05-28  0:29               ` Anthony Foiani
2013-05-30  7:32                 ` Xie Shaohui-B21989
2013-06-01  4:24                   ` Anthony Foiani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=g8v362qg8.fsf@dworkin.scrye.com \
    --to=tkil@scrye$(echo .)com \
    --cc=B07421@freescale$(echo .)com \
    --cc=B21989@freescale$(echo .)com \
    --cc=linuxppc-dev@lists$(echo .)ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox