[ale] Which large capacity drives are you having the best luck with?

Wed Jan 5 23:31:30 EST 2011

On Wed, Jan 5, 2011 at 8:42 PM, Pat Regan <thehead at patshead.com> wrote:
> On Wed, 5 Jan 2011 19:23:10 -0500
> Greg Freemyer <greg.freemyer at gmail.com> wrote:
>
>> The first thing I look at is POH (Power on Hours).  In this case
>> 27,871.  This field has been pretty reliable in my experience to be
>> exactly what it says.  So my drive is not exactly new.
>
> That's a pretty good specimen, 3 years powered on :)
>
>> Then look at Reallocated_Sector_Ct.  Mine is zero.  That's cool.
>
> I'm probably not telling you this, but when I see this number start to
> move away from 0 I start ordering a replacement drive :)
>
> Interestingly, the X25-M in my laptop is showing 1 relocated sector.
>
>> But Hardware_ECC_Recovered is 140,010,573.  That may sound large, but
>> remember, the reads succeeded because of the ECC data, so there is no
>> data loss.  I tend to agree with you that as magnetism fades for a
>> sector, checksum failures increase and ECC recovery is needed.
>> Spinrite used as you describe may keep that value lower.
>
> It may sound large, but what does it really mean?  Is it literally the
> number of ECC errors?  How large is a single ECC block?  How many ECC
> blocks are involved in the read of a single sector?  Could every bad
> ECC in a sector result in the count going up?  5000 per hour sounds
> like a lot of recovered reads to me, assuming it means sectors.

Pat,

A disk drive physical sector is the equivalent of a network packet.

It has a header, payload, footer.

And it has exactly one CRC / ECC value as I understand it.

The new "Advanced Format" drives from WD have 4KB physical sectors.
The benefit being less waste to overhead.  ie. Only one header/footer
per 4KB instead of every 512 bytes.  The ECC is bigger, but not 8x
bigger.

>> But I don't think spinrite tries to detect sectors that have been ECC
>> recovered.  So it doesn't really know the details.
>
> I would agree.  The drives in my laptop don't report ECC errors via
> smart.
>
>> A smart long self test has the ability to know that a ECC recovery is
>> needed for a sector.  What it does with the knowledge, I don't know.
>> But it certainly has more knowledge to work with than spinrite.
>
> How thorough do you think a long smart test is?  I've had drives die
> within a day or three of passing a long smart test.  They also don't
> take all that long, and they sure don't cause the drive to make any
> noise :)

I think / believe the long test to primarily be a surface test.  Total
disk drive failure is not caused by platter surface errors I would
guess, so I see little reason for correlation.  ie. I'm not surprised
at a controller failure a few hours after the platter surface test
showed it good.

>> fyi: hdparm has a long read capability that allows a full physical
>> sector to be read with no error correction!  So spinrite could in
>> theory read all of the sectors with CRC verification disabled and
>> check the CRC itself.  The trouble is the the drive manufactures
>> implement proprietary CRC / ECC solutions, so spinrite has no way to
>> actually delve into the details of the sectors data accuracy.
>
> I doubt that spinrite does this.

So do I.

>> fyi: hdparm has a way to force a write to Pending Sector and put new
>> good data on it.  Thus spinrite could do this if it wanted to as well.
>>  I certainly hope it is not doing so.
>
> My understanding is that spinrite attempts to read every sector and
> (eventually) write them back to the disk.  If it fails to read
> correctly it will start reading similarly to dd_rescue.
>
>> > It also doesn't mean that the sector has been reallocated.
>>
>> You imply a sector can be moved without it being reallocated.  I think
>> that is wrong.  The only way to move the sector is to allocate a spare
>> and use it instead of the original.
>
> I think he's implying that the data in the sector can be moved by
> software.  That can't be done without Spinrite understanding the file
> system and making the appropriate changes.  It definitely doesn't do
> this.
>
>> > This forces the drive's firmware to evaluate the performance at
>> > that point, and forces the surface to absorb both a 1 and 0 in turn
>> > at that point.  Also, I believe that the magnetic fields
>> > deteriorate over time.  I could probably corroborate that with some
>> > extensive research.
>>
>> Agreed, but I often store hard drives offline for extended periods.
>> We rarely see read failures for drives we put back on line.  So the
>> deteriation is very slow and not likely to be an issue.
>
> How do you know?  Sun published a lot of numbers related to silent data
> corruption.  ECC is pretty fallible.  Especially when it has to be used
> to correct data 150 million times.

We MD5 hash our major data files before we put the drives off-line.

When we bring them back we do a MD5 hash verify.

Very few issues seem to crop up due to offline storage.  Even if its
been a year or two.

We also keep tape backups when we do this and I can rarely think of us
having to restore the tape.

>> fyi: The DOD uses thermite in the drive platter area to heat the media
>> to several hundred degrees.  When this happens the magnetism is
>> released and the data is gone.
>
> The magic smoke gets out!
>
>> Especially with laptop drives, you get physical damage as the flying
>> head hits the platters from time to time.  To protect the platters,
>> they are often actually coated with a fine coat of diamond dust.
>> That's one reason laptop drives cost more.
>
> Laptop drives are rated for higher g forces than desktop drives.
> Taking both apart I wouldn't guess that, it must have something to do
> with the inertia of lighter parts.

desktop drives are not designed to have the head hit the platter.  ie.
The head is meant to always fly, so a small bump makes a gauge in the
platter.

Laptop drives are designed for occasional contact with no damage, so
more G's.  (Thus the diamond dust coating.)

>> > The read invert write read invert write cycle, if nothing else,
>> > will ensure that all the magnetic bits are good and strong since
>> > they are all ultimately rewritten.
>>
>> True, but I think normal degradation is much slower than you imply.
>
> I agree.
>
>> For a drive you've treated with spinrite, what's your ECC_Recovered /
>> POH ratio.
>>
>> ie. Mine is 5000 recoveries per power on hour.  And I don't do
>> anything to "maintain" it.  This is just my desktop machine.
>
> I wish we had old smart numbers for your drive.  I wonder if that ratio
> has been increasing over time and if so, by how much.

Sorry, no history available.

>> I believe a smart long self test will read all of the sectors and
>> identify those that are not ECC Recoverable.  I don't think it will
>> actually reallocate them.
>
> I don't believe a long smart self test touches every sector.  A long
> test runs much too fast for that, at least on the drives I've paid
> attention to.

Do you have a rough GB / min rate for the long test?

With modern drives, I can dd if=/dev/zero of=/dev/sda bs=4k  at about
6GB / min.  I can't say I've done a lot of the long tests, but that
seems about the same performance.

My 250GB drive says 92 minutes to run the long test.  That's less than
3GB/min, so plenty of time to do a full surface scan.

>> What spinrite likely does is read the sector in various ways.  ie many
>> data recovery tools can read the sectors in reverse order.  This
>> causes the drive to align the head slightly differently I believe.
>> Due to that slight change, some bad sectors can be read.  So I
>> actually do think spinrite could have some logic to do this that
>> normal read logic would not have.
>
> BACKUPS, BACKUPS, BACKUPS.  And then more backups.  Data recovery is
> for people who don't have backups :).

I tend to work a lot with other peoples drives, so I'm the one making
the backup!  (And I never see spinrite recommended by people in my
industry - computer forensics.)

For files I control, I tend to keep two copies of important files on a
minimum of 2 media, often 3 or more.  But I've had 2 drives fail
within hours of each other!  (both young drives during the height of
Seagate's problems.  Fortunately, replacing the controller card on one
of the drives brought it back to life.)

>> > Again, this may or
>> > may not trigger sector reallocation.
>>
>> I surely hope writing to a sector previously had read failures not
>> handle-able via ECC recovery triggers a reallocate.
>
> If it doesn't then you're out of spare sectors and the drive is ready
> for the scrap heap.  This is also one of the reasons why you want to
> image a bad drive onto a good drive.  Once you start writing you can
> really screw things up even more.
>
>> >  Spinrite will report these data
>> > areas as recovered or unrecovered as appropriate.  The drive itself
>> > may still be fully usable, if, for example, the data error was
>> > caused by a power failure, but the drive was not damaged.  If
>> > sectors start getting reallocated, I would agree that it's time to
>> > consider changing the drive out, as I did with one of mine last
>> > night.
>>
>> I'm not so sure I agree.  A lot of reallocates are just physical
>> platter issues.  It used to be that drives shipped new with lots
>> reallocated sectors.
>>
>> Admittedly, new ones tend to have zero anymore.
>
> Drives are cheap.  RMAing a drive is cheap.  If a drive starts acting
> up and doesn't want to stay in one of my RAIDs it is time to replace
> it.  I saw a 5900 rpm seagate 2 TB drive on sale for $100 shipped
> today.
>
> I could probably also argue that I don't completely trust a drive fresh
> out of the box, either :)

I actually like the idea of putting 20 or 30 or more operational hours
on a drive before putting real data on it.  We write a single pass of
zeros to new drives before putting them in service, but a couple years
ago with Seagate, that was not enough of a burn in.  Maybe 3 or 4
passes would have done it.

> Pat
Greg