[ale] Seagate 1.5TB drives, bad blocks, md raid, lvm, and hard lock-ups

Greg Freemyer greg.freemyer at gmail.com
Wed Jan 6 16:38:03 EST 2010


Brian,

If your running raid5 with those drives, you have basically zero fault
tolerance.

The issue is if one drive totally fails you are almost guaranteed to
have some bad sectors on the remaining drives.  Those bad sectors will
prevent mdraid from rebuilding the array fully.  (It may rebuild the
other stripes that don't have any bad sectors, but definitely the
stripes that have bad sectors are not rebuildable.).

So at a minimum you need to be running mdraid raid6.  And even then
you will just achieve raid5 reliability.  (ie. with that size drives
mdraid raid6 will likely NOT survive a double disk failure.)

And then you need to be running background scans on a routine basis.
I've forgotten the exact command, but you can tell mdraid to scan the
entire raid volume and verify the parity info is right.  In theory
doing that will handle the bad sectors as they pop up.

Unfortunately it sounds like your drives are creating bad sectors
faster than you can likely force them to be remapped even by routine
background scans.

Greg

On Wed, Jan 6, 2010 at 3:09 PM, Brian W. Neu <ale at advancedopen.com> wrote:
> I have a graphic design client with a 2U server running Fedora 11 and now 12
> which is at a colo handling their backups.  The server has 8 drives with
> Linux md raids & LVM on top of them.  The primary filesystems are ext4 and
> there is/was an LVM swap space.
>
> I've had an absolutely awful experience with these Seagate 1.5 TB drives,
> returning 10 out of the original 14 due to the ever increasing SMART
> "Reallocated_Sector_Ct" due to bad blocks.  The server that the client has
> at their office has a 3ware 9650(I think) that has done a great job of
> handling the bad blocks from this same batch of drives and sending email
> notifications of one of the drives that grew more and more bad blocks.  This
> 2U though is obviously pure software raid, and it has started locking up.
>
> As a stabilizing measure, I've disable the swap space, hoping the lockups
> were caused by failure to read/write from swap.  I have yet to let the
> server run over time and assess if this was successful.
>
> However, I'm doing a lot of reading today on how md & LVM handle bad blocks
> and I'm really shocked.  I found this article (which may be outdated) which
> claimed that md relies heavily on the firmware of the disk to handle these
> problems and when rebuilding an array there are no "common sense" integrity
> checks to assure that the right data is reincorporated back into the healthy
> array.  Then I've read more and more articles about drives that were
> silently corrupting data.  It's turned my stomach.  Btrfs isn't ready for a
> this, even though RAID5 was very recently incorporated.  And I don't see
> btrfs becoming a production stable file system until 2011 at the earliest.
>
> Am I totally wrong about suspecting bad blocks for causing the lock-ups?
> (syslog records nothing)
> Can md RAID be trusted with flaky drives?
> If it's the drives, then other than installing OpenSolaris and ZFS, how to I
> make this server reliable?
> Any experiences with defeating mysterious lock-ups?
>
> Thanks!
>
> ------------------------------SMART Data-----------------------------
> [root at victory3 ~]# for letter in a b c d e f g h ; do echo /dev/sd$letter;
> smartctl --all /dev/sd$letter |grep Reallocated_Sector_Ct; done
> /dev/sda
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       8
> /dev/sdb
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       1
> /dev/sdc
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       0
> /dev/sdd
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       0
> /dev/sde
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       1
> /dev/sdf
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       0
> /dev/sdg
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       1
> /dev/sdh
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
> Always       -       0
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>
>



-- 
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
Preservation and Forensic processing of Exchange Repositories White Paper -
<http://www.norcrossgroup.com/forms/whitepapers/tng_whitepaper_fpe.html>

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com



More information about the Ale mailing list