[ale] HD Failures: bad luck or technical error??

aaron aaron at pd.org
Sat Feb 2 13:28:09 EST 2002


Thanks to the folks who offered input on the HD failures; my replies are 
below for anyone interested.

Only "definitive" indications seem to be that some drive models or makes 
in combination with certain controller hardware can be inexplicably 
problematic. None of the responses point to a known file system problem 
or any special partitioning issues that could be corrected.

Guess I just have to hope for better karma with the 2nd Western Digital 
drive while backing up my system on a regular basis in case the ghost in 
the machine gets cranky again.

Or maybe using ext3 will exorcise the demon.

Hi Ho.
aaron

------

msmith at mikeandmel.com offered:
> I am surprised you could get Redhat to install on your Maxtor drive at 
> all.  I had to download a tool from Maxtor to turn on Write Verify to
> get  the drives to work with Redhat 7.2....after 2 weeks of trying all 
> kinds of  things....

I did have to work with the Maxtor Tech support on the first drive that 
failed before they would replace it. I had to download and build a DRDOS 
boot floppy with their utilities to attempt repairs and confirm the 
failure details before they would issue an RMA.  However, the actual RH 
6.1 install to that drive didn't raise any special problems or difficulty 
that I can recall.

> What were the exact errors you were getting?
> Were the errors CRC errors?

At this late date, I don't remember the exact errors from the first two 
failures. When the 2nd Maxtor went south like the first, I just moved on 
to another brand.

---

"Brian J. Dowd" <bdowd at DentFirst.com> had this to say:
 
> Is the power to your machine regulated by a UPS or power strip?
> Are there any electric motors or space heaters on the same circuit?
> (Just questioning power surges and spikes as a possible culprit.)

Power to my office systems is tech ground and pretty clean. Equipment is 
on surge protector power strips, though I don't have anything on a true 
UPS. There are no competing high-amp appliances on the circuits and the 
computer was off during the most recent power outages. I used to run the  
system 24 / 7 when usage was justifying it, but I only run it "on demand" 
now that I'm not working from home as much.

> I've had universally bad luck with Maxtor but pretty good success with 
> WD drives.

The Western Digital I dropped in outlasted either of the Maxtors I tried 
(like 6 months is something impressive ;-). However, given that a lot of 
people use Maxtor HD's without problems, it could easily be some odd 
incompatibility quirk between my IDE controller chipset / firmware and 
certain drive models.

> -Brian

---

Courtney Thomas <ccthomas at flash.net> continued:

> I too have recently started getting "short read" errors on a WD 60Gig 
> drive but only on the logical partition, none on the primary. I have 
> been running several Maxtor drives on this system for years without a 
> problem.

The 800 meg 5.25 Full Height Maxtor SCSI  that I bought for my Amigas 
back in 1989 (for several hundred dollars) ran like a top for over a 
decade at 50 or more hours a week. That's why I bought a Maxtor for my 
Linux system, though I now question whether Maxtor is the same company 
and product quality these days.

> I am running this drive under Debian-2.4.6 [Firewire] and am having no 
> problems at all with the Maxtor drives [non-Firewire] nor the primary 
> partitions on the Firewire [IDE] but have several non-logical
> partitions  on the same Firewire drive that are not getting short
> reads, whatever  short reads means.

> I find that when I reboot that the problem evaporates temporarily. Does 
> mounting the Firewire drive r/o on error have anything to do with 
> getting short reads ?

All of my experiences were with IDE connections to IDE drives. "Short 
Reads" was just the lone error explanation I got just before the drive 
died.

> Perplexedly,
> Courtney

---

Original  Post:

> The good news is I just updated my main system from RH 7.1 to RH 7.2.
> I like it. Actually, I like it a lot. The latest updates to Gnome have
> made it screamingly fast!!
>
> The BAD news is that my latest OS install is the 3rd one in about 18
> months and ALL of my updates have been motivated by physical hard drive
> failures. (If hardware failures require you to install and config a
> system, may as well install and configure the latest and greatest.)
>
> My hardware is a decent roll your own box... a tower case with ample
> air space, 250w power supply, reputable Tyan mother board w/ dual IDE
> on board, PIII 600 CPU, generous amounts of memory and a quality Matrox
> 400 AGP graphics card. All the distros I've experimented with have
> recognized my hardware and RH always seems to install just fine. My
> practice when installing on a new drive is to use the Disk Druid tools
> to manually set up separate /, /boot, /home, /tmp, /var, /usr and swap
> partitions, including a full format of all partitions with bad block
> checks.
>
> About 4 months after my initial RH 6.1 install, I started getting some
> odd errors and program crashes, ending a couple weeks later with an
> unreadable hard drive. Maxtor replaced the 20 gig drive under warranty
> with an identical unit and I used the HD swap as an excuse to update to
> RH 6.2. The 6.2 system was fine for another 5 months or so, then crash
> problems started cropping up again. I now had reason to suspect the
> hard drive, so I just went out and bought a 15 gig Western Digital at
> RadioShack.com and upgraded to RH 7.1 in the process. (I also got the
> dealer to exchange the Maxtor for a 15 gig Fuji and made that one a
> mirror / backup drive for paranoia's sake.)
>
> Things had been going good for about 8 months, then a couple weeks ago
> my largest partition started coming up with "short read" errors which
> required manual fdisk passes to restore before the system would boot.
> Subsequently, the drive started making some loud, ominous clicking
> noises at boot time, and eventually the disk showed up as unbootable
> (thankfully AFTER I had time to run an up-to-date mirror to my "backup"
> drive).
>
> So... I found a 40 gig Wester Digital 7200 for $90 at recent computer
> fair, have dropped it in and installed RH 7.2 from a commercial CD set
> I purchased.
>
> So I'm looking for opinions on what might be causing my problems.
> Am I just having a run of bad luck with these drives?? Are these low
> priced IDE units just too cheap?? Could my MB IDE controller or some
> obscure BIOS setting error be initiating these failures?? Am I missing
> some special trick for formatting and partitioning?? Will the new ext3
> file system magically solve everything??
>
> I'd like to get to a point where I do upgrade migrations on MY
> schedule, not the hard drive's!  Any helpful suggestions for doing that
> will be appreciated.
>
> peace
> (after justice)
> aaron

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
sent to listmaster at ale dot org.






More information about the Ale mailing list