[ale] Possibly bad hard drive

JD jdp at algoloma.com
Thu Feb 21 09:38:49 EST 2013


A few weeks ago, a 12.04 server refused to boot here after a kernel update.
That machine has a default LVM config, which created a 220MB ext2 /boot
partition.  What happened was that during the new kernel installed, it didn't
fit. /boot was 100% full.  I booted off the last kernel and had to use apt-get
purge to remove about 10 old kernels.

Anyway, that might be the issue for your system too.
Check both inodes and real space

$ df -i
$ df -k

Then an apt-get -f install made everything work again with the new kernel fully
installed.   apt-get clean doesn't help /boot to my knowledge.

If you really want to exercise the entire HDD - which will give the hardware and
firmware a chance to relocate failing sectors AND refresh any _lazy bits_, then
use ddrescue to mirror the entire HDD device to an identical, different, HDD. Or
you might buy spinrite or find some gnu/free program to do what spinrite does.
I think this will help MS-Windows machines too, which get a BSOD when booting -
I believe this is due to lazy bits more than people know.

The best and only real protection from any hardware issues is having enough
versioned backups so if there are any corrupted files, then you'll be able to go
back far enough to restore those.  Is 30 days enough? 60? 90?  I don't know.

IME, SMART data has never been very useful. By the time SMART warns about
anything, it is too late.

On 02/21/2013 08:31 AM, Jonathan Meek wrote:
> Hey guys,
> 
> I believe my hard drive is about to give up the ghost hard. Saturday the system
> booted without issue then Monday, I booted the system and tried to start
> Firefox. The taskbar freaked out and I had to do a hard shutdown. After multiple
> restarts I was able to get the system back up but at all the restarts, I got a
> error message that stated that it couldn't find a particular directory and to
> press Ctrl+Alt+Del to restart the system.
> 
> Finally I got the system to give the prompt for entering in my harddrive
> password (I have an encrypted hard disk that I setup when I did a fresh install
> of lubuntu last time). It checked for errors and found some, it tried to repair
> them and hung at mounting /tmp.
> 
> I restarted the system and this time it rebooted without issue I got all the way
> to the home screen and logged in. Launched Firefox without issue and goofed
> around for a few minutes while I let my backup system backup for the final time
> (in fear of never getting it back).
> 
> Shutdown the system and restarted it with a Ubuntu 12.04 live CD in order to do
> check the hard drive. Went into Disk Utility and the system recognized I had a
> hard drive but when I tried any of the benchmarks it balked at me saying it
> couldn't read as well as the SMART Status said "not applicable". This might be
> from the encryption but I don't know.
> 
> Exited out of the live CD, boot the system again, and it booted without
> incident. Tried to do software update, it griped at me saying that there was not
> enough room in /boot to do an update and to use sudo apt-get clean. Run sudo
> apt-get clean and tried the update again. Same error message. Repeated this step
> 5 times before giving up.
> 
> I am not sure what to do at this stage with it because I can't seem to wipe the
> drive probably due to the encryption because I tried to install Ubuntu 12.04
> since I had a backup of all my data.
> 
> All that backstory was to ask this one question: Is there anything else I can do
> to give some level of assurance the actual status of the hard drive? I think it
> is busted but I am 


More information about the Ale mailing list