[ale] RAID mirror boot nightmare

Bob Toxen transam at VerySecureLinux.com
Wed Jul 11 02:27:41 EDT 2012


All,

PROBLEM SOLVED!  Phil's suggestion of the initrd being wrong was
correct.  I was starting to suspect this.  See below for details on how
I got into this mess and how I got out.

Phil: please send me private email with your name as it should appear on
your $50 check and your mailing address.  I was deadly serious about
the reward as I was desperate as this system is for a client and I want
to give a real thanks!


The command to rebuild the initrd under CentOS is mkinitrd.


In the md superblock there is a field called "Preferred Minor", i.e.,
preferred minor device for the md device that is created.  There seems
to be no command and option to just update this field; apparently one
must use mdadm with --create or --assemble to update the md superblock
on the underlying real disk devices.

Due to the "brilliance" of whomever wrote that md code, on first write
when any md device is activated, that md device's minor device is written
into the superblock stored under each underlying device, e.g., /dev/sda6
and /dev/sda6.  When I used the CD Rescue code, it generated md devices
of the form /dev/md123, 4, 5, 6, 7.  Probably, when I then ran "fsck -f"
(or just read a file which causes the file's access time to be updated)
under the CD Rescue CD's Linux, it changed the preferred minor device
in each underlying disk device, precipitating this nightmare.

Unfortunately, on boot the kernel fails to give useful info on what
device it was trying to mount or why it failed -- very UN-Linux-like.

I booted from a different non-RAID partition, mounted the md partitions
now called /dev/md126 and /dev/md0 as /mnt2 and /mnt2/boot and issued
the command

  mkinitrd --fstab=/mnt2/etc/fstab /boot/initrd_126 `uname -r`

and then edited my /boot/grub/grub.conf to change the initrd field to
initrd_126.

I then edited /mnt2/etc/fstab to specify /dev/md126 as my / device
and /dev/md0 as /boot.  The kernel doesn't really seem to care about
/etc/mdadm.conf as the newer Linux RAID stores the info in the
"md superblock" (not to be confused with the ext[234] *nix superblock).

The kernel also seemed to ignore on the grub line:

  md=4,/dev/sda6,/dev/sdb6

I then booted successfully.  I then copied this new grub.conf and
initrd_126 to my new raid / and /boot partitions for redundancy

The above got my md devices running under the new names but with only
sda as I had not yet installed my replacement second disk.


To recover with my new empty disk, I installed it as sdb and did:
  sfdisk -d /dev/sda | sfdisk /dev/sdb


NOTE 1: Save your partition tables to a file thusly (and then to another
        system):

          sfdisk -d /dev/sda > partition_table_sda
          sfdisk -d /dev/sdb > partition_table_sdb

        To later recover (DANGEROUS):

          sfdisk /dev/sda < partition_table_sda


NOTE 2: For those that do full backups with tar, rsync, etc. which
        does NOT save inode numbers, it is very important to backup
	the inode numbers.  Thus, when you eventually suffer disk
	corruption (possible under ext3 occasionally with unclean
	shutdown), when fsck asks about inode 235255 you can grep for it
	in your backup and know which file may be corrupted and in need
	of a restore.  Also, if a directory file gets trashed you will
	know where to restore the orphan files that ended up in
	/lost+found.

	One way to capture inodes (prior to backup) is the following
	except prune to skip /proc and other fake file systems:

	  find / -ls > /root/inodes.list


To change them back to my original preferred names I then did:

  1. Booted to my primary non-raid sda5.

  2. Ensured that no md devices were mounted with the following
     (better than "mount" because it doesn't babble about /dev, /proc,
     etc.):

       df -h

  3. Deactivated the "wrong" md device:

       mdadm -S /dev/md126

  4. Created the "right" md device (this worked because after installing
     replacing my failed sdb disk I allowed RAID automatically to sync
     over several hours):

       mdadm mdadm -A /dev/md4 -v -U super-minor /dev/sd[ab]6

     Verify that md4 was created successfully:
       cat /proc/mdstat
       mdadm -D /dev/md4
       mdadm -E /dev/sd[ab]6

     Alternatively (if there was a out-of-date file system on /dev/sdb6
     such as if the replacement disk had been used)
     I first would have had to scribble both the md superblock and the
     ext3 superblock with:

       mdadm --zero-superblock /dev/sdb6            # Dangerous
       dd bs=512 count=1 if=/dev/zero of=/dev/sdb6  # Dangerous

Then update /boot/grub/grub.conf, /etc/fstab, and /etc/mdadm.conf.


Btw, to show a disk partition's UUID (suitable to surround with ``):
  blkid -o value /dev/md0  | head -1
  blkid -o value /dev/sda6 | head -1

Btw, to set a partition's UUID (maybe after "dd if=/dev/sda5
of=/dev/sda6") do (except some Distros use uuid instead of uuidgen):

  tune2fs /dev/sda6 -u `uuidgen`


THANKS also to LinuxGnome (first to respond), Scott McBrien, and Erik
Mathis for their help.

ALE comes through again for Linux!

Best regards,
Bob Toxen
bob at VerySecureLinux.com
transam at VerySecureLinux.com [ALE subscribre]

On Tue, Jul 10, 2012 at 08:40:48AM -0400, Phil Turmel wrote:
> Good morning Bob,

> Might be useful to show us the output of "mdadm -E /dev/sda[26]".

> You might just need to run "update-initrd" or whatever the equivalent is
> for CentOS 5.8.  You always need to do this when you rearrange your boot
> devices.



> Phil.

> On 07/10/2012 01:33 AM, Bob Toxen wrote:
> > Additional details on this miserable problem:

> > On Boot the kernel complains of:

> >   Creating root device
> >   Mounting root filesystem
> >   Mount: Could not find filesystem '/dev/root'

> > after talking about md0 apparently being created successful and lastly
> > panics.

> This suggests that something in your initrd doesn't match your system
> any more.  Its assembling md0 when your mdadm.conf below specifies md1
> and md4.

> > /boot/grub/grub.conf entry  being booted:
> > title CentOS-single-md4
> > 	root (hd0,0)
> > 	kernel /vmlinuz-2.6.18-308.4.1.el5 ro root=/dev/md4 md=4,/dev/sda6,/dev/sdb6 md=1,/dev/sda2,/dev/sdb2 md-mod.start_dirty_degraded=1 rhgb single noresume
> > 	initrd /initrd-2.6.18-308.4.1.el5.img

> You shouldn't need the md=n,/dev/... items in this list if your
> mdadm.conf is correct in the initrd.

> > /etc/mdadm.conf (heavily edited by me including switching from uuid to
> > devices; I don't presently list swap as that is not critical and it
> > fails before even thinking about swap):
> > # mdadm.conf written out by anaconda
> > DEVICE /dev/sda[26] /dev/sdb[26]
> > MAILADDR root
> > ARRAY /dev/md4 level=raid1 num-devices=2 devices=/dev/sda6,/dev/sdb6 auto=yes
> > ARRAY /dev/md1 level=raid1 num-devices=2 devices=/dev/sda2,/dev/sdb2 auto=yes

> I've had the best success when the ARRAY lines have only the md node and
> the uuid.

> > fdisk output:
> > Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
> > 255 heads, 63 sectors/track, 121601 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes

> >    Device Boot      Start         End      Blocks   Id  System
> > /dev/sda1   *           1          13      104391   83  Linux
> > /dev/sda2   *          14          26      104422+  fd  Linux raid autodetect
> > /dev/sda3              27        4200    33527655   82  Linux swap / Solaris
> > /dev/sda4            4201      121601   943023532+   f  W95 Ext'd (LBA)
> > /dev/sda5            4201       62900   471507718+  83  Linux
> > /dev/sda6           62901      121600   471507718+  fd  Linux raid autodetect

> > /etc/fstab:
> > /dev/md4        /                       ext3    defaults        1 2
> > /dev/md1        /boot                   ext3    defaults        1 2

> > #normal /dev/md3        /                       ext3    defaults        1 1
> > #normal /dev/md0        /boot                   ext3    defaults        1 2
> > #normal /dev/md4        /root2                  ext3    defaults        1 2
> > #normal /dev/md1        /boot2                  ext3    defaults        1 2
> > tmpfs                   /dev/shm                tmpfs   defaults        0 0
> > devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
> > sysfs                   /sys                    sysfs   defaults        0 0
> > proc                    /proc                   proc    defaults        0 0
> > /dev/md2                swap                    swap    defaults        0 0


> > What magic am I missing?  Please help!!!
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo


More information about the Ale mailing list