Re: [SLUG] Is RAID worth doing?

From: Ian C. Blenke (ian@blenke.com)
Date: Thu Jul 13 2006 - 13:14:35 EDT


steve szmidt wrote:

>If you are running hardware raid then the O/S does not even know there are
>multiple drives. Under software raid it's managing it. You cannot boot on a
>s/w raid as it's not "up" until after boot has started. So then you need to
>keep a seperate non raided boot partition.
>
>If you go that way the problem with the second drive is that it would not have
>the boot partition. You could create it on it too, and copy the kernel on to
>it. Don't forget the boot sector.
>
>
The BIOS boots to drives the order it finds them. The approach we use is
to install a boot manager to all drive sin the system. If a /dev/hda
goes out, we simple re-strap the /dev/hdc to be /dev/hda and the system
boots happily. Moreover, if /dev/hda goes out and there is no /dev/hdb
in the system, the BIOS will probe /dev/hdc as BIOS device 0x80 which
allows us to tell the system to boot from the secondary master and the
system comes up with no restrapping at all (in an emergency, that is -
running with only one drive is never good long term).

There is an easy solution to that without running lilo for each drive
(this is what I do):
1. Lilo is raid aware, in that you can install a bootsector to /dev/md0.
2. You use a utility like the "mbr" package to install a mbr to the boot
sectors of both drives.
3. The partitions that make up /dev/md0 need to be marked as "active"
(for the "mbr" utility to boot to by default).
4. The partitions that make up your root volume ("/") /dev/md0 device
should be marked as type "fd" (or auto-detect RAID)

Now when you boot:
1. Your BIOS loads the bootsector of the first device is finds (bios
device 0x80, or the first IDE drive it finds by walking through bus 0
master -> slave and then bus 1 master -> slave).
2. That bootsectors starts up the installed "mbr" which then looks for
the first active partition it can find. "mbr" loads the bootsector
installed on that active partition and starts that (lilo).
3. lilo boots, reading the kernel and initrd from the hard mapped
root=/dev/md0 files from the BIOS 0x80 device (the first drive). lilo
hard maps files when you install a bootsector, unlike grub which needs
to grok the filesystem and find files dynamically on boot.

The neat bit about this is that you just install lilo to /dev/md0, and
the software raid replicates that bootsector to the other drives. You
only have one lilo.conf file to maintain, and there is no need to
manually install bootsectors to different drives.

I did try grub. I usually prefer grub on my personal single-drive
systems (like laptops). Unfortunately, grub isn't so software raid
aware. The best you can hope for there is to install grub on both
drives, changing the BIOS device to 0x80 when you install to each so
that either drive can boot as primary master and the system will
continue booting.

I could go on at length if anyone is interested.

>Linux knows where on the controller chain it is. If you install it f.ex. as
>the primary drive (hda) on the first controller then it cannot boot in any
>other position. Any other drive you want to boot from must obviously be
>bootable. So the boot info must be placed manually on it.
>
>

Also, the boot info is installed to be BIOS device ID aware. If you
install a bootsector to /dev/hdc, the bootstrap will _insist_ that the
hardlinked info also come from BIOS device 0x81. If you reboot without a
/dev/hda, or restrap /dev/hdc for /dev/hda, you'll find you can't boot
as the bootsector is trying ot access BIOS device 0x81 which isn't there
anymore.

It's a real pain in the ass if you don't know what's going on.

>For this reason I prefer h/w raid. Now raid is loaded long before the o/s is
>and it can go it's merry way booting off any drive.
>
>

We are quite happy with software raid. With a bit of careful paritioning
and software raid, you can get the benefits of one boot manager config
that self maintains without resorting to hardware raid.

>Newer drives have higher data density, but they also have newer and better
>technologies which is a lot better than before. I see quality control issues
>as the biggest problem. All manufacturers mess up every now and then.
>
>

That's very true. Batches of drives from a vendor usually come in with
problems. It's really impossible to predict this ahead of time.

 - Ian C. Blenke <ian@blenke.com> http://ian.blenke.com/

-----------------------------------------------------------------------
This list is provided as an unmoderated internet service by Networked
Knowledge Systems (NKS). Views and opinions expressed in messages
posted are those of the author and do not necessarily reflect the
official policy or position of NKS or any of its employees.



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:40:10 EDT