RAID (Redundant Array of Inexpensive Disks) is a scheme in which data is stored
on hard disks in one of several ways to allow for more speed, more protection,
or robust use of data. The advantages of RAID systems are such that they are an important part of server hardware, but their application in Linux systems present a few problems.
Intro to RAID
RAID levels vary, but the most common are
0, 1, and 5. RAID level 0 offers no protection of data, but does
increase performance as well as provide an increase in device size. In
RAID 0 you can use two or more disks to stripe data across
multiple disks. This will yield a virtual device available to the OS
that is the sum total of the two actual devices being used (provided
they are the same size; otherwise your yield is two times the size of
the smallest of the two devices). For example, using RAID 0 on two
2G drives will yield an available file system of 4G with that file system
being striped across both disks. That means that the data is actually
alternating between disks in chunks. That way, any reads or writes of data
on the disks should be alternating between disks regularly and thus provide
a noticeable performance boost. This is because the most
time-consuming part of a read operation is seek time and rotational
latency. When you stripe data between two or more disks you can
be reading data from one disk while the other is seeking/spinning
RAID level 1 is the first level that offers protection. This level
allows you to use two or more disks to "mirror" each other and thus
data is reproduced in whole across more than one disk. If one disk
fails, then the RAID system being used will simply ignore the failed disk
and use only operable disks. RAID 1 space usage for two disks would
be the same as the size of one of the disks (if they were the same
size; otherwise the available space is exactly the same amount as the
smaller of the two disks). RAID 1 performance for reads should be
good because the RAID system can balance reading between all disks
in the array. Write performance is poor, however, since all writes
must occur on all disks. That's fine for many systems (especially
file server type systems) since they are heavily read oriented (often
as high as 95% of the time, in fact).
RAID level 5 is the best compromise in terms of both data availability
RAID 5 allows you to create an array of at least three disks. The
amount of space is usually calculated by the sum total of the amount
of disks minus one. For example, the amount of usable space when
using four 9G disks would be three times 9G, or 27G. There may be
a little extra overhead as well, but this is close. With RAID 5,
data is stored in a striped format across all disks. What is a little
different is that parity information is also striped along with the
data. So, if any disk has lost its contents, it can be re-created using
the data that is still available plus the parity. Since striping
is used, performance is generally good. Read performance is good
since reads are balanced across disks. Write performance is
a slight bit slower than normal due to the parity generation and
extra data to write. Performance does drop considerably when the array
loses a disk, though, as some data will have to be re-created by reading
several disks and generating the data based on the data and parity
information. The pay off is that all the data is still available,
even though an entire disk may have died.
RAID is an important and hot topic among those who run servers that
are critical for some reason. One of the more common reasons that
servers go down is hard disk failure. This can be one of
the worst failures to try to fix since hardware and data must be
replaced. At best this is going to involve hours of down time while
backups are restored. Even then, data is lost that was written between
the last backup and the disk failure. The best way to avoid this
scenario is to not lose the data at all. Enter RAID. RAID allows
you to use multiple disks in such a way that if one dies the rest
can keep working without loss of data. An administrator is notified
of the failure and if hot swappable drives are used the administrator
can actually repair the machine without ever having to take it off line. Even
if hot swappable devices are not used, the down time is minimal since
only the hardware has to be replaced and the system can be brought
back on line immediately. RAID will then actually replace the data
on the new drive in the background and the server will still work fine.
There are three ways to do RAID on Intel PC hardware. The most common
is the PCI SCSI RAID controller. The problem with these under Linux
is that many are high end and require an NDA to get the programming information. These NDAs are prohibitive to free software because the source code can not be released.
The next most common way to do RAID under Linux is through the use of
a SCSI to SCSI RAID controller. This requires a supported SCSI controller
(of which there are many). On that controllers bus is the RAID controller
which simply looks like one or more drives (depending on how you set up
the array in the controller). The RAID controller then has its own
SCSI bus(es) that are connected to the physical drives that
comprise the array(s).
The least used but most promising way to do RAID
is in software via the Linux kernel. You can use any supported
block device (IDE disks, supported SCSI, etc) to setup a RAID array.
All RAID operations are handled by kernel threads. This will be
available in full form from the 2.2 Linux kernel.
There are two PCI SCSI RAID controllers that
are now available and supported under Linux, the DPT and the
ICP-Vortex. Linux developers are working on support for the
Mylex cards as well, but it hasn't been completed yet.
The DPT has been supported for a few years now, but
the problem with it is that it requires DOS or SCO to run its
configuration utility. There have been promises to port it to
Linux, but it has yet to happen. DPT does support Linux by
providing the information necessary, but the actual work is
being done mostly by Michael Neuffer. Unfortunately he hasn't
yet had time to port the StorageManager to Linux, but he does
seem to think he will have time to do it in the not-so-distant
future. The DPT does have some nice features including multiple channel controllers and caching.
There is an audible alarm on the card as well as driver
notification available to an external process. There is no
monitoring software, but a simple program to monitor the RAID
array and email an administrator would be easy to write.
The ICP-Vortex is fully
supported by Linux and doesn't require any other OS for its
configuration. All ICP-Vortex configuration happens at the BIOS
level of the card. There is a Linux daemon that will alert
the system administrator if there are problems with any of the disks
in the array. ICP has several models available ranging from cards
that can do only RAID 0 and 1 to multi-channel RAID 5 cards (and
even Fiber Channel). The RAID 0/1 card can be upgraded via software
to full RAID 5 capability as well if you later find the need for
RAID 5. All cards have the ability to add cache via a 72 pin
SIMM socket (both EDO and standard 50ns SIMMs work fine).
The ICP-Vortex is only supported as of 2.0.33, but the company
appears very dedicated to supporting Linux. They wrote and maintain
their own Linux driver and are becoming very active in the Linux
community (they will be exhibiting at LinuxExpo this year).
SCSI to SCSI Controllers
The SCSI to SCSI solutions vary wildly. There are solutions from
many companies including Mylex, CMD, and others. These solutions
are usually external or requiring a full height slot in the case
for the controller. The actual administration is done either via
an LCD panel with buttons on the front and/or via a terminal emulator
and a serial port. Some models have one or the other and some have
both. You set up your array on the controller itself and then the
controller presents the array to the OS as one logical drive (or
more if that's how you set it up). The biggest drawback to these
types of controllers is that they are usually on the expensive side.
The CMD controllers are the only ones I know of that offer Linux
software to do the administration. They have several models
available including one that is a dual redundant hot swappable
controller. If one of your controllers fails you can replace it
without bringing your server down at all! CMD controllers can
also be upgraded to multiple channels with expansion cards.
Many companies produce SCSI to SCSI solutions. My own personal
experience is only with the Mylex. The Food for the Hungry International
server runs Linux on a Mylex DAC-960 SUI
RAID controller and has performed extremely well (their server was
built by Linux Hardware Solutions). They
have been very happy with the performance and ease of use of their
The final solution is kernel software RAID. RAID 0 and 1 was introduced
quite some time ago in the kernel, but now patches are available for
the 2.0.x kernel that allow RAID 0-5. This will be a standard option
in the 2.2 kernel.
Software RAID has been proven to have the advantage
of speed over all of the hardware options that have been tested. It
also has the advantage of allowing the use of any supported block
device. That means that you can mix things like IDE and SCSI in one
array which is completely impossible with existing hardware solutions
(I doubt you would want to, but it may at least be helpful in situations
where you have lost a drive and the only available replacement is an
IDE one). The main disadvantage appears to be that a server may need those CPU cycles for things other than calculating RAID parity.
The biggest advantage of software RAID is price. Hardware RAID
controllers seem to range in price from about $500 to $5,000. For
many systems it will likely be sufficient to use a single $200
supported SCSI controller (non RAID) and simply use the kernel's
software RAID across multiple disks on that controller. Folks
on a really tight budget might even want to go with RAID 5 across
four similar IDE disks using the IDE controller built into today's
motherboards. With 9G UDMA IDE drive prices at about $400, this is
becoming a very interesting option. You can get about 27G of RAID
5 protected space for a grand total of $1600 if you already have
a machine available with two free IDE controllers.
Software RAID does have one current problem, though, and that is
the ability to do full RAID for the root filesystem. This can be
overcome by using a boot floppy or a small non-RAID boot partition
to boot your system from. I don't
consider this a real obstacle, though. Any server should have a
floppy drive available any way, and given that they are cheap and
easy to replace should one of *those* die, they make perfect boot
media for this. You can even keep extra copies of your boot floppy
around if you are worried about them. Once the 2.2 kernel is
available (and stable...), we hope to have
RAID support at install time that
allows you to set up a system using a boot floppy,
small boot partition, or perhaps some other form of removable
media (Zip drive, CD-ROM, etc).
To summarize, RAID is very much alive and well on Intel PC hardware.
There are supported PCI RAID controllers that work well and more to
come (Mylex is working with Linux developers to get their card
supported). All SCSI to SCSI solutions should work very well, because
they are OS independent. Software RAID is coming along very nicely and
could be the best choice down the road. The performance is excellent
and the low cost is a huge bonus.
I do have benchmark reports from various sources on many of these
controller options, but I am not going to print them. No single
source was able to compare all three of the major types of RAID
controllers (PCI, SCSI to SCSI, software). That means that we have
no real comparison of all types on similar hardware. It would be
unfair to print numbers given that, so I won't.
I will discuss performance, though. Given my experience, I doubt
you would see too much difference between a single channel PCI controller
and a SCSI to SCSI solution. If you have performance problems on
either one, you need to move to software RAID with multiple controllers
or a multi-channel PCI controller. The best news might be that
software RAID seems to be performing much better than any of the
hardware solutions available. It is still not completely mature,
but there are several folks using it in some mission critical
places and it is working very well. Given the price and the fact
that the performance is awesome, I look for many sites to be moving
from hardware RAID to software RAID when the 2.2 kernel is released.
If you're looking for a vendor to supply Linux RAID solutions, look
at http://www.redhat.com/redhat/hardware-list.phtml. Many of the
vendors listed there can supply RAID ready Linux machines, including
Linux Hardware Solutions and VA Research. Thanks to both of those
companies for contributing data for this article.
If you know of other RAID solutions that work under Linux, please
feel free to mail me and hopefully I'll be able to re-visit this topic in a future article.
© 1998 OS News