RAID

From MythTV Official Wiki
Revision as of 22:25, 29 March 2006 by Jdastrup (talk | contribs) (Setup)

Jump to: navigation, search

RAID is a mechanism for using multiple disk drives to provide redundant file storage.

Quick Overview

Performance Expectations

There are many 'opinions' on what RAID level is best for performance, but here's the facts:

  • RAID 5 is the slowest*
  • RAID 1 is in the middle (speed is equivelent to not using RAID)
  • RAID 0 (and RAID 0+1, 10, 1+0, etc) is the fastest

And here's why:

RAID 5

RAID 5 is the slowest because your computer has to calculate parity for every write and then write that parity to disk. This can be considerable overhead with software RAID or budget RAID controllers. Parity is the distrubed data that allows you to lose any hard drive but still not lose data. Read speads are good.
*You can significanlty increase RAID 5 performance, sometimes even get it faster than RAID 1 performance, if you do the following:

  • Use hardware RAID solution
  • Use a Battery-Backed Write-back Cache, or BBWBC (only available on high-end RAID controllers)
  • Add more disks

In general, unless you have server-class hardware and SCSI disks, you can always expect RAID 5 to be slower than RAID 1 or 0.

RAID 1

RAID 1 (mirroring) is typically not faster nor slower than using a single disk, reading and writing. If you are using software RAID 1, then performance can be slightly degraded, depending on your CPU and controller config.

RAID 0

RAID 0 (striping) is the fastest with reading and writing because your computer reads and writes different data to two disks at the same time, theoretically doubling the performance of RAID 1. (Note: RAID 0 is not the same as disks spanning, or extending, a volume.) Software RAID 0 can be nearly as fast as hardware RAID 0. You can increase the speed of RAID 0 by:

  • Addings more disks
RAID 0+1 (or 1+0 or 10)

RAID 0+1 (mirroring a striped set) is as fast as RAID 0. This RAID level exists only to provide redundancy to RAID 0.

Capacity

Assuming all disk are of equal size (If this isn't the case, use the size of your smallest disk):

  • RAID 0 = N (Number of disks, or 100% efficient)
  • RAID 1 = 1 (One disk, or 100% efficient)
  • RAID 5 = N-1 (Total capacity minus one disk, efficiency varies, but no less than 66% and increases as drives are added)
  • RAID 0+1 = N/2 (Half the total capacity of your disks, or 50% efficient)

Redundancy

  • RAID 0 = No redundancy. Losing any disk results in total data loss
  • RAID 1 = Lose up to 1 disk without any data loss
  • RAID 5 = Lose up to 1 disk without any data loss
  • RAID 0+1 = Lose half your disks (if they are on the same stripe set) without any data loss. Or, lose 1 disk from one stripe set without any data loss.

SCSI vs IDE

In general, SCSI outperforms IDE in RAID arrays because it is much better at handling multiple data reads/writes at the same time. If you must use IDE, use the fastest controllers aviailable (SATA) and the fastest disks available. Also, put each disk on it's own controller; avoid placing a disk on both channels of any IDE controller.

Which one do I choose?

If you don't care about your data, go with RAID 0 for speed. If you don't want to lose your data, RAID 1 for 2 disks. If you have 3 or more disks, it's really a toss up between RAID 5 and RAID 10 (minimum 4 disks). It comes down to speed and $. If you don't need speed and don't have $, go with RAID 5. If you need speed, get the $ and go with RAID 10. In regards to Myth, several have tried RAID 5 and the results are mixed. If you only have 1 or 2 SD tuners, RAID 5 should be fine. Once you get multiple HD tuners or multiple frontends, RAID 5 often can't keep up. Your results may vary.

RAID For Recordings Drive

A few Options exist for using RAID for the recording drives, depending on the goal you have for your recordings; speed, redundency or both. RAID 0 will allow you to gain the most speed from your drives, RAID 01 (or RAID 0+1) will give you speed and 1:1 redundency. RAID 5 gives you the most capacity for your dollar, but write speeds can be pretty bad.

RAID For Archives Drive

Having an independent drive array for archival of shows one wishes to keep allows the user to setup a RAID for speed for the recordings drive and a RAID for backup for the archival drive. This way, once a show has been recorded, commercial flagged and possibly even transcoded to another format or for permanent commercial removal, it can be moved to the archive. In such a case, RAID 5 and RAID 10 make the most sense. If you plan on a large amount of access to the archive, a RAID 10 may make more sense as it will most easily keep up with the transfer rate requirement while still allowing for redundency, but at the cost of the price for obtaining the number of drives requried. RAID 5 will have a slight speed advantage over just having numerous drives (JBOD, Just a Bunch Of Disks in hardware RAID, linear in mdadm), but will also have the advantage of getting the most archival bang-for-your-buck while still maintaining parity for the case of a lost drive.

Setup (Software RAID)

For setting up hardware RAID, see your RAID controller's documentation. For software RAID, creating a RAID array with mdadm is quite easy. The Software-RAID HOWTO Performance section will help here, as different RAID types have different best values for chunk and block sizes. Since we will be dealing with only large files (recorded mpegs. music files, etc) it is recommended to choose the largest chunk and block value that combine for the highest performance.

Partitioning

Before a RAID array can be created on a disk it must be partitioned, again you can use cfdisk. The easiest way is to create a full drive partition.

Note: You must however, also set the type to "fd" or "Linux raid autodetect"!

RAID 5

The following line will create a RAID array with the following characteristics:

  • RAID 5 on /dev/md0
  • 3 drives, /dev/sda1, /dev/sdb1, and dev/sdc1
  • chunk size = 32K
  • no spare
  • verbose level of output
# mdadm -v --create /dev/md0 --force --chunk=32 --level=raid5 \
      --spare-devices=0 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1

RAID 10

RAID 10 is really the creation of 2 or more arrays. First you create the number of RAID 1, mirrored, arrays you wish to have,

# mdadm -v --create /dev/md0 --chunk=32 --level=raid1 --raid-devices=2 /dev/sda1 /dev/sdb1
# mdadm -v --create /dev/md1 --chunk=32 --level=raid1 --raid-devices=2 /dev/sdc1 /dev/sdd1

and so on until you have the number of drives you wish to concatenate into a RAID 0.

Once these have completed building (see below), you can create the RAID 0, striped, array,

# mdadm -v --create /dev/md2 --chunk=32 --level=raid0 --raid-devices=2 /dev/md0 /dev/md1

RAID Creation Confirmation

You will be prompted with the RAID parameters, and asked to continue,

mdadm: layout defaults to left-symmetric
mdadm: /dev/sdc1 appears to contain a reiserfs file system
    size = -192K
mdadm: size set to 293049600K
Continue creating array?

Status of RAID Creation

Upon confirmation you will only see

mdadm: array /dev/md0 started.

Once you run the command to create the RAID array, if you want to see the progress run,

# cat /proc/mdstat

and you will see something along the lines of,

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6] [raid10]
md0 : active raid5 sdc1[3] sdb1[1] sda1[0]
      586099200 blocks level 5, 32k chunk, algorithm 2 [3/2] [UU_]
      [=>...................]  recovery =  5.8% (17266560/293049600) finish=69.8min speed=65760K/sec

unused devices: <none>

Generate Config File

Now we need to setup '/etc/mdadm.conf', this can be done by copying the output of

# mdadm --detail --scan

to '/etc/mdadm.conf', which should end up looking similar to,

ARRAY /dev/md0 level=raid5 num-devices=3 UUID=2d918524:a32c7867:11db7af5:0053440d
devices=/dev/sda1,/dev/sdb1,/dev/sdc1

RAID Filesystem Creation

Once your RAID array is created you can place a filesystem on it. JFS and XFS are the two recommended filesystems for large file arrays, especially for the recordings drive in Myth. Again, we will use The Software-RAID HOWTO Performance section and go with a 4K (4096) block size.

For XFS (replace md0 with your final RAID array if using a mixed mode array),

mkfs.xfs -b size=4096 -L Recordings /dev/md0 -f

or for JFS,

mkfs.jfs -c -L Recordings /dev/md0 -f

(It is not recommended to use a JFS partition for your boot drive when using GRUB)

Mounting

Thats it, now your ready to mount the filesystem! You can add a line to your /etc/fstab similar to,

/dev/md0       /MythTV/tv             xfs     defaults        0       0

or

/dev/md0       /MythTV/tv             jfs     defaults        0       0

which will mount the filesystem upon boot and allow the automount option of mount to work, so go ahead and mount the filesystem,

# mount -a

Monitoring

Most distributions have an init.d daemon setup to monitor your mdadm arrays that will monitor your arrays and allow you to be notified when anything of note occurs.

Links

Great page with information on the different hardware and software raid chipsets, their current linux support

Wikipedia Entry for RAID

mdadm MAN page (via man-wiki)

Software-RAID HOWTO