Difference between revisions of "File storage"

From MythTV Official Wiki
Jump to: navigation, search
(RAID)
(Example setup: Removed outdated section)
 
(93 intermediate revisions by 36 users not shown)
Line 1: Line 1:
As everyone knows, storing copious amounts of video requires bucketloads of hard drive space. MythTV is exceptionally good at producing the requisite video, so we're going to need somewhere to store it - the question is how much, and for how long?
+
'''File storage''' refers to the broad topic of hardware, software and the methodology behind keeping MythTV recordings on a computer (and computer network).
  
I don't watch a huge amount of TV, so my immediate storage requirements aren't that colossal, and all of my MythTV space was cobbled together from whatever IDE drives I had lying around. However, I am a complete hoarder, and don't like deleting things unless I have to, or they're rubbish. As such, my storage requirements have started out basic and will eventually turn into multiple petabyte boxes over a fibre-channel SCSI SAN array. Well, maybe ;)
+
Pretty much any reasonably modern hard drive will be more than adequate both in space and speed for MythTV. With the introduction of [[Storage Groups]] in 0.21 you can use as many drives as you like without the hassle of LVM or Raid.
  
My chosen hardware (I'll use my PVR-250 as an example, since it's probably the most common TV card in use that I have access to) records in MPEG2 format, and takes up about 1GB an hour. This can of course be increased or decreased by altering the quality, but we're still talking about a lot of space over time.
+
==HDD Manufacturers==
 +
{{Note box|This section will be full of hearsay, personal bias and much that is probably apocryphal, so take it with a pinch of salt, and always ask around. If you are adding anecdotal datapoints about a specific manufacturer, please indent one level and ''sign'' your comments.}}
  
Pretty much any reasonably modern hard drive will be more than adequate both in space and speed for MythTV, but my advice to you is to buy the biggest hard drive(s) you can afford. Currently in the UK, the "sweet spot" is around the 200GB barrier; these drives offer the best GB/� ratio.
+
The current [2014] major manufacturers of consumer hard drives are Seagate, Western Digital, HGST, Toshiba and Samsung who all make HDD devices ranging up to 6TB, available in serial (SATA) interfaces, or as SCSI, though those have not caught up on the size front.
  
== Manufacturers ==
+
Here are some user thoughts on hard drives.  Please do not extrapolate these results.  For meaningful results check this [https://www.backblaze.com/blog/hard-drive-reliability-update-september-2014/ reliability study].
  
Please note that this section will be full of hearsay, personal bias and much that is probably apocryphal, so take it with a pinch of salt, and always ask around.  If you're adding anecdotal datapoints about a specific manufacturer, please indent one level and *sign* your comments.
+
*Seagate Barracudas are generally renowned to be quiet and reliable, having recently returned to offering a standard 5yr warranty
 +
*Western Digital drives are slightly better performers than the Seagates, at the expense of being noisier
 +
*Hitachi are recovering from a somewhat tarnished reputation from their "Deathstar" line of hard drives, and are producing some very good SATA drives, although I've never used any myself
 +
*Samsung Spinpoint drives are getting rave reviews (fast, quiet ''and'' reliable) from a lot of users here in the UK, although again I've never used them myself.
  
The current [Sep-04] major manufacturers of consumer hard drives are Seagate, Western Digital, Maxtor, IBM/Hitachi and Samsung who all make IDE devices ranging up to 400GB, available in parallel (PATA) or the newer serial (SATA) interfaces, or as SCSI, though those haven't caught up on the size front.
+
===Western Digital===
* Seagate Barracudas are generally renowned to be quiet and reliable, having recently returned to offering a standard 5yr warranty
+
WD have split their hard drive lines into each market segment (RAID / NAS, surveillance, desktop etc).  The actual impact of the choice of hard drives is not clear.
* Western Digital drives are slightly better performers than the Seagates, at the expense of being noisier
+
* WD Green drives are an excellent choice for the small, quiet HTPC.  They spin at 5200 RPM and are available cheaply and in high capacities.  These are not recommended for a system drive due to their slow rotational speed.
* All of the Maxtor drives I have used have been ''very'' noisy, and not particularly reliable
+
* WD Red drives for NAS / RAID configurations
** Note: All of the Maxtor drives ''I'' have used have been extremely reliable, and not very noisy at all. I have had only 2 fail in the six (?) years I have been using them in my computers and computers I build for customers. One was due to overheating (in a tiny amount of space in a hot area). The other -- I think was just a bad manufacture. Both times, Maxtor replaced the drives (and upgraded them, for free) for me with no hassle, and quite quickly, I might add. Just another personal opinion =) --[[Tyler Drake]]
+
* WD Black drives spin at 7200 RPM that suit traditional system drives if required.  It should be noted that Solid State Drives (SSDs) are better suited to system drive operations.
** My Maxtor [[Diamond Max]] 10 250GB is as quiet as a church mouse. --DavidC
 
* IBM/Hitachi are recovering from a somewhat tarnished reputation from their "Deathstar" line of hard drives, and are producing some very good SATA drives, although I've never used any myself
 
* Samsung Spinpoint drives are getting rave reviews (fast, quiet ''and'' reliable) from a lot of users here in the UK, although again I've never used them myself.
 
  
Based on my current bias, I can recommend the Seagate Barracuda drives for situations where you want quiet drives, although I prefer Western Digital for situations where noise isn't much of a problem.
+
It should be noted that due to a "head parking" feature some models of WD Green drives can have excess head spin counts in a SMART test.  It is possible to increase the timeout for head parking and this may have been fixed in later HDDs.
 +
 
 +
When using WD Green drives as a media storage hard drive, it is possible to have it spin down.  However due to firmware differences, standard Linux tools are not able to standby the drive on a timer.  Instead use [http://hd-idle.sourceforge.net/ hd-idle].  A systemd script is provided at the bottom of the page to allow for automatic standby when this drive is '''NOT''' used as the system drive.
 +
 
 +
==SSD Manufacturers==
 +
 
 +
Since 2013, SSD (Solid State Drive) have reached maturity and are available cheaply and provide an order-of-magnitude increase of performance over conventional HDD.  The most noticeable improvement is during the boot process which may not be necessary in an always-on HTPC.  Common manufactures as of 2014 are Samsung, Corsair, Crucial, Mach, Intel, Kingston and OCZ.
 +
 
 +
SSD are completely quiet (no moving parts) and are available in small form factors (mSATA).  Capacities as of 2014 are only suitable for system drive (OS) installations.
  
 
==Interfaces==
 
==Interfaces==
 +
There are three types of interfaces that are most commonly used these days. These interfaces are:
 +
*SATA
 +
*SCSI
 +
*SAS
  
Most new hard drives and motherboards come in either PATA or the newer SATA interface. Although SATA is a superior standard (it supports a lot of the SCSI subset, and features much smaller, thinner cables than PATA, amongst other improvements), it has been somewhat plagued in Linux by closed source SATA controller drivers. This has resulted in many Linux-based systems being unable to use SATA adequately due to poorly functioning controllers. A quick gander through my kernel config shows me that the following controllers are supported under Linux 2.6.8:
+
Bear in mind that a drive's performance as a video source depends on the 'sustained transfer rate' of the drive which has nothing to do with the interface type or speed. The sustained transfer rate is limited by the 'media transfer rate'. The media transfer rate is the rate of data transfer between the head and the disc surfaces. It is a ''physical'' limitation that is shared by all hard drives regardless of the interface. The only things that affect it are: the data density on the disc, the physical size (read/write area) of the heads, and the rotational speed of the drive. All drives with the same rotational speed and data density will have approximately the same media transfer rate. A high end 7200 RPM drive can achieve a max media transfer rate of approximately 80 MB/sec ''regardless'' of the interface being SATA or SCSI.
* Intel ICH5
 
* nVidia SATA (nForce chipsets)
 
* Promise SX4, TX2 and TX4
 
* Silicon Image SATA controllers (very common - presumably both the 3112 and 3114 are supported now)
 
* SiS 964/180
 
* VIA SATA (VIA chipsets)
 
* Vitesse VSC7174
 
Please bear in mind that the inbuilt software RAID functions on SATA chips will usually not work in Linux without extensive fooling around with the kernel (if at all), and that performance using the open source drivers may be less of that than the closed source proprietary drivers. If you wish for better SATA support under Linux, write to the device manufacturer and ask them to provide some open specs for the kernel hacking team!
 
  
As a much older standard, PATA is universally supported over all x86 hardware that I know about.
+
Drive manufacturers don't want you to know this and divert attention from it by emphasizing the interface's speed in their ads. The interface's speed is only an advantage during data bursts. Most manufacturers went so far as to stop listing the media transfer rates in their specification tables.
 +
 
 +
With drive of a given rotational velocity and data density, the only way to improve overall system performance is to use a form of RAID that uses Stripping. This effectively uses two drives simultaneously so that the total media transfer rate is doubled.
 +
 
 +
===SATA===
 +
Most new hard drives and motherboards come with support for the newer Serial ATA (SATA) interface. Although SATA is a superior standard (it supports a lot of the SCSI subset, and features much smaller, thinner cables than PATA, amongst other improvements), some SATA controllers have closed-source or no Linux drivers. This has resulted in some Linux-based systems being unable to use SATA adequately due to poorly functioning controllers. This situation is no longer as serious as it once was, but you should check your hardware driver support to be sure.
 +
 
 +
For the current status of SATA support under Linux you can check [http://linuxmafia.com/faq/Hardware/sata.html Serial ATA (SATA) on Linux].
 +
 
 +
Please bear in mind that the built-in software RAID functions on SATA chips will usually not work in Linux without extensive fooling around with the kernel (if at all). Because Linux provides its own software RAID features, this isn't a big loss for a dedicated Linux box (such as a MythTV system), but if you dual-boot, you may not be able to use the controller's software RAID.
 +
 
 +
===SAS===
 +
{{Note box| SAS are for industrial scale applications and are overkill for the home MythTV installation, this section is for information only}}
 +
SAS, or Serial Attached SCSI, is a new technology that takes the best of SCSI and SATA and in many ways it compares to Fiber Channel (e.g. SAN technology) and USB as well. Most modern servers already ship with SAS instead of SCSI, and they're eventually expected to be the desktop standard. As of today it has a bus bandwidth if 3 Gbps, and is on target to increase to 12 Gbps by 2011. Individual SAS drives today have a transfer rate of 300 MB/sec, just under the SCSI rate of 320 MB/sec, but each drive gets the full 300 MB/sec to the host, instead of shared as with SCSI, SATA and PATA. Current benchmarks show comparable performance to the best 15K Ultra320 SCSI drives and in some areas SAS far surpasses SCSI performance. Some other cool features are:
 +
 
 +
*The SAS interface is backwards compatible with all SATA drives.
 +
*It can support over 16,000 devices on a single bus, compared to 16 with SCSI and 1 with SATA
 +
*SAS Expanders provide the ability to hook up drives the same way we network computers using a switch, although over shorter distances (several meters).
 +
*2.5" and 3.5" drives are available
 +
 
 +
- Seagate Barracuda ES2 Serial Attached SCSI one terabyte drives can be found for around $270 - maybe even $250. They spin at 7200 rpm. Not a bad choice for MythTV systems that are going to be always on. [[User:RedmondTux|RedmondTux]]
 +
 
 +
===External Links===
 +
*[[wikipedia:Serial ATA]]
 +
*[[wikipedia:Serial Attached SCSI]]
  
 
==Partitions==
 
==Partitions==
 +
The Unix model of filesystems is much more flexible than that under windows, and Linux is no exception, allowing you to seamlessly integrate hard drives and partitions and different formats here, there and everywhere. Personally, I'm a big fan of multiple partition setups because, amongst other things, it allows you to tailor the filesystem (see below) to the files that are going to live on it. At the very least I would advocate at least three partitions:
 +
*<code><nowiki>/boot</nowiki></code> is where the kernel and bootloader things live. I usually format this ext2 since it is very rarely written to.
 +
*<code><nowiki>/</nowiki></code> is the root of your filesystem, pretty much the equivalent of "C:\" under windows; all of your programs and files will live on the root filesystem somewhere (such as <code><nowiki>/usr</nowiki></code> and <code><nowiki>/var/log</nowiki></code>), unless you specify a particular directory tree to live under a separate filesystem (like the /boot shown above).
 +
*The third partition would be where you store all of your MythTV files (as well as music and external video, if that is where you want it). If you want to see my partitioning setup for one of my backends, you can look at the "Advanced storage: example setup" section below.
 +
 +
Note that, particularly if you are prone to monkey with CVS Myth or advanced beta and alpha test drivers, you will be ''much'' happier if you put /var/log on its own partition.
  
The UNIX model of filesystems is much more flexible than that under windows, and Linux is no exception, allowing you to seamlessly integrate hard drives and partitions and different formats here, there and everywhere. Personally, I'm a big fan of multiple partition setups because, amongst other things, it allows you to tailor the filesystem (see below) to the files that are going to live on it. At the very least I would advocate at least three partitions;
+
Most partitions, in the sense just described, can exist instead as LVM volumes (see the section on LVM below).
* <code><nowiki>/boot</nowiki></code> is where the kernel and bootloader things live. I usually format this ext2 since it is very rarely written to.
 
* <code><nowiki>/</nowiki></code> is the root of your filesystem, pretty much the equivalent of "C:\" under windows; all of your programs and files will live on the root filesystem somewhere (such as <code><nowiki>/usr</nowiki></code> and <code><nowiki>/var/log</nowiki></code>), unless you specify a particular directory tree to live under a separate filesystem (like the /boot shown above).
 
* The third partition would be where you store all of your MythTV files (as well as music and external video, if that's where you want it). If you want to see my partitioning setup for one of my backends, you can look at the "Advanced storage: example setup" section below.
 
  
Note that, particularly if you're prone to monkey with CVS Myth or advanced beta and alpha test drivers, you will be *much* happier if you put /var/log on its own partition.
+
When partitioning a disk, you must first decide on a partitioning scheme. For x86 and x86-64 systems, the Master Boot Record (MBR) partitioning system has long been the standard. The MBR system, however, uses data structures that top out at 2TB. If you use a hardware RAID configuration, your virtual disks may exceed this size. Even single disks exceeding 2TB are likely to be available by the end of 2009. Therefore, you may need to use the newer GUID Partition Table (GPT) system if you plan to use lots of storage. GPT is already the standard on Intel-based Macintoshes. Using GPT requires partitioning with GPT-aware utilities, such as GNU Parted rather than fdisk. You may also need to track down a patched version of the GRUB boot loader. Check that your distribution supports installation to GPT disks if you intend to use this system. In some cases it may be simpler to install Linux on a (relatively) small MBR-partitioned disk and reserve the GPT system for the disk or RAID array that holds your recordings. If your individual disks or RAID arrays are smaller than 2TB, chances are the older MBR system will work fine.
  
==File Systems==
+
==File systems==
 +
As you probably know, Linux has a bewildering array of file systems available, most of which excel at a particular task. You are of course free to format your drives with whatever file system you choose, but here is some general info about the most popular file systems:
 +
*'''ext4''' is the standard filesystem in Fedora, amongst other distributions. It includes features enabling support for larger files and filesystems, as well as better performance with large files. Its stability and suitability for use on a MythTV box is ideal.
 +
*'''[[JFS]]''' was originally developed by IBM for their AIX operating system, and was later donated to Linux. JFS is incredibly good at dealing with the huge files that MythTV generates, and can delete pretty much any file in under a second (ext3 can take as long as 15 seconds to delete really big files). JFS is a very good file system to use for storing your videos on, and it is very conservative with CPU usage.
 +
*'''[[XFS]]''' is another "foreign" filesystem, developed by SGI for their IRIX operating system, and once again donated to Linux. Like JFS, it is exceptionally good at dealing with large files, and has the highest throughput of any Linux filesystem, albeit at a higher CPU loading. XFS also makes an excellent choice as storage for your movie files. (Note that XFS filesystems can be ''grown'', but not shrunk, at the present time; this can occasionally be problematic. Note also that file system cleanings are forced using xfs_repair, not fsck; if you are going to use XFS, and [[User:Baylink|Bay Link]] recommends that you do, ''read'' about it first.)
 +
*'''[[Btrfs]]''' (pronounced "butter-eff-ess") is the up-and-coming Linux filesystem. It's Linux's answer to ZFS, which is popular on Solaris. Although Btrfs has many advanced features, such as copy-on-write operation, online defragmentation, and snapshots, it's still very new and has not been extensively tested by the MythTV community, as of October 2009.
  
As you probably know, Linux has a bewildering array of filesystems available, most of which excel at a particular task. You are of course free to format your drives with whatever filesystem you choose, but here is some general info about the most popular filesystems:
+
To use any of these file systems, you'll need support for them compiled into the kernel along with the relevant userland utilities. The file system driver(s) of your partitions must either be compiled directly into the kernel (not as modules) ''or'' compiled as modules and included in an initial RAM disk (initrd). The former approach is usually easier to set up; initrd configuration adds steps to the kernel compilation process and can sometimes go wrong. If you build your filesystem drivers as modules and don't build an initrd, the kernel won't be able to read the filesystems on which the filesystem drivers are stored! If you use your distribution's standard precompiled kernel, you don't need to worry about this.
* '''ext2''' is the "old standard" file system. It is fairly speedy, but does not come with journalling to protect your data from corruption, and can take an age to run though a file system check (fsck), although it can be seamlessly upgraded to ext3
 
* '''ext3''' is an extension to the ext2 filesystem which introduced journalling as well as other improvements. It's a bit of a jack-of-all-trades of a filesystem, and doesn't excel at anything in particular, apart from very thorough testing!
 
* '''ReiserFS''' is a high performance filesystem that is especially good at dealing with directories with lots of small files, which makes it a good choice for your system partitions, although it doesn't perform as well with large files
 
* '''JFS''' was originally developed by IBM for their AIX operating system, and was later donated to Linux. JFS is incredibly good at dealing with the huge files that MythTV generates, and can delete pretty much any file in under a second (ext3 can take as long as 15 seconds to delete really big files). JFS is a very good filesystem to use for storing your videos on, and it is very conservative with CPU usage.
 
* '''XFS''' is another "foreign" filesystem, developed by SGI for their IRIX operating system, and once again donated to Linux. Like JFS, it is exceptionally good at dealing with large files, and has the highest throughput of any Linux filesystem, albeit at a higher CPU loading. XFS also makes an excellent choice as storage for your movie files.  (Note that XFS filesystems can be *grown*, but not shrunk, at the present time; this can occasionally be problematic.  Note also that filesystem cleanings are forced using xfs_repair, not fsck; if you're going to use XFS, and [[Bay Link]] recommends that you do, *read* about it first.)
 
  
To use any of these filesystems, you'll need support for them compiled into the kernel along with the relevant userland utilities. The filesystem driver(s) of your partitions should always be compiled statically into the kernel (which makes things much easier!) or into an initrd, and not as a module, otherwise your newly booted kernel won't be able to load the modules you need to understand the filesystem to load the module you need (made my head spin too, but if you re-read it enough, it makes sense).
+
Some distributions come with a choice of only one or two filesystems, although if you rebuild your kernel it is possible to enable all of them (including support for windows FAT32 and NTFS if you need it!). New, exotic and improved filesystems are cropping up all the time; hot on the horizon is Reiser4, which promises to be a very high performing and flexible system, although it is far from stable yet.
  
Some distributions come with a choice of only one or two filesystems, although if you rebuild your kernel it's usually possible to enable all of them (including support for windows FAT32 and NTFS if you need it!). New, exotic and improved filesystems are cropping up all the time; hot on the horizon is Reiser4, which promises to be a very high performing and flexible system, although it's far from stable yet.
+
Many filesystems allow tweaking of the block size at format time - selecting a large block size will make more efficient use of your hard drive space when dealing with large files, whereas a small block size is better suited for your system partitions. If in doubt, read the manual thoroughly or just go with the defaults, since you cannot change the block size without reformatting the drive.
  
Many filesystems allow tweaking of the block size at format time - selecting a large block size will make more efficient use of your hard drive space when dealing with large files, whereas a small block size is better suited for your system partitions. If in doubt, read the manual thoroughly or just go with the defaults, since you can't change the block size without reformatting the drive.
+
Filesystem mount options can sometimes affect performance. For instance, when using XFS, the allocsize option can be used to set the size of the blocks that the filesystem uses when allocating new disk space. Setting this to a large value (as in allocsize=512m) can reduce fragmentation and therefore improve performance when large files are stored on the filesystem.
  
In short, a good choice is ext3/Reiser for your system partitions and JFS or XFS for your MythTV storage. Note that the XFS implementations on SuSE 9.0 and 9.1 were both a bit flaky, this can make installations and upgrades difficult if you don't know the magic. (I'll put the magic here when I relocate it. --[[Bay Link]])
+
In short, a good choice is ext3 or ReiserFS for your system partitions and JFS or XFS for your MythTV storage. If you have a separate /boot partition, ext2 is a good option, since ext3's journal provides little benefit for a partition of this size but consumes a lot of disk space. Note that the XFS implementations on SuSE 9.0 and 9.1 were both a bit flaky, this can make installations and upgrades difficult if you do not know the magic. (I will put the magic here when I relocate it. --[[User:Baylink|Bay Link]])
  
== Advanced storage ==
+
==Advanced storage==
=== LVM ===
 
  
LVM stands for the Logical Volume Manager, and you can use it to make two or more separate hard drives (or partitions on those drives) appear as one huge hard drive to the operating system.  LVM can stripe the partitions together, and you then make one or more big filesystems on the entire thing.
 
  
You'll need to have LVM enabled in your kernel to do this, as well as having the userland LVM utilities installed. Users of 2.6 will be able to utilise the much improved LVM2.
+
===Storage Groups===
  
The terminology of LVM is perhaps a little advanced to go into here, so if you want a good explanation of it you can read the  [http://www.tldp.org/HOWTO/LVM-HOWTO/ LVM HOWTO]. In short, you can dedicate either individual partitions or entire hard drives (the 'physical volumes') for use by the LVM, which allows you to map them into one or more 'volume groups', from which you then carve out 'logical volumes' to install filesystems upon. Probably the best thing about LVM is that, as long as the filesystem you pick is capable of being resized, you can extend the volume group(s) and logical volume(s) over bigger and more hard drives without losing or having to copy any data, which makes it a great choice if you want to keep your expandability options open.
+
[[Storage Groups]] is a feature, introduced in version 0.21, allowing the use of multiple hard drives for the storage of recordings and other media. It provides an easier, cheaper and safer alternative to LVM. It may also replace RAID in certain setups.
  
=== RAID ===
+
===LVM===
 +
LVM stands for the Logical Volume Manager. It provides two basic advantages over conventional partitions:
  
Originally, RAID stood for Redundant Array of Independent Discs, although now the word Independent has been substituted for inexpensive (probably because most RAID setups use *very* expensive SCSI discs ;). What this basically means is that data is spread across multiple hard drives in such a way that if one of the hard drives explodes or is eaten by the cat, you will be able to reconstruct the lost data from the other hard drives. One of the lesser functions of RAID is to produce higher performance filesystems by spreading read/write load across multiple discs as well. For a very clear and concise RAID tutorial, you can read these pages http://www.acnc.com/04_00.html, but in the meantime here's a brief rundown of the most common RAID levels along with examples of storage capacity:
+
* You can use it to make two or more separate hard drives (or partitions on those drives) appear as one huge hard drive to the operating system. LVM can optionally stripe the partitions together, meaning that accesses to the two disks are interleaved. This can improve performance in a manner similar to some RAID configurations.
  
* '''RAID0''', also known as striping, is not true RAID, in that it offers no redundancy. If one of the discs in the array fails, all of the data in the array is lost. RAID0 scales linearly with every drive added; two 80GB drives will produce a single 160GB filesystem. Please note that RAID0 is distinctly different from LVM!
+
* Filesystems are stored in logical volumes within the partitions used by LVM. These logical volumes may be resized, added, and deleted without regard for their locations or precisely where the data you allocate will be stored. (The logical volumes act much like files in a filesystem.) This feature makes it easy to add storage space to the filesystems that need it. You can, for instance, add a new disk to an existing system and then grow your MythTV recordings filesystem without having to copy data or otherwise disrupt your existing recordings.
  
* '''RAID1''', also known as mirroring, involves copying data to two identical hard drives rather than just one. If one drive dies, the other will remain fully functional with all of your data intact. Two 80GB drives will produce a single 80GB filesystem.
+
LVM has certain drawbacks, of course:
  
* '''RAID0+1''' is a combination of mirroring and striping utilising 4 drives offering very fast read/write speeds. Four 80GB drives would combine to form a 160GB RAID0+1 filesystem.
+
* It adds complexity. In addition to creating partitions in a conventional way, you must use several utilities to build up the LVM data structures before you can begin using your disks.
Most of you will have seen these RAID levels advertised as being built into almost all modern motherboards; unfortunately this kind of RAID is achieved under proprietary drivers (all RAID calculations being done in software), typically available only for Windows. Fear not however, because the Linux kernel contains its own software RAID drivers, which (if the rumours I hear are true) perform even better than the proprietary software RAID drivers. Again, you can enable these in your kernel.
 
  
You will also note more exotic RAID levels such as RAID5 and RAID10. These are quite tricky to do in software (although it is possible), and are usually left to high-end dedicated RAID controllers. Previously, these were only available in high-end SCSI RAID setups, although recently 3ware have released an excellent series of cards that allow high end RAID features on much cheaper and larger capacity SATA discs. These cards are fully supported under Linux and offer excellent performance, and I use two of them at home. If you're looking for a relatively cheap and hassle-free IDE RAID setup under Linux, 3ware's are a very good choice. (P.S. thanks for the cheque, 3ware!)
+
* Not all distributions provide easy support for LVM. Some versions of Ubuntu lack LVM support "out of the box," for instance. (You can work around this problem, but doing so requires additional expertise.)
  
* '''RAID5''' is a good compromise on RAID10, and spans data and parity data over three or more drives, giving redundancy and good read/write performance.
+
* If you use LVM to span multiple physical disks, your data becomes more prone to damage should one disk fail -- the breakdown of one physical disk may make data stored on the good disk inaccessible.
  
* '''RAID10''' is very like the high performing RAID0+1, but with better redundancy (it can survive up to two simultaneous drive failures, whereas 0+1 can sustain only one hard drive failure).
+
* Emergency recovery becomes more complex. Your recovery tools must support LVM (most modern recovery CDs/DVDs do, fortunately), but you may need to execute extra commands to access your data.
  
In the end, if you're not that worried about losing your data (or if you keep good backups), any kind of RAID is overkill. A good compromise can be reached if you place all your system directories on a RAID of some sort (which will protect all of your time consuming configuration - my workstation is in the process of being switched over to RAID1 on two Western Digital Raptors) whilst placing the TV storage on a single disc. But if you have enough money and inclination, you can RAID your whole setup - I'm particularly paranoid, and plan to upgrade my backend to using a 3ware and four 250GB drives in RAID10 to (hopefully) put an end to my currently non-existent storage problems.
+
* Booting Linux can become more complex, because you must either have LVM support on an initial RAM disk (initrd) or you must provide the basic LVM drivers and tools on a non-LVM partition. (Note that you can install your basic Linux system on a non-LVM disk and reserve LVM for your MythTV recordings and database filesystems alone, if you like. This configuration will minimize this drawback of LVM.)
  
=== SCSI ===
+
Despite these drawbacks, LVM's advantages make LVM appealing for many users. MythTV 0.21's storage groups are another option for increasing storage flexibility. You will need to have LVM enabled in your kernel to use LVM, as well as having the userland LVM utilities installed. The 2.6 kernel series implements the much improved LVM2.
  
SCSI stands for Small Computer Systems Interface, and is/was a competing hard drive interface to IDE/ATA. However, back in the mists of time, SCSI was designated to the "high end hard drive" side of things, and is now much more expensive than IDE technology.  Though, if you look at things like raw drive MTBF hours, you'll see that cheaper IDE drives are only now barely catching up to the SCSI drive specs.
+
The setup details of LVM are a little advanced to go into here, so if you want a good explanation of it you can read the [http://www.tldp.org/HOWTO/LVM-HOWTO/ LVM HOWTO]. In short, you can dedicate either individual partitions or entire hard drives (the "physical volumes") for use by the LVM, which allows you to map them into one or more "volume groups", from which you then carve out "logical volumes" to install filesystems upon.
  
None but the highest end server and workstation motherboards come with inbuilt SCSI controllers, so these usually have to be added by means of a PCI card, which in themselves aren't cheap. The cost of the hard drives themselves are very high indeed, and they offer much reduced storage capacity compared to a modern PATA or SATA drive. However, SCSI discs are incredibly fast and very reliable - but as we can see, it comes at a huge price. To be honest, there is very little chance of even an extensive MythTV setup requiring a SCSI system - SCSI excels in massively multi-user environments like databases and web/mail servers, but the advantages under a single user setup are hard to distinguish. With the recent addition of Western Digital's enterprise-class "Raptor" SATA drives, you can approach SCSI speeds without shelling out a kings ransom, although their size is limited to 74GB at the time of writing.
+
===RAID===
 +
Originally, [[RAID]] stood for Redundant Array of Independent Discs, although now the word Independent has been substituted for Inexpensive (probably because most RAID setups use ''very'' expensive SCSI discs ;). What this basically means is that data is spread across multiple hard drives in such a way that if one of the hard drives explodes or is eaten by the cat, you will be able to reconstruct the lost data from the other hard drives. One of the lesser functions of RAID is to produce higher performance filesystems by spreading read/write load across multiple discs as well. For a very clear and concise RAID tutorial, you can read these pages http://www.acnc.com/04_00.html, but in the meantime here is a brief rundown of the most common RAID levels along with examples of storage capacity:
  
One thing of note is that SCSI drives are very *very* loud due to their very high rotation speed (10,000 or 15,000rpm) and so are going to be relegated to the backend under the stairs pretty quickly. Raptor drives are quieter, but still far louder than your average IDE drive.
+
*'''RAID0''', also known as striping, is not true RAID, in that it offers no redundancy. If one of the discs in the array fails, all of the data in the array is lost. '''RAID0''' scales linearly with every drive added; two 80GB drives will produce a single 160GB filesystem. Please note that '''RAID0''' is distinctly different from LVM!
  
[ Editorial comment: SCSI's not *that* bad a choice, particularly if you can get used drives cheaply on eBay, and you're building an [[Under The Stairs]] backend box -- instead of the 2 or 4 drives you can put on most IDE controllers, you can put 15 on a SCSI controller -- and multiple channel controllers are available.  So it's a matter of scale and buying savvy as much as anything else. -- [[Bay Link]] [[[[Date Time]](2004-10-01T18:06:44Z)]] ]
+
*'''RAID1''', also known as mirroring, involves copying data to two identical hard drives rather than just one. If one drive dies, the other will remain fully functional with all of your data intact. Two 80GB drives will produce a single 80GB filesystem.
  
=== Network filesystems ===
+
*'''RAID0+1''' and '''RAID10''' are two basic forms of nested arrays.  '''0+1''' is a mirror of stripes, while '''10''' is a stripe of mirrors.  While both methods are equally simple to execute, '''0+1''' is more commonly found on inexpensive software RAID included with consumer motherboards.  Conversely, '''10''' is the more reliable mode, requiring only one functional drive of each mirror set, while '''0+1''' requires one fully functional stripe.
  
=== Supermount ===
+
*'''RAID5''' and '''RAID6''' are more complex forms of redundancy, and as such are typically only found on higher end cards.  Similar to '''RAID0''', each stripe includes one redundant block of parity (two in the case of '''RAID6'''), used to calculate the missing data in the event of a failed drive.  Traditionally, this is very intensive, with high end cards having custom ASICs to handle the calculations, however modern CPUs, and particularly those with multiple cores, have no problem performing this function in software.  Due to the use of parity that must be calculated across the entire stripe, this form of RAID suffers from poor write performance when executing multiple writes smaller than one stripe size.  Read performance is nearly as high as '''RAID0'''.
  
=== Example setup ===
+
In the end, if you are not that worried about losing your data (or if you keep good backups), any kind of RAID is overkill. A good compromise can be reached if you place all your system directories on a RAID of some sort (which will protect all of your time consuming configuration) whilst placing the TV storage on a single disc. But if you have enough money and inclination, you can RAID your whole setup.
  
Below I've detailed the partitioning setup for my own main backend, which has just been rejigged.
+
===Network filesystems===
My main rig consists of 80GB (hda) and 120GB (hdb) Seagate Barracuda's:
 
* '''hda1''' is used for the '''/boot''' partition, and is 25MB is size, ext2 formatted
 
* '''hda2''' is used for the '''/''' partition, and is 580MB in size, ReiserFS formatted
 
* '''hda3''' is the swap partition
 
* '''hda5''' is used for the '''/usr''' partition, is 6GB in size and ReiserFS formatted
 
* '''hda6''' is '''/var''' and is 2.5GB in size (waaaaaay too big!), ReiserFS formatted
 
* '''hda7''' and '''hdb''' are joined into a JFS formatted LVM under '''/dev/mythvg/lvm0''' of approximately 174GB, sitting on '''/home''' (TV data is stored in '''/home/mythtv/tvstore''')
 
* '''/home/mythtv/music''' and '''/home/mythtv/movies''' are read-only NFS mounts from my file server (which has everything stored on RAID1 on a 3ware 8506) which keep all my movies and music available to [[MythMusic]] and [[MythVideo]]
 
* '''/usr/portage''' is also NFS mounted (read-write) to the file server, and used to distribute the portage cache and downloaded tarballs to my other Gentoo boxes, allowing for faster compiles and less rsyncing off the Gentoo servers, which will also allow me to use a smaller '''/usr''' partition.
 
  
It's probably a lot more complicated than is really necessary, but I felt like having a good experiment with things, and this is what I came up with.
+
As the name implies, these are mechanisms for locally accessing a remote file system (and therefore files) across a [[network]]. MythTV will internally stream content from backends to remote frontends, so for most purposes, so these will be unnecessary. This capability is limited to content defined on the backend using [[Storage_Groups|Storage Directories]], which currently limits it to the recording and video libraries. Music and artwork have not yet been migrated to this new design, and require filesystem access on each frontend. Backends can only record to locally mounted file systems, and will not stream a new recording to a remote backend for storage.
  
There's also a walkthrough setting up an xfs on LVM on RAID5 system here: [[LVM on RAID]]
+
If one does require a networked file system, the two common options are [[NFS]] and [[CIFS]]. NFS is the native protocol used by Linux and other POSIX compliant operating systems. CIFS is more commonly known as Windows File Sharing. CIFS offers much more configurability in terms of security and access restrictions, while NFS will be lightweight and faster. More importantly however, NFS is designed around the same filesystem properties as other Linux filesystems, while CIFS has a very foreign design, and incurs some complications in places where there is no direct translation from one parameter to another. NFS should be preferred over CIFS unless there are specific requirements that demand the use of CIFS.
  
 
[[Category:Hardware]]
 
[[Category:Hardware]]

Latest revision as of 10:30, 12 December 2014

File storage refers to the broad topic of hardware, software and the methodology behind keeping MythTV recordings on a computer (and computer network).

Pretty much any reasonably modern hard drive will be more than adequate both in space and speed for MythTV. With the introduction of Storage Groups in 0.21 you can use as many drives as you like without the hassle of LVM or Raid.

HDD Manufacturers

Important.png Note: This section will be full of hearsay, personal bias and much that is probably apocryphal, so take it with a pinch of salt, and always ask around. If you are adding anecdotal datapoints about a specific manufacturer, please indent one level and sign your comments.

The current [2014] major manufacturers of consumer hard drives are Seagate, Western Digital, HGST, Toshiba and Samsung who all make HDD devices ranging up to 6TB, available in serial (SATA) interfaces, or as SCSI, though those have not caught up on the size front.

Here are some user thoughts on hard drives. Please do not extrapolate these results. For meaningful results check this reliability study.

  • Seagate Barracudas are generally renowned to be quiet and reliable, having recently returned to offering a standard 5yr warranty
  • Western Digital drives are slightly better performers than the Seagates, at the expense of being noisier
  • Hitachi are recovering from a somewhat tarnished reputation from their "Deathstar" line of hard drives, and are producing some very good SATA drives, although I've never used any myself
  • Samsung Spinpoint drives are getting rave reviews (fast, quiet and reliable) from a lot of users here in the UK, although again I've never used them myself.

Western Digital

WD have split their hard drive lines into each market segment (RAID / NAS, surveillance, desktop etc). The actual impact of the choice of hard drives is not clear.

  • WD Green drives are an excellent choice for the small, quiet HTPC. They spin at 5200 RPM and are available cheaply and in high capacities. These are not recommended for a system drive due to their slow rotational speed.
  • WD Red drives for NAS / RAID configurations
  • WD Black drives spin at 7200 RPM that suit traditional system drives if required. It should be noted that Solid State Drives (SSDs) are better suited to system drive operations.

It should be noted that due to a "head parking" feature some models of WD Green drives can have excess head spin counts in a SMART test. It is possible to increase the timeout for head parking and this may have been fixed in later HDDs.

When using WD Green drives as a media storage hard drive, it is possible to have it spin down. However due to firmware differences, standard Linux tools are not able to standby the drive on a timer. Instead use hd-idle. A systemd script is provided at the bottom of the page to allow for automatic standby when this drive is NOT used as the system drive.

SSD Manufacturers

Since 2013, SSD (Solid State Drive) have reached maturity and are available cheaply and provide an order-of-magnitude increase of performance over conventional HDD. The most noticeable improvement is during the boot process which may not be necessary in an always-on HTPC. Common manufactures as of 2014 are Samsung, Corsair, Crucial, Mach, Intel, Kingston and OCZ.

SSD are completely quiet (no moving parts) and are available in small form factors (mSATA). Capacities as of 2014 are only suitable for system drive (OS) installations.

Interfaces

There are three types of interfaces that are most commonly used these days. These interfaces are:

  • SATA
  • SCSI
  • SAS

Bear in mind that a drive's performance as a video source depends on the 'sustained transfer rate' of the drive which has nothing to do with the interface type or speed. The sustained transfer rate is limited by the 'media transfer rate'. The media transfer rate is the rate of data transfer between the head and the disc surfaces. It is a physical limitation that is shared by all hard drives regardless of the interface. The only things that affect it are: the data density on the disc, the physical size (read/write area) of the heads, and the rotational speed of the drive. All drives with the same rotational speed and data density will have approximately the same media transfer rate. A high end 7200 RPM drive can achieve a max media transfer rate of approximately 80 MB/sec regardless of the interface being SATA or SCSI.

Drive manufacturers don't want you to know this and divert attention from it by emphasizing the interface's speed in their ads. The interface's speed is only an advantage during data bursts. Most manufacturers went so far as to stop listing the media transfer rates in their specification tables.

With drive of a given rotational velocity and data density, the only way to improve overall system performance is to use a form of RAID that uses Stripping. This effectively uses two drives simultaneously so that the total media transfer rate is doubled.

SATA

Most new hard drives and motherboards come with support for the newer Serial ATA (SATA) interface. Although SATA is a superior standard (it supports a lot of the SCSI subset, and features much smaller, thinner cables than PATA, amongst other improvements), some SATA controllers have closed-source or no Linux drivers. This has resulted in some Linux-based systems being unable to use SATA adequately due to poorly functioning controllers. This situation is no longer as serious as it once was, but you should check your hardware driver support to be sure.

For the current status of SATA support under Linux you can check Serial ATA (SATA) on Linux.

Please bear in mind that the built-in software RAID functions on SATA chips will usually not work in Linux without extensive fooling around with the kernel (if at all). Because Linux provides its own software RAID features, this isn't a big loss for a dedicated Linux box (such as a MythTV system), but if you dual-boot, you may not be able to use the controller's software RAID.

SAS

Important.png Note: SAS are for industrial scale applications and are overkill for the home MythTV installation, this section is for information only

SAS, or Serial Attached SCSI, is a new technology that takes the best of SCSI and SATA and in many ways it compares to Fiber Channel (e.g. SAN technology) and USB as well. Most modern servers already ship with SAS instead of SCSI, and they're eventually expected to be the desktop standard. As of today it has a bus bandwidth if 3 Gbps, and is on target to increase to 12 Gbps by 2011. Individual SAS drives today have a transfer rate of 300 MB/sec, just under the SCSI rate of 320 MB/sec, but each drive gets the full 300 MB/sec to the host, instead of shared as with SCSI, SATA and PATA. Current benchmarks show comparable performance to the best 15K Ultra320 SCSI drives and in some areas SAS far surpasses SCSI performance. Some other cool features are:

  • The SAS interface is backwards compatible with all SATA drives.
  • It can support over 16,000 devices on a single bus, compared to 16 with SCSI and 1 with SATA
  • SAS Expanders provide the ability to hook up drives the same way we network computers using a switch, although over shorter distances (several meters).
  • 2.5" and 3.5" drives are available

- Seagate Barracuda ES2 Serial Attached SCSI one terabyte drives can be found for around $270 - maybe even $250. They spin at 7200 rpm. Not a bad choice for MythTV systems that are going to be always on. RedmondTux

External Links

Partitions

The Unix model of filesystems is much more flexible than that under windows, and Linux is no exception, allowing you to seamlessly integrate hard drives and partitions and different formats here, there and everywhere. Personally, I'm a big fan of multiple partition setups because, amongst other things, it allows you to tailor the filesystem (see below) to the files that are going to live on it. At the very least I would advocate at least three partitions:

  • /boot is where the kernel and bootloader things live. I usually format this ext2 since it is very rarely written to.
  • / is the root of your filesystem, pretty much the equivalent of "C:\" under windows; all of your programs and files will live on the root filesystem somewhere (such as /usr and /var/log), unless you specify a particular directory tree to live under a separate filesystem (like the /boot shown above).
  • The third partition would be where you store all of your MythTV files (as well as music and external video, if that is where you want it). If you want to see my partitioning setup for one of my backends, you can look at the "Advanced storage: example setup" section below.

Note that, particularly if you are prone to monkey with CVS Myth or advanced beta and alpha test drivers, you will be much happier if you put /var/log on its own partition.

Most partitions, in the sense just described, can exist instead as LVM volumes (see the section on LVM below).

When partitioning a disk, you must first decide on a partitioning scheme. For x86 and x86-64 systems, the Master Boot Record (MBR) partitioning system has long been the standard. The MBR system, however, uses data structures that top out at 2TB. If you use a hardware RAID configuration, your virtual disks may exceed this size. Even single disks exceeding 2TB are likely to be available by the end of 2009. Therefore, you may need to use the newer GUID Partition Table (GPT) system if you plan to use lots of storage. GPT is already the standard on Intel-based Macintoshes. Using GPT requires partitioning with GPT-aware utilities, such as GNU Parted rather than fdisk. You may also need to track down a patched version of the GRUB boot loader. Check that your distribution supports installation to GPT disks if you intend to use this system. In some cases it may be simpler to install Linux on a (relatively) small MBR-partitioned disk and reserve the GPT system for the disk or RAID array that holds your recordings. If your individual disks or RAID arrays are smaller than 2TB, chances are the older MBR system will work fine.

File systems

As you probably know, Linux has a bewildering array of file systems available, most of which excel at a particular task. You are of course free to format your drives with whatever file system you choose, but here is some general info about the most popular file systems:

  • ext4 is the standard filesystem in Fedora, amongst other distributions. It includes features enabling support for larger files and filesystems, as well as better performance with large files. Its stability and suitability for use on a MythTV box is ideal.
  • JFS was originally developed by IBM for their AIX operating system, and was later donated to Linux. JFS is incredibly good at dealing with the huge files that MythTV generates, and can delete pretty much any file in under a second (ext3 can take as long as 15 seconds to delete really big files). JFS is a very good file system to use for storing your videos on, and it is very conservative with CPU usage.
  • XFS is another "foreign" filesystem, developed by SGI for their IRIX operating system, and once again donated to Linux. Like JFS, it is exceptionally good at dealing with large files, and has the highest throughput of any Linux filesystem, albeit at a higher CPU loading. XFS also makes an excellent choice as storage for your movie files. (Note that XFS filesystems can be grown, but not shrunk, at the present time; this can occasionally be problematic. Note also that file system cleanings are forced using xfs_repair, not fsck; if you are going to use XFS, and Bay Link recommends that you do, read about it first.)
  • Btrfs (pronounced "butter-eff-ess") is the up-and-coming Linux filesystem. It's Linux's answer to ZFS, which is popular on Solaris. Although Btrfs has many advanced features, such as copy-on-write operation, online defragmentation, and snapshots, it's still very new and has not been extensively tested by the MythTV community, as of October 2009.

To use any of these file systems, you'll need support for them compiled into the kernel along with the relevant userland utilities. The file system driver(s) of your partitions must either be compiled directly into the kernel (not as modules) or compiled as modules and included in an initial RAM disk (initrd). The former approach is usually easier to set up; initrd configuration adds steps to the kernel compilation process and can sometimes go wrong. If you build your filesystem drivers as modules and don't build an initrd, the kernel won't be able to read the filesystems on which the filesystem drivers are stored! If you use your distribution's standard precompiled kernel, you don't need to worry about this.

Some distributions come with a choice of only one or two filesystems, although if you rebuild your kernel it is possible to enable all of them (including support for windows FAT32 and NTFS if you need it!). New, exotic and improved filesystems are cropping up all the time; hot on the horizon is Reiser4, which promises to be a very high performing and flexible system, although it is far from stable yet.

Many filesystems allow tweaking of the block size at format time - selecting a large block size will make more efficient use of your hard drive space when dealing with large files, whereas a small block size is better suited for your system partitions. If in doubt, read the manual thoroughly or just go with the defaults, since you cannot change the block size without reformatting the drive.

Filesystem mount options can sometimes affect performance. For instance, when using XFS, the allocsize option can be used to set the size of the blocks that the filesystem uses when allocating new disk space. Setting this to a large value (as in allocsize=512m) can reduce fragmentation and therefore improve performance when large files are stored on the filesystem.

In short, a good choice is ext3 or ReiserFS for your system partitions and JFS or XFS for your MythTV storage. If you have a separate /boot partition, ext2 is a good option, since ext3's journal provides little benefit for a partition of this size but consumes a lot of disk space. Note that the XFS implementations on SuSE 9.0 and 9.1 were both a bit flaky, this can make installations and upgrades difficult if you do not know the magic. (I will put the magic here when I relocate it. --Bay Link)

Advanced storage

Storage Groups

Storage Groups is a feature, introduced in version 0.21, allowing the use of multiple hard drives for the storage of recordings and other media. It provides an easier, cheaper and safer alternative to LVM. It may also replace RAID in certain setups.

LVM

LVM stands for the Logical Volume Manager. It provides two basic advantages over conventional partitions:

  • You can use it to make two or more separate hard drives (or partitions on those drives) appear as one huge hard drive to the operating system. LVM can optionally stripe the partitions together, meaning that accesses to the two disks are interleaved. This can improve performance in a manner similar to some RAID configurations.
  • Filesystems are stored in logical volumes within the partitions used by LVM. These logical volumes may be resized, added, and deleted without regard for their locations or precisely where the data you allocate will be stored. (The logical volumes act much like files in a filesystem.) This feature makes it easy to add storage space to the filesystems that need it. You can, for instance, add a new disk to an existing system and then grow your MythTV recordings filesystem without having to copy data or otherwise disrupt your existing recordings.

LVM has certain drawbacks, of course:

  • It adds complexity. In addition to creating partitions in a conventional way, you must use several utilities to build up the LVM data structures before you can begin using your disks.
  • Not all distributions provide easy support for LVM. Some versions of Ubuntu lack LVM support "out of the box," for instance. (You can work around this problem, but doing so requires additional expertise.)
  • If you use LVM to span multiple physical disks, your data becomes more prone to damage should one disk fail -- the breakdown of one physical disk may make data stored on the good disk inaccessible.
  • Emergency recovery becomes more complex. Your recovery tools must support LVM (most modern recovery CDs/DVDs do, fortunately), but you may need to execute extra commands to access your data.
  • Booting Linux can become more complex, because you must either have LVM support on an initial RAM disk (initrd) or you must provide the basic LVM drivers and tools on a non-LVM partition. (Note that you can install your basic Linux system on a non-LVM disk and reserve LVM for your MythTV recordings and database filesystems alone, if you like. This configuration will minimize this drawback of LVM.)

Despite these drawbacks, LVM's advantages make LVM appealing for many users. MythTV 0.21's storage groups are another option for increasing storage flexibility. You will need to have LVM enabled in your kernel to use LVM, as well as having the userland LVM utilities installed. The 2.6 kernel series implements the much improved LVM2.

The setup details of LVM are a little advanced to go into here, so if you want a good explanation of it you can read the LVM HOWTO. In short, you can dedicate either individual partitions or entire hard drives (the "physical volumes") for use by the LVM, which allows you to map them into one or more "volume groups", from which you then carve out "logical volumes" to install filesystems upon.

RAID

Originally, RAID stood for Redundant Array of Independent Discs, although now the word Independent has been substituted for Inexpensive (probably because most RAID setups use very expensive SCSI discs ;). What this basically means is that data is spread across multiple hard drives in such a way that if one of the hard drives explodes or is eaten by the cat, you will be able to reconstruct the lost data from the other hard drives. One of the lesser functions of RAID is to produce higher performance filesystems by spreading read/write load across multiple discs as well. For a very clear and concise RAID tutorial, you can read these pages http://www.acnc.com/04_00.html, but in the meantime here is a brief rundown of the most common RAID levels along with examples of storage capacity:

  • RAID0, also known as striping, is not true RAID, in that it offers no redundancy. If one of the discs in the array fails, all of the data in the array is lost. RAID0 scales linearly with every drive added; two 80GB drives will produce a single 160GB filesystem. Please note that RAID0 is distinctly different from LVM!
  • RAID1, also known as mirroring, involves copying data to two identical hard drives rather than just one. If one drive dies, the other will remain fully functional with all of your data intact. Two 80GB drives will produce a single 80GB filesystem.
  • RAID0+1 and RAID10 are two basic forms of nested arrays. 0+1 is a mirror of stripes, while 10 is a stripe of mirrors. While both methods are equally simple to execute, 0+1 is more commonly found on inexpensive software RAID included with consumer motherboards. Conversely, 10 is the more reliable mode, requiring only one functional drive of each mirror set, while 0+1 requires one fully functional stripe.
  • RAID5 and RAID6 are more complex forms of redundancy, and as such are typically only found on higher end cards. Similar to RAID0, each stripe includes one redundant block of parity (two in the case of RAID6), used to calculate the missing data in the event of a failed drive. Traditionally, this is very intensive, with high end cards having custom ASICs to handle the calculations, however modern CPUs, and particularly those with multiple cores, have no problem performing this function in software. Due to the use of parity that must be calculated across the entire stripe, this form of RAID suffers from poor write performance when executing multiple writes smaller than one stripe size. Read performance is nearly as high as RAID0.

In the end, if you are not that worried about losing your data (or if you keep good backups), any kind of RAID is overkill. A good compromise can be reached if you place all your system directories on a RAID of some sort (which will protect all of your time consuming configuration) whilst placing the TV storage on a single disc. But if you have enough money and inclination, you can RAID your whole setup.

Network filesystems

As the name implies, these are mechanisms for locally accessing a remote file system (and therefore files) across a network. MythTV will internally stream content from backends to remote frontends, so for most purposes, so these will be unnecessary. This capability is limited to content defined on the backend using Storage Directories, which currently limits it to the recording and video libraries. Music and artwork have not yet been migrated to this new design, and require filesystem access on each frontend. Backends can only record to locally mounted file systems, and will not stream a new recording to a remote backend for storage.

If one does require a networked file system, the two common options are NFS and CIFS. NFS is the native protocol used by Linux and other POSIX compliant operating systems. CIFS is more commonly known as Windows File Sharing. CIFS offers much more configurability in terms of security and access restrictions, while NFS will be lightweight and faster. More importantly however, NFS is designed around the same filesystem properties as other Linux filesystems, while CIFS has a very foreign design, and incurs some complications in places where there is no direct translation from one parameter to another. NFS should be preferred over CIFS unless there are specific requirements that demand the use of CIFS.