XFS Filesystem

From MythTV Official Wiki
Revision as of 02:21, 4 January 2009 by TrinitronX (talk | contribs) (XFS Is It Right for You?: Added links to sections in Optimizing Performance page)

Jump to: navigation, search
Wikipedia has an article on:

XFS Is It Right for You?

XFS is another file system you can use. If you're familiar with Windows, the file systems you could choose from included: FAT (aka FAT16), FAT32, and NTFS. XFS is just another option for you to use for your /myth partition under LINUX/UNIX. A popular file system under LINUX is ext3. It is recommended that you consider using XFS for your partition(s) that store large data files such as your mythtv capture files, archived video files, etc.

Major advantages of XFS
  1. Handles large files much better and more efficiently than ext3.
  2. Minimizes both storage access time (HD read/write) and processing power (very low CPU usage).
  3. Allows users to defrag files to further minimize read/write times.

XFS really excels at handling large files like those you'll generate with mythtv (your captures and video files). The filesizes of both SD and HDTV content for example can be quite large, on the order of multi-gigs per hour. Deleting files that are stored on an XFS partition happens VERY rapidly and without your frontend waiting as the disk drives grinds away deleting the file under the default ext3 file system. XFS also allows you to defrag your partition which can also speed up access time to large files.

Potential disadvantages of XFS
  1. fsck is not run at boot time (is this really a disadvantage!)
  2. XFS partitions cannot be shrunk once they are made.
  3. Metadata-heavy operations may be slower - not a problem for your video filesystem.

There are potential downsides to an XFS partition. For one, fsck is not run by default at boot time like it is for ext3 - XFS is a journaling filesystem after all, and should not need boot-time repair. This really isn't an issue since there is a supplied program that will check and repair XFS file systems. The potential rub is that you have to manually run it (explained below) if you encounter problems that you think may be corruption-related. Another potential negative of switching is the nature of the filesystem can allow for data loss if the box isn't shutdown cleanly (i.e. if your power suddenly goes out and the box was writing data to the drive). There is the potential for data corruption to occur, but that doesn't mean that it is certain to occur. I mention it because the possibility is there and that XFS might be more susceptible to this than ext3 (the jury is still out on this one). You should know that data corruption can occur to ANY filesystem that isn't cleanly shutdown, if the application does not properly sync out important data. This statement is true to any file system (ext3, XFS, NTFS, FAT32, etc.) on any O/S (LINUX, UNIX, Windows, etc.)

That said, I have experienced several "dirty shutdowns" (due to thunderstorms and power outages) since switching to XFS. Upon restarting the box, I ran the disk check program (explained below) and didn't experience any detectable corruption. By contrast, I did lose data a few years ago on a power failure when I was using ext3.

For more on XFS, see:

For more information on tweaking XFS, see:

Make Your /myth XFS

Important.png Note: The instructions in this section were written for and tested on the Debian-based Knoppmyth (KM) distro. Extend it to your system if you know how.

Warning.png Warning: Completing these steps on an existing box will destroy all your files! ONLY do this if you are installing to a new HD, or if you don't care about losing the data on the current HD!

For New KM Installs

If you're doing a fresh install of Knoppmyth, making your /myth partition XFS is trivial: at the end of the install when you get the "Reboot?" message, don't reboot yet. Instead, press Alt-F2 to get to a terminal, and do this:

# nano /mnt/hdinstall/etc/fstab

In the /myth partition set the file type to "xfs", save and quit nano. Now issue the following command:

# mkfs.xfs -f /dev/hda3
(If you have a SATA or SCSI discs, change the last part to read sda3 instead of hda3)

Alt-F1 and select the OK and reboot.

For Existing KM Installs (auto upgrade)

If you have an existing /myth partition and you want to convert it to XFS, see the filesystem switch page on the knoppmythwiki.

Alternatively and if you are physically installing a new hardrive and want to migrate your old system to the new hardware, you can follow tjc's advice to copy your old hardrive's /myth partition to the new one. See his instructions in this KM Forum Post wherein you'll hook up both drives to the box, and rsync the entire /myth from the old drive to the new one.

XFS and Fragmentation

Measuring Fragmentation

Issue the following command to check for fragmentation:

$ xfs_db -c frag -r /dev/hda3

After I copied over about 300 gigs of video content from one drive to another, mine returned:

actual 5342, ideal 2568, fragmentation factor 51.93%

That's quite a bit! When properly configured and once you are creating new files, XFS rarely gets fragmented.

You can also check an individual file for fragmentation. Here is an example of a highly fragmented file. It's about 1 gig and as you can see, it contains 52 fragments (called inodes):

# xfs_bmap -v 18-Jun-2007.mkv
   0: [0..9503]:          67181048..67190551  0 (67181048..67190551)  9504
   1: [9504..25887]:      63597504..63613887  0 (63597504..63613887) 16384
   2: [25888..58655]:     58657216..58689983  0 (58657216..58689983) 32768
   3: [58656..99615]:     46611680..46652639  0 (46611680..46652639) 40960
   4: [99616..140575]:    10315872..10356831  0 (10315872..10356831) 40960
   5: [140576..181535]:   10274912..10315871  0 (10274912..10315871) 40960
   6: [181536..222495]:   10233952..10274911  0 (10233952..10274911) 40960
   7: [222496..263455]:   10192992..10233951  0 (10192992..10233951) 40960
   8: [263456..304415]:   10152032..10192991  0 (10152032..10192991) 40960
   9: [304416..345359]:   10111088..10152031  0 (10111088..10152031) 40944
  10: [345360..385919]:   10070520..10111079  0 (10070520..10111079) 40560
  11: [385920..425879]:   10030560..10070519  0 (10030560..10070519) 39960
  12: [425880..466199]:   9990240..10030559   0 (9990240..10030559)  40320
  13: [466200..506543]:   9949896..9990239    0 (9949896..9990239)   40344
  14: [506544..547343]:   9909096..9949895    0 (9909096..9949895)   40800
  15: [547344..588303]:   9868136..9909095    0 (9868136..9909095)   40960
  16: [588304..629223]:   9827216..9868135    0 (9827216..9868135)   40920
  17: [629224..670135]:   9786304..9827215    0 (9786304..9827215)   40912
  18: [670136..711055]:   9745384..9786303    0 (9745384..9786303)   40920
  19: [711056..752015]:   9704424..9745383    0 (9704424..9745383)   40960
  20: [752016..792895]:   9663544..9704423    0 (9663544..9704423)   40880
  21: [792896..833807]:   9622632..9663543    0 (9622632..9663543)   40912
  22: [833808..874767]:   9581672..9622631    0 (9581672..9622631)   40960
  23: [874768..915687]:   9540752..9581671    0 (9540752..9581671)   40920
  24: [915688..956607]:   9499832..9540751    0 (9499832..9540751)   40920
  25: [956608..997487]:   9458952..9499831    0 (9458952..9499831)   40880
  26: [997488..1038447]:  9417992..9458951    0 (9417992..9458951)   40960
  27: [1038448..1079343]: 9377096..9417991    0 (9377096..9417991)   40896
  28: [1079344..1120271]: 9336168..9377095    0 (9336168..9377095)   40928
  29: [1120272..1161231]: 9295208..9336167    0 (9295208..9336167)   40960
  30: [1161232..1202191]: 9254248..9295207    0 (9254248..9295207)   40960
  31: [1202192..1243151]: 9213288..9254247    0 (9213288..9254247)   40960
  32: [1243152..1284111]: 9172328..9213287    0 (9172328..9213287)   40960
  33: [1284112..1325071]: 9131368..9172327    0 (9131368..9172327)   40960
  34: [1325072..1374223]: 9082216..9131367    0 (9082216..9131367)   49152
  35: [1374224..1423375]: 9033064..9082215    0 (9033064..9082215)   49152
  36: [1423376..1472527]: 8983912..9033063    0 (8983912..9033063)   49152
  37: [1472528..1521679]: 8934760..8983911    0 (8934760..8983911)   49152
  38: [1521680..1570815]: 8885624..8934759    0 (8885624..8934759)   49136
  39: [1570816..1619967]: 8836472..8885623    0 (8836472..8885623)   49152
  40: [1619968..1669095]: 8787344..8836471    0 (8787344..8836471)   49128
  41: [1669096..1718247]: 8738192..8787343    0 (8738192..8787343)   49152
  42: [1718248..1767391]: 8689048..8738191    0 (8689048..8738191)   49144
  43: [1767392..1816543]: 8639896..8689047    0 (8639896..8689047)   49152
  44: [1816544..1865695]: 8590744..8639895    0 (8590744..8639895)   49152
  45: [1865696..1914847]: 8541592..8590743    0 (8541592..8590743)   49152
  46: [1914848..1963991]: 8492448..8541591    0 (8492448..8541591)   49144
  47: [1963992..2013143]: 8443296..8492447    0 (8443296..8492447)   49152
  48: [2013144..2062295]: 8394144..8443295    0 (8394144..8443295)   49152
  49: [2062296..2111447]: 8344992..8394143    0 (8344992..8394143)   49152
  50: [2111448..2160599]: 8295840..8344991    0 (8295840..8344991)   49152
  51: [2160600..2209743]: 8246696..8295839    0 (8246696..8295839)   49144
  52: [2209744..2251767]: 8204672..8246695    0 (8204672..8246695)   42024

Defragmenting XFS Partitions

Use xfs_fsr to defrag your partition. When started, it reorganizes all regular files in all mounted XFS filesystems. Operationally, it works differently than Windows based defrag utils: xfs_fsr runs in many cycles each time making a single pass over each XFS filesystem. It's smart and will select the files that are most fragmented, attempting to defrag the top 10 % of them on each pass.

You can specify how long you want it to run (default is 2 hours) and it will remember where it left off if more defragmenting is required. This information is stored in /var/tmp/.fsrlast_xfs.

Unlike some other disk utils, you actually leave the filesystem mounted when you use this one! xfs_fsr may be used while the filesystem is in use; if a file is written to during defragmentation, the defrag process for that file will be aborted and tried again at a later time. Note that defragmentation generates a lot of IO, so if it is happening while you're trying to watch live HDTV, you may overly stress the IO your system can do.

Now start the defrag process like so:

# xfs_fsr -v -t 600

The -t 600 means run for 600 sec (10 min) and then stop. How long you need it to run depends on the size and number of fragments on the partition. You can change that 600 to whatever you want and then re-check the % fragments and run it again if needed. Although you told it 600 sec, it might run longer if it's in the middle of a file when the time is up.

You may also defragment a single file:

# xfs_fsr -v /path/to/file

Or, defragment all files on the filesystem:

# xfs_fsr -v /mount/point

The -v switch isn't needed really. I like it because it lets me know it's still running. Omit it if you wish. It means verbose mode and just prints out a lot of info about what it's doing, example:

extents before:2 after:1 DONE ino=1989755620
extents before:2 after:1 DONE ino=1989755622

Here it is telling us that inode (file) number 1989755620 was defragmented from 2 fragments to 1.

In my example above (the one with a high degree of fragmentation, I actually ran the defag util for the default of 2 h (omitting the -t switch all together) to clean up all 300 gigs. After that time it was completely defragged:

$ xfs_db -c frag -r /dev/hda3
actual 2552, ideal 2546, fragmentation factor 0.24%

Note, though that you don't need to have a fetish about a low fragmentation factor. If you follow other advice on this page, fragmentation should not be a problem for you, and you can investigate file fragmentation (with xfs_bmap) and remedy it with xfs_fsr if you think it might be a problem.

Avoiding Future Fragmentation

See the Optimizing_Performance wherein it is suggested that you add the allocsize switch to your /etc/fstab to help minimize future fragmentation. It will make your system pre-allot chunks of the HD in 1/2 gig blocks to files it's creating when you capture or write to the FS. It is very helpful avoiding fragmentation.

Do it by editing your fstab:

# nano /etc/fstab

Now find the line that defines your /myth partition and type allocsize=512m in it. Here is that line from my /etc/fstab:

/dev/hda3  /myth  xfs  defaults,allocsize=512m  0 0

You can either reboot or umount, then remount your /myth partition for this to take effect.

Keeping XFS Healthy (Disk Checking)

As mentioned previously, XFS does not run fsck at boot time. You may periodically check your XFS partition(s) for errors manually since fsck will NOT run automatically as it does with your ext3 partition(s). xfs_repair is the provided program for this. Unless you are experiencing strange filesystem-related errors or crashes, you may never need to run xfs_repair.

Warning.png Warning: Do NOT run xfs_repair on a mounted filesystem! Serious data corruption can occur if you dare try it!

In my system, only my /myth is XFS (it is /dev/hda3). In order to unmount your partition, you must exit myth-frontend since it'll be using it. Also make sure that you have no scheduled recordings or else you will obviously miss them while the partition is unmounted.

After you exited the frontend and/or mythwelcome, drop into single user mode and umount your partition:

# telinit 1
# umount /dev/hda3

Obviously, you'll substitute the location of your /myth. For example, if you're using a SATA or SCSI drive, it would be /dev/sda3, etc.

Now that /myth is unmounted, issue the command to check/repair:

# xfs_repair /dev/hda3
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...

When the check/repair is finished, go ahead and remount your /myth and return to multiuser mode:

# telinit 5
# mount /dev/hda3

If you receive an error from xfs_repair, see this webpage for a list of common errors and how to handle them.

Error Unmounting

# unmount /myth
# umount: /myth: device is busy

This error most likely means you have users or programs accessing files on the partition you're trying to unmount. You can search for this via this command:

$ fuser -m /dev/hda3
$ /dev/hda3: 3560

Mine returned one process (3560) using it. You can see what it is by issuing the following:

$ ps auxw|grep 3560
mythtv    3560  4.5 11.1 164128 53660 ?        SLl  14:17   0:03 mythfrontend
mythtv    3594  0.0  0.1   1780   596 ttyp0    S+   14:19   0:00 grep 3560

It's mythfrontend (I didn't exit to illustrate this point). I killed it using the kill command followed-up a umount:

# kill 3560
# umount /dev/hda3

Now I'm able to run xfs_repair.