XFS Filesystem

From MythTV Official Wiki
Revision as of 19:48, 1 August 2008 by Graysky (talk | contribs) (Make your /myth XFS)

Jump to: navigation, search
Wikipedia-logo-en.png
Wikipedia has an article on:

Description

XFS is just another file system. If you're familiar with Windows, you could chose from FAT (aka FAT16) from the Win95 days, FAT32, and NTFS. XFS is just another option for you to use for your /myth partition. The default file system under Knoppmythis ext3.

Major advantages of XFS
  1. Handles large files much better and more efficiently than ext3.
  2. Minimize both storage access time (HD read/write) and processing power (very low CPU usage).
  3. Allows users to defrag files to further minimize read/write times.

XFS really excels at handling large files like those you'll generate with mythtv (your captures and video files). The filesizes of both SD and HDTV content for example can be quite large, on the order of multi-gigs per hour. Deleting files that are stored on an XFS partition happens VERY rapidly and without your frontend waiting as the disk drives grinds away deleting the file under the default ext3 file system. XFS also allows you to defrag your partition which can also speed up access time to large files.

Major disadvantages of XFS
  1. It isn't supported by fsck/you have to manually do it.
  2. If you have a sudden power failure you *may* lose some data that is being written.
  3. XFS partitions cannot be shrunk once they are made.

There are potential downsides to an XFS partition. For one, fsck won't run on it (the format isn't supported by the program) and as such, your /myth will not get auto checked every 30 boots (Knoppmyth default). This really isn't an issue since there is a supplied program that will check and repair XFS file systems. The potential rub is that you have to manually run it (explained below). Another negative of switching is the nature of the filesystem can allow for data loss if the box isn't shutdown cleanly (i.e. if your power suddenly goes out and the box was writing data to the drive). There is the potential for data corruption to occur, but that doesn't mean that it is certain to occur. I mention it because the possibility is there and that XFS might be more susceptible to this than ext3 (the jury is still out on this one). You should know that data corruption can occur to ANY filesystem that isn't cleanly shutdown. This statement is true to any file system (ext3, XFS, NTFS, FAT32, etc.) on any O/S (LINUX, UNIX, Windows, etc.)

That said, I have experienced several "dirty shutdowns" (due to thunderstorms and power outages) since switching to XFS. Upon restarting the box, I ran the disk check program (explained below) and didn't experience any detectable corruption. By contrast, I did lose data a few years ago on a power failure when I was using ext3.

For more on XFS and performance, see this page. This link provides a high-level overview beyond what was discussed above.

Make your /myth XFS

Important.png Note: The rest of the instruction in this section were written and tested on Debian-based Knoppmyth distro. Extend it to your system if you know how.

Warning.png Warning: Completing these steps on an existing box will destroy all your files! ONLY do this if you are installing to a new HD, or if you don't care about losing the data on the current HD!


For existing KM installs

If you're doing a fresh install of Knoppmyth, making your /myth partition XFS is trivial: at the end of the install when you get the "Reboot?" message, don't reboot yet. Instead, press Alt-F2 to get to a terminal, and do this:

# nano /mnt/hdinstall/etc/fstab

In the /myth partition set the file type to "xfs", save and quit nano. Now issue the following command:

# mkfs.xfs -f /dev/hda3
(If you have a SATA or SCSI dics, change the last part to read sda3 instead of hda3)

Alt-F1 and select the OK and reboot.

For Existing KM Boxes

If you have an existing /myth partition and you want to convert it to XFS, see the filesystem switch page on the knoppmythwiki.

Alternatively and if you are physically installing a new hardrive and want to migrate your old system to the new hardware, you can follow tjc's advice to copy your old hardrive's /myth partition to the new one. See his instructions in this KM Forum Post wherein you'll hook up both drives to the box, and rsync the entire /myth from the old drive to the new one.

XFS and Fragmentation

Measuring fragmentation

Issue the following command to check for fragmentation:

$ xfs_db -c frag -r /dev/hda3

Mine returned:

actual 5342, ideal 2568, fragmentation factor 51.93%

That's quite a bit! It happened when I followed the rsync method discussed above to syn my old /myth partition (about 300 gigs) to my new hardrive. When properly configured and once you are creating new files, XFS rarely gets fragmented.

You can also check an individual file for fragmentation. Here is an example of a highly fragmented file. It's about 1 gig and as you can see, it contains 52 fragments (called inodes):

# xfs_bmap -v 18-Jun-2007.mkv
18-Jun-2007.mkv:
 EXT: FILE-OFFSET         BLOCK-RANGE        AG AG-OFFSET            TOTAL
   0: 0..9503:          67181048..67190551  0 (67181048..67190551)  9504
   1: 9504..25887:      63597504..63613887  0 (63597504..63613887) 16384
   2: 25888..58655:     58657216..58689983  0 (58657216..58689983) 32768
   3: 58656..99615:     46611680..46652639  0 (46611680..46652639) 40960
   4: 99616..140575:    10315872..10356831  0 (10315872..10356831) 40960
   5: 140576..181535:   10274912..10315871  0 (10274912..10315871) 40960
   6: 181536..222495:   10233952..10274911  0 (10233952..10274911) 40960
   7: 222496..263455:   10192992..10233951  0 (10192992..10233951) 40960
   8: 263456..304415:   10152032..10192991  0 (10152032..10192991) 40960
   9: 304416..345359:   10111088..10152031  0 (10111088..10152031) 40944
  10: 345360..385919:   10070520..10111079  0 (10070520..10111079) 40560
  11: 385920..425879:   10030560..10070519  0 (10030560..10070519) 39960
  12: 425880..466199:   9990240..10030559   0 (9990240..10030559)  40320
  13: 466200..506543:   9949896..9990239    0 (9949896..9990239)   40344
  14: 506544..547343:   9909096..9949895    0 (9909096..9949895)   40800
  15: 547344..588303:   9868136..9909095    0 (9868136..9909095)   40960
  16: 588304..629223:   9827216..9868135    0 (9827216..9868135)   40920
  17: 629224..670135:   9786304..9827215    0 (9786304..9827215)   40912
  18: 670136..711055:   9745384..9786303    0 (9745384..9786303)   40920
  19: 711056..752015:   9704424..9745383    0 (9704424..9745383)   40960
  20: 752016..792895:   9663544..9704423    0 (9663544..9704423)   40880
  21: 792896..833807:   9622632..9663543    0 (9622632..9663543)   40912
  22: 833808..874767:   9581672..9622631    0 (9581672..9622631)   40960
  23: 874768..915687:   9540752..9581671    0 (9540752..9581671)   40920
  24: 915688..956607:   9499832..9540751    0 (9499832..9540751)   40920
  25: 956608..997487:   9458952..9499831    0 (9458952..9499831)   40880
  26: 997488..1038447:  9417992..9458951    0 (9417992..9458951)   40960
  27: 1038448..1079343: 9377096..9417991    0 (9377096..9417991)   40896
  28: 1079344..1120271: 9336168..9377095    0 (9336168..9377095)   40928
  29: 1120272..1161231: 9295208..9336167    0 (9295208..9336167)   40960
  30: 1161232..1202191: 9254248..9295207    0 (9254248..9295207)   40960
  31: 1202192..1243151: 9213288..9254247    0 (9213288..9254247)   40960
  32: 1243152..1284111: 9172328..9213287    0 (9172328..9213287)   40960
  33: 1284112..1325071: 9131368..9172327    0 (9131368..9172327)   40960
  34: 1325072..1374223: 9082216..9131367    0 (9082216..9131367)   49152
  35: 1374224..1423375: 9033064..9082215    0 (9033064..9082215)   49152
  36: 1423376..1472527: 8983912..9033063    0 (8983912..9033063)   49152
  37: 1472528..1521679: 8934760..8983911    0 (8934760..8983911)   49152
  38: 1521680..1570815: 8885624..8934759    0 (8885624..8934759)   49136
  39: 1570816..1619967: 8836472..8885623    0 (8836472..8885623)   49152
  40: 1619968..1669095: 8787344..8836471    0 (8787344..8836471)   49128
  41: 1669096..1718247: 8738192..8787343    0 (8738192..8787343)   49152
  42: 1718248..1767391: 8689048..8738191    0 (8689048..8738191)   49144
  43: 1767392..1816543: 8639896..8689047    0 (8639896..8689047)   49152
  44: 1816544..1865695: 8590744..8639895    0 (8590744..8639895)   49152
  45: 1865696..1914847: 8541592..8590743    0 (8541592..8590743)   49152
  46: 1914848..1963991: 8492448..8541591    0 (8492448..8541591)   49144
  47: 1963992..2013143: 8443296..8492447    0 (8443296..8492447)   49152
  48: 2013144..2062295: 8394144..8443295    0 (8394144..8443295)   49152
  49: 2062296..2111447: 8344992..8394143    0 (8344992..8394143)   49152
  50: 2111448..2160599: 8295840..8344991    0 (8295840..8344991)   49152
  51: 2160600..2209743: 8246696..8295839    0 (8246696..8295839)   49144
  52: 2209744..2251767: 8204672..8246695    0 (8204672..8246695)   42024

Defragmenting the FS

Use xfs_fsr to defrag your partition. When started, it reorganizes all regular files in all mounted XFS filesystems. Operationally, it works differently than Windows based defrag utils: xfs_fsr runs in many cycles each time making a single pass over each XFS filesystem. It's smart and will select the files that are most fragmented, attempting to defrag the top 10 % of them on each pass.

You can specify how long you want it to run (default is 2 hours) and it will remember where it left off if more defragmenting is required. This information is stored in /var/tmp/.fsrlast_xfs.

Unlike some other disk utils, you actually leave the filesystem mounted when you use this one! Finally, it isn't required, but it is recommended that you enter single user mode before running xfs_fsr so programs/users don't interfere with the optimization process by read/writing to the partition. You can do this via the telinit command:

# telinit 1

Now start the defrag process like so:

# xfs_fsr -v -t 600

The -t 600 means run for 600 sec (10 min) and then stop. How long you need it to run depends on how the size and number of fragments on the partition. You can change that to whatever you want and then re-check the % fragments and run it again if needed. Although you told it 600 sec, it might run longer if it's in the middle of a file when the time is up.

The -v switch isn't needed really. I like it because it lets me know it's still running. Omit it if you wish. It means verbose mode and just prints out a lot of cryptic info about what it's doing, example:

extents before:2 after:1 DONE ino=1989755620
ino=1989755622
extents before:2 after:1 DONE ino=1989755622

In my example above, I ran the defag util for the default of 2 h to clean up all 300 gigs. After that time it was complete:

$ xfs_db -c frag -r /dev/hda3
actual 2552, ideal 2546, fragmentation factor 0.24%

Remember to switch back to multiuser mode if you dropped down to single user mode:

# telinit 5

Avoiding Future Fragmentation

See the Optimizing_Performance wherein it is suggested that you add the allocsize switch to your /etc/fstab to help minimize future fragmentation. It will make your system pre-allot chunks of the HD in 1/2 gig blocks to files it's creating when you capture or write to the FS. It is very helpful avoiding fragmentation.

Do it by editing your fstab:

# nano /etc/fstab

Now find the line that defines your /myth partition and type allocsize=512m in it. Here is that line from my /etc/fstab:

/dev/hda3  /myth  xfs  defaults,allocsize=512m  0 0

Keeping XFS Healthy (Disk Checking)

As mentioned previously, fsck doesn't read/write to XFS. You must periodically check your XFS partition(s) for errors manually since fsck will NOT run automatically as it does with your ext3 partition(s). xfs_repair is the provided program for this.

You MUST unmount your XFS partition to use xfs_reapir. Do NOT run it on a mounted filesystem! In my system, only my /myth is XFS (it is /dev/hda3). In order to unmount your partition, you must exit myth-frontend since it'll be using it. Also make sure that you have no scheduled recordings or else you will also have to stop myth-backend prior to checking.

After you exited the frontend and/or mythwelcome, drop into single user mode and umount your partition:

# telinit 1
# umount /dev/hda3

Obviously, you'll substitute the location of your /myth. For example, if you're using a SATA or SCSI drive, it would be /dev/sda3, etc.

Now that /myth is unmounted, issue the command to check/repair:

# xfs_repair /dev/hda3
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

When the check/repair is finished, go ahead and remount your /myth and return to multiuser mode:

# telinit 5
# mount /dev/hda3

If you receive an error from xfs_repair, see this webpage for a list of common errors and how to handle them.