Difference between revisions of "Optimizing Performance"
m (→Ethernet Full-Duplex Mode: - Updated a few page links.)
(→Lightweight Window Managers)
|Line 274:||Line 274:|
=== Lightweight Window Managers ===
=== Lightweight Window Managers ===
While KDE & Gnome provide for a nice user experience, they also bring along a lot of baggage which is unnecessary for a dedicated Myth machine. Switching to a lightweight window manager such as [http://www.windowmaker.info WindowMaker]
While KDE & Gnome provide for a nice user experience, they also bring along a lot of baggage which is unnecessary for a dedicated Myth machine. Switching to a lightweight window manager such as [http://www.windowmaker.info WindowMaker][http://fluxbox.org Fluxbox] will reduce startup times and give you more available system resources at runtime.
Revision as of 17:05, 6 January 2009
This HOWTO aims to collect the multitude of tips regarding optimizing performance of your system for use with MythTV.
- 1 File Systems
- 1.1 Local File Systems
- 1.2 Network File Systems
- 2 Devices
- 3 Operating System
- 4 Other Software
Local File Systems
General Tips For Any File System
Fragmentation happens when the data placement of files is not contiguous on disk, causing time-consuming head seeks when reading or writing a file.
MythTV recordings on disk can become quite fragmented, due to several factors, such as the fact that MythTV writes large files over a very long period of time, the fact that recording files may have drastically different sizes, and the fact that many MythTV systems have multiple capture cards--allowing for recording multiple shows at once. Note, also, that any time MythTV is recording multiple shows to a single filesystem (even if in different directories and/or in different Storage Groups), the recordings will necessarily be fragmented.
Configuring multiple local filesystems within MythTV's Storage Groups will allow MythTV to write recordings to separate filesystems, thereby minimizing fragmentation. Therefore, the best approach to combat fragmentation is to ensure each computer running mythbackend has at least as many local (and available) filesystems as capture cards. If using a combination of local and network-mounted filesystems, you may need to adjust the Storage Groups Weighting to cause MythTV to write to network-mounted filesystems (though doing so may negatively impact performance, meaning the use of a sufficient number of local filesystems or the use of only network-mounted filesystems is preferred). The availability of a filesystem is somewhat dependent on that filesystem having space available for writing (i.e. having 2 filesystems for 2 capture cards with one filesystem completely full and the other only half full will not help prevent fragmentation, though if both are full, autoexpiration should allow either to be used).
Fragmentation can be measured by the "filefrag" command on most any filesystem.
Disabling File Access Time Logging
Most filesystems log the access times of files. Generally this file metadata shouldn't be necessary, however, if for some strange reason you experience problems, then don't apply this tweak.
To disable the logging of file access times, add the "noatime" and "nodiratime" options to your /etc/fstab:
# 1.5 TB RAID 5 array. Large file optimization: 512m of prealloc # NO logging of access times: improves performance # NO block devices or suid progs allowed: improves security /dev/md0 /terabyte xfs defaults,noatime,nodiratime,nosuid,nodev,allocsize=512m 0 0
If you get something like the following, the mount option is not supported for your filesystem:
mount: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so # dmesg | tail would return something like: YOUR_FILESYSTEM_TYPE: unknown mount option [noatime].
Using "relatime" Mount Option
You may also wish to look into the "relatime" mount option to improve performance, but still have file atime updated. For more information on this (and related discussion), see: Linux: Replacing atime With relatime
Under XFS, an additional command can be used to measure filesystem fragmentation: "xfs_bmap".
The xfs filesystem has a mount option which can help combat this fragmentation: allocsize
allocsize=size Sets the buffered I/O end-of-file preallocation size when doing delayed allocation writeout (default size is 64KiB). Valid values for this option are page size (typically 4KiB) through to 1GiB, inclusive, in power-of-2 increments.
This can be added to /etc/fstab, for example:
/dev/hdd1 /video xfs defaults,allocsize=512m 0 0
This essentially causes xfs to speculatively preallocate 512m of space at a time for a file when writing, and can greatly reduce fragmentation. For example on my box with HD streams that typically take about 3Gb for an hour of video, I used to get thousands of extents. This is largely due to the fsync loop in the file writer, which un-does any benefit that xfs's delayed allocation would otherwise provide. With the allocsize mount option as above, now I get at most 30 or so extents, because the periodic fsync now flushes to pre-allocated blocks.
For files which are already heavily fragmented, the xfs_fsr command (from the xfsdump package) can be used to defragment individual files, or an entire filesystem.
Run the following command to determine how fragmented your filesystem is:
xfs_db -c frag -r /dev/hdd1
xfs_fsr with no parameters will run for two hours. The -t parameter specifies how long it runs in seconds. It keeps track of where it got up to and can be run repeatedly. It can be added to our crontab to periodically defragment your disk. Add the following to /etc/crontab:
30 1 * * * root /usr/sbin/xfs_fsr -t 21600 >/dev/null 2>&1
to run it every night at 1:30 for 6 hours.
Don't forget to see the complete XFS_Filesystem wiki page that includes general info about XFS, defragmenting, disk checking and maintenance, etc.
Changing Number of Log Buffers
An interesting tweak mentioned in Filesystem Performance Tweaking with XFS is to change the number of log buffers used by XFS. This tweak can improve both sequential and random file creation and deletion times. The default depends on your filesystem's blocksize, and each log buffer takes up 32K of RAM. SGI (the company who created XFS) advises against using 8 on a system with 128M of RAM or less. Since most systems have more than 128M of RAM nowadays, it should be safe to increase the number of logbuffers for XFS. Note that the maximum number of logbuffers is 8.
By default, XFS adjusts this depending on your filesystem's blocksize:
logbufs=value Set the number of in-memory log buffers. Valid numbers range from 2-8 inclusive. The default value is 8 buffers for filesys‐ tems with a blocksize of 64KiB, 4 buffers for filesystems with a blocksize of 32KiB, 3 buffers for filesystems with a blocksize of 16KiB and 2 buffers for all other configurations. Increasing the number of buffers may increase performance on some workloads at the cost of the memory used for the additional log buffers and their associated control structures.
To check if you really need this tweak, check your XFS filesystem's blocksize using:
# replace /dev/md0 with your device's name xfs_info /dev/md0
Look for the bsize=X listed in the output. This value is reported in bytes. Note: I did not need to tweak this value, as my blocksize was 65536 bytes (64 KiB) on a XFS RAID 5 array.
To perform this tweak, edit your /etc/fstab and add the "logbufs=X" option, where X is 2 to 8 inclusive. More logbuffers will allow more system RAM to be used for filesystem log caching. This should theoretically increase performance, and according to benchmarks in Filesystem Performance Tweaking with XFS, it did for that user.
# Change X to what you want. /dev/md0 /terabyte xfs defaults,noatime,nodiratime,nosuid,nodev,allocsize=512m,logbufs=X 0 0
Please refer to Filesystem Performance Tweaking with XFS for other useful tweaks to improve the performance of your XFS filesystem.
Network File Systems
Disable NFS file attribute caching
if you are using SMB (not CIFS), you can try the ttl option using "-o ttl=100" which should set your timeout lower than the default. The default is supposed to be 1000ms which equals 1 second, but one user has reported that setting ttl=100 corrected the issue for him, so SMB users can give it a try.
Ensure that your NFS server is running in 'async' mode (configured in /etc/exports). The default for many NFS servers is 'async', but recent versions of debian now default to 'sync', which can result in very low throughput and the dreaded "TFW, Error: Write() -- IOBOUND" errors. Example of setting async in /etc/exports:
There are a few other NFS mount options that can help, such as "intr", "rsize", "wsize", "nfsvers=3", "actimeo=0" , "noatime" and "tcp". You can read the man pages for a more detailed description, but suggestions are below. (Please note that "soft" mentioned here before is prone to cause file corruption).
rsize,wsize - 8192 - 32768 (8k - 32k) suggested, can depend on your network. Try one, test it, try another, test it. 8192 is a reasonable default.
nfsvers=3 - This tells the client to use NFS3, which is better. Of course, the server has to also support it.
actimeo=0 - disable this attribute caching to allow the frontend to see updates from the backend quicker. The problem has been seen where LiveTV fails to transition from one program to another. The cache file attribute prevents the frontend from opening the new file promptly. This also causes more load on the server if that is a issue.
tcp - This tells NFS to use TCP instead of UDP. This seems to be *very* important for high speed networks (ie, 1000mbit), mixed networks and probably isn't a bad idea in any case.
intr - Makes I/O to a NFS mounted filesystem interuptable if the server is down. If not given the I/O becomes a uninteruptable sleep which causes the process to be impossible to kill until the server comes up again.
soft - If the NFS server becomes unavailable the NFS client will generate "soft" errors instead of hanging. Some software will handle this well, other much less well. In the later case file corruption will result. For a frontend node that does reading only it might still be a reasonable setting.
Example /etc/fstab entry:
server:/mythtv/recordings /mythtv/recordings nfs intr,rsize=8192,wsize=8192,async,nfsvers=3,bg,actimeo=0,tcp
Ethernet Full-Duplex Mode
Make sure that your ethernet adapters are running in full duplex mode. Check your current configuration with this command:
Typically both sides will be configured for autonegotiation by default and you will get the best possible connection automatically but there are conditions--typically involving old or buggy hardware--when this may not happen. The following can be used to disable autonegotiation and force a 100base-T network adapter into full duplex mode, when autonegotiation is failing.
ethtool -s eth0 speed 100 duplex full autoneg off
This problem can exhibit itself with "IOBOUND" errors in your logs.
Note: To use full-duplex mode, your network card must be connected to a switch (not a hub) and the switch must be configured to allow full-duplex operation (almost always the default) on the ports that are being used. By definition, a network switch supports full duplex operation and a network hub (sometimes referred to as a repeater) does not. If you are connecting to a hub, full-duplex operation will not be possible. Most switches support using 100base-T (Fast Ethernet) as well as 10base-T, while most hubs will only use 10base-T, and while a few 100base-T hubs (and 10base-T switches) do exist, they are quite rare. Gigabit switches can reliably be expected to handle both fast ethernet and normal ethernet connections in addition to the gigabit ethernet speeds.
Problems will arise if only one side of a connection supports full duplex or if one side only supports autonegotiation and can not be manually configured. It should be noted that most cheap switches and home routers do not support manual port configuration which will result in them autonegotiating to a half duplex connection if the computer is forced to full duplex as shown above. Forced connections can't advertise what they are so the autonegotiating side must assume half duplex so you will actually be creating a problem if the now forced connection was already full duplex. Nearly all of the time, using autonegotiation on all of the equipment will give you the best possible results. If you encounter problems with autonegotiation you can opt to manually configure settings for that device but it is highly recommended that you manually configure every piece of equipment on that segment as well.
MythTV is very demanding of disks, and can demand a sizeable amount of throughput from the disks to be available to operate properly, although it may not at first be obvious quite how much is actually needed. When watching LiveTV on a PVR card (for instance), if MythTV is all on one machine for you, the backend is writing to the ringbuffer, while the frontend is simultaneously reading from it, so just watching TV uses twice the disk resources that one might at first think. Filesystem caching can usually reduce the impact of this, but not always. If you are getting repeated messages from MythTV complaining that the ringbuffer file is not available, it's likely that DMA access has gone wrong in your configuration and caused your disks to become very slow.
First, check to see if DMA access has been enabled for the drive you are using. Running `hdparm /dev/hdx` for each drive will tell you (the using_dma setting) whether or not DMA has been enabled. Under normal conditions, the kernel will always enable DMA support for any and all drives and controllers that support the feature (which is basically everything that shouldn't be in a museum by now). If DMA has not been enabled, usually the kernel will have said something in the syslog/dmesg as to why it refused to enable DMA support. Solve whatever problem it's referring to before continuing. Remember that you really do need an 80-conductor IDE cable for DMA transfers to work reliably. 40-conductor cables are fine for optical drives, but not for magnetic disks. If you're still using a 40-conductor cable, replace it even if it seemed to work just fine--high speed transfers are not reliable without the 80-conductor cable.
There is a "Generic PCI bus-master DMA support" option in the kernel that will enable DMA support, but this by itself results in rather slow (<6MB/s) throughput. Run `hdparm -t /dev/hdx` while the machine is relatively non-busy and you should see a number around 16MB/s, 33MB/s or sometimes even higher (SATA and SCSI drives and RAID arrays will often show even higher numbers). If you see a very low number, then you need to enable support for the specific chipset of your disk controller in the kernel. If you are booting from this controller, you must compile the chipset support for it directly into the kernel and not as a module.
If DMA access is still not available, you can try to force it on by using the command `hdparm -d1 -X /dev/hdx`. Use this only as a last resort as enabling DMA access when the system isn't capable of properly supporting it can easily result in massive data corruption.
RAID (Redundant Array of Inexpensive Disks) is a method of utilizing multiple drives in parallel for enhanced reliability and/or performance. For the purpose of performance optimization, you can look at one of the RAID 0 (striped) configurations, to split the data between two or more drives (without redundancy) to increase throughput. More information can be found in two locations: the file storage page, and the RAID page.
For backend machines, or machines that are a combination frontend/backend, the type of capture card used will impact performance. With a typical analog capture card, such as the popular bttv cards, the CPU must encode the raw video to MPEG-4 or RTjpeg on the fly. When watching live TV on a combination frontend/backend machine, the machine has to both encode AND decode the video stream simultaneously.
Luckily there are two options:
- Hardware MPEG-2 Capture Cards, such as the popular Hauppauge PVR-150, PVR-250, PVR-350, and PVR-500.
- Digital tuners, such as the pcHDTV HD-5500, which work with both OTA 8VSB signals as well as QAM for digital cable systems.
With cards of this type, the machine's CPU doesn't have to encode the incoming video. Instead, it simply receives the MPEG-2 stream from the card and dumps it to disk. This makes the recording process a simple operation, with relatively low resource usage.
Video Cards & Hardware Accelerated Video
Several options are available for accelerating video output:
XvMC can be used for GPU decoding of MPEG-2 on most chipsets, with MPEG-4 being support on some VIA Unichrome chipsets.
NVIDIA AGP FastWrite & Side Band Addressing
See AGP FastWrites or Side Band Addressing for more information. (Note that this link is no longer active due to a database loss at the Gentoo Wiki - do we have an alternate source?)
VDPAU is currently NVIDIA-only for the time being, but provides for GPU-accelerated decode of MPEG-1, MPEG-2, H.264, and VC-1 bitstreams, as well as post-processing of decoded video including temporal and spatial deinterlacing, inverse telecine, and noise reduction.
Hardware MPEG-2 Decoders
CPU / Processor
Clock Speed Throttling
There are several conditions in which your computer's CPU may be scaled down from its maximum clock speed:
- A laptop or notebook has scaled down the CPU automatically due to being unplugged from an AC power source and running on the battery
- The system has detected an unsafe thermal condition, and has scaled back the clock speed to avoid damage
- The CPU speed has been configured incorrectly in the BIOS
- The CPU speed has been manually configured to a lower speed at runtime
You can check your CPU's current operating frequency by running the command:
If your system is slowing down because it is at its thermal limits, the only real option is to beef up your cooling capacity. This could be in the form of a larger heatsink, a larger fan, or even liquid cooling. A CPU that is incorrectly configured in the BIOS should be easy to check and easy to fix, but take care that you don't unintentionally overclock it in the process. Changing a manual control or overriding an automatic speed control will likely be distribution-dependent, or subject to your choice of adjustment tools.
If you're compiling your own kernel, you might want to try out the following options:
Ensure that the "Processor Family" (in "Processor Type and Features") is configured correctly.
Ensure that the correct IDE controller is set (in "Device Drivers->ATA/ATAPI support->PCI IDE chipset support").
Kernel preemption allows high priority threads to interrupt even kernel operations -- this ensures the lowest possible latency when responding to important events. (Note: apparently some IVTV drivers show stability problems with a preemptible kernel.)
Increasing the scheduler's timer frequency to 1000Hz can reduce latency between multiple threads of execution (at a small cost to overall performance), e.g. when recording/playing multiple video streams.
On some machines you may hear an annoying high-pitched "whistle": reduce the frequency to 250Hz or lower to avoid this.
The mythfrontend & mythtv threads can be configured to run with "realtime" priorities - if the frontend is configured this way, and if sufficient privileges are available to the user running mythfrontend.
The HOWTO has an excellent section on how to set your system up to enable this (look for "Enabling real-time scheduling of the display thread.") You will also need to select "Enable Realtime Priority Threads" in the General Playback frontend setup dialogue.
Realtime threads can help smooth out video and audio, because the system scheduler gives very high priority to mythtv. For more information on how this works, see the Real-Time chapter in Robert Love's great Linux Kernel Development book.
Incorrect or less-than-optimal settings of PCI Latency can cause performance-related problems. See the page PCI Latency
RTC Maximum Frequency
Linux Distribution Selection
At a more fundamental level, your choice of a Linux distribution can have a large impact on the overall performance of your Myth machine. Most "modern" distributions (Fedora, Ubuntu, etc) come with default installations intended to give the best initial user experience by providing support for scores of devices & programs, with automation wherever possible. The downside to this, is that these default installations have large kernels and large numbers of background processes running to support this usage.
While any distribution can be whittled down to meet a more focused need, it takes an effort to do so. An alternative approach, is to select a distribution such as Gentoo that provides you with a blank slate by default. This allows you to add only the components you need, ensuring a clean system with minimal effort.
The choice of an appropriate playback profile can make a huge difference in the perceived performance of your MythTV frontend. The playback profile decides which video decoder will be used, how the on-screen display is rendered, and which video filters (deinterlacing, etc) are used. The playback profile also dictates how hardware acceleration is used, which is especially important on low-end PCs or machines processing HD content.
MySQL Database Tweaks
Taken from this thread in mythtv-users.
Add the following to the [mysqld] section of /etc/my.cnf to see improvements in database speed for MythTV as well as MythWeb.
key_buffer = 48M max_allowed_packet = 8M table_cache = 128 sort_buffer_size = 48M net_buffer_length = 8M thread_cache_size = 4 query_cache_type = 1 query_cache_size = 4M
XORG CPU Hogging
Under some circumstances, X can use huge amounts of CPU. This can be fixed in some cases by increasing its priority above the base value of 0 (i.e. to a negative value). E.g. renice -10 [pid for X]
A second way of lowering Xorg CPU usage (especially when watching hd/x.264)with NVidia cards, is to add
Option "UseEvents" "True"
to the Device section of your Xorg.conf. (warning: although this works well for watching hd content, it's considered unstable for 3D software like gaming, etc... )
Lightweight Window Managers
While KDE & Gnome provide for a nice user experience, they also bring along a lot of baggage which is unnecessary for a dedicated Myth machine. Switching to a lightweight window manager such as WindowMaker,Fluxbox, or Ratpoison will reduce startup times and give you more available system resources at runtime.