Transcode.py

From MythTV Official Wiki
Jump to: navigation, search

Author Lucas Jacobs
Description transcode MythTV / WTV programs to H.264/AAC MPEG-4, with commercial cutting, subtitles, chapters and metadata included
Supports Version24.png  


Description

This program uses a MythTV database or WTV recording to transcode recorded TV programs to MPEG-4 or Matroska video using the H.264/AAC or VP8/Vorbis/FLAC codecs, cutting the video at certain commercial points detected by either MythTV or by Comskip. The resulting video file comes complete with SRT subtitles extracted from the embedded VBI closed-caption data. and iTunes-compatible metadata about the episode, if it can be found.

In short, this program calls a bunch of programs to convert a file like 1041_20100523000000.mpg or Program Name_ABC_2010_05_23_00_00_00.wtv into a file like /srv/video/Program Name/Program Name - Episode Name.mp4, including captions, file tags, chapters, and with commercials removed.

This program has been tested on Windows, Linux and FreeBSD, and can optionally transcode from remote sources - e.g., mythbackend / MySQL running on a different computer, or WTV source files from a remote SMB share / HomeGroup.

Download

transcode.py

Requirements

For MPEG-4 / H.264:

  • MP4Box
  • neroAacEnc (optional)
  • AtomicParsley (optional)
  • FFmpeg must be compiled with --enable-libx264
  • if using faac for audio, FFmpeg must be compiled with --enable-libfaac

For Matroska / VP8:

  • MKVToolNix
  • FFmpeg must be compiled with --enable-libvpx and --enable-libvorbis

Most of these packages can usually be found in various Linux software repositories or as pre-compiled Windows binaries. Most precompiled FFmpeg binaries include support for everything listed above except libfaac.

Setup

  • Make sure the above dependencies are installed on your system. Add each binary location to your PATH variable if necessary.
  • Edit the settings in transcode.conf to your preference. At a minimum, specify paths to Project-X and remuxTool, specify a directory to output transcoded video, and enter your MythTV host IP (if using MythTV).
  • If on Linux, optionally install transcode.py to /usr/local/bin and copy transcode.conf to ~/.transcode - otherwise, run the program from within its directory.

Usage

transcode.py %CHANID% %STARTTIME%
transcode.py /path/to/file.wtv

Notes on format string

Each of the below tags are replaced with the respective metadata obtained from MythTV, the WTV file, or Tvdb, if it can be found. Path separators (slashes) indicate that new directories are to be created if necessary.

Tag Description
 %T title (name of the show)
 %S subtilte (name of the episode)
 %R description (short plot synopsis)
 %C category (genre of the show)
 %n episode production code (or %s%e if unavailable)
 %s season number
 %E episode number (as reported by Tvdb)
 %e episode number, padded with zeroes if necessary
 %r content rating (e.g., TV-G)
 %oy year of original air date (two digits)
 %oY year of original air date (full four digits)
 %om month of original air date (two digits)
 %od day of original air date (two digits)

Special thanks

  • #MythTV Freenode
  • wagnerrp, the maintainer of the Python MythTV bindings
  • Project-X team
Usage
=====
  transcode.py [options] chanid time
  transcode.py [options] wtv-file

Options
=======
--version             show program's version number and exit
--help, -h            show this help message and exit

File options
------------
--out=PATH, -o PATH   directory to store encoded video file
                      [default: /srv/video]
--tmp=PATH, -t PATH   temporary directory to be used while transcoding
                      [default: c:\users\lucas\appdata\local\temp]
--format=FMT          format string for the encoded video filename
                      [default: %T/%T - %S]
--replace=CHAR        character to substitute for invalid filename characters
                      [default: ""]
--lang=LANG, -l LANG  two-letter language code [default: en]

Video format options
--------------------
--mp4                 use the MPEG-4 (MP4 / M4V) container format [default]
--mkv                 use the Matroska (MKV) container format
--h264                use H.264/MPEG-4 AVC and AAC [default]
--vp8                 use On2's VP8 codec and Ogg Vorbis
--ipod                use iPod Touch compatibility settings
--no-ipod             do not use iPod compatibility settings [default]
--webm                encode WebM compliant video
--no-webm             do not encode WebM compliant video [default]

Video encoding options
----------------------
--one-pass, -1        one-pass encoding [default]
--two-pass, -2        two-pass encoding
--video-br=BR         two-pass target video bitrate (in KB/s) [default: 1500]
--video-crf=CR        one-pass target compression ratio (~15-25 is ideal)
                      [default: 22]
--h264-preset=PRE     ffmpeg x264 preset to use [default: none]
--vp8-preset=PRE      ffmpeg libvpx preset to use [default: 720p]
--res=RES, -r RES     target video resolution or aspect ratio [default: none]
--ipod-preset=PRE     ffmpeg x264 preset to use for iPod Touch compatibility
                      [default: ipod640]
--ipod-res=RES        target video resolution for iPod Touch compatibility
                      [default: 480p]
--webm-preset=PRE     ffmpeg libvpx preset to use for WebM video [default:
                      720p]
--webm-res=RES        target video resolution for WebM video [default: 720p]

Audio encoding options
----------------------
--nero                use NeroAacEnc (must be installed) [default]
--faac                use libfaac (must be linked into ffmpeg)
--vorbis              use Ogg Vorbis audio (MKV only) [default]
--flac                use FLAC audio (MKV only)
--audio-br=BR         faac / vorbis audio bitrate (in KB/s) [default: 128]
--audio-q=Q           neroAacEnc audio quality ratio [default: 0.3]

Metadata options
----------------
--rating              include Tvdb episode rating (1 to 10) as voted by users
                      [default]
--no-rating           do not include Tvdb episode rating
--tvdb-description    use episode descriptions from Tvdb when available

MythTV options
--------------
--host=IP             MythTV database host [default: 127.0.0.1]
--database=DB         MySQL database for MythTV [default: mythconverg]
--user=USER           MySQL username for MythTV [default: mythtv]
--password=PWD        MySQL password for MythTV [default: mythtv]
--pin=PIN             MythTV security PIN [default: 0000]

Miscellaneous options
---------------------
--quiet, -q           avoid printing to stdout
--verbose, -v         print command output to stdout
--thresh=TH           ignore clip segments TH seconds from the beginning or
                      end [default: 5]
--project-x=PATH      path to the Project-X JAR file
                      (used for noise cleaning / cutting)
--remuxtool=PATH      path to remuxTool.jar
                      (used for extracting MPEG-2 data from WTV files)


PythonIcon.png transcode.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

'''
This program uses a MythTV database or WTV recording to transcode
recorded TV programs to MPEG-4 or Matroska video using the H.264/AAC
or VP8/Vorbis/FLAC codecs, cutting the video at certain commercial points
detected by either MythTV or by Comskip. The resulting video file comes
complete with SRT subtitles extracted from the embedded VBI closed-caption
data. and iTunes-compatible metadata about the episode, if it can be found.

In short, this program calls a bunch of programs to convert a file like
1041_20100523000000.mpg or Program Name_ABC_2010_05_23_00_00_00.wtv into a
file like /srv/video/Program Name/Program Name - Episode Name.mp4,
including captions, file tags, chapters, and with commercials removed.

This program has been tested on Windows, Linux and FreeBSD, and can optionally
transcode from remote sources - e.g., mythbackend / MySQL running on a
different computer, or WTV source files from a remote SMB share / HomeGroup.

Requirements:
- FFmpeg (http://ffmpeg.org)
- Java (http://www.java.com)
- Project-X (http://project-x.sourceforge.net)
- remuxTool.jar (http://babgvant.com/downloads/dburckh/remuxTool.jar)
- ccextractor (http://ccextractor.sourceforge.net) (optional)
- Python 2.7 (http://www.python.org)
  - lxml (http://lxml.de)
  - mysql-python (http://sourceforge.net/projects/mysql-python)
  - the MythTV Python bindings (if MythTV is not installed already)

For MPEG-4 / H.264:
- MP4Box (http://gpac.sourceforge.net)
- neroAacEnc (http://www.nero.com/enu/technologies-aac-codec.html) (optional)
- AtomicParsley (https://bitbucket.org/wez/atomicparsley) (optional)
- FFmpeg must be compiled with --enable-libx264
- if using faac for audio, FFmpeg must be compiled with --enable-libfaac

For Matroska / VP8:
- MKVToolNix (http://www.bunkus.org/videotools/mkvtoolnix/)
- FFmpeg must be compiled with --enable-libvpx and --enable-libvorbis

Most of these packages can usually be found in various Linux software
repositories or as pre-compiled Windows binaries. Most precompiled FFmpeg
binaries include support for everything listed above except libfaac.

Setup:
- Make sure the above dependencies are installed on your system. Add each
  binary location to your PATH variable if necessary.
- Edit the settings in transcode.conf to your preference. At a minimum,
  specify paths to Project-X and remuxTool, specify a directory to output
  transcoded video, and enter your MythTV host IP (if using MythTV).
- If on Linux, optionally install transcode.py to /usr/local/bin and
  copy transcode.conf to ~/.transcode - otherwise, run the program from
  within its directory.

Usage:
  transcode.py %CHANID% %STARTTIME%
  transcode.py /path/to/file.wtv
See transcode.py --help for more details

Notes on format string:
Each of the below tags are replaced with the respective metadata obtained
from MythTV, the WTV file, or Tvdb, if it can be found. Path separators
(slashes) indicate that new directories are to be created if necessary.
  %T - title (name of the show)
  %S - subtilte (name of the episode)
  %R - description (short plot synopsis)
  %C - category (genre of the show)
  %n - episode production code (or %s%e if unavailable)
  %s - season number
  %E - episode number (as reported by Tvdb)
  %e - episode number, padded with zeroes if necessary
  %r - content rating (e.g., TV-G)
  %oy - year of original air date (two digits)
  %oY - year of original air date (full four digits)
  %om - month of original air date (two digits)
  %od - day of original air date (two digits)
  
Special thanks to:
- #MythTV Freenode
- wagnerrp, the maintainer of the Python MythTV bindings
- Project-X team
'''

# Changelog:
# 1.3 - support for Matroska / VP8, fixed subtitle sync problems
# 1.2 - better accuracy for commercial clipping, easier configuration
# 1.1 - support for multiple audio streams, Comskip
# 1.0 - initial release

# TODO:
# - better error handling
# - check for adequate disk space on temporary directory and destination
# - generate thumbnail if necessary
# - better genre interpretation for WTV files
# - fetch metadata for movies as well as TV episodes
# - allow users to manually enter in metadata if none can be found

# Long-term goals:
# - add entries to MythTV's database when encoding is finished
# - support for boxed-set TV seasons on DVD
# - metadata / chapter support for as many different players as possible,
#   especially Windows Media Player
# - easier installation: all required programs should be bundled
# - Python 3.x compatibility
# - apply video filters, such as deinterlace, crop detection, etc.
# - a separate program for viewing / editing Comskip data

# Known issues:
# - subtitle font is sometimes too large on QuickTime / iTunes / iPods
# - many Matroska players seem to have trouble displaying metadata properly
# - video_br and video_crf aren't recognized options in ffmpeg 0.7+

import re, os, sys, math, datetime, subprocess, urllib, tempfile, glob
import shutil, codecs, StringIO, time, optparse, unicodedata, logging
import xml.dom.minidom
import MythTV, MythTV.ttvdb.tvdb_api, MythTV.ttvdb.tvdb_exceptions

def _clean(filename):
    'Removes the file if it exists.'
    try:
        os.remove(filename)
    except OSError:
        pass

def _convert_time(time):
    '''Converts a timestamp string into a datetime object.
    For example, '20100523140000' -> datetime(2010, 5, 23, 14, 0, 0) '''
    time = str(time)
    ret = None
    if len(time) != 14:
        raise ValueError('Invalid timestamp - must be 14 digits long.')
    try:
        ret = datetime.datetime.strptime(time, '%Y%m%d%H%M%S')
    except ValueError:
        raise ValueError('Invalid timestamp.')
    return ret

def _convert_timestamp(ts):
    '''Translates the values from a regex match for two timestamps of the
    form 00:12:34,567 into seconds.'''
    start = int(ts.group(1)) * 3600 + int(ts.group(2)) * 60
    start += int(ts.group(3))
    start += float(ts.group(4)) / 10 ** len(ts.group(4))
    end = int(ts.group(5)) * 3600 + int(ts.group(6)) * 60
    end += int(ts.group(7))
    end += float(ts.group(8)) / 10 ** len(ts.group(8))
    return start, end

def _seconds_to_time(sec):
    '''Returns a string representation of the length of time provided.
    For example, 3675.14 -> '01:01:15' '''
    hours = int(sec / 3600)
    sec -= hours * 3600
    minutes = int(sec / 60)
    sec -= minutes * 60
    return '%02d:%02d:%02d' % (hours, minutes, sec)

def _seconds_to_time_frac(sec, comma = False):
    '''Returns a string representation of the length of time provided,
    including partial seconds.
    For example, 3675.14 -> '01:01:15.140000' '''
    hours = int(sec / 3600)
    sec -= hours * 3600
    minutes = int(sec / 60)
    sec -= minutes * 60
    if comma:
        frac = int(round(sec % 1.0 * 1000))
        return '%02d:%02d:%02d,%03d' % (hours, minutes, sec, frac)
    else:
        return '%02d:%02d:%07.4f' % (hours, minutes, sec)

def _last_name_first(name):
    '''Reverses the order of full names of people, so their last name
    appears first.'''
    names = name.split()
    if len(names) == 0:
        return names
    else:
        return u' '.join((names[-1], ' '.join(names[:-1]))).strip()

def _sanitize(filename, repl = None):
    '''Replaces any invalid characters in the filename with repl. Invalid
    characters include anything from \x00 - \x31, and some other characters
    on Windows. UTF-8 characters are translated loosely to ASCII on systems
    which have no Unicode support. Unix systems are assumed to have Unicode
    support. Note that directory separators (slashes) are allowed as they
    specify that a new directory is to be created.'''
    if repl is None:
        repl = ''
    not_allowed = ''
    if os.name == 'nt':
        not_allowed += '<>:|?*'
    out = ''
    for ch in filename:
        if ch not in not_allowed and ord(ch) >= 32:
            if os.name == 'posix' or os.path.supports_unicode_filenames \
                    or ord(ch) < 127:
                out += ch
            else:
                ch = unicodedata.normalize('NFKD', ch)
                ch = ch.encode('ascii', 'ignore')
                out += ch
        else:
            out += repl
    if os.name == 'nt':
        out = out.replace('"', "'")
        reserved = ['con', 'prn', 'aux', 'nul']
        reserved += ['com%d' % d for d in range(0, 10)]
        reserved += ['lpt%d' % d for d in range(0, 10)]
        for r in reserved:
            regex = re.compile(r, re.IGNORECASE)
            out = regex.sub(repl, out)
    return out

def _filter_xml(data):
    'Filters unnecessary whitespace from the raw XML data provided.'
    regex = '<([^/>]+)>\s*(.+)\s*</([^/>]+)>'
    sub = '<\g<1>>\g<2></\g<3>>'
    return re.sub(regex, sub, data)
    
def _cmd(args, cwd = None, expected = 0):
    '''Executes an external command with the given working directory, ignoring
    all output. Raises a RuntimeError exception if the return code of the
    subprocess isn't what is expected.'''
    args = [unicode(arg).encode('utf_8') for arg in args]
    ret = 0
    logging.debug('$ %s' % u' '.join(args))
    proc = subprocess.Popen(args, stdout = subprocess.PIPE,
                            stderr = subprocess.STDOUT, cwd = cwd)
    for line in proc.stdout:
        logging.debug(line.replace('\n', ''))
    ret = proc.wait()
    if ret != 0 and ret != expected:
        raise RuntimeError('Unexpected return code', u' '.join(args), ret)
    return ret

def _ver(args, regex, use_stderr = True):
    '''Executes an external command and searches through the standard output
    (and optionally stderr) for the provided regular expression. Returns the
    first matched group, or None if no matches were found.'''
    ret = None
    with open(os.devnull, 'w') as devnull:
        try:
            stderr = subprocess.STDOUT
            if not use_stderr:
                stderr = devnull
            proc = subprocess.Popen(args, stdout = subprocess.PIPE,
                                    stderr = stderr)
            for line in proc.stdout:
                match = re.search(regex, line)
                if match:
                    ret = match.group(1).strip()
                    break
            for line in proc.stdout:
                pass
            proc.wait()
        except OSError:
            pass
    return ret

def _nero_ver():
    '''Determines whether neroAacEnc is present, and if so, returns the
    version string for later.'''
    regex = 'Package version:\s*([0-9.]+)'
    ver = _ver(['neroAacEnc', '-help'], regex)
    if not ver:
        regex = 'Package build date:\s*([A-Za-z]+\s+[0-9]+\s+[0-9]+)'
        ver = _ver(['neroAacEnc', '-help'], regex)
    if ver:
        ver = 'Nero AAC Encoder %s' % ver
    return ver

def _ffmpeg_ver():
    '''Determines whether ffmpeg is present, and if so, returns the
    version string for later.'''
    return _ver(['ffmpeg', '-version'], '([Ff]+mpeg.*)$',
                use_stderr = False)

def _x264_ver():
    '''Determines whether x264 is present, and if so, returns the
    version string for later. Does not check whether ffmpeg has libx264
    support linked in.'''
    return _ver(['x264', '--version'], '(x264.*)$')

def _faac_ver():
    '''Determines whether ffmpeg is present, and if so, returns the
    version string for later. Does not check whether ffmpeg has libfaac
    support linked in.'''
    return _ver(['faac', '--help'], '(FAAC.*)$')

def _projectx_ver(opts):
    '''Determines whether Project-X is present, and if so, returns the
    version string for later.'''
    return _ver(['java', '-jar', opts.projectx, '-?'],
                '(ProjectX [0-9/.]*)')

def _version(opts):
    'Compiles a string representing the versions of the encoding tools.'
    nero_ver = _nero_ver()
    faac_ver = _faac_ver()
    x264_ver = _x264_ver()
    ver = _ffmpeg_ver()
    if x264_ver:
        ver += ', %s' % x264_ver
    ver += ', %s' % _projectx_ver(opts)
    if opts.nero and nero_ver:
        ver += ', %s' % nero_ver
    elif faac_ver:
        ver += ', %s' % faac_ver
    logging.debug('Version string: %s' % ver)
    return ver

def _iso_639_2(lang):
    '''Translates from a two-letter ISO language code (ISO 639-1) to a
    three-letter ISO language code (ISO 639-2).'''
    languages = {'cs' : 'ces', 'da' : 'dan', 'de' : 'deu', 'es' : 'spa',
                 'el' : 'ell', 'en' : 'eng', 'fi' : 'fin', 'fr' : 'fra',
                 'he' : 'heb', 'hr' : 'hrv', 'hu' : 'hun', 'it' : 'ita',
                 'ja' : 'jpn', 'ko' : 'kor', 'nl' : 'nld', 'no' : 'nor',
                 'pl' : 'pol', 'pt' : 'por', 'ru' : 'rus', 'sl' : 'slv',
                 'sv' : 'swe', 'tr' : 'tur', 'zh' : 'zho'}
    if lang in languages.keys():
        return languages[lang]
    else:
        raise ValueError('Invalid language code %s.' % lang)

def _find_conf_file():
    'Obtains the location of the config file if one exists.'
    local_dotfile = os.path.expanduser('~/.transcode')
    if os.path.exists(local_dotfile):
        return local_dotfile
    conf_file = os.path.split(os.path.realpath(__file__))[0]
    conf_file = os.path.join(conf_file, 'transcode.conf')
    if os.path.exists(conf_file):
        return conf_file
    return None

def _get_defaults():
    'Returns configuration defaults for this program.'
    opts = {'final_path' : '/srv/video', 'tmp' : None, 'matroska' : False,
            'format' : '%T/%T - %S', 'replace_char' : '', 'language' : 'en',
            'vp8' : False, 'two_pass' : False, 'h264_preset' : None,
            'vp8_preset' : '720p', 'video_br' : 1500, 'video_crf' : 22,
            'resolution' : None, 'ipod' : False, 'webm' : False,
            'ipod_resolution' : '480p', 'ipod_preset' : 'ipod640',
            'webm_resolution' : '720p', 'webm_preset' : '720p', 'nero' : True,
            'flac' : False, 'audio_br' : 128, 'audio_q' : 0.3,
            'use_tvdb_rating' : True, 'use_tvdb_descriptions' : False,
            'host' : '127.0.0.1', 'database' : 'mythconverg',
            'user' : 'mythtv', 'password' : 'mythtv', 'pin' : 0,
            'quiet' : False, 'verbose' : False, 'thresh' : 5,
            'projectx' : 'project-x/ProjectX.jar',
            'remuxtool' : 'remuxTool.jar'}
    return opts

def _add_option(opts, key, val):
    'Inserts the given configuration setting into the dictionary.'
    key = key.lower()
    if key in ['tmp', 'h264_preset', 'vp8_preset', 'resolution',
               'ipod_preset', 'ipod_resolution', 'webm_preset',
               'webm_resolution']:
        if val == '' or not val:
            val = None
    if key in ['matroska', 'vp8', 'two_pass', 'webm', 'ipod', 'nero', 'flac',
               'use_tvdb_rating', 'use_tvdb_descriptions', 'quiet', 'verbose']:
        val = val.lower()
        if val in ['1', 't', 'y', 'true', 'yes', 'on']:
            val = True
        elif val in ['0', 'f', 'n', 'false', 'no', 'off']:
            val = False
        else:
            raise ValueError('Invalid boolean value for %s: %s' % (key, val))
    if key in ['video_br', 'video_crf', 'audio_br', 'audio_q',
               'pin', 'thresh', 'audio_q']:
        try:
            if key == 'audio_q':
                val = float(val)
            else:
                val = int(val)
        except ValueError:
            raise ValueError('Invalid numerical value for %s: %s' % (key, val))
    if key in ['projectx', 'remuxtool']:
        if not os.path.exists(val):
            pass # raise IOError('File not found: %s' % val)
    opts[key] = val

def _read_options():
    'Reads configuration settings from a file.'
    opts = _get_defaults()
    conf_name = _find_conf_file()
    if conf_name is not None:
        with open(conf_name, 'r') as conf_file:
            regex = re.compile('^\s*(.*)\s*=\s*(.*)\s*$')
            ignore = re.compile('#.*')
            for line in conf_file:
                line = re.sub(ignore, '', line).strip()
                match = re.search(regex, line)
                if match:
                    _add_option(opts, match.group(1).strip(),
                                match.group(2).strip())
    return opts

def _def_str(test, val):
    'Returns " [default]" if test is equal to val.'
    if test == val:
        return ' [default]'
    else:
        return ''

def _get_options():
    'Uses optparse to obtain command-line options.'
    opts = _read_options()
    usage = 'usage: %prog [options] chanid time\n' + \
        '  %prog [options] wtv-file'
    version = '%prog 1.3'
    parser = optparse.OptionParser(usage = usage, version = version,
                                   formatter = optparse.TitledHelpFormatter())
    flopts = optparse.OptionGroup(parser, 'File options')
    flopts.add_option('-o', '--out', dest = 'final_path', metavar = 'PATH',
                      default = opts['final_path'], help = 'directory to ' +
                      'store encoded video file                          ' +
                      '[default: %default]')
    if opts['tmp']:
        flopts.add_option('-t', '--tmp', dest = 'tmp', metavar = 'PATH',
                          default = opts['tmp'], help = 'temporary ' +
                          'directory to be used while transcoding ' +
                          '[default: %default]')
    else:
        flopts.add_option('-t', '--tmp', dest = 'tmp', metavar = 'PATH',
                          help = 'temporary directory to be used while ' +
                          'transcoding [default: %s]' % tempfile.gettempdir())
    flopts.add_option('--format', dest = 'format',
                      default = opts['format'], metavar = 'FMT',
                      help = 'format string for the encoded video filename ' +
                      '          [default: %default]')
    flopts.add_option('--replace', dest = 'replace_char',
                      default = opts['replace_char'],  metavar = 'CHAR',
                      help = 'character to substitute for invalid filename ' +
                      'characters [default: "%default"]')
    flopts.add_option('-l', '--lang', dest = 'language',
                      default = opts['language'], metavar = 'LANG',
                      help = 'two-letter language code [default: %default]')
    parser.add_option_group(flopts)
    vfopts = optparse.OptionGroup(parser, 'Video format options')
    vfopts.add_option('--mp4', dest = 'matroska', action = 'store_false',
                      help = 'use the MPEG-4 (MP4 / M4V) container format' +
                      _def_str(opts['matroska'], False))
    vfopts.add_option('--mkv', dest = 'matroska', action = 'store_true',
                      default = opts['matroska'], help = 'use the Matroska ' +
                      '(MKV) container format' +
                      _def_str(opts['matroska'], True))
    vfopts.add_option('--h264', dest = 'vp8', action = 'store_false',
                      default = opts['vp8'], help = 'use H.264/MPEG-4 AVC ' +
                      'and AAC' + _def_str(opts['vp8'], False))
    vfopts.add_option('--vp8', dest = 'vp8', action = 'store_true',
                      default = opts['vp8'], help = 'use On2\'s VP8 codec ' +
                      'and Ogg Vorbis' + _def_str(opts['vp8'], True))
    vfopts.add_option('--ipod', dest = 'ipod', action = 'store_true',
                      default = opts['ipod'], help = 'use iPod Touch ' +
                      'compatibility settings' + _def_str(opts['ipod'], True))
    vfopts.add_option('--no-ipod', dest = 'ipod', action = 'store_false',
                      help = 'do not use iPod compatibility settings' +
                      _def_str(opts['ipod'], False))
    vfopts.add_option('--webm', dest = 'webm', action = 'store_true',
                      default = opts['webm'], help = 'encode WebM ' +
                      'compliant video' + _def_str(opts['webm'], True))
    vfopts.add_option('--no-webm', dest = 'webm', action = 'store_false',
                      help = 'do not encode WebM compliant video' +
                      _def_str(opts['webm'], False))
    parser.add_option_group(vfopts)
    viopts = optparse.OptionGroup(parser, 'Video encoding options')
    viopts.add_option('-1', '--one-pass', dest = 'two_pass',
                      action = 'store_false', default = opts['two_pass'],
                      help = 'one-pass encoding' +
                      _def_str(opts['two_pass'], False))
    viopts.add_option('-2', '--two-pass', dest = 'two_pass',
                      action = 'store_true', help = 'two-pass encoding' +
                      _def_str(opts['two_pass'], True))
    viopts.add_option('--video-br', dest = 'video_br', metavar = 'BR',
                      type = 'int', default = opts['video_br'],
                      help = 'two-pass target video bitrate (in KB/s) ' +
                      '[default: %default]')
    viopts.add_option('--video-crf', dest = 'video_crf', metavar = 'CR',
                      type = 'int', default = opts['video_crf'],
                      help = 'one-pass target compression ratio (~15-25 is ' +
                      'ideal) [default: %default]')
    viopts.add_option('--h264-preset', dest = 'h264_preset', metavar = 'PRE',
                      default = opts['h264_preset'], help = 'ffmpeg x264 ' +
                      'preset to use [default: %default]')
    viopts.add_option('--vp8-preset', dest = 'vp8_preset', metavar = 'PRE',
                      default = opts['vp8_preset'], help = 'ffmpeg libvpx ' +
                      'preset to use [default: %default]')
    viopts.add_option('-r', '--res', dest = 'resolution', metavar = 'RES',
                      default = opts['resolution'], help = 'target video ' +
                      'resolution or aspect ratio [default: %default]')
    viopts.add_option('--ipod-preset', dest = 'ipod_preset', metavar = 'PRE',
                      default = opts['ipod_preset'], help = 'ffmpeg x264 ' +
                      'preset to use for iPod Touch compatibility ' +
                      '[default: %default]')
    viopts.add_option('--ipod-res', dest = 'ipod_resolution', metavar = 'RES',
                      default = opts['ipod_resolution'], help = 'target ' +
                      'video resolution for iPod Touch compatibility ' +
                      '[default: %default]')
    viopts.add_option('--webm-preset', dest = 'webm_preset', metavar = 'PRE',
                      default = opts['webm_preset'], help = 'ffmpeg libvpx ' +
                      'preset to use for WebM video ' +
                      '[default: %default]')
    viopts.add_option('--webm-res', dest = 'webm_resolution', metavar = 'RES',
                      default = opts['webm_resolution'], help = 'target ' +
                      'video resolution for WebM video ' +
                      '[default: %default]')
    parser.add_option_group(viopts)
    auopts = optparse.OptionGroup(parser, 'Audio encoding options')
    auopts.add_option('--nero', dest = 'nero', action = 'store_true',
                      default = opts['nero'], help = 'use NeroAacEnc (must ' +
                      'be installed)' + _def_str(opts['nero'], True))
    auopts.add_option('--faac', dest = 'nero', action = 'store_false',
                      help = 'use libfaac (must be linked into ffmpeg)' +
                      _def_str(opts['nero'], False))
    auopts.add_option('--vorbis', dest = 'flac', action = 'store_false',
                      default = opts['flac'], help = 'use Ogg Vorbis audio ' +
                      '(MKV only)' + _def_str(opts['flac'], False))
    auopts.add_option('--flac', dest = 'flac', action = 'store_true',
                      help = 'use FLAC audio (MKV only)' +
                      _def_str(opts['flac'], True))
    auopts.add_option('--audio-br', dest = 'audio_br', metavar = 'BR',
                      type = 'int', default = opts['audio_br'], help =
                      'faac / vorbis audio bitrate (in KB/s) ' +
                      '[default: %default]')
    auopts.add_option('--audio-q', dest = 'audio_q', metavar = 'Q',
                      type = 'float', default = opts['audio_q'], help =
                      'neroAacEnc audio quality ratio [default: %default]')
    parser.add_option_group(auopts)
    mdopts = optparse.OptionGroup(parser, 'Metadata options')
    mdopts.add_option('--rating', dest = 'use_tvdb_rating',
                      action = 'store_true', default = opts['use_tvdb_rating'],
                      help = 'include Tvdb episode rating (1 to 10) ' +
                      'as voted by users' +
                      _def_str(opts['use_tvdb_rating'], True))
    mdopts.add_option('--no-rating', dest = 'use_tvdb_rating',
                      action = 'store_false', help = 'do not include Tvdb ' +
                      'episode rating' +
                      _def_str(opts['use_tvdb_rating'], False))
    mdopts.add_option('--tvdb-description', dest = 'use_tvdb_descriptions',
                      action = 'store_true',
                      default = opts['use_tvdb_descriptions'], help = 'use ' +
                      'episode descriptions from Tvdb when available' +
                      _def_str(opts['use_tvdb_descriptions'], True))
    parser.add_option_group(mdopts)
    myopts = optparse.OptionGroup(parser, 'MythTV options')
    myopts.add_option('--host', dest = 'host', metavar = 'IP',
                      default = opts['host'], help = 'MythTV database ' +
                      'host [default: %default]')
    myopts.add_option('--database', dest = 'database', metavar = 'DB',
                      default = opts['database'], help = 'MySQL database ' +
                      'for MythTV [default: %default]')
    myopts.add_option('--user', dest = 'user', metavar = 'USER',
                      default = opts['user'], help = 'MySQL username for ' +
                      'MythTV [default: %default]')
    myopts.add_option('--password', dest = 'password', metavar = 'PWD',
                      default = opts['password'], help = 'MySQL password ' +
                      'for MythTV [default: %default]')
    myopts.add_option('--pin', dest = 'pin', metavar = 'PIN', type = 'int',
                      default = opts['pin'], help = 'MythTV security PIN ' +
                      '[default: %04d]' % opts['pin'])
    parser.add_option_group(myopts)
    miopts = optparse.OptionGroup(parser, 'Miscellaneous options')
    miopts.add_option('-q', '--quiet', dest = 'quiet', action = 'store_true',
                      default = opts['quiet'], help = 'avoid printing to ' +
                      'stdout' + _def_str(opts['quiet'], True))
    miopts.add_option('-v', '--verbose', dest = 'verbose',
                      action = 'store_true', default = opts['verbose'],
                      help = 'print command output to stdout' +
                      _def_str(opts['verbose'], True))
    miopts.add_option('--thresh', dest = 'thresh', metavar = 'TH',
                      type = 'int', default = opts['thresh'], help = 'ignore ' +
                      'clip segments TH seconds from the beginning or end ' +
                      '[default: %default]')
    miopts.add_option('--project-x', dest = 'projectx', metavar = 'PATH',
                      default = opts['projectx'], help = 'path to the ' +
                      'Project-X JAR file                            ' +
                      '(used for noise cleaning / cutting)')
    miopts.add_option('--remuxtool', dest = 'remuxtool', metavar = 'PATH',
                      default = opts['remuxtool'], help = 'path to ' +
                      'remuxTool.jar                               ' +
                      '(used for extracting MPEG-2 data from WTV files)')
    parser.add_option_group(miopts)
    return parser

def _check_args(args, parser, opts):
    '''Checks to ensure the positional arguments are valid, and adjusts
    conflicting options if necessary.'''
    if len(args) == 2:
        try:
            ts = _convert_time(args[1])
        except ValueError:
            print 'Error: invalid timestamp.'
            exit(1)
    elif len(args) == 1:
        if not os.path.exists(args[0]):
            print 'Error: file not found.'
            exit(1)
        if not re.search('\.[Ww][Tt][Vv]', args[0]):
            print 'Error: file is not a WTV recording.'
            exit(1)
    else:
        parser.print_help()
        exit(1)
    if opts.ipod and opts.webm:
        print 'Error: WebM and iPod options conflict.'
        exit(1)
    if (opts.vp8 or opts.flac or opts.webm) and not opts.matroska:
        opts.matroska = True
    if opts.ipod:
        opts.matroska = False
        opts.vp8 = False
        opts.flac = False
        opts.resolution = opts.ipod_resolution
        opts.preset = opts.ipod_preset
    elif opts.webm:
        opts.matroska = True
        opts.vp8 = True
        opts.flac = False
        opts.resolution = opts.webm_resolution
        opts.preset = opts.webm_preset
    elif opts.vp8:
        opts.preset = opts.vp8_preset
    else:
        opts.preset = opts.h264_preset
    loglvl = logging.INFO
    if opts.verbose:
        loglvl = logging.DEBUG
    if opts.quiet:
        loglvl = logging.CRITICAL
    logging.basicConfig(format = '%(message)s', level = loglvl)

class Subtitles:
    '''Extracts closed captions from source media using ccextractor and
    writes them as SRT timed-text subtitles.'''
    subs = 1
    marks = []
    
    def __init__(self, source):
        self.source = source
        self.srt = source.base + '.srt'
        self.enabled = self.check()
        if not self.enabled:
            logging.warning('*** ccextractor not found, ' +
                            'subtitles disabled ***')
    
    def check(self):
        '''Determines whether ccextractor is present, and if so,
        stores the version number for later.'''
        ver = _ver(['ccextractor'], '(CCExtractor [0-9]+\.[0-9]+),')
        return ver is not None and not self.source.opts.webm
    
    def mark(self, sec):
        'Marks a cutpoint in the video at a given point.'
        self.marks += [sec]
    
    def extract(self, video):
        '''Obtains the closed-caption data embedded as VBI data within the
        filtered video clip and writes them to a SRT file.'''
        if not self.enabled:
            return
        logging.info('*** Extracting subtitles ***')
        _cmd(['ccextractor', '-o', self.srt, '-utf8', '-ve',
              '--no_progress_bar', video], expected = 232)
    
    def adjust(self):
        '''Joining video can cause the closed-caption VBI data to become
        out-of-sync with the video, because some closed captions last longer
        than the individual clips and ccextractor doesn't know when to clip
        these captions. To compensate for this, any captions which
        extend longer than the cutpoint are clipped, and the difference is
        subtracted from the rest of the captions.'''
        if len(self.marks) == 0:
            return
        delay = 0.0
        curr = 0
        newsubs = ''
        ts = '(\d\d):(\d\d):(\d\d),(\d+)'
        regex = re.compile(ts + '\s*-+>\s*' + ts)
        with open(self.srt, 'r') as subs:
            for line in subs:
                match = re.search(regex, line)
                if match:
                    (start, end) = _convert_timestamp(match)
                    start += delay
                    end += delay
                    if curr < len(self.marks) and self.marks[curr] < end:
                        if self.marks[curr] > start:
                            delay += self.marks[curr] - end
                            end = self.marks[curr]
                        curr += 1
                    start = _seconds_to_time_frac(start, True)
                    end = _seconds_to_time_frac(end, True)
                    newsubs += '%s --> %s\n' % (start, end)
                else:
                    newsubs += line
        _clean(self.srt)
        with open(self.srt, 'w') as subs:
            subs.write(newsubs)
    
    def clean_tmp(self):
        'Removes temporary SRT files.'
        _clean(self.srt)

class MP4Subtitles(Subtitles):
    'Embeds SRT subtitles into the final MPEG-4 video file.'
    
    def write(self):
        'Invokes MP4Box to embed the SRT subtitles into a MP4 file.'
        if not self.enabled:
            return
        arg = '%s:name=Subtitles:layout=0x125x0x-1' % self.srt
        _cmd(['MP4Box', '-tmp', self.source.opts.final_path, '-add',
              arg, self.source.mp4])

class MKVSubtitles(Subtitles):
    'Embeds SRT subtitles into the final MPEG-4 video file.'
    
    def write(self):
        '''Returns command-line arguments for mkvmerge to embed the SRT
        subtitles into a MKV file.'''
        return ['--track-name', '0:Subtitles', self.srt]

class MP4Chapters:
    'Creates iOS-style chapter markers between designated cutpoints.'
    
    def __init__(self, source):
        self.source = source
        self.enabled = not self.source.opts.webm
        self._chap = source.base + '.xml'
        doc = xml.dom.minidom.Document()
        doc.appendChild(doc.createComment('GPAC 3GPP Text Stream'))
        stream = doc.createElement('TextStream')
        stream.setAttribute('version', '1.1')
        doc.appendChild(stream)
        header = doc.createElement('TextStreamHeader')
        stream.appendChild(header)
        sample = doc.createElement('TextSampleDescription')
        header.appendChild(sample)
        sample.appendChild(doc.createElement('FontTable'))
        self._doc = doc
    
    def add(self, pos, seg):
        '''Adds a new chapter marker at pos (in seconds).
        If seg is provided, the marker will be labelled 'Scene seg+1'.'''
        sample = self._doc.createElement('TextSample')
        sample.setAttribute('sampleTime', _seconds_to_time_frac(round(pos)))
        if seg is None:
            sample.setAttribute('text', '')
        else:
            text = self._doc.createTextNode('Scene %d' % (seg + 1))
            sample.appendChild(text)
            sample.setAttribute('xml:space', 'preserve')
        self._doc.documentElement.appendChild(sample)
    
    def write(self):
        '''Outputs the chapter data to XML format and invokes MP4Box to embed
        the chapter XML file into a MP4 file.'''
        if not self.enabled:
            return
        _clean(self._chap)
        data = self._doc.toprettyxml(encoding = 'UTF-8', indent = '  ')
        data = _filter_xml(data)
        logging.debug('Chapter XML file:')
        logging.debug(data)
        with open(self._chap, 'w') as dest:
            dest.write(data)
        args = ['MP4Box', '-tmp', self.source.opts.final_path,
                '-add', '%s:chap' % self._chap, self.source.mp4]
        try:
            _cmd(args)
        except RuntimeError:
            logging.warning('*** Old version of MP4Box, ' +
                            'chapter support unavailable ***')
    
    def clean_tmp(self):
        'Removes the temporary chapter XML file.'
        _clean(self._chap)

class MKVChapters:
    '''Creates a simple-format Matroska chapter file and embeds it into
    the final MKV media file.'''
    
    def __init__(self, source):
        self.source = source
        self.enabled = not source.opts.webm
        self._chap = source.base + '.chap'
        self._data = ''
    
    def add(self, pos, seg):
        '''Adds a new chapter marker at pos (in seconds).
        If seg is provided, the marker will be labelled 'Scene seg+1'.'''
        if seg is None or not self.enabled:
            return
        time = _seconds_to_time_frac(pos)
        self._data += 'CHAPTER%02d=%s\n' % (seg, time)
        self._data += 'CHAPTER%02dNAME=%s\n' % (seg, 'Scene %d' % (seg + 1))
    
    def write(self):
        '''Writes the chapter file and returns command-line arguments to embed
        the chapter date into a MKV file.'''
        if not self.enabled:
            return []
        _clean(self._chap)
        logging.debug('Chapter data file:')
        logging.debug(self._data)
        if len(self._data) > 0:
            with open(self._chap, 'w') as dest:
                dest.write(self._data)
            return ['--chapters', self._chap]
        else:
            return []
    
    def clean_tmp(self):
        'Removes the temporary chapter file.'
        _clean(self._chap)

class MP4Metadata:
    '''Translates previously fetched metadata (series name, episode name,
    episode number, credits...) into command-line arguments for AtomicParsley
    in order to embed it as iOS-compatible MP4 tags.'''
    _rating_ok = False
    _credits_ok = False
    
    def __init__(self, source):
        self.source = source
        self.enabled = self.check()
        if not self.enabled:
            logging.warning('*** AtomicParsley not found, ' +
                            'metadata disabled ***')
        elif not self._rating_ok or not self._credits_ok:
            logging.warning('*** Old version of AtomicParsley, ' +
                            'some functions unavailable ***')
    
    def check(self):
        '''Determines whether AtomicParsley is present, and if so, checks
        whether code to handle 'reverse DNS' style atoms (needed for content
        ratings and embedded XML credits) is present.'''
        ver = _ver(['AtomicParsley', '--help'], '(AtomicParsley)')
        if ver:
            rdns = ['AtomicParsley', '--reverseDNS-help']
            if _ver(rdns, '.*(--contentRating)'):
                self._rating_ok = True
            if _ver(rdns, '.*(--rDNSatom)'):
                self._credits_ok = True
            return not self.source.opts.webm
        else:
            return False
    
    def _sort_credits(self):
        '''Browses through the previously obtained list of credited people
        involved with the episode and returns three separate lists of
        actors/hosts/guest stars, directors, and producers/writers.'''
        cast = []
        directors = []
        producers = []
        c = self.source.get('credits')
        for person in c:
            if person[1] in ['actor', 'host', '']:
                cast.append(person[0])
            elif person[1] == 'director':
                directors.append(person[0])
            elif person[1] == 'executive_producer':
                producers.append(person[0])
        for person in c:
            if person[1] == 'guest_star':
                cast.append(person[0])
            elif person[1] == 'producer':
                producers.append(person[0])
        for person in c:
            if person[1] == 'writer':
                producers.append(person[0])
        return cast, directors, producers
    
    def _make_section(self, people, key_name, xml, doc):
        '''Creates an XML branch named key_name, adds each person contained
        in people to an array within it, and appends the branch to xml, using
        the XML document doc.'''
        if len(people) > 0:
            key = doc.createElement('key')
            key.appendChild(doc.createTextNode(key_name))
            xml.appendChild(key)
            array = doc.createElement('array')
            xml.appendChild(array)
            for person in people:
                dic = doc.createElement('dict')
                key = doc.createElement('key')
                key.appendChild(doc.createTextNode('name'))
                dic.appendChild(key)
                name = doc.createElement('string')
                name.appendChild(doc.createTextNode(person))
                dic.appendChild(name)
                array.appendChild(dic)
    
    def _make_credits(self):
        '''Returns an iOS-compatible XML document listing the actors, directors
        and producers involved with the episode.'''
        (cast, directors, producers) = self._sort_credits()
        imp = xml.dom.minidom.getDOMImplementation()
        dtd = '-//Apple Computer//DTD PLIST 1.0//EN'
        url = 'http://www.apple.com/DTDs/PropertyList-1.0.dtd'
        dt = imp.createDocumentType('plist', dtd, url)
        doc = imp.createDocument(url, 'plist', dt)
        root = doc.documentElement
        root.setAttribute('version', '1.0')
        top = doc.createElement('dict')
        root.appendChild(top)
        self._make_section(cast, 'cast', top, doc)
        self._make_section(directors, 'directors', top, doc)
        self._make_section(producers, 'producers', top, doc)
        return doc
    
    def _perform(self, args):
        '''Invokes AtomicParsley with the provided command-line arguments. If
        the program is successful and outputs a new MP4 file with the newly
        embedded metadata, copies the new file over the old.'''
        self.clean_tmp()
        args = ['AtomicParsley', self.source.mp4] + args
        _cmd(args)
        for old in glob.glob(self.source.final + '-temp-*.' +
                             self.source.mp4ext):
            shutil.move(old, self.source.mp4)
    
    def _simple_tags(self, version):
        'Adds single-argument or standalone tags into the MP4 file.'
        s = self.source
        args = ['--stik', 'TV Show', '--encodingTool', version,
                '--grouping', 'MythTV Recording']
        if s.get('time') is not None:
            utc = s.time.strftime('%Y-%m-%dT%H:%M:%SZ')
            args += ['--purchaseDate', utc]
        if s.get('title') is not None:
            t = s['title']
            args += ['--artist', t, '--album', t, '--albumArtist', t,
                     '--TVShowName', t]
        if s.get('subtitle') is not None:
            args += ['--title', s['subtitle']]
        if s.get('category') is not None:
            args += ['--genre', s['category']]
        if s.get('originalairdate') is not None:
            args += ['--year', str(s['originalairdate'])]
        if s.get('channel') is not None:
            args += ['--TVNetwork', s['channel']]
        if s.get('syndicatedepisodenumber') is not None:
            args += ['--TVEpisode', s['syndicatedepisodenumber']]
        if s.get('episode') is not None and \
                s.get('episodecount') is not None:
            track = '%s/%s' % (s['episode'], s['episodecount'])
            args += ['--tracknum', track, '--TVEpisodeNum', s['episode']]
        if s.get('season') is not None and \
                s.get('seasoncount') is not None:
            disk = '%s/%s' % (s['season'], s['seasoncount'])
            args += ['--disk', disk, '--TVSeasonNum', s['season']]
        if s.get('rating') is not None and self._rating_ok:
            args += ['--contentRating', s['rating']]
        self._perform(args)
    
    def _longer_tags(self):
        '''Adds lengthier tags (such as episode description or series artwork)
        into the MP4 file separately, to avoid problems.'''
        s = self.source
        args = []
        if s.get('description') is not None:
            args += ['--description', s['description']]
        self._perform(args)
        if s.get('albumart') is not None:
            try:
                self._perform(['--artwork', s['albumart']])
            except RuntimeError:
                logging.warning('*** Could not embed artwork ***')
    
    def _credits(self):
        'Adds the credits XML data directly as a command-line argument.'
        c = self.source.get('credits')
        if c is not None and c != [] and self._credits_ok:
            doc = self._make_credits()
            args = ['--rDNSatom', doc.toxml(encoding = 'UTF-8'),
                    'name=iTunMOVI', 'domain=com.apple.iTunes']
            self._perform(args)
    
    def write(self, version):
        '''Performs each of the above steps involved in embedding metadata,
        using version as the encodingTool tag.'''
        if self.enabled:
            logging.info('*** Adding metadata to %s ***' % self.source.mp4)
            self._simple_tags(version)
            self._longer_tags()
            self._credits()
    
    def clean_tmp(self):
        'Removes any leftover MP4 files created by AtomicParsley.'
        files = u'%s-temp-*.%s' % (self.source.final, self.source.mp4ext)
        for old in glob.glob(files):
            _clean(old)

class MKVMetadata:
    '''Translates previously fetched metadata (series name, episode name,
    episode number, credits...) into an XML tags file for mkvmerge
    in order to embed it as Matroska tags.'''
    
    def __init__(self, source):
        self.source = source
        self.enabled = not self.source.opts.webm
        self._tags = self.source.base + '-tags.xml'
        imp = xml.dom.minidom.getDOMImplementation()
        url = 'http://www.matroska.org/files/tags/matroskatags.dtd'
        dt = imp.createDocumentType('Tags', None, url)
        self._doc = imp.createDocument(url, 'Tags', dt)
        self._root = self._doc.documentElement
        self._show = self._make_tag('collection', 70)
        self._season = self._make_tag('season', 60)
        self._ep = self._make_tag('episode', 50)
    
    def _make_tag(self, targtype, targval):
        '''Creates an XML branch for Matroska tags using the desired level
        of specificity to which the included tags will apply.'''
        tag = self._doc.createElement('Tag')
        targ = self._doc.createElement('Targets')
        tv = self._doc.createElement('TargetTypeValue')
        tv.appendChild(self._doc.createTextNode(str(targval)))
        targ.appendChild(tv)
        tt = self._doc.createElement('TargetType')
        tt.appendChild(self._doc.createTextNode(targtype.upper()))
        targ.appendChild(tt)
        tag.appendChild(targ)
        self._root.appendChild(tag)
        return tag
    
    def _add_simple(self, tag, name, val):
        '''Creates an XML Matroska <Simple> tag with the specified name
        and value and attaches it to the given tag.'''
        simple = self._doc.createElement('Simple')
        n = self._doc.createElement('Name')
        n.appendChild(self._doc.createTextNode(name.upper()))
        simple.appendChild(n)
        v = self._doc.createElement('String')
        v.appendChild(self._doc.createTextNode(str(val)))
        simple.appendChild(v)
        tag.appendChild(simple)
    
    def _add_date(self, tag, name, date):
        '''Translates a datetime object into a UTC timestamp and adds it to
        the specified XML tag.'''
        if date is not None:
            self._add_simple(tag, name, date.strftime('%Y-%m-%d'))
    
    def _add_tags(self, version):
        'Adds most simple metadata tags to the XML tree.'
        utc = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        if self.source.get('title') is not None:
            self._add_simple(self._show, 'title', self.source.get('title'))
        if self.source.get('category') is not None:
            self._add_simple(self._show, 'genre', self.source.get('category'))
        if self.source.get('seasoncount') is not None:
            self._add_simple(self._show, 'part_number',
                             self.source.get('seasoncount'))
        if self.source.get('season') is not None:
            self._add_simple(self._season, 'part_number',
                             self.source.get('season'))
        if self.source.get('episodecount') is not None:
            self._add_simple(self._season, 'total_parts',
                             self.source.get('episodecount'))
        tags = {'subtitle' : 'subtitle', 'episode' : 'part_number',
                'syndicatedepisodenumber' : 'catalog_number',
                'channel' : 'distributed_by', 'description' : 'description',
                'rating' : 'law_rating'}
        for key, val in tags.iteritems():
            if self.source.get(key) is not None:
                self._add_simple(self._ep, val, self.source.get(key))
        if self.source.get('popularity'):
            popularity = self.source.get('popularity') / 51.0 * 2
            popularity = round(popularity) / 2.0
            self._add_simple(self._ep, 'rating', popularity)
        self._add_date(self._ep, 'date_released',
                       self.source.get('originalairdate'))
        self._add_date(self._ep, 'date_recorded', self.source.time)
        self._add_simple(self._ep, 'date_encoded', utc)
        self._add_simple(self._ep, 'date_tagged', utc)
        self._add_simple(self._ep, 'encoder', version)
        self._add_simple(self._ep, 'fps', self.source.fps)
    
    def _credits(self):
        'Adds a list of credited people into the XML tree.'
        for person in self.source.get('credits'):
            if person[1] in ['actor', 'host', 'guest_star', '']:
                self._add_simple(self._ep, 'actor', person[0])
            elif person[1] == 'director':
                self._add_simple(self._ep, 'director', person[0])
            elif person[1] == 'executive_producer':
                self._add_simple(self._ep, 'executive_producer', person[0])
            elif person[1] == 'producer':
                self._add_simple(self._ep, 'producer', person[0])
            elif person[1] == 'writer':
                self._add_simple(self._ep, 'written_by', person[0])
    
    def write(self, version):
        '''Writes the metadata XML file and returns command-line arguments to
        mkvmerge in order to embed it, along with album artwork if available,
        into the final MKV file.'''
        _clean(self._tags)
        if not self.enabled:
            return []
        logging.info('*** Adding metadata to %s ***' % self.source.mkv)
        self._add_tags(version)
        self._credits()
        data = self._doc.toprettyxml(encoding = 'UTF-8', indent = '  ')
        data = _filter_xml(data)
        logging.debug('Tag XML file:')
        logging.debug(data)
        with open(self._tags, 'w') as dest:
            dest.write(data)
        args = ['--global-tags', self._tags]
        if self.source.get('albumart') is not None:
            args += ['--attachment-description', 'Episode preview']
            args += ['--attachment-mime-type', 'image/jpeg']
            args += ['--attach-file', self.source.get('albumart')]
        return args
    
    def clean_tmp(self):
        'Removes the XML tags file if it exists.'
        _clean(self._tags)

class Transcoder:
    '''Base class which invokes tools necessary to extract, split, demux
    and encode the media.'''
    seg = 0
    video = ''
    audio = ''
    subtitles = None
    chapters = None
    metadata = None
    _split = []
    _demuxed = []
    _demux_v = None
    _demux_a = None
    _frames = 0
    _extra = 0
    
    def __init__(self, source, opts):
        self.source = source
        self.opts = opts
        self._join = source.base + '-join.ts'
        self._demux = source.base + '-demux'
        self._wav = source.base + '.wav'
        self._h264 = source.base + '.h264'
        self._vp8 = source.base + '.vp8'
        self._aac = source.base + '.aac'
        self._ogg = source.base + '.ogg'
        self._flac = source.base + '.flac'
        self.check()
    
    def check(self):
        '''Determines whether ffmpeg is installed and has built-in support
        for the requested encoding method.'''
        codecs = ['ffmpeg', '-codecs']
        if not _ffmpeg_ver():
            raise RuntimeError('FFmpeg is not installed.')
        if not self.opts.vp8 and not _ver(codecs, '--enable-(libx264)'):
            raise RuntimeError('FFmpeg does not support libx264.')
        if self.opts.vp8 and not _ver(codecs, '--enable-(libvpx)'):
            raise RuntimeError('FFmpeg does not support libvpx.')
        if self.opts.nero:
            if not _nero_ver():
                raise RuntimeError('neroAacEnc is not installed.')
        elif self.opts.flac:
            if not _ver(codecs, 'EA.*(flac)'):
                raise RuntimeError('FFmpeg does not support FLAC.')
        elif self.opts.vp8:
            if not _ver(codecs, '--enable-(libvorbis)'):
                raise RuntimeError('FFmpeg does not support libvorbis.')
        elif not _ver(codecs, '--enable-(libfaac)'):
            raise RuntimeError('FFmpeg does not support libfaac. ' +
                               '(Perhaps try using neroAacEnc?)')
        if not self.source.meta_present:
            self.metadata.enabled = False
    
    def _extract(self, clip, elapsed):
        '''Creates a new chapter marker at elapsed and uses ffmpeg to extract
        an MPEG-TS video clip from clip[0] to clip[1]. All values are
        in seconds.'''
        logging.info('*** Extracting segment %d: [%s - %s] ***' %
                     (self.seg, _seconds_to_time(clip[0]),
                      _seconds_to_time(clip[1])))
        self.chapters.add(elapsed, self.seg)
        args = ['ffmpeg', '-y', '-i', self.source.orig, '-ss', str(clip[0]),
                '-t', str(clip[1] - clip[0])] + self.source.split_args[0]
        split = '%s-%d.ts' % (self.source.base, self.seg)
        args += [split]
        self._split += split
        if len(self.source.split_args) > 1:
            args += self.source.split_args[1]
        _cmd(args)
        self.seg += 1
        self.subtitles.mark(clip[1])
    
    def split(self):
        '''Uses the source's cutlist to mark specific video clips from the
        source video for extraction while also setting chapter markers.'''
        (pos, elapsed) = (0, 0)
        for cut in self.source.cutlist:
            (start, end) = cut
            if start > self.opts.thresh and start > pos:
                self._extract((pos, start), elapsed)
            elapsed += start - pos
            pos = end
        if pos < self.source.duration - self.opts.thresh:
            self._extract((pos, self.source.duration), elapsed)
            elapsed += self.source.duration - pos
            pos = self.source.duration
        self.chapters.add(elapsed, None)
    
    def join(self):
        '''Uses ffmpeg's concat: protocol to rejoin the previously split
        video clips, and then extracts subtitles from the resulting video.'''
        logging.info('*** Joining video to %s ***' % self._join)
        self.subtitles.clean_tmp()
        concat = 'concat:'
        for seg in xrange(0, self.seg):
            name = '%s-%d.ts' % (self.source.base, seg)
            if os.path.exists(name):
                concat += '%s|' % name
            else:
                raise RuntimeError('Could not find video segment %s.' % name)
        concat = concat[:-1]
        args = ['ffmpeg', '-y', '-i', concat] + self.source.split_args[0]
        args += [self._join]
        if len(self.source.split_args) > 1:
            args += self.source.split_args[1]
        _cmd(args)
        self.subtitles.extract(self._join)
        self.subtitles.adjust()
    
    def _find_streams(self):
        '''Locates the PID numbers of the video and audio streams to be encoded
        using ffmpeg.'''
        (vstreams, astreams) = ([], [])
        stream = 'Stream.*\[0x([0-9a-fA-F]+)\].*'
        videoRE = re.compile(stream + 'Video:.*\s+([0-9]+)x([0-9]+)')
        audioRE = re.compile(stream + ':\s*Audio')
        proc = subprocess.Popen(['ffmpeg', '-i', self._join],
                                stdout = subprocess.PIPE,
                                stderr = subprocess.STDOUT)
        for line in proc.stdout:
            match = re.search(videoRE, line)
            if match:
                enabled = False
                if re.search('Video:.*\(Main\)', line):
                    logging.debug('Found video stream 0x%s' %
                                  match.group(1))
                    enabled = True
                vstreams += [(int(match.group(1), 16), enabled)]
            match = re.search(audioRE, line)
            if match:
                enabled = False
                pid = int(match.group(1), 16)
                match = re.search('\]\(([A-Za-z]+)\)', line)
                if match:
                    if match.group(1) == _iso_639_2(self.opts.language):
                        logging.debug('Found audio stream 0x%s' %
                                      match.group(1))
                        enabled = True
                astreams += [(pid, enabled)]
            proc.wait()
        if len(vstreams) == 0:
            raise RuntimeError('No video streams could be found.')
        if len(astreams) == 0:
            raise RuntimeError('No audio streams could be found.')
        if len([vid[0] for vid in vstreams if vid[1]]) == 0:
            logging.debug('No detected video streams, enabling %s' %
                          hex(vstreams[0][0]))
            vstreams[0] = (vstreams[0][0], True)
        if len([aud[0] for aud in astreams if aud[1]]) == 0:
            logging.debug('No detected audio streams, enabling %s' %
                          hex(astreams[0][0]))
            astreams[0] = (astreams[0][0], True)
        return vstreams, astreams
    
    def _find_demux(self):
        'Uses the Project-X log to obtain the separated video / audio files.'
        (vstreams, astreams) = self._find_streams()
        with open('%s_log.txt' % self._demux, 'r') as log:
            videoRE = re.compile('Video: PID 0x([0-9A-Fa-f]+)')
            audioRE = re.compile('Audio: PID 0x([0-9A-Fa-f]+)')
            fileRE = re.compile('\'(.*)\'\s*$')
            (curr_v, curr_a) = (1, 1)
            (found_v, found_a) = (0, 0)
            (targ_v, targ_a) = (0, 0)
            for line in log:
                match = re.search(videoRE, line)
                if match:
                    found_v += 1
                    pid = int(match.group(1), 16)
                    for vid in vstreams:
                        if vid[0] == pid and vid[1]:
                            targ_v = found_v
                match = re.search(audioRE, line)
                if match:
                    found_a += 1
                    pid = int(match.group(1), 16)
                    for aud in astreams:
                        if aud[0] == pid and aud[1]:
                            targ_a = found_a
                if re.match('\.Video ', line):
                    match = re.search(fileRE, line)
                    if match:
                        self._demuxed += match.group(1)
                        if curr_v == targ_v:
                            self._demux_v = match.group(1)
                    curr_v += 1
                elif re.match('Audio \d', line):
                    match = re.search(fileRE, line)
                    if match:
                        self._demuxed += match.group(1)
                        if curr_a == targ_a:
                            self._demux_a = match.group(1)
                    curr_a += 1
    
    def demux(self):
        '''Invokes Project-X to clean noise from raw MPEG-2 video capture data,
        split the media along previously specified cutpoints and combine into
        one file, and then extract each media stream (video / audio) from the
        container into separate raw data files.'''
        logging.info('*** Demuxing video ***')
        name = os.path.split(self._demux)[-1]
        try:
            _cmd(['java', '-jar', self.opts.projectx, '-out', self.opts.tmp,
                  '-name', name, '-demux', self._join])
        except RuntimeError:
            raise RuntimeError('Could not demux video.')
        self._find_demux()
        if self._demux_v is None or not os.path.exists(self._demux_v):
            raise RuntimeError('Could not locate demuxed video stream.')
        if self._demux_a is None or not os.path.exists(self._demux_a):
            raise RuntimeError('Could not locate demuxed audio stream.')
        logging.debug('Demuxed video: %s' % self._demux_v)
        logging.debug('Demuxed audio: %s' % self._demux_a)
    
    def _adjust_res(self):
        '''Adjusts the resolution of the transcoded video to be exactly the
        desired target resolution (if one is chosen), preserving aspect ratio
        by padding extra space with black pixels.'''
        size = []
        res = self.source.resolution
        target = self.opts.resolution
        if target is not None:
            aspect = res[0] * 1.0 / res[1]
            if aspect > target[0] / target[1]:
                vres = int(round(target[0] / aspect))
                if vres % 2 == 1:
                    vres += 1
                pad = (target[1] - vres) / 2
                size = ['-s', '%dx%d' % (target[0], vres), '-vf',
                        'pad=%d:%d:0:%d:black' % (target[0], target[1], pad)]
            else:
                hres = int(round(target[1] * aspect))
                if hres % 2 == 1:
                    hres += 1
                pad = (target[0] - hres) / 2
                size = ['-s', '%dx%d' % (hres, target[1]), '-vf',
                        'pad=%d:%d:%d:0:black' % (target[0], target[1], pad)]
        return size
    
    def encode_video(self):
        'Invokes ffmpeg to transcode the video stream to H.264 or VP8.'
        prof = []
        (fmt, codec, target) = ('', '', '')
        if self.opts.preset:
            prof = ['-vpre', self.opts.preset]
        if self.opts.ipod:
            (fmt, codec, target) = ('ipod', 'libx264', self._h264)
        elif self.opts.webm:
            (fmt, codec, target) = ('webm', 'libvpx', self._vp8)
        elif self.opts.vp8:
            (fmt, codec, target) = ('matroska', 'libvpx', self._vp8)
        else:
            (fmt, codec, target) = ('mp4', 'libx264', self._h264)
        _clean(target)
        size = self._adjust_res()
        common = ['ffmpeg', '-y', '-i', self._demux_v, '-vcodec', codec,
                  '-an', '-threads', 0, '-f', fmt] + prof + size
        if self.opts.two_pass:
            logging.info(u'*** Encoding video to %s - first pass ***' %
                         self.opts.tmp)
            common += ['-vb', '%dk' % self.opts.video_br]
            _cmd(common + ['-pass', 1, os.devnull],
                 cwd = self.opts.tmp)
            logging.info(u'*** Encoding video to %s - second pass ***' %
                         target)
            _cmd(common + ['-pass', 2, target],
                 cwd = self.opts.tmp)
        else:
            logging.info(u'*** Encoding video to %s ***' % target)
            _cmd(common + ['-crf', self.opts.video_crf, target])
        self.video = target
    
    def encode_audio(self):
        '''Invokes ffmpeg or neroAacEnc to transcode the audio stream to
        AAC, Vorbis or FLAC.'''
        (fmt, codec, target) = ('', '', '')
        if self.opts.flac:
            (fmt, codec, target) = ('flac', 'flac', self._flac)
        elif self.opts.vp8:
            (fmt, codec, target) = ('ogg', 'libvorbis', self._ogg)
        else:
            (fmt, codec, target) = ('aac', 'libfaac', self._aac)
        _clean(target)
        logging.info(u'*** Encoding audio to %s ***' % target)
        if fmt == 'aac' and self.opts.nero:
            _cmd(['ffmpeg', '-y', '-i', self._demux_a, '-vn', '-acodec',
                  'pcm_s16le', '-f', 'wav', self._wav])
            _cmd(['neroAacEnc', '-q', self.opts.audio_q, '-if', self._wav,
                  '-of', target])
        else:
            _cmd(['ffmpeg', '-y', '-i', self._demux_a, '-vn', '-acodec',
                  codec, '-ab', '%dk' % self.opts.audio_br, '-f', fmt,
                  target])
        self.audio = target
    
    def clean_video(self):
        'Removes the temporary video stream data.'
        _clean(self._demux_v)
    
    def clean_audio(self):
        'Removes the temporary audio stream data.'
        _clean(self._demux_a)
    
    def clean_tmp(self):
        'Removes any temporary files generated during encoding.'
        self.clean_video()
        self.clean_audio()
        for split in self._split:
            _clean(split)
        for demux in self._demuxed:
            _clean(demux)
        _clean(self._join)
        _clean(self._wav)
        _clean(self.video)
        _clean(self.audio)
        for log in ['ffmpeg2pass-0.log', 'x264_2pass.log',
                    'x264_2pass.log.mbtree', '%s_log.txt' % self._demux,
                    '%s_log.txt' % self.source.base]:
            _clean(os.path.join(self.opts.tmp, log))

class MP4Transcoder(Transcoder):
    'Remuxes and finalizes MPEG-4 media files.'
    
    def __init__(self, source, opts):
        Transcoder.__init__(self, source, opts)
        self.subtitles = MP4Subtitles(source)
        self.chapters = MP4Chapters(source)
        self.metadata = MP4Metadata(source)
        self.check()
    
    def check(self):
        'Determines whether MP4Box is installed.'
        if not _ver(['MP4Box', '-version'], 'version\s*([0-9.DEV]+)'):
            raise RuntimeError('MP4Box is not installed.')
        if not self.source.meta_present:
            self.metadata.enabled = False
    
    def remux(self):
        '''Invokes MP4Box to combine the audio, video and subtitle streams
        into the MPEG-4 target file, also embedding chapter data and
        metadata.'''
        logging.info(u'*** Remuxing to %s ***' % self.source.mp4)
        self.source.make_final_dir()
        _clean(self.source.mp4)
        common = ['MP4Box', '-tmp', self.opts.final_path]
        _cmd(common + ['-new', '-add', '%s#video:name=Video' % self.video,
                       '-add', '%s#audio:name=Audio' % self.audio,
                       self.source.mp4])
        _cmd(common + ['-isma', '-hint', self.source.mp4])
        self.subtitles.write()
        self.chapters.write()
        _cmd(common + ['-lang', self.opts.language, self.source.mp4])
        self.metadata.write(_version(self.opts))
    
    def clean_tmp(self):
        'Removes any temporary files generated during encoding.'
        Transcoder.clean_tmp(self)
        self.subtitles.clean_tmp()
        self.chapters.clean_tmp()
        self.metadata.clean_tmp()

class MKVTranscoder(Transcoder):
    'Remuxes and finalizes Matroska media files.'
    
    def __init__(self, source, opts):
        Transcoder.__init__(self, source, opts)
        self.subtitles = MKVSubtitles(source)
        self.chapters = MKVChapters(source)
        self.metadata = MKVMetadata(source)
        self.check()
    
    def check(self):
        'Determines whether mkvmerge is installed.'
        if not _ver(['mkvmerge', '--version'], '\sv*([0-9.]+)'):
            raise RuntimeError('mkvmerge is not installed.')
        if not self.source.meta_present:
            self.metadata.enabled = False
    
    def remux(self):
        '''Invokes mkvmerge to combine the audio, video and subtitle streams
        into the Matroska target file, also embedding chapter data and
        metadata.'''
        logging.info(u'*** Remuxing to %s ***' % self.source.mkv)
        self.source.make_final_dir()
        _clean(self.source.mkv)
        common = ['--no-chapters', '-B', '-T', '-M', '--no-global-tags']
        args = ['mkvmerge']
        args += ['--default-language', _iso_639_2(self.opts.language)]
        args += common + ['-A', '-S', '--track-name', '0:Video', self.video]
        args += common + ['-D', '-S', '--track-name', '0:Audio', self.audio]
        subs = self.subtitles.write()
        if len(subs) > 0:
            args += common + ['-A', '-D'] + subs
        args += self.chapters.write()
        args += self.metadata.write(_version(self.opts))
        if self.opts.webm:
            args += ['--webm']
        args += ['-o', self.source.mkv]
        _cmd(args)
    
    def clean_tmp(self):
        'Removes any temporary files generated during encoding.'
        Transcoder.clean_tmp(self)
        self.subtitles.clean_tmp()
        self.chapters.clean_tmp()
        self.metadata.clean_tmp()

class Source(dict):
    '''Acts as a base class for various raw video sources and handles
    Tvdb metadata.'''
    fps = None
    resolution = None
    duration = None
    cutlist = None
    vstreams = 0
    astreams = 0
    base = None
    orig = None
    rating = None
    final = None
    mp4 = None
    mkv = None
    meta_present = False
    split_args = None
    
    def __repr__(self):
        season = int(self.get('season', 0))
        episode = int(self.get('episode', 0))
        prodcode = self.get('syndicatedepisodenumber', '(?)')
        show = self.get('title')
        name = self.get('subtitle')
        s = ''
        if show and name:
            if season > 0 and episode > 0:
                s = u'%s %02dx%02d - %s' % (show, season, episode, name)
            else:
                s = u'%s %s - %s' % (show, prodcode, name)
        else:
            s = self.base
        return u'<Source \'%s\' at %s>' % (s, hex(id(self)))
    
    def __init__(self, opts):
        self.opts = opts
        if opts.tmp:
            self.remove_tmp = False
        else:
            opts.tmp = tempfile.mkdtemp(prefix = u'transcode_')
            self.remove_tmp = True
        if opts.ipod:
            self.mp4ext = 'm4v'
        else:
            self.mp4ext = 'mp4'
        self.tvdb = MythTV.ttvdb.tvdb_api.Tvdb(language = self.opts.language)
    
    def _check_split_args(self):
        '''Determines the arguments to pass to FFmpeg when copying video data
        during splits or frame queries. Versions 0.7 and older use -newaudio /
        -newvideo tags for each additional stream present. Versions 0.8 and
        newer use -map 0:v -map 0:a to automatically copy over all streams.'''
        match = re.search('[Ff]+mpeg\s+(.*)$', _ffmpeg_ver())
        if not match:
            raise RuntimeError('FFmpeg version could not be determined.')
        ver = match.group(1)
        match = re.match('^([0-9]+\.[0-9]+)', ver)
        if match and float(match.group(1)) <= 0.7:
            args = [['-acodec', 'copy', '-vcodec', 'copy',
                     '-f', 'mpegts']]
            for astream in xrange(1, self.astreams):
                args += [['-acodec', 'copy', '-newaudio']]
            for vstream in xrange(1, self.vstreams):
                args += [['-vcodec', 'copy', '-newvideo']]
        else:
            args = [['-map', '0:v', '-map', '0:a', '-c', 'copy',
                     '-f', 'mpegts']]
        self.split_args = args
    
    def video_params(self):
        '''Obtains source media parameters such as resolution and FPS
        using ffmpeg.'''
        (fps, resolution, duration) = (None, None, None)
        (vstreams, astreams) = (0, 0)
        stream = 'Stream.*\[0x[0-9a-fA-F]+\].*'
        fpsRE = re.compile('([0-9]*\.?[0-9]*) tbr')
        videoRE = re.compile(stream + 'Video:.*\s+([0-9]+)x([0-9]+)')
        audioRE = re.compile(stream + ':\s*Audio')
        duraRE = re.compile('Duration: ([0-9]+):([0-9]+):([0-9]+)\.([0-9]+)')
        try:
            proc = subprocess.Popen(['ffmpeg', '-i', self.orig],
                                    stdout = subprocess.PIPE,
                                    stderr = subprocess.STDOUT)
            for line in proc.stdout:
                match = re.search(fpsRE, line)
                if match:
                    fps = float(match.group(1))
                match = re.search(videoRE, line)
                if match:
                    vstreams += 1
                    if resolution is None:
                        width = int(match.group(1))
                        height = int(match.group(2))
                        resolution = (width, height)
                match = re.search(audioRE, line)
                if match:
                    astreams += 1
                match = re.search(duraRE, line)
                if match:
                    hour = int(match.group(1))
                    minute = int(match.group(2))
                    sec = int(match.group(3))
                    frac = int(match.group(4))
                    duration = hour * 3600 + minute * 60 + sec
                    duration += frac / (10. ** len(match.group(4)))
            proc.wait()
        except OSError:
            raise RuntimeError('FFmpeg is not installed.')
        if vstreams == 0:
            raise RuntimeError('No video streams could be found.')
        if astreams == 0:
            raise RuntimeError('No audio streams could be found.')
        self._check_split_args()
        return fps, resolution, duration, vstreams, astreams
    
    def parse_resolution(self, res):
        '''Translates the user-specified target resolution string into a
        width/height tuple using predefined resolution names like '1080p',
        or aspect ratios like '4:3', and so on.'''
        if res:
            predefined = {'480p' : (640, 480), '480p60' : (720, 480),
                          '720p' : (1280, 720), '1080p' : (1920, 1080)}
            for key, val in predefined.iteritems():
                if res == key:
                    return val
            match = re.match('(\d+)x(\d+)', res)
            if match:
                return (int(match.group(1)), int(match.group(2)))
            match = re.match('(\d+):(\d+)', res)
            if match:
                aspect = float(match.group(1)) / float(match.group(2))
                h = self.resolution[1]
                w = int(round(h * aspect))
                return (w, h)
            if res == 'close' or res == 'closest':
                (w, h) = self.resolution
                aspect = w * 1.0 / h
                closest = (-1920, -1080)
                for val in predefined.itervalues():
                    test = math.sqrt((val[0] - w) ** 2 + (val[1] - h) ** 2)
                    curr = math.sqrt((closest[0] - w) ** 2 +
                                     (closest[1] - h) ** 2)
                    if test < curr:
                        closest = val
                return closest
            logging.warning('*** Invalid resolution - %s ***' % res)
        return None
    
    def _align_episode(self):
        '''Returns a string for the episode number padded by as many zeroes
        needed so that the filename for the last episode in the season will
        align perfectly with this filename. Usually a field width of two,
        since most seasons tend to have less than 100 episodes.'''
        ep = self.get('episode')
        if ep is None:
            ep = ''
        if self.get('episodecount') is None:
            return ep
        lg = math.log(self['episodecount']) / math.log(10)
        field = int(math.ceil(lg))
        return ('%0' + str(field) + 'd') % ep
    
    def _find_episode(self, show):
        'Searches Tvdb for the episode using the original air date and title.'
        episodes = None
        airdate = self.get('originalairdate')
        subtitle = self.get('subtitle')
        if airdate is not None:
            episodes = show.search(airdate, key = 'firstaired')
        if not episodes and subtitle is not None:
            episodes = show.search(subtitle, key = 'episodename')
        if len(episodes) == 1:
            return episodes[0]
        for ep in episodes:
            air = ep.get('firstaired')
            if air:
                date = datetime.datetime.strptime(air, '%Y-%m-%d').date()
                if airdate == date:
                    return ep
        for ep in episodes:
            st = ep.get('episodename')
            if subtitle.find(st) >= 0 or st.find(subtitle) >= 0:
                return ep
        return None
    
    def fetch_tvdb(self):
        'Obtains missing metadata through Tvdb if episode is found.'
        if not self.get('title'):
            return
        try:
            show = self.tvdb[self.get('title')]
            self['seasoncount'] = len(show)
            if show.has_key(0):
                self['seasoncount'] -= 1
            ep = self._find_episode(show)
            logging.debug('Tvdb episode: %s' % ep)
            if ep:
                if self.get('subtitle') is None:
                    self['subtitle'] = ep.get('episodename')
                if ep.get('episodenumber') is not None:
                    self['episode'] = int(ep.get('episodenumber'))
                elif ep.get('combined_episodenumber') is not None:
                    self['episode'] = int(ep.get('combined_episodenumber'))
                season = ep.get('seasonnumber')
                if season is None:
                    season = int(ep.get('combined_season'))
                if season is not None:
                    self['episodecount'] = len(show[int(season)])
                    self['season'] = int(season)
                if self['season'] is not None and self['episode'] is not None:
                    if self.get('syndicatedepisodenumber') is None:
                        self['syndicatedepisodenumber'] = '%d%s' % \
                            (self['season'], self._align_episode())
                overview = ep.get('overview')
                if self.get('description') is None:
                    self['description'] = overview
                elif overview is not None:
                    if self.opts.use_tvdb_descriptions:
                        if len(self['description']) < len(overview):
                            self['description'] = overview
                rating = ep.get('rating')
                if rating is not None:
                    self['popularity'] = int(float(rating) / 10 * 255)
                    if self.opts.use_tvdb_rating:
                        self['description'] += ' (%s / 10)' % rating
                filename = ep.get('filename')
                if filename is not None and len(filename) > 0:
                    ext = os.path.splitext(filename)[-1]
                    art = self.base + ext
                    try:
                        urllib.urlretrieve(filename, art)
                        self['albumart'] = art
                    except IOError:
                        logging.warning('*** Unable to download ' +
                                        'episode screenshot ***')
        except MythTV.ttvdb.tvdb_exceptions.tvdb_shownotfound:
            logging.warning('*** Unable to fetch Tvdb listings for show ***')
    
    def sort_credits(self):
        '''Sorts the list of credited actors, directors and such using the
        last name first.'''
        if self.get('credits'):
            key = lambda person: u'%s %s' % (person[1],
                                             _last_name_first(person[0]))
            self['credits'] = sorted(self.get('credits'), key = key)
    
    def print_metadata(self):
        'Outputs any metadata obtained to the log.'
        if not self.meta_present:
            return
        logging.info('*** Printing file metadata ***')
        for key in self.keys():
            if key != 'credits':
                logging.info(u'  %s: %s' % (key, self[key]))
        cred = self.get('credits')
        if cred:
            logging.info('  Credits:')
            for person in cred:
                name = person[0]
                role = re.sub('_', ' ', person[1]).lower()
                logging.info(u'    %s (%s)' % (name, role))
        else:
            logging.info('  No credits')
    
    def print_options(self):
        'Outputs the user-selected options to the log.'
        logging.info('*** Printing configuration ***')
        (fmt, audio, video, ext) = '', '', '', ''
        if self.opts.webm:
            (fmt, ext) = 'WebM', 'webm'
        elif self.opts.ipod:
            (fmt, ext) = 'iPod Touch compatible MPEG-4', 'm4v'
        elif self.opts.matroska:
            (fmt, ext) = 'Matroska', 'mkv'
        else:
            (fmt, ext) = 'MPEG-4', 'mp4'
        if self.opts.vp8:
            video = 'On2 VP8'
        else:
            video = 'H.264 AVC'
        if self.opts.flac:
            audio = 'FLAC'
        elif self.opts.vp8:
            audio = 'Ogg Vorbis'
        elif self.opts.nero:
            audio = 'AAC (neroAacEnc)'
        else:
            audio = 'AAC (libfaac)'
        logging.info('  Format: %s, %s, %s' % (fmt, video, audio))
        enc = '  Video options:'
        if self.opts.preset:
            enc += ' preset \'%s\',' % self.opts.preset
        if self.opts.resolution:
            enc += ' resolution %dx%d,' % self.opts.resolution
        if self.opts.two_pass:
            enc += ' two-pass, br: %dk' % self.opts.video_br
        else:
            enc += ' one-pass, crf: %d' % self.opts.video_crf
        logging.info(enc)
        logging.info('  Source file: %s' % self.orig)
        logging.info('  Target file: %s.%s' % (self.final, ext))
        logging.info('  Temporary directory: %s' % self.opts.tmp)
    
    def final_name(self):
        '''Obtains the filename of the target MPEG-4 file using the format
        string. Formatting is (mostly) compatible with Recorded.formatPath()
        from dataheap.py, and the format used in mythrename.pl.'''
        final = os.path.join(self.opts.final_path,
                             os.path.split(self.base)[-1])
        if self.meta_present:
            path = self.opts.format
            tags = [('%T', 'title'), ('%S', 'subtitle'), ('%R', 'description'),
                    ('%C', 'category'), ('%n', 'syndicatedepisodenumber'),
                    ('%s', 'season'), ('%E', 'episode'), ('%r', 'rating')]
            for tag, key in tags:
                if self.get(key) is not None:
                    val = unicode(self.get(key))
                    val = val.replace('/', self.opts.replace_char)
                    if os.name == 'nt':
                        val = val.replace('\\', self.opts.replace_char)
                    path = path.replace(tag, val)
                else:
                    path = path.replace(tag, '')
            if self.get('episode') is not None:
                path = path.replace('%e', self._align_episode())
            else:
                path = path.replace('%e', '')
            tags = [('%oy', '%y'), ('%oY', '%Y'), ('%on', '%m'),
                    ('%om', '%m'), ('%oj', '%d'), ('%od', '%d')]
            airdate = self.get('originalairdate')
            for tag, part in tags:
                if airdate is not None:
                    val = unicode(airdate.strftime(part))
                    path = path.replace(tag, val)
                else:
                    path = path.replace(tag, '')
            path = path.replace('%-', '-')
            path = path.replace('%%', '%')
            path = _sanitize(path, self.opts.replace_char)
            final = os.path.join(self.opts.final_path, *path.split('/'))
        return final
    
    def make_final_dir(self):
        'Creates the directory for the target MPEG-4 file.'
        path = os.path.dirname(self.final)
        if not os.path.isdir(path):
            os.makedirs(path, 0755)
    
    def clean_copy(self):
        'Removes the copied raw MPEG-2 video data.'
        _clean(self.orig)
    
    def clean_tmp(self):
        'Removes any temporary files created.'
        self.clean_copy()
        art = self.get('albumart')
        if art:
            _clean(art)
        if self.remove_tmp and os.path.isdir(self.opts.tmp):
            shutil.rmtree(self.opts.tmp)

class MythSource(Source):
    '''Obtains the raw MPEG-2 video data from a MythTV database along with
    metadata and a commercial-skip cutlist.'''
    prog = None
    
    class _Rating(MythTV.DBDataRef):
        'Query for the content rating within the MythTV database.'
        _table = 'recordedrating'
        _ref = ['chanid', 'starttime']
    
    def __init__(self, channel, time, opts):
        Source.__init__(self, opts)
        self.channel = channel
        self.time = _convert_time(time)
        self.base = os.path.join(opts.tmp, '%s_%s' % (str(channel), str(time)))
        self.orig = self.base + '-orig.mpg'
        self.db_info = {'DBHostName' : opts.host, 'DBName' : opts.database,
                        'DBUserName' : opts.user, 'DBPassword' : opts.password,
                        'SecurityPin' : opts.pin}
        self.db = MythTV.MythDB(**self.db_info)
        try:
            self.rec = MythTV.Recorded((channel, time), db = self.db)
        except MythTV.exceptions.MythError:
            err = 'Could not find recording at channel %d and time %s.'
            raise ValueError(err % (channel, time))
        try:
            self.prog = self.rec.getRecordedProgram()
            self.meta_present = True
        except MythTV.exceptions.MythError:
            logging.warning('*** No MythTV program data, ' +
                            'metadata disabled ***')
        self.rating = self._Rating(self.rec._wheredat, self.db)
        self._fetch_metadata()
        self.final = self.final_name()
        self.mp4 = '%s.%s' % (self.final, self.mp4ext)
        if self.opts.webm:
            self.mkv = self.final + '.webm'
        else:
            self.mkv = self.final + '.mkv'
    
    def _frame_to_timecode(self, frame):
        '''Uses ffmpeg to remux a given number of frames in the video file
        in order to determine the amount of time elapsed by those frames.'''
        time = 0
        args = ['ffmpeg', '-y', '-i', self.orig, '-vframes',
                str(frame)] + self.split_args[0]
        args += [os.devnull]
        if len(self.split_args) > 1:
            args += self.split_args[1]
        proc = subprocess.Popen(args, stdout = subprocess.PIPE,
                                stderr = subprocess.STDOUT)
        logging.debug('$ %s' % u' '.join(args))
        regex = re.compile('time=(\d\d):(\d\d):(\d\d\.\d\d)(?!.*time)')
        for line in proc.stdout:
            match = re.search(regex, line)
            if match:
                time = 3600 * int(match.group(1)) + 60 * int(match.group(2))
                time += float(match.group(3))
        return time
    
    def _cut_list(self):
        'Obtains the MythTV commercial-skip cutlist from the database.'
        logging.info('*** Locating cut points ***')
        markup = self.rec.markup.getcutlist()
        self.cutlist = []
        for cut in xrange(0, len(markup)):
            start = self._frame_to_timecode(markup[cut][0])
            end = self._frame_to_timecode(markup[cut][1])
            self.cutlist.append((start, end))
    
    def _fetch_metadata(self):
        'Obtains any metadata MythTV might have stored in the database.'
        if not self.meta_present:
            return
        logging.info('*** Fetching metadata for %s ***' % self.base)
        channel = None
        ch = MythTV.Channel(db = self.db)
        for chan in ch._fromQuery("", (), db = self.db):
            if chan.get('chanid') == self.rec.get('chanid'):
                channel = chan
        if channel:
            self['channel'] = channel.get('name')
        for key in ['title', 'subtitle', 'description', 'category',
                    'originalairdate', 'syndicatedepisodenumber']:
            val = self.prog.get(key)
            if val is not None:
                self[key] = val
        if self.rec.get('originalairdate') is None: # MythTV bug
            self.rec['originalairdate'] = self.time
        for item in self.rating:
            self['rating'] = item.get('rating')
        cred = None
        for person in self.rec.cast:
            if cred is None:
                cred = []
            cred.append((person['name'], person['role']))
        self['credits'] = cred
        self.sort_credits()
        self.fetch_tvdb()
    
    def copy(self):
        '''Copies the recording for the given channel ID and start time to the
        specified path, if enough space is available.'''
        bs = 4096 * 1024
        logging.info('*** Copying video to %s ***' % self.orig)
        source = self.rec.open()
        try:
            with open(self.orig, 'wb') as dest:
                data = source.read(bs);
                while len(data) > 0:
                    dest.write(data)
                    data = source.read(bs)
        finally:
            source.close()
        (self.fps, self.resolution, self.duration,
         self.vstreams, self.astreams) = self.video_params()
        if not self.fps or not self.resolution or not self.duration:
            raise RuntimeError('Could not determine video parameters.')
        self.opts.resolution = self.parse_resolution(self.opts.resolution)
        self._cut_list()

class WTVSource(Source):
    '''Obtains the raw MPEG-2 video data from a Windows TV recording (.WTV)
    along with embedded metadata.'''
    channel = None
    time = None
    orig = None
    
    def __init__(self, wtv, opts):
        Source.__init__(self, opts)
        self.wtv = wtv
        match = re.search('[^_]*_([^_]*)_(\d\d\d\d)_(\d\d)_(\d\d)_' +
                          '(\d\d)_(\d\d)_(\d\d)\.[Ww][Tt][Vv]', wtv)
        if match:
            self.channel = match.group(1)
            t = ''
            for g in xrange(2, 8):
                t += match.group(g)
            self.time = _convert_time(long(t))
            b = self.channel + '_' + t
            self.base = os.path.join(opts.tmp, '%s_%s' % (self.channel, t))
        else:
            f = os.path.split(wtv)[-1]
            self.base = os.path.join(opts.tmp, os.path.splitext(f)[0])
        self.orig = self.base + '-orig.ts'
        self._fetch_metadata()
        self.final = self.final_name()
        self.mp4 = '%s.%s' % (self.final, self.mp4ext)
        self.mkv = self.final + '.mkv'
    
    def _cut_list(self):
        '''Obtains a commercial-skip cutlist from previously generated
        Comskip output, if it exists.'''
        self.cutlist = []
        base = os.path.splitext(self.wtv)[0]
        comskip = base + '.txt'
        if os.path.exists(comskip):
            framesRE = re.compile('^(\d+)\s+(\d+)\s*$')
            with open(comskip, 'r') as text:
                for line in text:
                    match = re.search(framesRE, line)
                    if match:
                        start = int(match.group(1)) / self.fps
                        end = int(match.group(2)) / self.fps
                        self.cutlist.append((start, end))
    
    def _parse_genre(self, genre):
        'Translates the WTV genre line into a category tag.'
        self['category'] = genre.split(';')[0]
    
    def _parse_airdate(self, airdate):
        'Translates the UTC timecode for original airdate into a date object.'
        try:
            d = datetime.datetime.strptime(airdate, '%Y-%m-%dT%H:%M:%SZ')
            self['originalairdate'] = d.date()
        except ValueError:
            pass
    
    def _parse_credits(self, line):
        'Translates the WTV credits line into a list of credited people.'
        cred = None
        people = line.split(';')
        for tup in zip(range(0, 4), ['actor', 'director',
                                     'host', 'guest_star']):
            for person in people[tup[0]].split('/'):
                if person != '':
                    if cred is None:
                        cred = []
                    cred.append((person, tup[1]))
        self['credits'] = cred
    
    def _fetch_metadata(self):
        'Obtains any metadata which might be embedded in the WTV.'
        logging.info('*** Fetching metadata for %s ***' % self.wtv)
        val = '\s*:\s*(.*)$'
        tags = []
        tags.append((re.compile('service_provider' + val), 'channel'))
        tags.append((re.compile('service_name' + val), 'channel'))
        tags.append((re.compile('\s+Title' + val), 'title'))
        tags.append((re.compile('WM/SubTitle' + val), 'subtitle'))
        tags.append((re.compile('WM/SubTitleDescription' + val),
                     'description'))
        tags.append((re.compile('genre' + val), self._parse_genre))
        tags.append((re.compile('WM/MediaOriginalBroadcastDateTime' + val),
                     self._parse_airdate))
        tags.append((re.compile('WM/ParentalRating' + val), 'rating'))
        tags.append((re.compile('WM/MediaCredits' + val), self._parse_credits))
        try:
            proc = subprocess.Popen(['ffmpeg', '-i', self.wtv],
                                    stdout = subprocess.PIPE,
                                    stderr = subprocess.STDOUT)
            for line in proc.stdout:
                for tag in tags:
                    match = re.search(tag[0], line)
                    if match:
                        self.meta_present = True
                        if type(tag[1]) == type(str()):
                            self[tag[1]] = match.group(1).strip()
                        else:
                            tag[1](match.group(1).strip())
            proc.wait()
        except OSError:
            raise RuntimeError('FFmpeg is not installed.')
        self.sort_credits()
        self.fetch_tvdb()
    
    def copy(self):
        'Extracts the MPEG-2 data from the WTV file.'
        logging.info('*** Extracting data to %s ***' % self.orig)
        try:
            _cmd(['java', '-cp', self.opts.remuxtool, 'util.WtvToMpeg',
                  '-i', self.wtv, '-o', self.orig, '-lang',
                  _iso_639_2(self.opts.language)])
        except RuntimeError:
            raise RuntimeError('Could not extract video.')
        self._fetch_metadata()
        (self.fps, self.resolution, self.duration,
         self.vstreams, self.astreams) = self.video_params()
        if not self.fps or not self.resolution or not self.duration:
            raise RuntimeError('Could not determine video parameters.')
        self.opts.resolution = self.parse_resolution(self.opts.resolution)
        self._cut_list()

if __name__ == '__main__':
    sys.stdout = codecs.getwriter('utf8')(sys.stdout)
    parser = _get_options()
    (opts, args) = parser.parse_args()
    _check_args(args, parser, opts)
    s = None
    if len(args) == 1:
        wtv = args[0]
        s = WTVSource(wtv, opts)
    else:
        channel = int(args[0])
        timecode = long(args[1])
        s = MythSource(channel, timecode, opts)
    s.copy()
    s.print_metadata()
    s.print_options()
    if opts.matroska:
        t = MKVTranscoder(s, opts)
    else:
        t = MP4Transcoder(s, opts)
    t.split()
    t.join()
    t.demux()
    s.clean_copy()
    t.encode_audio()
    t.clean_audio()
    t.encode_video()
    t.clean_video()
    t.remux()
    t.clean_tmp()
    s.clean_tmp()

# Copyright (c) 2012, Lucas Jacobs
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met: 
#
# 1. Redistributions of source code must retain the above copyright notice,
#    this list of conditions and the following disclaimer. 
# 2. Redistributions in binary form must reproduce the above copyright notice,
#    this list of conditions and the following disclaimer in the documentation
#    and/or other materials provided with the distribution. 
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#
# The views and conclusions contained in the software and documentation are
# those of the authors and should not be interpreted as representing official
# policies, either expressed or implied, of the copyright holder.


Script.png transcode.conf

# transcode.conf - configuration file for transcode.py
# copy to ~/.transcode or to same directory as transcode.py


# --- General options ---


# directory to store encoded video file
final_path = /srv/video

# temporary directory to use while encoding
# (if left blank, a temporary directory will be created)
tmp = 

# format string for the encoded video filename. some examples:
# '%T/%T - %S' -> 'Show/Show - Episode'
# '%C/%T/%o - %S' -> 'Genre/Show/1x23 - Episode'
# '%oY/%T (%sx%e) - %S' -> '2012/Show (1x23) - Episode'
format = %T/%T - %S

# character to substitute for invalid filename characters
# (if left blank, invalid characters will be removed
replace_char = 

# two-letter language code (ISO 639-2) of the audio track to be obtained
language = en


# --- MythTV options ---


# MythTV database host (change this if using MythTV)
host = 127.0.0.1

# MySQL database for MythTV (usually 'mythconverg')
database = mythconverg

# MySQL username for MythTV (usually 'mythtv')
user = mythtv

# MySQL password for MythTV (usually 'mythtv')
password = mythtv

# MythTV security PIN (or 0000 if not set)
pin = 0000


# --- Video format options ---


# use the Matroska (MKV) container format
# rather than the MPEG-4 (MP4, M4V) container format
matroska = no

# use On2's VP8 codec and Ogg Vorbis rather than H.264 and AAC
vp8 = no

# use iPod Touch compatibility settings:
# - use the MPEG-4 format, H.264 and AAC
# - use the x264 profile specified by ipod_preset
# - use ffmpeg output format 'ipod'
# - resize video to ipod_resolution preserving aspect ratio
ipod = no

# encode WebM compliant video:
# - use the Matroska format, VP8 and Ogg Vorbis
# - use the vpx profile specified by webm_preset
# - use ffmpeg output format 'webm'
# - resize video to webm_resolution preserving aspect ratio
# - WebM does not allow any tags, chapters or subtitles
webm = no


# --- Video encoding options ---


# whether to use two-pass encoding
two_pass = no

# one-pass target video compression ratio (about 15 to 25 is ideal)
video_crf = 22

# two-pass target video bitrate (in KB/s)
video_br = 1500

# ffmpeg x264 preset file to use (empty = no preset)
h264_preset =

# ffmpeg libvpx preset file to use (empty = no preset)
vp8_preset = 720p

# target video resolution or aspect ratio, using one of these formats:
# '640x480' - width and height
# '480p', '480p60', '720p', '1080p' - predefined TV resolutions
# '16:9' - use the video height and this ratio to calculate the desired width
# 'closest' - use whatever standard resolution is closest to the source
# the aspect ratio of the source file will be preserved, and black padding on
# either the top and bottom or sides will be added if necessary
# if not specified, the video resolution is unchanged
resolution =

# ffmpeg x264 profile to use for iPod Touch compatibility
ipod_preset = ipod640

# target video resolution for iPod Touch compatibility
ipod_resolution = 480p

# ffmpeg vpx profile to use for WebM video
webm_preset = 720p

# target video resolution for WebM video
webm_resolution = 720p


# --- Audio encoding options ---


# whether to use neroAacEnc instead of libfaac (must be installed)
nero = yes

# whether to use FLAC audio (MKV only)
flac = no

# neroAacEnc audio quality ratio
audio_q = 0.3

# libfaac / libvorbis audio bitrate (in KB/s)
audio_br = 128


# --- Metadata options ---


# whether to include the Tvdb episode rating (1 to 10) as voted by users
use_tvdb_rating = yes

# whether to prefer to use episode descriptions from Tvdb when available
use_tvdb_descriptions = no


# --- Miscellaneous options ---


# avoid printing to stdout
quiet = no

# print command output to stdout
verbose = no

# when removing commercials, ignore any short clips this many seconds
# from beginning or end of the video
thresh = 5

# path to the Project-X JAR file (used for noise cleaning / cutting)
projectx = project-x/ProjectX.jar

# path to remuxTool.jar (used for extracting MPEG-2 data from WTV files)
remuxtool = remuxTool.jar