Transcoding Preserving Captions

From MythTV Official Wiki
Jump to: navigation, search

Author Christopher Neufeld
Description This framework supplies the means for a user job to do a cut on a MythTV recording, while still retaining the ability of MythTV to display closed captions. It also provides methods for transcoding the video to H.264 format, with a valid seek table, while preserving captions.
Supports


A common request is to perform a transcode without losing the closed caption data in the ivtv data stream. For technical reasons, this is a difficult problem. The technique outlined here allows the preservation of captions during a transcode. This is done not by preserving the caption data in the transcoded stream, but by extracting it into a .srt file, which is a file that MythTV automatically detects and makes available if caption displaying is enabled. Once the .srt file is produced, it can be used with the stream however it is transcoded, so converting it to H.264 format is also supported, and still preserves the captions.

This script honours the cutlist, doing exact cuts even when the cuts are not located on keyframes. It does this by forcing extra keyframes at the edges of the cuts during the transcode, then cutting between these new keyframes.

This technique makes use of the pull-captions.pl script shown at Closed_captioning, which in turn depends on the ccextractor project (see that page for details).

To use this script, first put the following .pm file into your Perl library path. The name I've used is MythXCode.pm, but you can rename that with a global search/replace if it collides with something in your namespace. To view the documentation for MythXCode.pm, simply run:

pod2man MythXCode.pm | groff -man -Tascii | less

Changes

2013-01-03:

  • Updated for compatibility with MythTV 0.25.2. Added the ability to use .srt files that exist at the time of the invocation, for instance if produced by an HD-PVR recording using the technique at Captions_With_HD_PVR. There have been some API changes since the original version. One now calls $worker->prepare_captions() in the place of $worker->cut_and_caption(). The latter function no longer exists.
  • We now use -qmax instead of forced bitrate values when generating the H.264 transcode.
  • We now use the default (High) profile, rather than Baseline.
  • We no longer support generating a lossless transcode preserving captions by omitting the transcode_to_h264() function call. That behaviour can be easily obtained by running the pull_captions.pl script then transcoding through the usual interfaces, this script isn't needed for that.
  • Tested on PVR-500 (ivtv capture), HDHomerun3 OTA (ATSC captions), and HD-PVR.

2013-03-06:

  • Edited some documentation related to the debug level and testing for new database schema. The backend process no longer logs the output of user jobs, so the script must now be run from the command line instead, and the on-screen output examined.

Library Module

Now, here is MythXCode.pm:

#! /usr/bin/perl

# A class to handle transcoding, for people with unusual requirements.
#

package MythXCode;

=pod

=head1 NAME

MythXCode Perl package - Support for transcoding and preserving captions

=head1 SYNOPSIS

    use MythXCode;
    my $worker = MythXCode->new();
    $worker->set_db_parms("mythconverg", "mythtv", "myth-db-password");
    $worker->set_program_identifier($chanid, $starttime, $jobid);
    $worker->init();
    $worker->prepare_captions();
    $worker->transcode_to_h264();
    undef $worker;


=head1 DESCRIPTION

The C<MythXCode> package provides configurable support for certain
operations connected to transcoding files in MythTV.  Its primary
function is to preserve closed-caption data as recordings are
manipulated, for instance when cutting to remove commercials, or when
creating DVDs.  It has been tested on NTSC recordings with ivtv
captions (Hauppauge PVR-500), on North American ATSC digital OTA
captions (HDHomerun3), and on (Hauppauge) HD-PVR recordings.

This package allows the user to convert recordings to H.264 with the
cut pieces removed.  The resulting transcoded recordings will have
valid seek tables and their closed captioning data will be intact.

The MythXCode objects have only been tested while working on a single
recording.  If multiple recordings are to be processed, each must be
done in its own MythXCode object, rather than reusing the object.

This package depends on the F<pull-captions.pl> script to extract
captions from recordings with cutlists.  That script can be found at
L<http://www.mythtv.org/wiki/Closed_captioning>, and it depends, in
turn, on the ccextractor binary found at
L<http://ccextractor.sourceforge.net/>.

The following methods should be considered public methods.  Methods
not documented here are intended for internal use within MythXCode,
and should not be called by users of the package.  The user should
call these methods in order, though not all applications will have to
use all methods.

=cut



use strict;
use DBI;
use File::Basename;
use Fcntl qw(:flock);

# This script pokes around in the database.  Things might go really
# wrong if the database innards are changed.  So, we sanity check
# against that.  This script has been tested on schema 1299.
my $schema_var = "DBSchemaVer";
my $verified_schema = 1299;

my $pts_time_offset = 1.00;   # have to add 1 second to keyframe times
			      # for HD-PVR recordings, because ffmpeg
			      # makes up PTS values in that context,
			      # and they start at 1 second, rather
			      # than at zero!  To be safe, we insert
			      # two keyframes, one for PTS offset 0,
			      # and another for offset 1.

my $pull_captions_pathname = "/home/mythtv/pull-captions.pl";
my $ffmpeg_pathname = "/usr/bin/ffmpeg";
my $ffprobe_pathname = "/usr/bin/ffprobe";
my $mplayer_pathname = "/usr/bin/mplayer";
my $mkvmerge_pathname = "/usr/bin/mkvmerge";

my $ffmpeg_containeropts = "-f matroska";

my $default_ffmpeg_qfactor = 26;
my $default_system_fps = 29.97;

# you must have "recordedprogram" here, other parts of the code use
# it.
my @tables_to_backup = ( "recorded", "recordedprogram", "recordedseek",
			 "recordedmarkup" );


sub new {
    my $class = shift;
    my $self = {};
    bless $self, $class;
    $self->check_dependent_executables();

    $self->{ ffmpeg_qfactor } = $default_ffmpeg_qfactor;
    $self->{ system_fps } = $default_system_fps;

    return $self;
}


sub check_dependent_executables {
    my $self = shift;

    die "Unable to locate pull-captions.pl" if ! -x "$pull_captions_pathname";
    die "Unable to locate ffmpeg" if ! -x "$ffmpeg_pathname";
    die "Unable to locate ffprobe" if ! -x "$ffprobe_pathname";

    if ( ! -x "$mplayer_pathname" ) {
	print "NOTICE:  unable to locate mplayer, no cropping will be done.\n";
	$self->{ cannot_crop } = 1;
    }
}



sub get_debug_level {

    my $self = shift;
    if ( $self->{ debug_level } ) {
	return $self->{ debug_level };
    } else {
	return 0;
    }
}



sub connect_to_db {

    my $self = shift;

    die "Failed to set database parameters" if ! $self->{ parms_set };

    my $dsn;

    if ( $self->{ dbhost } ) {
	if ( $self->{ dbportnum } ) {
	    $dsn = "DBI:mysql:database=$self->{dbname};host-$self->{dbhost};port=$self->{dbport}";
	} else {
	    $dsn = "DBI:mysql:database=$self->{dbname};host-$self->{dbhost}";
	}
    } else {
	$dsn = "DBI:mysql:database=$self->{dbname}";
    }

    if ( @_ && @_[0]->{ readonly } ) {

	if ( $self->get_debug_level() > 0) {
	    print "Opening read-only connection to database.\n";
	}

	$self->{ dbhandle } = DBI->connect($dsn, $self->{ dbuser }, $self->{ dbpasswd }, { RaiseError => 1, AutoCommit => 0, ReadOnly => 1 });
    } else {

	if ( $self->get_debug_level() > 0) {
	    print "Opening read-write connection to database.\n";
	}

	$self->{ dbhandle } = DBI->connect($dsn, $self->{ dbuser }, $self->{ dbpasswd }, { RaiseError => 1, AutoCommit => 0, ReadOnly => 0 });
    }

    die "Failed to connect to the database: $DBI::errstr" if ! $self->{ dbhandle };
}

sub disconnect_db {
    my $self = shift;

    if ( $self->get_debug_level() > 0) {
	print "Disconnecting from database\n";
    }

    $self->{ dbhandle }->disconnect;
    $self->{ dbhandle } = undef();
}

sub check_database_schema {
    my $self = shift;

    $self->connect_to_db( { readonly => 1 } );
    my $qstr = "SELECT data FROM settings WHERE value=?";
    my $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($schema_var);

    my @onerow = $query->fetchrow_array();
    $query->finish();
    $self->disconnect_db();

    die "Unverified database schema $onerow[0].  This script must be carefully inspected for the new version." if $onerow[0] != $verified_schema;
}


sub read_recording_data {
    my $self = shift;
    $self->connect_to_db( { readonly => 1 } );

    my @dirs;
    my @onerow;
    
    my $query = $self->{ dbhandle }->prepare("SELECT dirname FROM storagegroup");
    $query->execute();
    while ( @onerow = $query->fetchrow_array ) {
	push @dirs, $onerow[0];
    }

    die "No storage group directories located" if $#dirs == -1;

    $query = $self->{ dbhandle }->prepare("SELECT basename FROM recorded WHERE chanid=? AND starttime=?");
    $query->execute( $self->{ chanid }, $self->{ starttime } );
    @onerow = $query->fetchrow_array;
    die "Failed to locate recording basename" if $#onerow != 0;
    $self->{ orig_basename } = $onerow[0];

    @onerow = $query->fetchrow_array;
    die "Corrupted database, found multiple recordings at chanid and starttime" if $#onerow != -1;

    for ( @dirs ) {
	my $candidate = $_ . "$self->{ orig_basename }";
	if ( -e "$candidate" ) {
	    $self->{ orig_pathname } = $candidate;
	    last;
	}
    }
    
    $self->{ backups } = {};
    for ( @tables_to_backup ) {
	@{$self->{ backups }->{ $_ }} = $self->retrieve_table ( $_ );
    }

    $self->disconnect_db();
}


sub retrieve_table {
    my $self = shift;
    my $tablename = shift;

    my $qstr = "SELECT * from $tablename WHERE chanid=? AND starttime=?";
    my $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($self->{ chanid }, $self->{ starttime });
    my $names = $query->{NAME};

    my @retval;
    push @retval, $names;

    while ( my @res = $query->fetchrow_array ) {
	push @retval, \@res;
    }

    return @retval;
}


sub generate_caption_filename {
    my $self = shift;
    
    my $localcopy = $self->{ orig_pathname };
    $localcopy =~ s/\.[^.]+$/.srt/;

    return $localcopy;
}

sub generate_new_caption_filename {
    my $self = shift;
    
    my $localcopy = $self->{ orig_pathname };
    $localcopy =~ s/\.[^.]+$/.srt.NEW/;

    return $localcopy;
}


sub captions_already_extracted {
    my $self = shift;

    return -s $self->generate_caption_filename();
}


sub has_cutlist {
    my $self = shift;

    $self->connect_to_db( { readonly => 1 } );

    my $qstr = "SELECT * FROM recordedmarkup WHERE chanid=? AND starttime=? AND ( type=0 OR type=1 )";

    my $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($self->{ chanid }, $self->{ starttime });
    my @onerow = $query->fetchrow_array;
    $query->finish();
    $self->disconnect_db();

    return $#onerow != -1;
}


sub seconds_to_hms {
    my $self = shift;
    my $timestamp = shift;

    my $hours = int($timestamp / 3600);
    $timestamp -= $hours * 3600;
    my $minutes = int($timestamp / 60);
    $timestamp -= $minutes * 60;
    my $whole_seconds = int($timestamp);
    my $milliseconds = 1000 * ($timestamp - $whole_seconds);

    return sprintf("%02d:%02d:%02d.%03d", $hours, $minutes, $whole_seconds, $milliseconds);
}


sub get_cutlist {
    my $self = shift;

    if ( $self->{ cutlist } ) {
	return $self->{ cutlist };
    }

    my $command = "mythutil -q -q --getcutlist --chanid=$self->{ chanid} --starttime='$self->{ starttime }'";
    print "EXECUTE:  $command\n";

    my $fh;
    if ( ! open ($fh, "$command |") ) {
	$self->set_job_status("Failed to extract cutlist");
	die "Cutlist extraction failed.";
    }

    while (<$fh>) {
	if ( m/^Cutlist: (.*)$/ ) {
	    $self->{ cutlist } = $1;
	}
    }

    print "cutlist= $self->{ cutlist }\n";

    if ( $self->{ cutlist } ) {
	# build the times for the new keyframes, and the cut times

	my $inverse_cutlist;
	my $inv_prev = 0;

	my $fkf = "-force_key_frames ";

	my $ctr = 0;
	my @intervals = split(",", $self->{ cutlist });
	foreach (@intervals) {
	    my $start = (split("-", $_))[0];
	    my $end = (split("-", $_))[1];

	    if ( $ctr != 0 ) {
		$fkf .= ",";
	    }

	    if ( $start > $inv_prev ) {
		my $decr = $start - 1;
		$inverse_cutlist .= "$inv_prev-$decr,";
	    }
	    $inv_prev = $end + 1;


	    my $start2 = $start / $self->{ system_fps };
	    my $end2 = ( $end + 1 ) / $self->{ system_fps };

	    # Make sure the order of new keyframes is correct.  For
	    # very short cuts, we have to reorder them because of the
	    # pts_time_offset silliness.
	    if ( $pts_time_offset != 0) {
		if ( $end2 - $start2 < $pts_time_offset ) {

		    $fkf .= sprintf("%.2f,%.2f,%.2f,%.2f", $start2 - 0.005, $end2 - 0.005, $start2 + $pts_time_offset - 0.005, $end2 + $pts_time_offset - 0.005 ); 

		} else {

		    $fkf .= sprintf("%.2f,%.2f,%.2f,%.2f", $start2 - 0.005, $start2 + $pts_time_offset - 0.005, $end2 - 0.005, $end2 + $pts_time_offset - 0.005 ); 

		}
	    } else {
		$fkf .= sprintf("%.2f,%.2f", $start2 - 0.005, $end2 - 0.005);
	    }

	    $ctr++;
	}

	$inverse_cutlist .= "$inv_prev-1300000";   # a nice 12-hour range

	my $ctr2 = 0;
	my $spl = "--split parts:";
	@intervals = split(",", $inverse_cutlist);
	foreach (@intervals) {
	    my $start = (split("-", $_))[0];
	    my $end = (split("-", $_))[1];

	    if ( $ctr2 != 0 ) {
		$spl .= ",+";
	    }

	    # Subtract 2 from start and from end+1, so that
	    # floating-point roundoff doesn't make us fall through to
	    # the next keyframe.  ffmpeg doesn't make consecutive
	    # keyframes without explicit directions to do so, so this
	    # should always work.
	    my $start2 = ( $start - 2) / $self->{ system_fps };
	    my $end2 = ( $end - 1 ) / $self->{ system_fps };

	    if ( $start2 < 0.005 ) {
		$start2 = 0.005;
	    }
	    if ( $end2 < 0.005 ) {
		$end2 = 0.005;
	    }

	    $spl .= sprintf("%s-%s", $self->seconds_to_hms($start2 - 0.005), $self->seconds_to_hms($end2 - 0.005));

	    $ctr2++;
	}

	if ( $ctr > 0 ) {
	    $self->{ forced_key_frames } = $fkf;
	    $self->{ split_arg } = $spl;
	    $self->{ inverse_cutlist } = $inverse_cutlist;
	}

    }

    return $self->{ cutlist };
}



sub clear_cutlist {
    my $self = shift;

    $self->connect_to_db();

    my $query;
    my $qstr = "DELETE FROM recordedmarkup WHERE chanid=? AND starttime=? AND type=0";

    if ( $self->get_debug_level() > 0 ) {
	print "DBExecute:  $qstr;   args= '$self->{ chanid }','$self->{ starttime }'\n";
    } else {
	$query = $self->{ dbhandle }->prepare($qstr);
	$query->execute($self->{ chanid }, $self->{ starttime });
    }

    $qstr = "DELETE FROM recordedmarkup WHERE chanid=? AND starttime=? AND type=1";

    if ( $self->get_debug_level() > 0 ) {
	print "DBExecute:  $qstr;   args= '$self->{ chanid }','$self->{ starttime }'\n";
    } else {
	$query = $self->{ dbhandle }->prepare($qstr);
	$query->execute($self->{ chanid }, $self->{ starttime });
    }

    $qstr = "UPDATE recorded SET cutlist=0 WHERE chanid=? AND starttime=?";
    if ( $self->get_debug_level() > 0 ) {
	print "DBExecute:  $qstr;   args= '$self->{ chanid }','$self->{ starttime }'\n";
    } else {
	$query = $self->{ dbhandle }->prepare($qstr);
	$query->execute($self->{ chanid }, $self->{ starttime });
    }

    $qstr = "UPDATE recorded SET commflagged=0 WHERE chanid=? AND starttime=?";
    if ( $self->get_debug_level() > 0 ) {
	print "DBExecute:  $qstr;   args= '$self->{ chanid }','$self->{ starttime }'\n";
    } else {
	$query = $self->{ dbhandle }->prepare($qstr);
	$query->execute($self->{ chanid }, $self->{ starttime });
    }

    if ( $self->get_debug_level() > 0 ) {
	print "COMMIT !!!\n";
    } else {
	$self->{ dbhandle }->commit || die "Database update failed clearing cutlist.\n";
    }

    $self->disconnect_db();
}


sub get_recordedprogram_table_data {
    my $self = shift;

    my $names = $self->{ backups }->{ "recordedprogram" }[0];
    my $res = $self->{ backups }->{ "recordedprogram" }[1];
    
    my %hashret;

    for (0 .. $#$res) {
	$hashret{$$names[$_]} = $$res[$_];
    }

    return %hashret;
}


sub safe_to_cut_before_captions {
    my $self = shift;

    return 0;
}

sub program_may_need_cropping {
    my $self = shift;

    return $self->{ video_res } == 480;
}

sub high_definition {
    my $self = shift;
    
    return ! ( $self->{ video_res } == 480 );
}


sub extract_captions {
    my $self = shift;

    my $command = "$pull_captions_pathname $self->{ chanid } '" . $self->{ starttime } . "' " . $self->generate_caption_filename() . " >/dev/null 2>&1";

    print "EXECUTE:  $command\n";
    if ( $self->get_debug_level() < 2 ) {
	$self->set_job_status("Extracting captions");
	if ( system($command) != 0) {
	    $self->set_job_status("Failed while extracting captions");
	    die "Failed to pull the captions from the recording.";
	}
	$self->set_job_status("Finished extracting captions");
    }
}



sub seconds_to_srt_ts {
    my $self = shift;
    my $timestamp_seconds = shift;
    my $rvstr;

    my $hours;
    my $minutes;
    my $whole_seconds;
    my $milliseconds;

    $hours = int($timestamp_seconds / 3600);
    $timestamp_seconds -= $hours * 3600;
    $minutes = int($timestamp_seconds / 60);
    $timestamp_seconds -= $minutes * 60;
    $whole_seconds = int($timestamp_seconds);
    $milliseconds = ( $timestamp_seconds - $whole_seconds ) * 1000;


    return sprintf("%02d:%02d:%02d,%03d", $hours, $minutes, $whole_seconds, $milliseconds);
}


sub generate_cut_captions
{
    my $self = shift;

    print "Applying cutlist to captions.\n";
    if ( $self->get_debug_level() > 1 ) {
	if ( $self->has_cutlist() ) {
	    print "Cut captions.\n";
	}
	return;
    }

    my $overlap_buffer = 0.002;   # minimum seconds between captions
    my $min_caption_duration = 0.3;  # minimum residence time of caption

    my $inputfile = $self->generate_caption_filename();
    my $outputfile = $self->generate_new_caption_filename();
    my $cutlist = $self->get_cutlist();

    my @intervals = split(",", $cutlist);

    my $cumulative_frames_dropped = 0;

    my $position = 0;
    my $marker = 0;
    my @ranges;
    my $start;
    my $end;

    foreach (@intervals) {
	$start = (split("-", $_))[0];
	$end = (split("-", $_))[1];

	$position = $start;
	
	if ( $position - $marker > 0) {
	    push @ranges, [ ( $marker, $position - 1, $cumulative_frames_dropped ) ];
	}
	$cumulative_frames_dropped += ( $end - $start + 1 );
	$marker = $end + 1;
    }
    push @ranges, [ ( $marker, 100000000, $cumulative_frames_dropped ) ];


    my $range_start;
    my $range_end;
    my $range_delay;

    my $ifile;
    my $outfile = <STDOUT>;

    open $ifile, "<$inputfile";
    open $outfile, ">$outputfile" || die "Failed to open new .srt to hold cut captions.";

    my $holdslash = $/;
    $/ = "\r\n\r\n";

    my $recnum = 1;
    my $caption_time_1;
    my $caption_time_2;
    my $prev_caption_end = -1000;
    my $caption_frame_1;
    my $caption_frame_2;
    my $caption_text;

    while (<$ifile>) {

	next if ! m/\d+\r\n(\d+):(\d+):(\d+),(\d+) --> (\d+):(\d+):(\d+),(\d+)\r\n(.*)/s;
	
	$caption_time_1 = $1 * 3600 + $2 * 60 + $3 + $4 * 0.001;
	$caption_time_2 = $5 * 3600 + $6 * 60 + $7 + $8 * 0.001;
	$caption_frame_1 = $caption_time_1 * $self->{ system_fps };
	$caption_frame_2 = $caption_time_2 * $self->{ system_fps };

	$caption_text = $9;

	for my $i ( 0 .. $#ranges ) {
	    $range_start = $ranges[$i][0];
	    $range_end = $ranges[$i][1];
	    $range_delay = $ranges[$i][2];

	    if ( $caption_frame_2 >= $range_start && $caption_frame_1 <= $range_end ) {
		if ( $caption_frame_1 < $range_start ) {
		    $caption_frame_1 = $range_start;
		}
		if ( $caption_frame_2 > $range_end ) {
		    $caption_frame_2 = $range_end;
		}

		$caption_time_1 = ( $caption_frame_1 - $range_delay ) / $self->{ system_fps };
		$caption_time_2 = ( $caption_frame_2 - $range_delay ) / $self->{ system_fps };
		if ( $caption_time_1 < $prev_caption_end + $overlap_buffer ) {
		    $caption_time_1 = $prev_caption_end + $overlap_buffer;

		    if ( $caption_time_2 < $caption_time_1 + $min_caption_duration ) {
			$caption_time_2 = $caption_time_1 + $min_caption_duration;
		    }
		}

		printf $outfile "%d\r\n%s --> %s\r\n%s\r\n", $recnum, $self->seconds_to_srt_ts($caption_time_1), $self->seconds_to_srt_ts($caption_time_2), $caption_text;

		$recnum++;
		$prev_caption_end = $caption_time_2;
		last;
	    }
	}
    }

    $/ = $holdslash;
}



sub examine_source_media {
    my $self = shift;


    print "EXECUTE:  $ffprobe_pathname $self->{ orig_pathname } 2>&1\n";

    my $fh;
    open ($fh, "$ffprobe_pathname $self->{ orig_pathname } 2>&1 |") 
	|| die "Unable to run ffprobe on $self->{ orig_pathname }";


    my $saw_5_1;
    while (<$fh>) {
	if ( /Duration: ([0-9]+):([0-9]+):([0-9]+\.[0-9][0-9]), start: / ) {
	    $self->{ duration_secs } = $1 * 3600 + $2 * 60 + $3;
	    print "Media duration is $self->{ duration_secs } seconds.\n";
	}
	if ( /Stream #([0-9]+)[.:]([0-9]+)\[.* Video:.*(1920x1080|1280x720|720x480)/ ) {
	    $self->{ video_map } = "-map $1:$2";
	    if ( $self->{ map_dot } ) {
		$self->{ video_map } =~ s/:/./g;
	    }

	    if ( $3 =~ "1920x1080" ) {
		$self->{ video_res } = 1080;

	    } elsif ( $3 =~ "1280x720" ) {

		$self->{ video_res } = 720;
		
	    } elsif ( $3 =~ "720x480" ) {
		$self->{ video_res } = 480;
	    } else {
		die "Unable to determine video resolution";
	    }
	}

	if ( /Stream #([0-9]+)[.:]([0-9]+)\[.* Audio: / ) {

	    my $mcopy = "-map $1:$2";
	    if ( $self->{ map_dot } ) {
		$mcopy =~ s/:/./g;
	    }

	    if ( ! $saw_5_1 ) {

		if ( /, 5\.1(\(side\))?, / ) {
		    $saw_5_1 = 1;
		    $self->{ audio_map } = $mcopy;
		}

		if ( m/, stereo, / ||  m/, 2 channels, / ) {

		    $self->{ audio_map } = $mcopy;
		}
	    }
	}
    }
    close $fh;

    die "Can't find video information" if ! $self->{ video_map } || ! $self->{ video_res };
    die "Can't find audio information" if ! $self->{ audio_map };
}


sub get_cropping_argument {
    my $self = shift;

    $self->set_job_status("Determining cropping values");

    # We'll go 5 minutes into the recording, to avoid any issues with
    # opening credits being in a strange shape.  Then, draw 10 samples
    # at 15 second intervals.  We choose the most commonly reported
    # answer.

    my %votehash;

    for (0 .. 10) {
	my $time_delay = 300 + $_ * 15;
	my @oneshot = $self->probe_cropping_data($self->{ orig_pathname }, $time_delay);
	for (@oneshot) {
	    if ( ! $votehash{$_} ) {
		$votehash{$_} = 1;
	    } else {
		$votehash{$_}++;
	    }
	}
    }

    my $beststring = "";
    my $bestcount = -1;
    foreach my $key (keys %votehash) {
	if ( $votehash{$key} > $bestcount ) {
	    $beststring = $key;
	    $bestcount = $votehash{$key};
	}
    }
    return $beststring;
}



sub probe_cropping_data {
    my $self = shift;
    my $filename = shift;
    my $delay = shift;

    my $fh;
    open ($fh, "$mplayer_pathname $filename -ss $delay -identify -frames 10 -vo md5sum:outfile=/dev/null -ao null -nocache -vf pp=lb,cropdetect=20:16 2>/dev/null |") || die "Unable to probe for cropping settings.";

    my @result;
    while (<$fh>) {
	if ( /CROP.*X:.*Y:.*\((-vf crop=[0-9]+:[0-9]+:[0-9]+:[0-9]+)\)/ ) {
	    push @result, $1;
	}
    }
    close $fh;
    return @result;
}


sub set_job_status {
    my $self = shift;
    my $text = shift;

    die "Job status string too long" if length($text) > 127;

    if ( $self->get_debug_level() == 0 && $self->{ jobid } ) {
	$self->connect_to_db();

	my $qstr = "UPDATE jobqueue SET comment=? WHERE id=?";
	my $query = $self->{ dbhandle }->prepare($qstr);
	$query->execute($text, $self->{ jobid });
	$self->{ dbhandle }->commit();
	
	$self->disconnect_db();
    } else {
	print "STATUS:  $text\n";
    }
}


# This resubmits the job as another job with a later time.  Then it
# erases its own entry, which will make noise in the error log for the
# backend, but otherwise is safe.
sub resubmit_self_and_exit {
    my $self = shift;
    
    my $jobtype = $self->my_job_type();

    $self->connect_to_db();

# This copies my entry into the job queue, but submitted for "now".
    my $qstr = "INSERT INTO jobqueue (chanid,starttime,inserttime,type,cmds,flags,status,hostname,args,schedruntime) SELECT chanid,starttime,current_timestamp(),type,cmds,flags,1,hostname,args,current_timestamp() FROM jobqueue WHERE id=?";

    my $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($self->{ jobid });

    $qstr = "DELETE FROM jobqueue WHERE id=?";
    $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($self->{ jobid });

    # and this rewrites all the other queued MythXCode jobs, so they don't
    # have to do that on their own.

    $qstr = "UPDATE jobqueue SET inserttime=current_timestamp() WHERE type=? AND status=1";
    $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($jobtype);

    $self->{ dbhandle }->commit();

    $self->disconnect_db();

    exit 0;
}


sub my_job_type {
    my $self = shift;

    $self->connect_to_db( { readonly => 1 } );
    my $jobtype;

    my $qstr = "SELECT type FROM jobqueue WHERE id=?";
    my $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($self->{ jobid });
    my @onerow = $query->fetchrow_array;
    
    if ( @onerow ) {
	$jobtype = $onerow[0];
    }

    $query->finish();
    $self->disconnect_db();
    return $jobtype;
}


# We might want to let other jobs go first, if we're blocking waiting
# for another MythXCode process to complete.  This determines if there are
# any others waiting.
sub other_jobs_queued {
    my $self = shift;
    my $rv = undef;

    my $jobtype = $self->my_job_type;

    if ( $jobtype ) {

	$self->connect_to_db( { readonly => 1 } );
    
	# Now, find other queued jobs, not of the same type
	my $qstr = "SELECT COUNT(id) FROM jobqueue WHERE type != ? AND status=1";
	my $query = $self->{ dbhandle }->prepare($qstr);
	$query->execute($jobtype);
	my @onerow = $query->fetchrow_array;
	
	if ( $onerow[0] != 0 ) {
	    $rv = 1;
	}

	$query->finish();
	$self->disconnect_db();
    }


    return $rv;
}


sub wait_on_lock {
    my $self = shift;

    if ( $self->{ lockfile } ) {

	if ( ! $self->{ jobid } ) {
	    die "Asked for locking, but without a job ID.";
	}

	# We'll auto-close this on exit, releasing the lock.
	open $self->{ lockhandle }, '>', $self->{ lockfile } || die "Failed to create lockfile.";

	# Try a non-blocking, exclusive lock
	while ( ! flock $self->{ lockhandle}, LOCK_EX | LOCK_NB ) {

	    if ( $self->other_jobs_queued() ) {
		$self->resubmit_self_and_exit();
	    }

	    # wait for 5 minutes and check again.
	    sleep 301;
	}
    }
}


sub perform_transcoding_to_h264 {
    my $self = shift;

    $self->examine_source_media();
    $self->get_cutlist();

    my $cropping_arg;
    my @keyframes;

    if ( ! $self->{ cannot_crop } && $self->program_may_need_cropping() ) {
	$cropping_arg = $self->get_cropping_argument();
    }

    my $xcoded_name = $self->{ orig_pathname };
    $xcoded_name =~ s/\.[^.]+$/.mkv/;

    # Don't re-transcode the same file
    if ( $self->{ orig_pathname } eq $xcoded_name ) {
	return;
    }

    # Before we get to .mkv we'll pass through a .tmp file and a .ts file
    my $tmpname1 = $xcoded_name . ".tmp";
    my $tmpname2 = $xcoded_name . ".ts";
    my $tmpname3 = $xcoded_name . ".tmp.2";

    if ( -e $tmpname1 ) {
	unlink $tmpname1;
    }
    if ( -e $tmpname2 ) {
	unlink $tmpname2;
    }
    if ( -e $tmpname3 ) {
	unlink $tmpname3;
    }

    # we use the h264_mp4toannexb filter so that we can rebuild the
    # seek table.  That's convenient.

    my $command;
    my $localname1;
    my $localname2;
    my $localname3;

    my $audioopts;
    if ( ! $self->high_definition() ) {
	$audioopts = "-acodec libfaac -ab 256k -ac 2";
    } else {
	$audioopts = "-acodec copy";
    }

    if ( $self->{ ssh_string } ) {
	$localname1 = basename($tmpname1);
	$localname2 = basename($tmpname2);
	$localname3 = basename($tmpname3);
	$command = "$self->{ ssh_string } $ffmpeg_pathname -y -i - -r $self->{ system_fps } -threads $self->{ ssh_threads } -c:v libx264 -qmax $self->{ ffmpeg_qfactor } $self->{ forced_key_frames } $cropping_arg $self->{ video_map } $audioopts $self->{ audio_map } $ffmpeg_containeropts $localname1 2\\>\\&1 < $self->{ orig_pathname }";
    } else {
	$command = "$ffmpeg_pathname -y -i $self->{ orig_pathname } -r $self->{ system_fps } -c:v libx264 -qmax $self->{ ffmpeg_qfactor } $self->{ forced_key_frames} $cropping_arg $self->{ video_map } $audioopts $self->{ audio_map } $ffmpeg_containeropts $tmpname1 2>&1 </dev/null";
    }

    print "EXECUTE:  $command\n";
    if ( $self->get_debug_level() > 1 ) {

	print "Transcode, introducing new keyframes.\n";
	print "Split out the cut parts.\n";
	print "Copy cut .srt file into position.\n";
	print "Rebuild keyframe index.\n";
	print "rename $tmpname2 $xcoded_name\n";
	print "unlink $self->{ orig_pathname }\n";
	print "Update database to reflect change\n";

    } else {

	$self->set_job_status("Beginning h.264 transcode");

	my $oldIRS = $/;
	$/ = "\r";

	my $fh;
	
	open ( $fh, "$command |") || die "Failed to execute ffmpeg for transcoding";

	my $last_percent = 0;
	while (<$fh>) {

	    my $timesecs;
	    my $fps;

	    if ( /^frame= *[0-9]+ +fps= *([0-9]+) +q=.*time= *([0-9]+\.[0-9]+) / ) {
		$fps = $1;
		$timesecs = $2;
	    }

	    if ( /^frame= *[0-9]+ +fps= *([0-9]+) +q=.*time= *([0-9]+):([0-9]+):([0-9]+\.[0-9]+) / ) {
		$fps = $1;
		$timesecs = $2 * 3600 + $3 * 60 + $4;
	    }

	    if ( $timesecs ) {
	    
		my $frac_done = $timesecs / $self->{ duration_secs };
		my $this_percent = int( $frac_done * 100 );

		if ($this_percent > $last_percent) {
		    $self->set_job_status("H.264 transcode $this_percent\% done.  $1 FPS");
		    $last_percent = $this_percent;
		}
	    }
	}

	close $fh;
	$/ = $oldIRS;

	if ( $? != 0 ) {
	    $self->set_job_status("H.264 transcode failed");
	    die "Transcoding to h.264 failed.";
	}

	if ( $self->has_cutlist() ) {

	    $self->set_job_status("Using mkvmerge to apply cutlist");

	    if ( $self->{ ssh_string } ) {
		$command = "$self->{ ssh_string} $mkvmerge_pathname $self->{ split_arg } -o $localname3 --compression 0:none $localname1 && $self->{ ssh_string } mv $localname3 $localname1";
	    } else {
		$command = "$mkvmerge_pathname $self->{ split_arg } -o $tmpname3 --compression 0:none $tmpname1 && mv $tmpname3 $tmpname1";
	    }

	    print "EXECUTE:  $command\n";

	    if ( system($command) != 0 ) {
		$self->set_job_status("mkvmerge cutting failed");
		die "Transcoding to h.264 failed.";
	    }
	    
	}

	$self->set_job_status("Running through mp4toannexb to allow rebuilding the keyframe index");

	if ( $self->{ ssh_string } ) {
	    $command = "$self->{ ssh_string} $ffmpeg_pathname -y -threads 1 -i $localname1 -vcodec copy -bsf:v h264_mp4toannexb -acodec copy -f mpegts $localname2 >/dev/null 2>&1";
	} else {
	    $command = "$ffmpeg_pathname -i $tmpname1 -threads 1 -vcodec copy -bsf:v h264_mp4toannexb -acodec copy -f mpegts $tmpname2 >/dev/null 2>&1";
	}
	print "EXECUTE:  $command\n";
	if ( system($command) != 0 ) {
	    $self->set_job_status("H.264 transcode failed");
	    die "Transcoding to h.264 failed.";
	}

	if ( $self->{ ssh_string } ) {

	    if ( $self->{ ffmpeg_rebuild } ) {
		@keyframes = $self->parse_keyframes_from_ffmpeg($localname2);
	    }


	    $self->set_job_status("Copying file back from remote ffmpeg engine.");
	    $command = "$self->{ ssh_string } cat $localname2 > $tmpname2";
	    if ( system($command) != 0 ) {
		$self->set_job_status("H.264 transcode failed");
		die "Copying file back from remote engine failed.";
	    }
	    $command = "$self->{ ssh_string} /bin/rm -f $localname1 $localname2";
	    if ( system($command) != 0 ) {
		print "Filed to unlink file on remote engine.";
	    }
	} else {

	    if ( $self->{ ffmpeg_rebuild } ) {
		@keyframes = $self->parse_keyframes_from_ffmpeg($tmpname2);
	    }

	    unlink $tmpname1;
	}


	$self->set_job_status("Finished h.264 transcode");

	rename $tmpname2, $xcoded_name;
	my $newlen = -s $xcoded_name;

	$self->connect_to_db();

	my $qstr = "DELETE FROM recordedseek WHERE chanid=? AND starttime=?";
	if ( $self->get_debug_level() == 0 ) {
	    my $query = $self->{ dbhandle }->prepare($qstr);
	    $query->execute($self->{ chanid }, $self->{ starttime });
	} else {
	    print "$qstr\n";
	}

	$qstr = "DELETE FROM recordedmarkup WHERE chanid=? AND starttime=?";
	if ( $self->get_debug_level() == 0 ) {
	    my $query = $self->{ dbhandle }->prepare($qstr);
	    $query->execute($self->{ chanid }, $self->{ starttime });
	} else {
	    print "$qstr\n";
	}

	$qstr = "UPDATE recorded SET basename=?,filesize=?,transcoded='1' WHERE chanid=? AND starttime=?";
	if ( $self->get_debug_level() == 0 ) {
	    my $query = $self->{ dbhandle }->prepare($qstr);
	    $query->execute(basename($xcoded_name), $newlen, $self->{ chanid }, $self->{ starttime });
	} else {
	    print "$qstr\n";
	}

	if ( $self->get_debug_level() == 0 ) {
	    if ( ! $self->{ dbhandle }->commit() ) {
		unlink $xcoded_name;
		die "Failed to move transcoded file into database.";
	    }

	    unlink $self->{ orig_pathname };

	    $self->disconnect_db();
	}

	# Clear the cut information, we've now finished modifying the file
	$self->clear_cutlist();

	# Move the new .srt into position
	if ( -s $self->generate_new_caption_filename() ) {
	    my $cut_name = $self->generate_new_caption_filename();
	    my $final_path = $self->generate_caption_filename();
	    if ( $self->get_debug_level() == 0) {
		rename $cut_name, $final_path;
	    } else {
		print "Renaming to move $cut_name to $final_path\n";
	    }
	}

	# Rebuild the seek table, but don't worry if it fails.
	$self->set_job_status("Rebuilding seek table after transcode");

	if ( $self->{ ffmpeg_rebuild } ) {

	    $self->db_commit_keyframes_manually(\@keyframes);

	} else {

	    my $command = "mythcommflag --rebuild --chanid=$self->{ chanid } --starttime='$self->{ starttime }'";
	    print "EXECUTE:  $command\n";
	    if ( $self->get_debug_level() == 0 ) {
		system($command);
	    }
	    $self->set_job_status("Finished rebuilding seek table");
	}
    }
}


sub parse_keyframes_from_ffmpeg {
    my $self = shift;
    my $transport_stream_file = shift;

    $self->set_job_status("Using ffmpeg to extract keyframe positions");

    my @rv;  # a vector of (n, pos) for all keyframes

    my $command = "$ffmpeg_pathname -y -nostats -i $transport_stream_file -vf showinfo -f null -";
    
    if ( $self->{ ssh_string } ) {
	$command = "$self->{ ssh_string } " . $command . " 2\\>\\&1";
    } else {
	$command = $command . " 2>&1";
    }

    my $prevpos = -1;
    my $fh;
    open ( $fh, "$command |") || return;
    while (<$fh>) {
	if ( /.* n:(\d+) .* pos:(\d+) .* iskey:1 .*/ ) {

	    if ( $2 == $prevpos ) {
		die "Obtained bad data trying to locate keyframes.";
	    }
	    $prevpos = $2;
	    push @rv, [$1, $2];
	}
    }

    close $fh;
    return @rv;
}


sub db_commit_keyframes_manually {
    my $self = shift;
    my @kf_array = @{$_[0]};
    shift;

    if ( $self->get_debug_level() == 0 ) {

	$self->connect_to_db();
	my $query = $self->{ dbhandle }->prepare("DELETE FROM recordedseek WHERE chanid=? AND starttime=?");
	$query->execute( $self->{ chanid }, $self->{ starttime } );

	my $i;
	my $j;
	for $i ( 0 .. $#kf_array ) {
	    $query = $self->{ dbhandle }->prepare("INSERT INTO recordedseek VALUES(?,?,?,?,?)");
	    $query->execute( $self->{ chanid }, $self->{ starttime }, $kf_array[$i][0], $kf_array[$i][1], 9);
	}

	$self->{ dbhandle }->commit();
	$self->disconnect_db();

    } else {

	printf "DELETE FROM recordedseek WHERE chanid='%d' AND starttime='%s'\n", $self->{ chanid }, $self->{ starttime };

	my $i;
	my $j;
	for $i ( 0 .. $#kf_array ) {
	    printf "INSERT INTO recordedseek VALUES(%d,%s,%d,%d,%d)\n", $self->{ chanid }, $self->{ starttime }, $kf_array[$i][0], $kf_array[$i][1], 9;
	}
    }
}



sub set_queue_status {
    my $self = shift;
    my $code = shift;

    $self->connect_to_db();

    my $qstr = "UPDATE jobqueue SET status=? WHERE id=?";
    my $query = $self->{ dbhandle }->prepare($qstr);
    $query->execute($code, $self->{ jobid } );

    $self->{ dbhandle }->commit();
    $self->disconnect_db();
}


#########################################################################
#
# The following methods are expected to be used by the consumer.  The
# methods above this line are internal, consider them 'private'.
#
#########################################################################

=head2 C<set_db_parms(DBNAME, DBUSER, DBPASSWD [, DBHOST [, DBPORT ] ])>

This method is required and must be called before C<init()>.  

This method informs the MythXCode object of the parameters needed to
access the MythTV database.  At least I<DBNAME>, I<DBUSER>, and
I<DBPASSWD> are required parameters.  I<DBHOST> is optional, and
declares that the database is running on a remote host, which can be
either a resolvable hostname or an IP number.  I<DBPORT> is the
non-standard port number on the remote host.  To use a non-standard
port number on the local host, set I<DBHOST> to the IP number of an
interface on the local machine that is listening for connections.

This method ABENDs if the three required parameters are not supplied.
Otherwise, it will always succeed.  It does not validate the
connection parameters.

=cut

sub set_db_parms {
    my $self = shift;
    my $dbname = shift;
    my $dbuser = shift;
    my $dbpasswd = shift;
    my $dbhost = shift;
    my $dbport = shift;

    die "Required database name not supplied." if ! $dbname;
    die "Required database user not supplied." if ! $dbuser;
    die "Required database password not supplied." if ! $dbpasswd;

    $self->{ dbname } = $dbname;
    $self->{ dbuser } = $dbuser;
    $self->{ dbpasswd } = $dbpasswd;
    $self->{ dbhost } = $dbhost;
    $self->{ dbport } = $dbport;
    $self->{ parms_set } = 1;
}

=head2 C<set_backup_directory(DIRNAME)>

This method is optional, and must be run before C<init()>.

This method informs the MythXCode object that we want a backup made of
the working files before processing begins.  A new directory will be
created, with filename F<DIRNAME/transcode_PID> where PID is the
process ID of the process running MythXCode.  If that directory
already exists, it is an error, and execution terminates.

The directory must have sufficient free space to hold the recording
file, plus a relatively small amount of database information.

Note that merely calling C<set_backup_directory()> is not sufficient
to perform the backup.  The C<perform_backup()> function must be
invoked.

=cut

sub set_backup_directory {
    my $self = shift;
    my $backup_dir = shift;

    $self->{ backup_dir } = $backup_dir;
}

=head2 C<set_debug_level(NUM)>

This method is optional, and must be run before C<init()>.

MythTV systems are not all alike.  The operations that this package
performs are potentially destructive.  While every attempt has been
made to detect and gracefully handle error conditions, it is
recommended that the user test the procedure first.  The debug level
should be set first to 2 and the script executed from the command
line, as the same user as the mythbackend.  When run from the command
line, do not call the set_lock_file() method as it makes use of the
Job ID assigned by the backend, and that doesn't exist in this case.

The script output should then be examined for unexpected or
undesirable behaviour.  In debug level 2, no database writes are
performed, and no files are created or modified.

If everything looks safe at debug level 2, the debug level should be
reduced to 1 and the procedure repeated.  At debug level 1, no
database writes are performed, but transcoded/cut files are produced
and a .srt file with captions information is created.  These files
will occupy space on disk and will not be deleted by MythTV, which has
no knowledge of them, so they should be manually deleted after
debugging is complete.  Again, the output should be examined for any
unexpected or undesirable behaviour.

Finally, the debug level should be set to 0.  Files are created,
original recording files are deleted and replaced by the new files,
and database entries are changed to reflect this.  This is the normal
operation mode.  If anything goes wrong, the backup (if produced),
allows you to reset the recording to the state it was in before the
script executed.  Otherwise, you can delete the backup.

=cut

sub set_debug_level {
    my $self = shift;
    my $debug_level = shift;

    $self->{ debug_level } = $debug_level;
}


=head2 C<set_qmax(NUM)>

This method is optional, and must be run before C<init()>.

The ffmpeg transcoding requires a target quality factor.  The lower
this number, the higher the quality, but the higher the bitrate.  The
default is 26, but can be changed by calling this function.

=cut

sub set_qmax {
    my $self = shift;
    my $qmax = shift;

    $self->{ ffmpeg_qfactor } = $qmax;
}


=head2 C<set_fps(NUM)>

This method is optional, and must be run before C<init()>.

The conversion between frame counts and times requires, at present, a
knowledge of a fixed frame rate.  The default is 29.97, for North
American broadcasts.  If this is incorrect in your region, you may
modify the frame rate here.

=cut

sub set_fps {
    my $self = shift;
    my $fps = shift;

    $self->{ system_fps } = $fps;
}


=head2 C<set_lock_file(FILENAME)>

This method is optional, and must be run before C<init()>.

A transcode job can run for a long time.  It is not hard to queue up
days of transcodes.  The user may want to allow other jobs, such as
commercial flagging runs, to execute as new recordings are created,
without waiting days for the queue of transcode jobs to empty.

To do this, supply a common absolute pathname, F<FILENAME>, to all
MythXCode objects using this method.  The MythTV job execution queue
must be set to allow at least two concurrent jobs.  Also, the
C<set_program_identifier()> method must supply the I<JOBID> argument
because the locking method must interact with the MythTV job
scheduler.

Implementation is via an advisory lock on the supplied filename.  When
the C<init()> method is called, an attempt is made to acquire the
lock.  If it succeeds, the C<init()> method returns.  If it fails, it
sleeps for 5 minutes and tries again.  Before sleeping, it checks the
job queue to see if there are any user jobs of a different type
waiting in the queue.  If there are, it finds all queued jobs of the
same type and resets their submission times to now.  It then resubmits
itself, and exits, freeing up the job queue entry.

The net effect, if MythTV is configured to allow two concurrent jobs
to run, is that There will initially be two running MythXCode jobs,
but one will not be making progress and will not be consuming CPU.
If, later, another job of a different type, such as commercial
flagging, appears on the queue, then the blocked MythXCode job will,
within 5 minutes, detect it, and reorder the queue so that the
commercial flagging job moves to the front of the line.  This allows
the commercial flagging job to run concurrently with the MythXCode, so
it can be done in a timely manner.

=cut

sub set_lock_file {
    my $self = shift;
    $self->{ lockfile } = shift;
}


=head2 C<set_remote_ffmpeg_engine(SSH_INVOCATION_STRING [, NUMTHREADS])>

This method is optional and must be called before C<init()>.  

The MythTV backend may not have the most computing resources available
on your network.  It may, therefore, be desirable to offload the
transcoding to H.264 to another computer.  In that case, this method
can be used.

The I<SSH_INVOCATION_STRING> must be a string that allows
password-free access to an account on the remote machine.  An example
might be C<S<ssh -l guest -i /home/mythtv/.ssh/key bigbox>>.  The second
argument is optional, and should be the integer number of threads to
use, passed to the C<-threads> option of F<ffmpeg>.  The default
number of threads is 1.

There must be enough space in the home directory of the remote machine
to hold the transcoded file, but the original file will not be copied
over, it is streamed to the C<stdin> of F<ffmpeg>.  Once the file has
been copied back, it will be deleted from the remote machine,
releasing the disk space.

=cut

sub set_remote_ffmpeg_engine {
    my $self = shift;
    my $ssh_string = shift;
    my $num_threads = shift;

    if ( ! $num_threads ) {
	$num_threads = 1;
    }

    die "Bad thread number" if $num_threads < 1;

    $self->{ ssh_string } = $ssh_string;
    $self->{ ssh_threads } = $num_threads;
}


=head2 C<set_ffmpeg_rebuild()>

This method is optional and must be called before C<init()>.  

Some versions of mythcommflag incorrectly mark some non-keyframes as
keyframes, leading to visible artefacting immediately after a seek.
In that case, if a fairly recent version of ffmpeg is available (1.0
works, 0.8.4 does not), the script can use ffmpeg to obtain the true
keyframe locations and update the seektable appropriately.  Call this
function to activate that mode of operation.

=cut

sub set_ffmpeg_rebuild {
    my $self = shift;

    $self->{ ffmpeg_rebuild } = 1;
}

=head2 C<ffmpeg_map_uses_colon()>

This method is optional, and must be called before C<init()>.

The command-line invocation for stream mapping in F<ffmpeg> has
changed over time.  Older versions use a syntax like C<-map 0.1>, while
newer ones use C<-map 0:1>.  This method asks that the colon separator
be used in F<ffmpeg> invocations.  This is also the default, so this
method need not be used, but is recommended in case the F<ffmpeg>
invocation changes again in the future.

=cut

sub ffmpeg_map_uses_colon {
    my $self = shift;
    undef $self->{ map_dot };
}

=head2 C<ffmpeg_map_uses_dot()>

This method is optional, and must be called before C<init()>.

The command-line invocation for stream mapping in F<ffmpeg> has
changed recently.  Older versions use a syntax like C<-map 0.1>, while
newer ones use C<-map 0:1>.  This method asks that the dot separator
be used in F<ffmpeg> invocations.  Use this method if your F<ffmpeg>
does not recognize the newer format.

=cut

sub ffmpeg_map_uses_dot {
    my $self;
    $self->{ map_dot } = 1;
}


=head2 C<set_program_identifier(CHANID, STARTTIME [, JOBID])>

This method is required, and must be called before C<init()>.

It is used to set the channel ID and starting time of the recording to
be processed.  The optional third parameter holds the job ID, and must
be supplied if locking is being used (see C<set_lock_file>).

The expected formats for these three parameters are simply the strings
that MythTV supplies to user jobs when the pseudo-parameters
B<%CHANID%>, B<%STARTTIMEISO%>, and B<%JOBID%> are used in the command
line invocation.

=cut

sub set_program_identifier {
    my $self = shift;
    my $chanid = shift;
    my $starttime = shift;
    my $jobid = shift;

    die "It is not legal to call set_program_identifier more than once on a MythXCode object" if $self->{ chanid };

    die "Failed to supply a channel ID" if ! $chanid;
    die "Failed to supply a starttime" if ! $starttime;
    
    $self->{ chanid } = $chanid;
    $self->{ starttime } = $starttime;
    $self->{ jobid } = $jobid;

}



=head2 C<init()>

This method is required.  It validates some of the configurations set
up by the previously-described methods.  It verifies that the database
schema is one that has been tested and is known to work.  If locking
is enabled, it acquires the lock.

This method will never return an error, but it may not return.  If the
configuration settings supplied earlier are wrong, it may ABEND at
this time.  If the process has to move out of the way to allow a
commercial flagging job to move up in the queue, it will do a quiet
exit and not return.

=cut

sub init {
    my $self = shift;

    $self->check_database_schema();
    $self->wait_on_lock();
    $self->{ initialized } = 1;
    $self->read_recording_data();
}


=head2 C<perform_backup()>

This method is optional, and must be called after C<init()>.  

If a backup directory was set with C<set_backup_dir()>, the backup is
performed at this time.  See C<set_backup_dir()> for details of the
location of the backups.  The backup holds the original MPEG recording
file, a file named F<recovery> that shows how to copy this file back
to its original location, and several MySQL command files suitable for
streaming directly into F<mysql> to restore the relevant database
entries for the recording.

B<NOTE>: no backups will be performed unless this method is explicitly
called.

=cut

sub perform_backup {
    my $self = shift;
    my $newdir = $self->{ backup_dir };

    die "Failed to call init()" if ! $self->{ initialized };

    if ( ! $newdir ) {
	return;
    }

    $newdir = $newdir . "/transcode_$$";
    die "Backup directory already exists" if -d $newdir;
    mkdir $newdir,0700 || die "Failed to create backup directory.";
    my $caption_filename = $self->generate_caption_filename();

    if ( $self->get_debug_level() > 0 ) {
	print "cp $self->{ orig_pathname } $newdir\n";
	if ( -s $caption_filename) {
	    print "cp $caption_filename $newdir\n";
	}

    } else {
	$self->set_job_status("Backing up original file.");
	if (system( "cp", $self->{ orig_pathname }, $newdir ) != 0 ) {
	    die "Failed to make the backup copy.";
	}
	if ( -s $caption_filename ) {
	    if (system( "cp", $caption_filename, $newdir ) != 0 ) {
		die "Failed to make the backup copy of .srt file.";
	    }
	}
	my $hints;
	open $hints, ">$newdir/recovery";
	print $hints "cp $newdir/$self->{ orig_basename} $self->{ orig_pathname }\n";
	if ( -s $caption_filename ) {
	    my $caption_basename = basename($caption_filename);
	    my $caption_dirname = dirname($caption_filename);
	    print $hints "cp $newdir/$caption_basename $caption_dirname\n";
	}

	for ( @tables_to_backup ) {
	    print $hints "mysql $self->{ dbname } < $newdir/$_\n";
	}

	close $hints;
    }

    for ( @tables_to_backup ) {
	
	my $tablename = $_;

	my $hints;
	open $hints, ">$newdir/$tablename";
	print $hints "DELETE FROM $_ WHERE chanid='$self->{ chanid }' AND starttime='$self->{ starttime }';\n";
	my @onetable = @{$self->{ backups }->{ $tablename }};
	for ( my $i = 1; $i <= $#onetable; $i++ ) {

	    print $hints "INSERT INTO $_ (", join(',', @{$onetable[0]}), " ) VALUES ('", join ("','", map { my $t = $_; $t =~ s|'|\\'|g; $t; } @{$onetable[$i]}), "');\n";
	}
	if ( ! close $hints ) {
	    die "Failed to write backup data file $tablename";
	}
    }

    if ( $self->get_debug_level() == 0 ) {
	$self->set_job_status("");
    }
}


=head2 C<prepare_captions()>

This method is optional, and must be called after C<init()>.

This method extracts the captions from the recording and applies the
cutlist (if any) to the resulting .srt file.

This method should be called before C<transcode_to_h264()>, as
otherwise captions and cutlist data will be lost.

=cut

sub prepare_captions {
    my $self = shift;

    die "Failed to call init()" if ! $self->{ initialized };

    if ( ! $self->captions_already_extracted() ) {
	$self->extract_captions();
    } else {
	print "Captions already present in a .srt file.\n";
	# Note that extract_captions applies the cutlist as it works
	if ( $self->has_cutlist() ) {
	    $self->generate_cut_captions();
	}
    }
}



=head2 C<transcode_to_h264()>

This method is optional, and must be called after C<init()>.

If a cutlist exists, or if captions are to be preserved,
C<prepare_captions()> should be called before this method.

This method transcodes a recording to H.264 while added extra
keyframes at the cut points.  It then cuts the recording cleanly at
those positions, and converts the file to a transport stream so that
the seek table rebuild code will work on it.  It will use the remote
F<ffmpeg> host, if requested (see C<set_remote_ffmpeg_engine()>).

Unless running in debugging mode (see C<set_debug_level()>), this
operation makes permanent changes to the recording on disk and to the
database entries related to it.

=cut

sub transcode_to_h264 {
    my $self = shift;

    die "Failed to call init()" if ! $self->{ initialized };

    $self->perform_transcoding_to_h264();
}


=head1 AUTHOR

Christopher Neufeld.  Copyright 2013, released under the GPL version 3.

=cut


Invocation Script

The .pm file describes a class. A separate script is needed to invoke the methods defined in the class. Here is the one I use, xcode_to_h264.pl


#! /usr/bin/perl
#
# Invoke as a user job with:
# /home/mythtv/xcode_to_h264.pl %CHANID% %STARTTIMEISO% %JOBID%

# New code to transcode.

use lib '/home/mythtv';

use MythXCode;
use strict;

umask 0022;

my $dbname = "mythconverg";
my $dbuser = "mythtv";
my $dbpasswd = "password";

# my $backup_dir = "/myth/tmp";
# my $lock_file = $backup_dir . "/transcode_lock";

my $dbglevel = 0;



my $chanid = $ARGV[0];
my $starttime = $ARGV[1];
my $jobid = $ARGV[2];

die "Bad channel ID" if ! $chanid;
die "Bad start time" if ! $starttime;

$starttime =~ tr/T/ /;


my $worker = MythXCode->new();

$worker->set_db_parms($dbname, $dbuser, $dbpasswd);
# $worker->set_backup_directory($backup_dir);
$worker->set_debug_level($dbglevel);
# $worker->set_lock_file($lock_file);
# $worker->set_remote_ffmpeg_engine("ssh -i /home/mythtv/.ssh/bigboxkey bigbox", 3);
# $worker->set_ffmpeg_rebuild();

$worker->set_program_identifier($chanid, $starttime, $jobid);

$worker->init();

# $worker->perform_backup();
$worker->prepare_captions();
$worker->transcode_to_h264();

Read the documentation for MythXCode.pm for more details. You can set up a locking scheme that allows transcode jobs to move out of the way of commercial flagging jobs, in the event that your transcoding runs take a very long time. You can (and probably should) use the backup facility to produce files capable of exactly restoring the state of the recording in case something goes wrong. You can use a remote ffmpeg offload engine (I do, as a 64-bit box is much faster than my 32-bit backend on transcodes).