Closed captioning

From MythTV Official Wiki
Jump to: navigation, search

Enabling closed caption data in future recordings

In order to view Closed Captioning data you must first configure the backend to properly set the VBI format. The default value is None, which means no closed captioning.

Note that ATSC captions should not require the VBI setting.

Viewing closed captions when present in a recording

You also need to configure the frontend to set whether you want Closed Captioning to turn on whenever you select a channel. Regardless of your choice for the frontend, you can toggle Closed Captioning on and off by using the keyboard. The default key is T.

If captions don't appear for you, it may be because the player (from your video profile) is not supporting playback for the kind of captions in the video. Try switching to a different video profile (one that uses VDPAU or ffmepg, for example) to improve the chances of captions working. The HD-PVR is unable to capture VBI data and thus recordings it produces do not have captions. (Enabling captions on the cable box would work, but that would result in captions being "burned" into the HD-PVR recording's video.) For a work-around, see Captions_With_HD_PVR.

Extracting closed captions to a .srt file

If you have a recording that contains closed caption data, and a cutlist, you can extract the time-corrected .srt data. You will need CCExtractor. Install the following script:

#! /usr/bin/perl

# A script to produce a .srt closed-caption extract from a mythtv
# recording with a cutlist.

# Script written by Christopher Neufeld, released GPLv2.

# The output file is a single file, time-corrected for the missing
# pieces of the input file.

# Invocation:
#	<script> <channel> <timespec> <output-file>

# May not work correctly on recordings that span the autumn time change

require Date::Manip;

$ENV{'PATH'} = '/bin:/usr/bin:/root/bin';

$frames_per_sec = 29.97;

sub short_time_to_long {
    my $st = shift;
    return substr($st, 0, 4) . "-" . substr($st, 4, 2) . "-" . substr($st, 6, 2) . " " . substr($st, 8, 2) . ":" . substr($st, 10, 2) . ":" . substr($st, 12, 2);

sub long_time_to_short {
    my $lt = shift;
    $lt =~ tr/ :-//d;
    return $lt;

sub seconds_to_time {
    my $seconds = shift;
    my $hours = int($seconds / 3600);
    $seconds -= $hours * 3600;
    my $minutes = int($seconds / 60);
    $seconds -= $minutes * 60;
    return sprintf("%02d:%02d:%02d", $hours, $minutes, $seconds);

die "Usage: $0 <channel> <timespec> <output-file>" if $#ARGV != 2;
$channel = $ARGV[0];
$long_time = $ARGV[1];
print "long_time= $long_time\n";
$long_time =~ tr/T/ /;
print "long_time= $long_time\n";
$output = $ARGV[2];

$time = long_time_to_short($long_time);
$epoch_start = Date::Manip::UnixDate($long_time, "%s");

# Next, find out how long the recording is.

$mysql_user = "mythtv";
$mysql_passwd = "mythtv";
$mysql_db = "mythconverg";

$endtime=`mysql -s -s -u $mysql_user --password=$mysql_passwd $mysql_db -e "select endtime from recorded where chanid = $channel and starttime = '$long_time';"`;
chomp $endtime;

$basename=`mysql -s -s -u $mysql_user --password=$mysql_passwd $mysql_db -e "select basename from recorded where chanid = $channel and starttime = '$long_time';"`;
chomp $basename;
die "Unable to locate recording basename for channel $channel and time $starttime." if $basename eq "";

@dirnames=split('\n', `mysql -s -s -u $mysql_user --password=$mysql_passwd $mysql_db -e "select dirname from storagegroup;"`);

$epoch_end = Date::Manip::UnixDate($endtime, "%s");

$duration = $epoch_end - $epoch_start;

die "Unable to compute length of recording" if ( $duration < 0 );

# Now, find the recording pathname
$recording_fullpath = "";
for (@dirnames) {
    $candidate = "$_" . $basename;
    if ( -e "$candidate" ) {
	$recording_fullpath = $candidate;

die "Unable to locate recording pathname." if $recording_fullpath eq "";

open(RLIST, "mythcommflag --getcutlist -c $channel -s $time |");
while(<RLIST>) {
    if (m/^Cutlist: /) {
	@opset = split(',', (split(' '))[1]);

# Fix weirdly formatted fields.  A bare field is the starting cut, and
# a field ending in '-' extends to the end of the file.
for (@opset) {

# Now, we do everything relative to the start of the file 

$seconds_begin = 0;
$seconds_end = $duration;

$total_delay_frames = 0;

$current_start = $seconds_begin;
$current_end = $seconds_end;

$tmpfile = $output . ".tmp";

for (@opset) {
    $cut_start = (split('-'))[0];
    $cut_stop = (split('-'))[1];

    $cut_start_seconds = int($cut_start / $frames_per_sec);
    $cut_stop_seconds = int($cut_stop / $frames_per_sec);
    $total_delay_milliseconds = 1000 * int($total_delay_frames / $frames_per_sec);
    $total_delay_frames += $cut_stop - $cut_start;

    if ( $cut_start_seconds > $current_start ) {
	$interval_start = seconds_to_time($current_start);
	$interval_end = seconds_to_time($cut_start_seconds);
	system "ccextractor -o $tmpfile -startat $interval_start -endat $interval_end -delay -$total_delay_milliseconds $recording_fullpath\n";
	system "cat $tmpfile >> $output";
    $current_start = $cut_stop_seconds;

if ( $current_start < $duration ) {
    $interval_start = seconds_to_time($current_start);
    $interval_end = seconds_to_time($duration);
    $total_delay_milliseconds = 1000 * int($total_delay_frames / $frames_per_sec);
    system "ccextractor -o $tmpfile -startat $interval_start -endat $interval_end -delay -$total_delay_milliseconds $recording_fullpath\n";
    system "cat $tmpfile >> $output";

unlink $tmpfile;

Invoke the script like this: 1020 2009-05-30T14:00:00 <>

This will extract the captions from the recording that starts at 2:00 PM on May 30, 2009, recorded on channel 1020 (probably channel 20 of source 1). This script may not work with older versions of MythTV, prior to the introduction of storage groups. You may have to modify the MySQL username, password, or database name if you are not using default values, and you may have to modify the PATH variable if the mysql or ccextractor binaries are not in that set.

Viewing transcoded recordings with captions in mplayer

The mplayer program automatically loads .srt files that match the media basename. That is, if you are watching Babylon5.mpg, it will automatically load captions from in the same directory. By using the script shown above after setting the cutlist, but before transcoding, you can produce a pair of files that will allow you to view the transcoded, commercial-cut recording with optional closed captions.

Creating DVDs with captions present

It should be possible to use the script above to insert captions into DVDs created with mytharchive. These captions will be in their own video stream, so can be turned on and off, they are not overlaid directly on the original video stream. The mytharchive script will have to be modified to call the caption-extracting script above, and put the .srt file into the appropriate directory. Then, after the .mpg file has been extracted and optionally requantized, it should create a file similar to this, call it 'b5.xml':

    <textsub filename="/some/directory/" characterset="ISO8859-1"
         fontsize="24.0" font="/some/directory/FreeSans.ttf" horizontal-alignment="center"
         vertical-alignment="bottom" left-margin="60" right-margin="60"
         top-margin="20" bottom-margin="30" subtitle-fps="30"
         movie-fps="30" movie-width="720" movie-height="480"/>

The width and height are for NTSC 4x3 video, they will have to be changed appropriately for other formats. Then, the .mpg file should be augmented with the captions using the command:

spumux -s0 b5.xml < origfile.mpg > with-captions.mpg

This will result in the creation of DVDs with a caption track associated with each .mpg that went into it, which can be toggled by the usual CC selection mechanism on the DVD playback system in use. I have tested the creation of .mpg files with the embedded captioning track, but have not yet tried to change the mytharchive python script to write these to a DVD filesystem.