Difference between revisions of "Captions with HD-PVR"
m (Minor tweaks/wls) |
Jstichnoth (talk | contribs) m (Add v4l2-ctl command to explicitly set the bitrate.) |
||
(4 intermediate revisions by 2 users not shown) | |||
Line 15: | Line 15: | ||
As with several other scripts, this works by dropping a .srt file into the storage directory. MythTV will automatically use such a file if it is found. The .srt file is obtained by scanning the standard-definition outputs of the STB. | As with several other scripts, this works by dropping a .srt file into the storage directory. MythTV will automatically use such a file if it is found. The .srt file is obtained by scanning the standard-definition outputs of the STB. | ||
+ | |||
+ | == Changes == | ||
+ | |||
+ | 2013-02-05: | ||
+ | * Added a v4l2-ctl command to explicitly set the bitrate. There is some evidence that if the card's current bitrate (from either boot-time configuration or a previous use of the card) is too high, ccextractor may error out partway through with a message like "Error: Not enough memory. Please report this: 65536 bytes is not enough!" | ||
+ | |||
+ | 2013-02-01: | ||
+ | * Put in changes to kill a ccextractor doing VBI reads from an earlier recording if it somehow is still running | ||
+ | * Following some suggestions from stichnot: | ||
+ | ** Explicitly use /bin/bash instead of /bin/sh to ensure we have pushd/popd available | ||
+ | ** Change the delay_bias to be in milliseconds rather than in seconds | ||
+ | ** Add a configurable post-roll overrun to ensure that captions are recorded during that interval when it is being used | ||
+ | |||
== Prerequisites == | == Prerequisites == | ||
Line 71: | Line 84: | ||
<pre> | <pre> | ||
− | #! /bin/ | + | #! /bin/bash |
# | # | ||
Line 116: | Line 129: | ||
workdir_prefix=/myth/tmp/captions_ | workdir_prefix=/myth/tmp/captions_ | ||
− | + | delay_bias_ms=0 # set this to any consistent value, a number of | |
− | + | # milliseconds earlier that you want to see all | |
− | + | # captions appearing | |
− | + | post_roll_seconds=0 # If you record past the end of a recording by | |
+ | # some seconds, set this value to at least this | ||
+ | # number to avoid losing captions in this | ||
+ | # post-recording interval | ||
</pre> | </pre> | ||
Line 140: | Line 156: | ||
<pre> | <pre> | ||
− | #! /bin/ | + | #! /bin/bash |
# | # | ||
Line 167: | Line 183: | ||
epoch2=`${DATE} -u --date="$subst" +%s` | epoch2=`${DATE} -u --date="$subst" +%s` | ||
− | duration=`${EXPR} $epoch2 - $epoch1` | + | duration=`${EXPR} $epoch2 - $epoch1 + $post_roll_seconds` |
hours=`${EXPR} $duration / 3600` | hours=`${EXPR} $duration / 3600` | ||
rem=`${EXPR} $duration % 3600` | rem=`${EXPR} $duration % 3600` | ||
Line 185: | Line 201: | ||
# hasn't finished reading captions (can happen if the end-late | # hasn't finished reading captions (can happen if the end-late | ||
# time is negative). | # time is negative). | ||
+ | |||
+ | ctr=0 | ||
while ${FUSER} ${CAPTION_DEVICE} >/dev/null 2>&1 | while ${FUSER} ${CAPTION_DEVICE} >/dev/null 2>&1 | ||
do | do | ||
${SLEEP} 1 | ${SLEEP} 1 | ||
− | + | ctr=`${EXPR} $ctr + 1` | |
+ | |||
+ | # If it is taking too long, axe the ccextractor run that has | ||
+ | # wedged the device. | ||
+ | if [ $ctr -gt 5 ] | ||
+ | then | ||
+ | vidholder=`fuser ${CAPTION_DEVICE} |& awk ' { print $2 } '` | ||
+ | ccpid=`fuser ${CCEXTRACTOR} |& awk ' { print $2 } '` | ||
+ | if [ "${vidholder}e" = "${ccpid}" ] | ||
+ | then | ||
+ | kill ${vidholder} | ||
+ | fi | ||
+ | fi | ||
done | done | ||
+ | |||
+ | # Enable VBI (thanks Jpoet) | ||
+ | ${V4L2_CTL} -d ${CAPTION_DEVICE} --set-fmt-sliced-vbi=cc --set-ctrl=stream_vbi_format=1 | ||
+ | |||
+ | # Set bitrate to something medium to avoid ccextractor overflows | ||
+ | ${V4L2_CTL} -d ${CAPTION_DEVICE} -c video_bitrate=4500000 -c video_peak_bitrate=6000000 | ||
${V4L2_CTL} -i ${CAPTION_INPUT_NUM} -d ${CAPTION_DEVICE} | ${V4L2_CTL} -i ${CAPTION_INPUT_NUM} -d ${CAPTION_DEVICE} | ||
Line 276: | Line 312: | ||
captions_start=`cat captions-start.txt` | captions_start=`cat captions-start.txt` | ||
− | + | diff_msecs=`${EXPR} $diff_start '*' 1000 + $delay_bias_ms` | |
− | + | diff_start=`${EXPR} $diff_msecs / 1000` | |
− | |||
diff_mins=`${EXPR} $diff_start / 60` | diff_mins=`${EXPR} $diff_start / 60` | ||
diff_secs=`${EXPR} $diff_start % 60` | diff_secs=`${EXPR} $diff_start % 60` | ||
− | ${CCEXTRACTOR} -utf8 -startat ${diff_mins}:${diff_secs} -delay -${diff_msecs} pass1.srtbin -o result.srt | + | if [ ${diff_start} -lt 0 ] |
+ | then | ||
+ | diff_msecs=`expr ${diff_msecs} / -1` | ||
+ | ${CCEXTRACTOR} -utf8 -delay ${diff_msecs} pass1.srtbin -o result.srt | ||
+ | else | ||
+ | ${CCEXTRACTOR} -utf8 -startat ${diff_mins}:${diff_secs} -delay -${diff_msecs} pass1.srtbin -o result.srt | ||
+ | fi | ||
if [ -f extractor_pid.txt ] | if [ -f extractor_pid.txt ] |
Revision as of 17:42, 5 February 2013
Author | Christopher Neufeld |
Description | This set of scripts provides a way, given the right hardware, to record closed-caption data for HD-PVR recordings. |
Supports |
Because there is no defined standard for the transmission of closed-caption information over high-definition connections such as component or HDMI, it is not currently possible to obtain closed caption data from recordings produced by a Hauppauge HD-PVR. One alternative that may be suitable for some users is to use the STB (set-top box) to render the captions, so that they are seen as open captions by the HD-PVR, but this has the disadvantage of not being selectable at viewing time, the captions are an inextricable part of the video data. The technique described here allows closed-caption recording, with caption text that can be turned on or off as desired during the viewing of a recording.
The technique described here allows recordings to be made with closed caption information, provided that the STB has the correct behaviour, and that you have a card capable of reading the VBI data from a composite or coaxial standard-definition stream. This procedure has been tested with a Hauppauge PVR-500.
Please note that this procedure includes post-processing after a recording is complete, so it does not work with live television, or with programs that are being watched while still being recorded. It will work for viewing recordings that have completed. The post-processing stage takes only seconds, so the show can be viewed almost immediately.
As with several other scripts, this works by dropping a .srt file into the storage directory. MythTV will automatically use such a file if it is found. The .srt file is obtained by scanning the standard-definition outputs of the STB.
Contents
Changes
2013-02-05:
- Added a v4l2-ctl command to explicitly set the bitrate. There is some evidence that if the card's current bitrate (from either boot-time configuration or a previous use of the card) is too high, ccextractor may error out partway through with a message like "Error: Not enough memory. Please report this: 65536 bytes is not enough!"
2013-02-01:
- Put in changes to kill a ccextractor doing VBI reads from an earlier recording if it somehow is still running
- Following some suggestions from stichnot:
- Explicitly use /bin/bash instead of /bin/sh to ensure we have pushd/popd available
- Change the delay_bias to be in milliseconds rather than in seconds
- Add a configurable post-roll overrun to ensure that captions are recorded during that interval when it is being used
Prerequisites
Before proceeding, first determine whether you have the hardware required.
- You must have a STB that has standard-definition outputs as well as the component outputs used by the HD-PVR. These can be composite or coaxial. If using a coaxial connection, some changes will have to be made to the script, you will have to choose the correct input number, and will have to set the tuner frequency.
- Your STB must produce output on the standard-definition outputs even when tuned to a high-definition channel.
- Your STB must include VBI data in the standard-definition outputs.
To test these requirements, connect your television set to the standard-definition outputs of the STB. Tune the STB to a high-definition channel, and then use the television's internal settings (not the STB settings) to select closed captions. If you see captions, then your STB is suitable for use with this technique. Note that not all programs will have captions, and sometimes commercials or promos don't have them, so you might have to check several high-definition channels to determine whether or not your STB transmits VBI data.
Next, you must have a hardware device capable of reading the VBI stream from a standard-definition, analogue stream. In my case, my backend has a PVR-500 card, which can do that. I connected the composite outputs of the STB to the composite inputs of the PVR-500. Note that you only have to connect one cable, on the video plug, the two audio plugs aren't necessary for this operation, but I've plugged in all three because I don't have individual cables, only triplet cables.
You must be using at least MythTV 0.23, because we are using system events.
You must have installed the CCExtractor program. I have tested this with ccextractor version 0.59.
Parameters to determine
You may have to modify some parameters in these scripts. They should all be adjustable by editing the hd-captions-common.sh script, not the other two. The required parameters are:
- The CardID of your HD-PVR. Mine is '1'. You can determine this by running the MySQL command "select cardid,cardtype from capturecard;", or simply modify the script to write cardid to a file and exit, then start a recording on the HD-PVR.
- The pathname of the device file to the VBI-extracting hardware. In my case, that's the second module of my PVR-500, and on my system that's /dev/pvr_500_2
- The input number of the composite input on the VBI-extracting hardware. In my case, that's '2'.
- A working directory, writable by the UID that runs the mythbackend. I have a tmp directory in /myth, so I've set the working directory prefix to point under that. The script will create a new directory in which it will work, and will remove the directory when it completes.
- A bias can be set here, or it can be left to zero. If set to a positive number, captions will appear that many seconds earlier in the stream. This is to allow for the possibility that there is an undetected systematic delay on your particular hardware, one that requires correction.
There is also a set of binaries used by the scripts. You should verify that the pathnames are correct for your system. In particular, ccextractor might be installed somewhere other than in /usr/bin.
Setting up
Copy all three scripts to the same directory. They should be made executable, and should be in a directory that is readable by the UID that runs the mythbackend.
In mythtv-setup, go to the screen "System Events". Add two new events. Under "Recording started", using the complete pathname, insert the script hd-captions-start.sh:
/SOME/DIR/hd-captions-start.sh "%CARDID%" "%CHANID%" "%STARTTIMEISOUTC%" "%ENDTIMEISOUTC%"
Under "Recording finished", insert the script hd-captions-finalize.sh:
/SOME/DIR/hd-captions-finalize.sh "%CARDID%" "%CHANID%" "%STARTTIMEISOUTC%" "%ENDTIMEISOUTC%" "%DIR%" "%FILE%"
Also in mythtv-setup, under "Input Connections", on the second page for the HD-PVR, create a new recording group for the HD-PVR, and add the VBI-decoding hardware to that recording group. That ensures that the backend will not try to schedule a recording on your VBI-decoding hardware while you're using it to extract captions for the HD-PVR. Note that the backend must be restarted for this recording group change to be noticed by the scheduler.
Finished
You can now restart your backend, and you should get recordings with closed captions, selectable at viewing time.
This script sets up the variables used by the two other scripts.
#! /bin/bash # # Variables used by the hd-captions scripts ######################################### # # VERIFY THESE PATHS FOR YOUR SYSTEM # ######################################### CCEXTRACTOR=/usr/bin/ccextractor FFPROBE=/usr/bin/ffprobe V4L2_CTL=/usr/bin/v4l2-ctl GREP=/bin/grep AWK=/bin/awk SED=/bin/sed FUSER=/usr/bin/fuser EXPR=/usr/bin/expr DATE=/bin/date PRINTF=/usr/bin/printf CP=/bin/cp RM=/bin/rm TOUCH=/bin/touch MKDIR=/bin/mkdir SLEEP=/bin/sleep ######################################### # # EDIT THESE PARAMETERS IF NECESSARY # ######################################### CAPTION_DEVICE=/dev/pvr_500_2 CAPTION_INPUT_NUM=2 # the composite input HD_PVR_CARDID=1 workdir_prefix=/myth/tmp/captions_ delay_bias_ms=0 # set this to any consistent value, a number of # milliseconds earlier that you want to see all # captions appearing post_roll_seconds=0 # If you record past the end of a recording by # some seconds, set this value to at least this # number to avoid losing captions in this # post-recording interval
Recording start script
This script performs the following functions, in order:
- Verifies that this recording is being made on the HD-PVR
- Builds a pathname for its working directory
- Parses out the passed parameters to determine how long the recording is
- Verifies that our working directory doesn't exist
- Creates the working directory, and chdirs into it
- Spawns a subshell to do the work
- Makes sure that the VBI-decoding device isn't in use, and waits for it to be free (can happen if you have "end late" set to a negative number)
- Selects the appropriate input on the VBI-decoding device
- Records the time, in seconds since the epoch, when the caption extraction began, for use by the finalize script
- Runs ccextractor for the duration of the show, recording the results in its own binary format
- Records the PID of the ccextractor program in a file, so that the finalize script can kill it if the user manually requests an early stop to the recording.
#! /bin/bash # # Invoke with CARDID CHANID STARTTIMEISOUTC ENDTIMEISOUTC . `dirname $0`/hd-captions-common.sh cardid=$1 chanid=$2 starttime=$3 endtime=$4 # Only do this for the HD-PVR input if [ $cardid -ne ${HD_PVR_CARDID} ] then exit 0 fi workdir=${workdir_prefix}${chanid}_${starttime} epoch1=`${DATE} +%s` subst=`echo $endtime | tr "T" " "` epoch2=`${DATE} -u --date="$subst" +%s` duration=`${EXPR} $epoch2 - $epoch1 + $post_roll_seconds` hours=`${EXPR} $duration / 3600` rem=`${EXPR} $duration % 3600` mins=`${EXPR} $rem / 60` secs=`${EXPR} $rem % 60 ` if [ -e $workdir ] then echo "Working directory name collision" exit 1 fi ${MKDIR} $workdir cd $workdir ( # Avoid a potential race condition if the previous recording # hasn't finished reading captions (can happen if the end-late # time is negative). ctr=0 while ${FUSER} ${CAPTION_DEVICE} >/dev/null 2>&1 do ${SLEEP} 1 ctr=`${EXPR} $ctr + 1` # If it is taking too long, axe the ccextractor run that has # wedged the device. if [ $ctr -gt 5 ] then vidholder=`fuser ${CAPTION_DEVICE} |& awk ' { print $2 } '` ccpid=`fuser ${CCEXTRACTOR} |& awk ' { print $2 } '` if [ "${vidholder}e" = "${ccpid}" ] then kill ${vidholder} fi fi done # Enable VBI (thanks Jpoet) ${V4L2_CTL} -d ${CAPTION_DEVICE} --set-fmt-sliced-vbi=cc --set-ctrl=stream_vbi_format=1 # Set bitrate to something medium to avoid ccextractor overflows ${V4L2_CTL} -d ${CAPTION_DEVICE} -c video_bitrate=4500000 -c video_peak_bitrate=6000000 ${V4L2_CTL} -i ${CAPTION_INPUT_NUM} -d ${CAPTION_DEVICE} ${DATE} -u "+%s" > captions-start.txt ${CCEXTRACTOR} -s ${CAPTION_DEVICE} -utf8 -endat `${PRINTF} "%02d:%02d:%02d" $hours $mins $secs` -out=bin -o pass1.srtbin >/dev/null 2>&1 & ccpid=$! echo $ccpid > extractor_pid.txt wait $ccpid ${RM} extractor_pid.txt ) </dev/null >/dev/null 2>&1 & exit 0
Recording finalize script
This script, executed once the recording has finished, performs the following steps, in order:
- Verify that this was an HD-PVR recording
- Looks for the working directory, and exits if it wasn't found
- Enters the working directory
- Forks a subshell to to the work
- Uses ffprobe to parse out the duration of the HD-PVR recording. I find that it can take the HD-PVR up to dozens of seconds to start recording, so we can't assume that it is as long as the requested recording interval.
- Uses the end time of the recording (assumed correct) and the length of the recording to deduce the starting time of the HD-PVR stream
- Compares the starting time of the HD-PVR stream with that of the VBI stream, and computes the time offset
- Runs ccextractor on the previously-collected data, correcting for the time offset
- Kills the first ccextractor run from the startup script, if it's still running, to free up the device file for immediate use
- Builds the .srt filename
- Makes sure we don't clobber the recording file if something's wrong
- Copies the .srt file to its final position
- Removes the working directory and exits
#! /bin/sh # Invoke with CARDID CHANID STARTTIMEISOUTC ENDTIMEISOUTC DIR FILE . `dirname $0`/hd-captions-common.sh cardid=$1 chanid=$2 starttime=$3 endtime=$4 dir=$5 file=$6 # Only do this for the HD-PVR input if [ $cardid -ne ${HD_PVR_CARDID} ] then exit 0 fi workdir=${workdir_prefix}${chanid}_${starttime} if [ ! -d $workdir ] then exit 0 fi pushd $workdir # Fork off a subshell to do this work ( rec_duration=`${FFPROBE} $dir/$file 2>&1 | ${GREP} '^ Duration: ' | \ ${AWK} -F: ' { print $2 * 3600 + $3 * 60 + int($4) } '` # OK, it's a bit awkward here. The captions might have started early, # and might have ended early (recording past the end of the slot isn't # passed in ENDTIMEISOUTC for recording start scripts). The recording # should have ended on time, at ENDTIMEISOUTC. So, we can compute the # start time of the recording subst=`echo $endtime | tr "T" " "` epoch2=`${DATE} -u --date="$subst" +%s` recording_start=`${EXPR} $epoch2 - $rec_duration` captions_start=`cat captions-start.txt` diff_msecs=`${EXPR} $diff_start '*' 1000 + $delay_bias_ms` diff_start=`${EXPR} $diff_msecs / 1000` diff_mins=`${EXPR} $diff_start / 60` diff_secs=`${EXPR} $diff_start % 60` if [ ${diff_start} -lt 0 ] then diff_msecs=`expr ${diff_msecs} / -1` ${CCEXTRACTOR} -utf8 -delay ${diff_msecs} pass1.srtbin -o result.srt else ${CCEXTRACTOR} -utf8 -startat ${diff_mins}:${diff_secs} -delay -${diff_msecs} pass1.srtbin -o result.srt fi if [ -f extractor_pid.txt ] then kill `cat extractor_pid.txt` fi ofile=`echo $file | ${SED} -e 's|.mpg$|.srt|'` if [ $ofile = $file ] then exit 1 fi ${CP} result.srt $dir/$ofile popd ${RM} -fr $workdir ) </dev/null >/dev/null 2>&1 & exit 0