Mythpycutter

From MythTV Official Wiki
Jump to: navigation, search

Warning.png Warning: An updated script based on both Mythcutprojectx and this one is now at MythDVBcut

It is not limited to mpeg2video recordings and uses MythTV tools from 0.25+

The link between the editor-based cutlist and the cutpoints actually used differs slightly from both the earlier versions.--Johnp 11:38, 8 March 2013 (UTC)


This script works in a similar way to mythcutprojectx but works on recordings in formats for which Project-X cannot be used. I'm not sure if it duplicates the function of other packages, and it sounds like a throwback to the early days of digital tv, but I get the impression that it would be useful.

It just cuts at the keyframe nearest to each edit point, concatenates the wanted segments and fixes the database to match. Unlike Project-X it makes no corrections for transmission or other errors, and it does nothing about AV sync. This doesn't seem to be a problem for dvb recordings, where I believe the TS format does that job. The log file identifies the edit points in the output file, and sometimes there are brief audio glitches there. For dvb recordings, but maybe not for locally-encoded ones, keyframes often occur at or near natural edit points.

It comes in two parts; a python module and a bash script that calls it. It isn't set up as a real 'user job' script; I run it from a terminal, and since I have only used it on my stand-alone laptop it isn't set up to use Storage Groups. I don't think it would be difficult to convert it to do that, but that isn't how I've used it.

Anyone wishing to do this conversion might like to examine this post by Raymond Wagner, which includes a suggested translation of mythcutprojectx using Python Bindings. http://www.gossamer-threads.com/lists/mythtv/users/517197#517197

Please note that I'm not a full-time programmer and the usual caveats and lack of guarantee will apply.

Note, too, that the bash script is for 0.25 only. The main difference is that mythutil has replaced some forms present in earlier versions.


Script.png pycutter.sh

#!/bin/sh 
. ~/.mythtv/mysql.txt      # for DB access info

# Copyright (C) 2012 John Pilkington 
# Uses ideas from scripts posted by Tino Keitel and Kees Cook in the Mythtv lists.

# Usage: ./pycutter <recording>    ...or...
# ionice -c3 ./pycutter <recording> will reduce io priority and is recommended.
# <recording> is an mpeg2 file recorded by MythTV with a valid DB entry.

# This script is essentially a terminal-based replacement for the 'lossless' mpeg2 mythtranscode,
# but it might be usable for any format of recording that has a MythTV seektable.
# It was developed from mythcutprojectx for use with formats for which Project-X cannot be used.

# It will pass the recording and the MythTV cutlist to pycut.py, which copies and concatenates 
# the required chunks without any other processing.  No special measures are taken to preserve AV sync, 
# but for some recordings ( eg DVB TS format ) this may happen during normal playback.  
# If the cutlist is empty the entire recording will be processed - essentially a file copy.
  
# It then clears the cutlist, updates the filesize in the database, rebuilds the seek table and creates a new preview.
# The result should be acceptable as a recording within MythTV and perhaps as an input to MythArchive.
# The logfile includes the positions in the new file at which deletions have been made. 

# The script needs to be edited to define some local variables.  
# It has been used on a stand-alone laptop and will not recognise Storage Groups.

####################

# Variables RECDIR, LOGDIR need to be customised.
#
# TESTRUN is initially set to true.  No changes will be made to the input file.

##---------------
RECDIR=/home/john/Mythrecs
LOGDIR=/home/john/Logs

INVERT=false   # setting for use in 0.24 and later

#TESTRUN=true    # cutlists will be shown but the recording will be unchanged  
TESTRUN=false  # the recording will be processed

#CUTMODE="FRAMECOUNT"
CUTMODE="BYTECOUNT"
#################

if [ "$1" = "-h" ] || [ "$1" = "--help" ] ; then
echo "Usage: "$0" <recording>"
echo "<recording> is a file recorded by MythTV with a valid DB entry and seektable."
echo "e.g. 1234_20100405123400.mpg in one of the defined RECDIRs"
echo "The output file replaces the input file which is renamed to <recording>.old"
exit 0
fi

# exit if .old file exists

if  [ -f ${RECDIR}/"$1".old ] ; then 
    echo " ${RECDIR}/"$1".old exists: giving up." ; exit 1
fi


# Customize with paths to alternative recording and temp folders

cd $RECDIR
if  [ ! -f "$1" ] ; then
       echo " "$1" not found.  Giving up"
       cd ~
       exit 1
fi 

#Now do the actual processing
# chanid and starttime identify the recording in the DB
chanid=$(echo "select chanid from recorded where basename=\"$1\";" |
mysql -N -u${DBUserName} -p${DBPassword} -h${DBHostName} $DBName )

starttime=$(echo "select starttime from recorded where basename=\"$1\";" |
mysql -N -u${DBUserName} -p${DBPassword} -h${DBHostName} $DBName )

#exit
echo -e "\nLogfile listing:\n" > log$$
echo -e "Logfile is log$log$$ \n" | tee -a log$$

echo "chanid ${chanid}   starttime ${starttime} " >> log$$
starttime=$(echo  ${starttime} | tr -d ': -') 
echo "Reformatted starttime ${starttime}" >> log$$

command="mythutil --getcutlist --chanid  $chanid --starttime $starttime -q"
echo "Running: ${command}" >> log$$
mythutilcutlist=$($(echo ${command}))
echo "${mythutilcutlist}" | >> log$$
if [ "${mythutilcutlist}" = "Cutlist: " ] ; then
  echo "Cutlist was empty; inserting dummy EOF" >> log$$
  mythutilcutlist=" 9999999 "
fi

echo -e "\nCutframe list from editor: " >> log$$
echo "${mythutilcutlist}" | tr -d [:alpha:] | tr [:punct:] " " | tee  edlist$$ >> log$$

#cat edlist$$ | tee -a log$$
echo

#
#Reverse the sense of the cutlist 
#
echo -n > revedlist$$
for i in  $(cat edlist$$) ; 
do
   if  [ $i -eq 0 ] ;  then
      for j in  $(cat edlist$$) ; 
      do
         if  [ $j != 0 ] ; then echo -n "$j " >> revedlist$$
         fi
      done
   else
         echo -n " 0 " >> revedlist$$
         for j in  $(cat edlist$$) ; 
         do
             echo -n "$j " >> revedlist$$
         done
   fi
   break
done
echo >> revedlist$$
echo -e "\nPassframe (reversed cutframe) list from editor: " >> log$$
cat revedlist$$ | tee -a log$$
 
# For a byte-count cutlist, (PX CutMode=0)
# mark is the frame count in the seektable and is compared here with the editpoint
# mark type 9 is MARK_GOP_BYFRAME (see eg trac ticket #1088) with adjacent values typically separated by 12 or more,
# so that is presumably the cutpoint frame granularity in bytecount mode. 
# Subjectively the granularity seems smaller, but this may not apply to locally encoded recordings; 
# in dvb-t and similar systems the spacing of keyframes apparently depends on pre-transmission edits of content. 
#
# Find the keyframe (mark type 9) nearest to each cut mark,
# write the byte offset and frame number into one-line cutlists.
#
for i in $(cat revedlist$$) ;
        do echo "SET SESSION sql_mode='NO_UNSIGNED_SUBTRACTION' ; 
             select offset, mark from recordedseek
             where chanid=$chanid and starttime='$starttime' and type=9 
             and mark > ($i-50) and mark < ($i+50) order by abs(2*(mark-$i)+1) limit 1 ;" |
             mysql -N -u${DBUserName} -p${DBPassword} -h${DBHostName} $DBName          
done > tmp$$
#cat tmp$$
echo
echo -e "\nKeyframe passlist via DB.  First is a cut-in:" >> log$$
cut -f2 tmp$$ | tr "\n" " "  | tee -a log$$ > keylist$$
echo >> keylist$$
echo >> log$$
#cat log$$
echo -e "\nByte offsets of switchpoints in original file, via DB.  First is a cut-in:" >> log$$
cut -f1 tmp$$ |  tr "\n" " "  | tee -a log$$ > bytelist$$
echo >> bytelist$$
echo >> log$$
rm tmp$$

echo -e " #!/bin/bash\n mv  '$RECDIR/$1' '$RECDIR/$1.old' "  > pyscript$$
echo -en " ionice -c3 ~/pycut.py '$RECDIR/$1.old' '$RECDIR/$1' " >> pyscript$$

#cat bytelist$$  | tr  "\n"  "  " | tee  -a  pyscript$$  | tee  -a log$$ # get switchpoints on one line
cat bytelist$$  >>  pyscript$$ 
#echo >> pyscript$$

echo -e "\nSwitchbyte positions in new file:" >> log$$
J=0
S=0                           # 0 or 1 for cut or pass lists
for i in  $(cat bytelist$$) ;  
do 
  if [ $S -eq 0 ] ; then
      J=$((J - i))           
      S=1
  else
      J=$((J + i))
      S=0
      echo -n "$J " >> log$$
   fi
done 
echo >> log$$

echo -e "\nSwitchframe positions in new file:" >> log$$
J=0
S=0                           # 0 or 1 for cut or pass lists
for i in  $(cat keylist$$) ;  
do 
  if [ $S -eq 0 ] ; then
      J=$((J - i))           
      S=1
  else
      J=$((J + i))
      S=0
      echo -n "$J " >> log$$
   fi
done 
echo >> log$$
echo
cat log$$

# create script to be run
echo -e "\nThis is pyscript$$, the script that will be run if TESTRUN is false:\n"
cat pyscript$$
chmod +x pyscript$$
echo -e "\nTo run this use ./pyscript$$. \n"


if $TESTRUN ; then
   echo "Quitting because TESTRUN is ${TESTRUN}"
   rm -f cutlist$$
   rm -f temp$$   
   cd ~
   exit 0
fi

# Now do the actual cutting and concatenation
# mv  "$1" "$1".old
echo "Running : " 
cat pyscript$$
echo
echo
./pyscript$$
echo

# Cutting completed.  Now clean up.
  
# tell mythDB about new filesize and clear myth cutlist
FILESIZE=`du -b "$1" | cut -f 1`
if [ "${FILESIZE}" -gt 1000000 ]; then
      echo "Running: update recorded set filesize=${FILESIZE} where basename=\"$1\";"
      echo "update recorded set filesize=${FILESIZE} where basename=\"$1\";" | 
      mysql -u${DBUserName} -p${DBPassword} -h${DBHostName} $DBName
#     mysql -u mythtv -p$PASSWD mythconverg

      echo -e "Filesize has been reset.\n"

      echo "Running: ionice -c3 mythutil  --clearcutlist  --chanid "$chanid" --starttime "$starttime" -q"
      ionice -c3 mythutil  --clearcutlist  --chanid "$chanid" --starttime "$starttime" -q

#      echo -e "Cutlist has been cleared.\n" 
fi

#rebuild seek table
echo "Running: ionice -c3 mythcommflag --rebuild --file $1 " 
ionice -c3 mythcommflag --rebuild --file $1

#echo -e "Seek table has been rebuilt.\n"

#echo  "The cutlist was applied in ** "$CUTMODE" ** mode."
echo -e "Output file is $1. \n" 

# Get tech details of output file into the log.
echo -e "\nRunning:  mythffmpeg -i "$1" 2>&1 | grep -C 4 Video" | tee -a log$$
echo
mythffmpeg -i "$1" 2>&1 | grep -C 4 Video | tee -a log$$
echo

mv log$$ ${LOGDIR}/$1.pycut$$.txt
rm bytelist$$
rm edlist$$
rm keylist$$
rm revedlist$$
rm pyscript$$
cd ~

# mythpreviewgen isn't essential here so put it where failure won't cause other problems.
#rm -f "$1".png  #  Delete the old preview
echo "Running: mythpreviewgen  --chanid "$chanid" --starttime "$starttime" -q "
ionice -c3 mythpreviewgen --chanid "$chanid" --starttime "$starttime" -q
echo -e "Preview created.\n"

exit 0


{{{2}}}


Script.png pycut.py

#! /usr/bin/env python
# -*- coding: utf-8 -*-

# Cut out and concatenate sections of a file

import sys, os
#print sys.argv


######################
## For tests
##
## echo "0123456789A123456789B123456789C123456789D123456789E123456789F123456789" > ~/test.txt 
## 
## fn1 = './test.txt'
## fn2 = './temp.txt'
## chunks = [ 3, 12, 35, 47, 53, 68  ]
## buflen = 5

## Doesn't recognise '~/test.txt' but this, or full path, seems ok
## python pycut.py './test.txt' './temp.txt' 3 12 35 47 53 68  

# Zambesi HD

#./printcutlist /home/john/Mythrecs/1054_20120328222600.mpg
# Generates byte-mode cutlist for use with Project-X  - and here
#CollectionPanel.CutMode=0

#fn1 = '/mnt/f10store/myth/reca/1054_20120323002600old.mpg'
#fn2 = '/mnt/sam1/recb/1054_20120323002600.mpg'
#chunks = [ 390284804, 4556742872 ]
#buflen = 1024*1024
#
########################

fn1 = sys.argv[1]  # input file
fn2 = sys.argv[2]  # output file
chunks = map( int, sys.argv [ 3 : ] )  # start and end bytes of chunks in infile
buflen = 1024*1024
#bignum = 10000000000                   # for use as EOF if needed
# less likely to be surprised if we use the actual filesize here

print "infile        ", fn1
print "outfile       ", fn2
print "switchpoints  ", chunks

#######################

# sanity checks

chunklen = len(chunks)
if chunklen != 2 * ( chunklen / 2 ) :
#    chunks.append(bignum)
    chunks.append( 1 + os.path.getsize(fn1))
    chunklen = len(chunks)

# adjust chunk-endpoints in the hope of keeping chain linkage in the data intact
n = 1
while n < chunklen :
  chunks[n] += -1
  n += 2
  
n=0
while n < chunklen - 2 :
   if chunks[n] > chunks[n+1] :
      print "Quitting: switchpoints out of order"
      sys.exit(98)
   n += 1

print "Adjusted switchpoints  ", chunks

n = 0
m = 0
offset = [ 0 ]
while n < chunklen - 1 :
   m += 1 + chunks[ n+1 ] - chunks[ n ]
   offset.append( m )
   n += 2

print
print "Byte offsets of cutpoints in output file: ",  offset
print "DB table is recordedseek, mark (framecount) is type 9."
##################################
# Don't touch stuff below here 
## byte numbering starts at 0 and output includes both chunk-endpoints

chnklim = len(chunks) - 1 
nchnk = 0
chstart=chunks[nchnk]
chend=chunks[nchnk + 1]
bufstart = 0

f1 = open(fn1, 'rb')
f2 = open(fn2, 'wb')

while True :
  data = f1.read(buflen)
  lendat = len(data)
  if lendat == 0 :
       break
  bufend = bufstart + lendat 
  while chstart < bufend :
       if chend <  bufend :
           f2.write(data[chstart - bufstart : chend - bufstart + 1 ])
           nchnk += 2
           if nchnk > chnklim :             # job done
               chstart = bufend + buflen*2  # kill further looping
               break
           
           chstart = chunks[nchnk]
           chend   = chunks[nchnk + 1]
       else :
           f2.write(data[chstart - bufstart :  ])
           chstart = bufend 
 
  bufstart += lendat
 
f1.close()
f2.close()
                  

{{{2}}}