Difference between revisions of "Commercial detection with silences"
Stevegoodey (talk | contribs) (→Cluster Detecting Version: Changed from "loud silences") |
Dizygotheca (talk | contribs) m (→Algorithm) |
||
Line 332: | Line 332: | ||
In practice, silences are detected as a consecutive series of frames having an average audio power below <threshold> for at least <minquiet>. If the interval between a silence and the previous one is less than <maxsep> then they belong to the same cluster; otherwise they lie in different clusters. Clusters that are shorter than <minbreak> or composed of less than <mindetect> silences are ignored. Adverts are shortened by <padding> on both sides. | In practice, silences are detected as a consecutive series of frames having an average audio power below <threshold> for at least <minquiet>. If the interval between a silence and the previous one is less than <maxsep> then they belong to the same cluster; otherwise they lie in different clusters. Clusters that are shorter than <minbreak> or composed of less than <mindetect> silences are ignored. Adverts are shortened by <padding> on both sides. | ||
− | Although adverts are reported in real-time, all silences and clusters are stored - I originally envisaged using post-scan analysis to amend the detected adverts. However, so far, this hasn't proved necessary or viable. | + | |
+ | Although adverts are reported in real-time, all silences and clusters are stored - I originally envisaged using post-scan analysis to amend the detected adverts. However, so far, this hasn't proved necessary or viable. | ||
=== Building/Change Summary === | === Building/Change Summary === |
Revision as of 23:48, 26 February 2013
Author | Hippo |
Description | A python program based on Mythcommflag-wrapper (thank you Cowbut) that can be used on UK FreeviewHD channels and probably others. |
Supports | ![]() ![]() |
Contents
Original version
I tried out the scripts in Mythcommflag-wrapper and they worked well on the Freeview channels I receive but not on the FreeviewHD channels. The reason is that the audio on FreevieHD is an AAC stream and not an MP3 stream. Fixing that would require decoding from AAC and encoding back to MP3 before letting the script analyse the MP3 stream. So I wrote a little C program to analyze an uncompressed audio stream and a Python program to wrap it up and turn the output into a commercial skip list.
To use this
- Compile the two C programs and put them somewhere the Python program can find it. (e.g. /usr/local/bin)
- Copy the Python program to somehwere the backend can find it.
- Follow the instructions on Mythcommflag-wrapper except the job setting should be 'mausc-wrapper.py %JOBID%'
The python program uses avconv to decode the program file to an AU stream. If you don't have avconv replace it with ffmpeg or mythffmpeg (avconv is the new name for ffmpeg). It upconverts the audio to 6 channels so that it works even when the audio switches around. If you know you only ever get stereo you can replace the 6 with 2 to save a bit of CPU power. It might have to go up in future. Up-converting is better because it's low power and always works whereas down-converting may fail depending on your version of avconv/ffmpeg.
This can do near-realtime commflagging by enabling the backend setting to start commflagging when the recording starts. (mythtv-setup/General/Page9-JobQueueGlobal). The programs mark entries in the cutlist <max-break-setting> after the start of a break is detected so this will be after the commercial break has ended. If you are displaying the programme and get too close to the end you will be in the commercials before they are flagged. C'est la vie.
It's low CPU in that it only decodes the audio stream and since it follows the end of the recording it shouldn't thrash the memory or disk. avconv takes about 2% to decode ITV1-HD on a 1.6GHz Atom Asus motherboard. catagrower takes about 1% and could be a lot better if made less portable.
/* Copyright 2012 Crackers Phipps. */ /* Public domain. */ /* Compile with gcc -std=c99 -O catagrower.c -o catagrower */ /* This program will stop when the file has not grown for this many seconds. */ #define TIMEOUT 60 /* MythTV files are often large. */ #define _FILE_OFFSET_BITS 64 #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> static void usage(const char *name) { fprintf(stderr, "Usage: %s <file>\n", name); fprintf(stderr, "<file>: file to be monitored.\n"); fprintf(stderr, "The contents of the file will be copied to stdout.\n"); fprintf(stderr, "Copying will stop when the file has stopped growing.\n"); } int main(int argc, char **argv) { /* Check usage. */ if (2 != argc) { usage(argv[0]); exit(1); } /* Load options. */ int fd; if (-1 == (fd = open(argv[1], O_RDONLY))) { fprintf(stderr, "Could not open %s for reading.\n", argv[1]); usage(argv[0]); exit(2); } #define BUFFSIZE 4096 int timer = TIMEOUT; char buffer[BUFFSIZE]; int bytes; while (timer > 0) { while (0 != (bytes = read(fd, buffer, BUFFSIZE))) { write (STDOUT_FILENO, buffer, bytes); timer = TIMEOUT; } sleep(1); timer--; } return 0; }
/* Copyright 2013 Tinsel Phipps. */ /* Public domain. Links with libsndfile which is GPL. */ /* Compile with gcc -std=c99 -O mausc.c -o mausc -lsndfile -lm You may need the libsndfile-dev package installed. */ #include <stdlib.h> #include <math.h> #include <sndfile.h> #include <errno.h> #include <unistd.h> #include <limits.h> static void usage(const char *name) { fprintf(stderr, "Usage: %s <threshold> <min> <max> <rate>\n", name); fprintf(stderr, "<threshold>: silence threshold in dB.\n"); fprintf(stderr, "<min>: minimum time for silence detection in seconds.\n"); fprintf(stderr, "<max>: maximum length of breaks in seconds.\n"); fprintf(stderr, "<rate>: frame rate of video.\n"); fprintf(stderr, "An AU format file should be fed into this program.\n"); fprintf(stderr, "Example: %s -70 0.15 400 25 < audio.au\n", name); } int main(int argc, char **argv) { /* Check usage. */ if (5 != argc) { usage(argv[0]); exit(1); } /* Load options. */ float threshold, min, max, rate; if (1 != sscanf(argv[1], "%f", &threshold)) { fprintf(stderr, "Could not parse threshold option into a number.\n"); usage(argv[0]); exit(2); } if (1 != sscanf(argv[2], "%f", &min)) { fprintf(stderr, "Could not parse min option into a number.\n"); usage(argv[0]); exit(2); } if (1 != sscanf(argv[3], "%f", &max)) { fprintf(stderr, "Could not parse max option into a number.\n"); usage(argv[0]); exit(2); } if (1 != sscanf(argv[4], "%f", &rate)) { fprintf(stderr, "Could not parse rate option into a number.\n"); usage(argv[0]); exit(2); } /* Scale threshold to integer range that libsndfile will use. */ threshold = INT_MAX * pow(10, threshold / 20); /* Scale times to frames. */ min = min * rate; max = max * rate; /* Check the input is an audiofile. */ SNDFILE *input; SF_INFO metadata; input = sf_open_fd(STDIN_FILENO, SFM_READ, &metadata, SF_FALSE); if (NULL == input) { sf_perror(NULL); return sf_error(NULL); } /* Allocate data buffer to contain audio data from one video frame. */ size_t frameSamples = metadata.channels * metadata.samplerate / rate; int *samples; samples = malloc(frameSamples * sizeof(int)); if (NULL == samples) { perror(NULL); return errno; } /* Process the file one frame at a time and process cuts along the way. */ int frames = 0; int silent = 0; int last_silent = 0; int gapend = 0; int gapstart = 0; int first_gapstart = 0; while (frameSamples == sf_read_int(input, samples, frameSamples)) { frames++; int maxabs = 0; for (unsigned i = 0; i < frameSamples; i++) { samples[i] = abs(samples[i]); maxabs = (maxabs > samples[i]) ? maxabs : samples[i]; } last_silent = silent; silent = (maxabs < threshold); /* Remember first transition to silence. */ if (silent && !gapstart) { gapstart = frames; } /* Store last transition out of silence. */ if (!silent && last_silent) { /* Make sure it is long enough. */ if (frames > gapstart + min) { gapend = frames; if (!first_gapstart) { first_gapstart = gapstart; } } gapstart = 0; } /* Create a skip when max frames have passed. */ if (first_gapstart && gapend && frames > first_gapstart + max) { printf("%d %d\n", first_gapstart, gapend); fflush(stdout); gapstart = 0; gapend = 0; first_gapstart = 0; } } /* At end of file can have an unprocessed gap. */ if (first_gapstart) { if (first_gapstart == gapstart) { gapend = frames; } printf("%d %d\n", first_gapstart, gapend); } return sf_close(input); }
#!/usr/bin/env python # Build a skiplist from silence in the audio track. # Based on http://www.mythtv.org/wiki/Transcode_wrapper_stub from MythTV import MythDB, Job, Recorded, findfile, MythLog from os import path from subprocess import Popen, PIPE from optparse import OptionParser def runjob(jobid=None, chanid=None, starttime=None): # Tunable settings (would like to retrieve per channel from the database) thresh = -70 # Silence threshold in dB. minquiet = 0.15 # Minimum time for silence detection in seconds. maxbreak = 400 # Maximum length of adverts breaks. rate = 25 # Frame rate of video. (should be automatic) db = MythDB() if jobid: job = Job(jobid, db=db) chanid = job.chanid starttime = job.starttime try: rec = Recorded((chanid, starttime), db=db) except: if jobid: job.update({'status':job.ERRORED, 'comment':'ERROR: Could not find recording.'}) else: print 'Could not find recording.' exit(1) # Get program handle in standard format. starttime = rec.starttime chanid = rec.chanid sg = findfile(rec.basename, rec.storagegroup, db=db) if sg is None: if jobid: job.update({'status':job.ERRORED, 'comment':'ERROR: Local access to recording not found.'}) else: print 'Local access to recording not found.' exit(1) infile = path.join(sg.dirname, rec.basename) # Purge any existing skip list. rec.markup.clean() rec.commflagged = 0 rec.update() # Write out the file contents and keep going till recording is finished. p1 = Popen(["catagrower", infile], stdout = PIPE) # Pipe through avconv to extract uncompressed audio stream. p2 = Popen(["avconv", "-v", "8", "-i", "pipe:0", "-f", "au", "-ac", "6", "-"], stdin = p1.stdout, stdout = PIPE) # Pipe to mausc which will spit out a list of breaks. p3 = Popen(["mausc", str(thresh), str(minquiet), str(maxbreak), str(rate)], stdin = p2.stdout, stdout = PIPE) # Store breaks in the database. breaks = 0 while 1: line = p3.stdout.readline() if not line: break start, end = line.split() rec.markup.append(start, rec.markup.MARK_COMM_START, None) rec.markup.append(end, rec.markup.MARK_COMM_END, None) rec.commflagged = 1 rec.update() breaks = breaks + 1 if jobid is None: print 'Got a break at frame %s' % start if jobid: job.update({'status':272, 'comment':'Audio commflag detected %s breaks.' % breaks }) else: print 'Audio commflag detected %s breaks.' % breaks def main(): parser = OptionParser(usage="usage: %prog [options] [jobid]") parser.add_option('--chanid', action='store', type='int', dest='chanid', help='Use chanid for manual operation') parser.add_option('--starttime', action='store', type='string', dest='stime', help='Use starttime for manual operation') MythLog.loadOptParse(parser) opts, args = parser.parse_args() if len(args) == 1: runjob(jobid=args[0]) elif opts.chanid and opts.stime: runjob(chanid=opts.chanid, starttime=opts.stime) else: print 'Script must be provided either jobid, or chanid and starttime.' parser.print_help() exit(1) if __name__ == '__main__': main()
Cluster Detecting Version
The basic silence detection algorithm is easily thrown by odd silences that occur within 6 mins of an advert and performs poorly on animations/kids programmes. I was keen to cut adverts out of my kids' shows so I developed an algorithm that detects clusters of silences: adverts are characterised by many silences close together whilst isolated silences within programmes are ignored.
Hippo has provided a good platform for a commflagging script. New features of my version are;
- Determine ad breaks from clusters of silences. Solves those occasional glitches caused by silences within programmes and does a pretty good job on animations/kids programmes. Also allows the silence detection to be more sensitive (to pick up short and/or long silences) as rogue ones will be ignored.
- Integrates the script with Myth logging. Works well with rsyslog (Mythbuntu). Should also work with file logging but I haven't tested it.
- Allows parameters to be varied per-channel and per-programme. Useful for channels with 'noisier' ad breaks, ie. Dave, and regular programmes where the defaults don't suit you.
- Sends ad breaks to mythplayer as they are found. If you start watching a prog before it has finished recording then the comm-skipping will still work (assuming you're not too close to real-time).
Algorithm
An advert is defined as a cluster of silences, at least <minbreak> long, that is composed of at least <mindetect> silences all within <maxsep> of each other.
In practice, silences are detected as a consecutive series of frames having an average audio power below <threshold> for at least <minquiet>. If the interval between a silence and the previous one is less than <maxsep> then they belong to the same cluster; otherwise they lie in different clusters. Clusters that are shorter than <minbreak> or composed of less than <mindetect> silences are ignored. Adverts are shortened by <padding> on both sides.
Although adverts are reported in real-time, all silences and clusters are stored - I originally envisaged using post-scan analysis to amend the detected adverts. However, so far, this hasn't proved necessary or viable.
Building/Change Summary
- catagrower.cpp has minor mods to take its timeout from an arg. This is to allow the script to be run manually from the command line.
- silence.cpp replaces mausc.c. New algorithm. This now depends on Qt/Myth libs in order to send messages to mythplayer.
- silence.py replaces mausc-wrapper.py. I've updated the deprecated arg parsing, integrated Myth logging and added channel/prog preset handling. It can reside anywhere but I keep mine in /usr/local/bin. It expects the C++ executables to reside in /usr/local/bin/ though.
- Build the C++ files using the Makefile (ie. "sudo make"). This will put the executables in /usr/local/bin/. The Makefile works for me using gcc 4.6.3 (Ubuntu 12.04) & Myth 0.26. I'm no expert on C++ standards so earlier versions may need some tinkering.
Notes
- I only use Freeview SD, so I downmix my stereo reception to 1 channel to improve performance. Refer to Hippo's comments above regarding the number of channels and update silence.py (kUpmix_Channels) accordingly.
- I also reduce the audio sample rate (silence.py line 177, "-ar 8000") to reduce the data throughput. Ultimately all channels/samples are reduced to a single audio power per frame and I haven't noticed any qualitative difference from this optimisation. However it could affect the mythffmpeg load; if loading/performance is important to you, you may wish to experiment with this.
- silence.py utilises mythffmpeg but, as Hippo states, you can simply replace with avconv/ffmpeg. I notice no difference.
- <minbreak> and <mindetect> do not apply to pre-roll/post-roll (starting/ending) 'adverts'.
- All programmes will be processed. However, if it originates from a channel marked as comm-free, only pre-roll and post-roll breaks will be detected. This is useful for finding the start of BBC programmes.
- Mythplayer will not auto-skip pre-roll/post-roll breaks. When starting playback you need to manually comm-skip to the programme start.
- The log information can be initially confusing - bear in mind the algorithmic process when interpreting it. The interval of a silence always relates to the previous silence; the interval reported by a cluster always relates to the previous cluster. Silences report their audio power whereas clusters report the number of silences they contain.
- If processing manually, note that the Python will clear any existing comm-skip list on startup. Be aware that this also appears to erase the bookmark (and maybe other mark-ups).
- UK commercials are usually 10-60 secs long. However I have found occasional film trailers that are 2 min long (of constant noise). Thus <maxsep> defaults to 120. If you don't mind the odd trailer then reducing <maxsep> to 60 or 90 would probably reduce erroneous cuts.
Running
Assuming you use the same locations, your Advert-detection command should be:
/usr/local/bin/silence.py %JOBID% %VERBOSELEVEL% --loglevel debug
You can also run it manually from the command line like this:
silence.py --chanid 1004 --starttime 20130117220000 --loglevel debug
INFO logging reports details of the clusters/cuts, DEBUG logging also reports details of the detected silences.
Channel Presets
When run on its own the Python program uses decent defaults that work pretty well.
However it's also possible to specify parameters to use for specific channels or programmes. A preset file defines values that override the defaults according to programme title or channel callsign. Only one preset can apply - the first applicable - so care is needed when deciding the order. The title/callsigns are considered to be Python regular expressions so beware of the meta-characters. The 8th field is ignored and so can be used for comments/notes. Specify a preset file using the --presetfile option, like this:
/usr/local/bin/silence.py %JOBID% %VERBOSELEVEL% --loglevel debug --presetfile /home/eric/.mythtv/silence.preset
This is my preset file which customises the processing of 4 regular programmes and 'tunes' some channels.
# presets for silence.py # use comma separated values: defaults are used for absent values # For titles/callsign the name is a python regular expressions, case is ignored. # Re Metachars are # . ^ $ * + ? { } [ ] \ | ( ) # If a title contains one of these, then escape it (using \) or replace it with full stop # Names are matched to the START of a title/callsign so "e4" also matches "e4+1" # First name match is used so put specific presets (ie. programmes) before general ones (channels) # # title/callsign, threshold, minquiet, mindetect, minbreak, maxsep, padding # defaults -75, 0.16, 6, 120, 120, 0.48, # frasier, , 0.28, , , 91, , long pauses in prog channel 4 news, , 1.00, 1, 55, , 0, short advert, many silences milkshake, , 0.48, 8, 60, 61, , ignore short silences in animation/links rude tube, , 0.32, , 180, 61, , ignore short silences in links channel 4, , 0.24, more 4, , 0.24, dave, -71, , , , , , loud silences quest, , , , , 55, , short silences, long breaks, short ads channel 5, , 0.24, 2, , 300, , cut news out of films itv, , , , , , 1.0, long pad for films film 4, , , , , , 1.0, long pad for films bbc, , 0.48, 1, 20, 360, 0, pre/post-roll cbeebies, , 0.48, 1, 20, 360, 0, pre/post-roll cbbc, , 0.48, 1, 20, 360, 0, pre/post-roll
Once you understand the logging information you can easily tune your own channels/programmes by experimenting with the --preset option directly from a command line until you get decent results. For example;
silence.py --chanid 1004 --starttime 20130117220000 --loglevel debug --preset "-80,,3,,180"
/* Copyright 2012 Crackers Phipps. */ /* Public domain. */ /* Compile with gcc -std=c99 -O catagrower.c -o catagrower */ /* MythTV files are often large. */ #define _FILE_OFFSET_BITS 64 #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> static void usage(const char *name) { fprintf(stderr, "Usage: %s <file> <timeout>\n", name); fprintf(stderr, "<file> : file to be monitored.\n"); fprintf(stderr, "<timeout>: secs to wait for input.\n"); fprintf(stderr, "The contents of the file will be copied to stdout.\n"); fprintf(stderr, "Copying will stop when the file has stopped growing.\n"); } int main(int argc, char **argv) { /* Check usage. */ if (3 != argc) { usage(argv[0]); exit(1); } /* Load options. */ int fd; if (-1 == (fd = open(argv[1], O_RDONLY))) { fprintf(stderr, "Could not open %s for reading.\n", argv[1]); usage(argv[0]); exit(2); } const int timeout = atoi(argv[2]); #define BUFFSIZE 4096 int timer = timeout; char buffer[BUFFSIZE]; int bytes; while (timer > 0) { while (0 != (bytes = read(fd, buffer, BUFFSIZE))) { if (-1 == write (STDOUT_FILENO, buffer, bytes)){ fprintf(stderr, "Write failed.\n"); exit(3); } timer = timeout; } sleep(1); timer--; } return 0; }
// Based on mausc.c by Tinsel Phipps. // v1.0 Roger Siddons // Public domain. Requires libsndfile, libQtCore, libmythbase-0.x, libmyth-0.x // Detects commercial breaks using clusters of audio silences #include <sndfile.h> #include <unistd.h> #include <mythcorecontext.h> #include <mythcontext.h> #include <mythversion.h> #include <programtypes.h> #include <QCoreApplication> #define DELIMITER '@' // must correlate with python wrapper typedef unsigned frameNumber_t; typedef unsigned frameCount_t; class Silence // A detected silence { public: enum state_t {detected, progStart, progEnd, beyondEnd}; static const char state_log[5]; frameNumber_t start; // frame of start frameNumber_t end; // frame of end frameCount_t length; // number of frames frameCount_t interval; // frames between end of last silence & start of this one double power; // average power level state_t state; // type of silence Silence() : start(0), end(0), length(0), interval(0), power(0), state(detected) {} Silence(frameNumber_t _start, frameNumber_t _end, state_t _state) : start(_start), end(_end), length(_end - _start + 1), interval(0), power(0), state(_state) {} void restart(frameNumber_t frame, double _power, state_t _state = detected) // Define start of a silence { start = end = frame; length = 1; interval = 0; power = _power; state = _state; } void extend(frameNumber_t frame, double _power) // Define end of a silence { end = frame; length = frame - start + 1; // maintain running average power, allowing for missing frames // = (oldpower * (newlength - 1) + newpower)/ newlength // a missing frame is assigned a power of the current average power += (_power - power)/length; } }; // c++0x doesn't allow initialisation within class const char Silence::state_log[5] = {' ', '<', '>', 'v'}; class Cluster // A cluster of silences { public: enum state_t {unset, preroll, advert, tooshort, toofew, postroll}; static const char state_log[6]; Silence* start; // first silence Silence* end; // last silence unsigned silenceCount; // number of silences frameCount_t length; // number of frames frameCount_t interval; // frames between end of last cluster and start of this one state_t state; Cluster() : start(0), end(0), silenceCount(0), length(0), interval(0), state(unset) {} void restart(Silence* _start) // Define start of a cluster { start = end =_start; silenceCount = 1; length = _start->length; state = unset; } void extend(Silence* _end) // Define end of a cluster { end = _end; silenceCount++; length = _end->end - start->start + 1; } }; // c++0x doesn't allow initialisation within class const char Cluster::state_log[6] = {'.', '<', '-', '#', '?', '>'}; class ClusterList // Stores a list of detected silences and a list of assigned clusters { protected: const unsigned useMinDetect; const frameCount_t useMinLength; // list of detected silences QList<Silence*> silence; // list of deduced clusters of silences QList<Cluster*> cluster; public: ClusterList(unsigned minDetect, frameCount_t minLength) : useMinDetect(minDetect), useMinLength(minLength) {} ~ClusterList() { // release contents of lists qDeleteAll(silence); qDeleteAll(cluster); } Silence* addSilence(Silence& newSilence) // Adds a silence detection to the end of the silence list { Silence* ref = NULL; // set interval between this & previous good silence if (Silence* prev = getLastSilence()) newSilence.interval = newSilence.start - prev->end - 1; else // no previous newSilence.interval = newSilence.start - 1; // do not store fake silence if (newSilence.state != Silence::beyondEnd) { // store silence in ClusterList ref = new Silence(newSilence); silence.push_back(ref); } return ref; } Silence* insertStartSilence() // Inserts a fake silence at the front of the silence list { // create a single frame silence at frame 1 and insert it at front Silence* ref = new Silence(1, 1, Silence::progStart); silence.push_front(ref); return ref; } Silence* getLastSilence() // Returns previous silence in list { if (silence.isEmpty()) return NULL; else return silence.last(); } Cluster* addCluster(Cluster& newCluster) // Adds a cluster to end of the cluster list { // ignore empty cluster at prog end if (newCluster.end->state == Silence::progEnd && newCluster.length == 1) { // delete prog end silence as it serves no purpose silence.removeLast(); return NULL; } else { // link to previous cluster // set interval between new cluster & previous one (or prog start) if (cluster.isEmpty()) newCluster.interval = newCluster.start->start - 1; else newCluster.interval = newCluster.start->start - cluster.last()->end->end - 1; // set state if (newCluster.start->start == 1) newCluster.state = Cluster::preroll; else if (newCluster.end->state == Silence::progEnd) newCluster.state = Cluster::postroll; else if (newCluster.length < useMinLength) newCluster.state = Cluster::tooshort; else if (newCluster.silenceCount < useMinDetect) newCluster.state = Cluster::toofew; else newCluster.state = Cluster::advert; // store cluster in ClusterList Cluster* ref = new Cluster(newCluster); cluster.push_back(ref); return ref; } } }; // List of completed silences & clusters ClusterList* clist; // the items currently being detected/built Silence currentSilence; Cluster currentCluster; // Player update message QString updateMessage; // Audio detection settings from args // Audio level of silence int useThreshold; // Minimum length of a silence to register frameCount_t useMinQuiet; // Minimum number of silences that constitute an advert unsigned useMinDetect; // adverts must be at least this long frameCount_t useMinLength; // silence detections must be closer together than this to be in the same cluster frameCount_t useMaxSep; // padding for each cut frameCount_t usePad; // sample rate (maps time to frame count) const float kvideoRate = 25.0; // fps const frameCount_t krateInMins = kvideoRate * 60; // frames per min // true if prog originates from a comm-free channel int commfree; // bool static void usage(const char *name) { printf("err%cUsage: %s <threshold> <minquiet> <mindetect> <minlength> <maxsep> <pad> <commfree> <progid>\n", DELIMITER, name); printf("err%c<threshold>: (float) silence threshold in dB.\n", DELIMITER); printf("err%c<minquiet> : (float) minimum time for silence detection in seconds.\n", DELIMITER); printf("err%c<mindetect>: (float) minimum number of silences to constitute an advert.\n", DELIMITER); printf("err%c<minlength>: (float) minimum length of advert break in seconds.\n", DELIMITER); printf("err%c<maxsep> : (float) maximum time between silences in an advert break in seconds.\n", DELIMITER); printf("err%c<pad> : (float) padding for each cut point in seconds.\n", DELIMITER); printf("err%c<commfree> : (int) 1 if prog is comm-free, 0 otherwise.\n", DELIMITER); printf("err%c<progid> : (string) chan_starttime of program (for player updates).\n", DELIMITER); printf("err%cAn AU format file should be fed into this program.\n", DELIMITER); printf("err%cExample: %s -75 0.1 5 60 90 0 1004_20121003090000 < audio.au\n", DELIMITER, name); } void parseArgs(int argc, char **argv) // Parse args and convert to useable values (frames) { /* Check usage. */ if (9 != argc) { usage(argv[0]); exit(1); } float argThreshold; // db float argMinQuiet; // secs float argMinDetect; float argMinLength; // secs float argMaxSep; // secs float argPad; // secs char progid[100]; /* Load options. */ if (1 != sscanf(argv[1], "%f", &argThreshold)) { printf("err%cCould not parse threshold option into a number", DELIMITER); exit(2); } if (1 != sscanf(argv[2], "%f", &argMinQuiet)) { printf("err%cCould not parse minquiet option into a number", DELIMITER); exit(2); } if (1 != sscanf(argv[3], "%f", &argMinDetect)) { printf("err%cCould not parse mindetect option into a number", DELIMITER); exit(2); } if (1 != sscanf(argv[4], "%f", &argMinLength)) { printf("err%cCould not parse minlength option into a number", DELIMITER); exit(2); } if (1 != sscanf(argv[5], "%f", &argMaxSep)) { printf("err%cCould not parse maxsep option into a number", DELIMITER); exit(2); } if (1 != sscanf(argv[6], "%f", &argPad)) { printf("err%cCould not parse pad option into a number", DELIMITER); exit(2); } if (1 != sscanf(argv[7], "%d", &commfree)) { printf("err%cCould not parse commfree option into a number", DELIMITER); exit(2); } if (1 != sscanf(argv[8], "%s", progid)) { printf("err%cCould not parse progid option into a string", DELIMITER); exit(2); } /* Scale threshold to integer range that libsndfile will use. */ useThreshold = rint(INT_MAX * pow(10, argThreshold / 20)); /* Scale times to frames. */ useMinQuiet = ceil(argMinQuiet * kvideoRate); useMinDetect = (int)argMinDetect; useMinLength = ceil(argMinLength * kvideoRate); useMaxSep = rint(argMaxSep * kvideoRate + 0.5); usePad = rint(argPad * kvideoRate + 0.5); updateMessage = "COMMFLAG_UPDATE " + QString(progid) + ' '; printf("debug%cThreshold=%.1f, MinQuiet=%.2f, MinDetect=%.1f, MinLength=%.1f, MaxSep=%.1f," " Pad=%.2f\n", DELIMITER, argThreshold, argMinQuiet, argMinDetect, argMinLength, argMaxSep, argPad); printf("debug%cFrame rate is %.2f, Detecting silences below %d that last for at least %d frames\n", DELIMITER, kvideoRate, useThreshold, useMinQuiet); printf("debug%cClusters are composed of a minimum of %d silences closer than %d frames and must be\n", DELIMITER, useMinDetect, useMaxSep); printf("debug%clonger than %d frames in total. Cuts will be padded by %d frames\n", DELIMITER, useMinLength, usePad); printf("debug%c< preroll, > postroll, - advert, ? too few silences, # too short, = comm marked\n", DELIMITER); printf("info%c Start - End Start - End Duration Interval Level/Cnt\n", DELIMITER); printf("info%c frame - frame (mmm:ss-mmm:ss) frame (mm:ss.s) frame (mmm:ss)\n", DELIMITER); } void report(const char* err, const char type, const char* msg1, const frameNumber_t start, const frameNumber_t end, const frameNumber_t interval, const int power) // Logs silences/clusters/cuts in standard format { frameCount_t duration = end - start + 1; printf("%s%c%c %7s %6d-%6d (%3d:%02ld-%3d:%02ld), %4d (%2d:%04.1f), %5d (%3d:%02ld), [%7d]\n", err, DELIMITER, type, msg1, start, end, (start+13) / krateInMins, lrint(start / kvideoRate) % 60, (end+13) / krateInMins, lrint(end / kvideoRate) % 60, duration, (duration+1) / krateInMins, fmod(duration / kvideoRate, 60), interval, (interval+13) / krateInMins, lrint(interval / kvideoRate) % 60, power); } void makeCut(frameNumber_t start, frameNumber_t end) // Logs cuts and sends player update message { // log cut report("cut", '=', "Cut", start, end, 0, 0); // Update player // add comma unless it's first cut if (!updateMessage.endsWith(' ')) updateMessage += ','; updateMessage += QString("%1:%2,%3:%4") .arg(start).arg(MARK_COMM_START) .arg(end).arg(MARK_COMM_END); gCoreContext->SendMessage(updateMessage); // printf("debug%c Sending %s\n", DELIMITER, updateMessage.toAscii().constData()); } void analyseSilence() // Determines if current silence belongs to existing cluster or starts a new one { Silence* thisSilence; Silence* prevSilence = clist->getLastSilence(); // record new detected silence thisSilence = clist->addSilence(currentSilence); // check for a cluster break if (currentSilence.interval <= useMaxSep) { if (!prevSilence) { // First silence is close to prog start so extend cluster to the start // by inserting a fake silence at prog start and starting the cluster there currentCluster.restart(clist->insertStartSilence()); } // add this silence to current cluster currentCluster.extend(thisSilence); } else // start of new cluster { // for first silence there is no previous cluster if (prevSilence) { // complete previous cluster, which finished at previous silence currentCluster.end = prevSilence; Cluster* c = clist->addCluster(currentCluster); // log cluster report("info", c->state_log[c->state], "Cluster", c->start->start, c->end->end, c->interval, c->silenceCount); // create cut switch (c->state) { case Cluster::preroll: makeCut(c->start->start, c->end->end - usePad); break; case Cluster::postroll: makeCut(c->start->start + usePad, c->end->end); break; case Cluster::advert: makeCut(c->start->start + usePad, c->end->end - usePad); break; default:; } } // this silence is the start of a new cluster if (thisSilence) currentCluster.restart(thisSilence); } // log silence report("debug", thisSilence->state_log[thisSilence->state], "Silence", thisSilence->start, thisSilence->end, thisSilence->interval, thisSilence->power); } int main(int argc, char **argv) // Detect silences and allocate to clusters { // Require Myth context for sending player update messages QCoreApplication a(argc, argv); QCoreApplication::setApplicationName("silence"); MythContext* gContext = new MythContext(MYTH_BINARY_VERSION); if (!gContext->Init( false, /*use gui*/ false, /*prompt for backend*/ false, /*bypass auto discovery*/ false)) /*ignoreDB*/ { printf("err%cContext initialisation failed\n", DELIMITER); exit(1); } gCoreContext->ConnectToMasterServer(); /* Check the input is an audiofile. */ SF_INFO metadata; SNDFILE* input = sf_open_fd(STDIN_FILENO, SFM_READ, &metadata, SF_FALSE); if (NULL == input) { sf_perror(NULL); return sf_error(NULL); } parseArgs(argc, argv); /* Allocate data buffer to contain audio data from one video frame. */ const size_t frameSamples = metadata.channels * metadata.samplerate / kvideoRate; int* samples = (int*)malloc(frameSamples * sizeof(int)); if (NULL == samples) { perror(NULL); return errno; } // initialise silence/cluster list with relevant limits clist = new ClusterList(useMinDetect, useMinLength); // flush output buffer after every line setvbuf(stdout, NULL, _IOLBF, 0); // start outside of a silence bool in_silence = false; // Process the input one frame at a time and process cuts along the way. frameNumber_t frames = 0; while (frameSamples == static_cast<size_t>(sf_read_int(input, samples, frameSamples))) { frames++; // determine average audio level in this frame long avgabs = 0; for (unsigned i = 0; i < frameSamples; i++) avgabs += abs(samples[i]); avgabs = avgabs / frameSamples; // check for a silence if (avgabs < useThreshold) { if (in_silence) { // extend current silence currentSilence.extend(frames, avgabs); } else // transition to silence { in_silence = true; // start of new silence currentSilence.restart(frames, avgabs); } } else if (in_silence) // transition out of silence { in_silence = false; // process completed silence if it's long enough if (currentSilence.length >= useMinQuiet) analyseSilence(); } } // Ensure there's a silence at prog end so that any post-roll cluster extends right to the end. // If we're already in silence then use the existing start... if (!in_silence) { // ..otherwise generate a dummy silence at prog end currentSilence.restart(frames, 0, Silence::progEnd); } analyseSilence(); // Generate a dummy silence a long time after prog end to complete any unfinished cluster currentSilence.restart(frames + useMaxSep + 2, 0, Silence::beyondEnd); analyseSilence(); }
#!/usr/bin/env python # Build a skiplist from silence in the audio track. # Roger Siddons v1.0 import MythTV import os import subprocess import argparse import collections import re import sys kExe_Catagrower = '/usr/local/bin/catagrower' kExe_Mausc = '/usr/local/bin/silence' kUpmix_Channels = '1' kInput_Timeout = '30' class MYLOG( MythTV.MythLog ): "A specialised logger" def __init__(self, db): "Initialise logging" MythTV.MythLog.__init__(self, 'm', db) def log(self, msg, level=MythTV.MythLog.INFO): "Log message" # prepend string to msg so that rsyslog routes it to correct logfile MythTV.MythLog.log(self, MythTV.MythLog.COMMFLAG, level, 'mythcommflag: ' + msg.rstrip('\n')) class PRESET: "Manages the presets (parameters passed to the detection algorithm)" # define arg ordering and default values argname = ['thresh', 'minquiet', 'mindetect', 'minbreak', 'maxsep', 'pad'] argval = [ -75, 0.16, 6, 120, 120, 0.48] # dictionary holds value for each arg argdict = collections.OrderedDict(zip(argname, argval)) def _validate(self, k, v): "Converts arg input from string to float or None if invalid/not supplied" if v is None or v == '': return k, None try: return k, float(v) except ValueError: self.logger.log('Preset ' + k + ' (' + str(v) + ') is invalid - will use default', MYLOG.ERR) return k, None def __init__(self, _logger): "Initialise preset manager" self.logger = _logger def getFromArg(self, line): "Parses preset values from command-line string" self.logger.log('Parsing presets from "' + line + '"', MYLOG.DEBUG) if line: # ignore empty string vals = [i.strip() for i in line.split(',')] # split individual params # convert supplied values to float & match to appropriate arg name validargs = map(self._validate, self.argname, vals) # remove missing/invalid values from list & replace default values with the rest self.argdict.update(dict(filter(lambda (k,v): False if v is None else (k,v), validargs))) def getFromFile(self, filename, title, callsign): "Gets preset values from a file" self.logger.log('Using preset file "' + filename + '"', MYLOG.DEBUG) try: with open(filename) as presets: for rawline in presets: line = rawline.strip() if line and (not line.startswith('#')): # ignore empty & comment lines vals = [i.strip() for i in line.split(',')] # split individual params # match preset name to recording title or channel pattern = re.compile(vals[0], re.IGNORECASE) if pattern.match(title) or pattern.match(callsign): self.logger.log('Using preset "' + line.strip() + '"') # convert supplied values to float & match to appropriate arg name validargs = map(self._validate, self.argname, vals[1:min(len(vals),len(self.argname))]) # remove missing/invalid values from list & replace default values with the rest self.argdict.update(dict(filter(lambda (k,v): False if v is None else (k,v), validargs))) break else: self.logger.log('No preset found for "' + title + '" or "' + callsign + '"') except IOError: self.logger.log('Presets file "' + filename + '" not found', MYLOG.ERR) return self.argdict def getValues(self): "Returns params as a list of strings" return [str(i) for i in self.argdict.values()] def main(): "Commflag a recording" # define options parser = argparse.ArgumentParser(description='Commflagger') parser.add_argument('--preset', help='Specify values as "Threshold, MinQuiet, MinDetect, MinLength, MaxSep, Pad"') parser.add_argument('--presetfile', help='Specify file containing preset values') parser.add_argument('--chanid', help='Use chanid for manual operation') parser.add_argument('--starttime', help='Use starttime for manual operation') parser.add_argument('jobid', nargs='?', help='Myth job id') # must set up log attributes before Db locks them MYLOG.loadArgParse(parser) MYLOG._setmask(MYLOG.COMMFLAG) # parse options args = parser.parse_args() db = MythTV.MythDB() logger = MYLOG(db) if args.jobid: job = MythTV.Job(args.jobid, db) chanid = job.chanid starttime = job.starttime timeout = kInput_Timeout elif args.chanid and args.starttime: job = None chanid = args.chanid starttime = args.starttime timeout = '1' else: logger.log('Both chanid and starttime must be specified', MYLOG.ERR) sys.exit(1) # get recording try: rec = MythTV.Recorded((chanid, starttime), db) except: if job: job.update({'status':job.ERRORED, 'comment':'ERROR: Could not find recording.'}) logger.log('Could not find recording', MYLOG.ERR) sys.exit(1) channel = MythTV.Channel(chanid, db) logger.log('') logger.log('Processing: ' + str(channel.callsign) + ', ' + str(rec.starttime) + ', "' + str(rec.title) + ' - ' + str(rec.subtitle) + '"') if rec.commflagged == 3: commfree = 1 logger.log('--- Comm-free programme - will detect pre-roll & post-roll adverts only') else: commfree = 0 sg = MythTV.findfile(rec.basename, rec.storagegroup, db) if sg is None: if job: job.update({'status':job.ERRORED, 'comment':'ERROR: Local access to recording not found.'}); logger.log('Local access to recording not found', MYLOG.ERR) sys.exit(1) # player update message needs prog id progId = str(chanid) + '_' + str(starttime) # create params with default values param = PRESET(logger) if args.preset: param.getFromArg(args.preset) elif args.presetfile: # use preset file param.getFromFile(args.presetfile, rec.title, channel.callsign) infile = os.path.join(sg.dirname, rec.basename) # Purge any existing skip list and flag as in-progress rec.markup.clean() rec.commflagged = 2 rec.update() # Write out the file contents and keep going till recording is finished. p1 = subprocess.Popen([kExe_Catagrower, infile, timeout], stdout = subprocess.PIPE) # Pipe through avconv to extract uncompressed audio stream. p2 = subprocess.Popen(["mythffmpeg", "-loglevel", "quiet", "-i", "pipe:0", "-ar", "8000", "-f", "au", "-ac", kUpmix_Channels, "-"], stdin = p1.stdout, stdout = subprocess.PIPE) # Pipe to mausc which will spit out a list of breaks. p3 = subprocess.Popen([kExe_Mausc] + param.getValues() + [str(commfree)] + [progId], stdin = p2.stdout, stdout = subprocess.PIPE) # Store breaks in the database. breaks = 0 level = {'info':MYLOG.INFO, 'debug':MYLOG.DEBUG, 'err':MYLOG.ERR} while True: line = p3.stdout.readline() if line: flag, info = line.split('@', 1) if flag == 'cut': # extract numbers from log numbers = re.findall('\d+', info) # mark advert in recording rec.markup.append(numbers[0], rec.markup.MARK_COMM_START, None) rec.markup.append(numbers[1], rec.markup.MARK_COMM_END, None) rec.update() breaks += 1 logger.log(info) else: # use warning for unexpected log levels logger.log(info, level.get(flag, MYLOG.WARNING)) else: break if job: job.update({'status':272, 'comment':'Audio commflag detected %s breaks.' % breaks}); logger.log('Audio commflag detected %s breaks.' % breaks) # Signal comflagging has finished if commfree: rec.commflagged = 3 else: rec.commflagged = 1 rec.update() if __name__ == '__main__': main()
CC=g++ CFLAGS=-c -Wall -std=c++0x LDFLAGS= INCPATH = -I/usr/include/qt4/QtCore -I/usr/include/qt4/QtNetwork -I/usr/include/qt4/QtSql -I/usr/include/qt4 -I/usr/include/mythtv LIBPATH = -L/usr/lib -L/usr/lib/i386-linux-gnu LIBS = -lsndfile -lQtCore -lmythbase-0.26 -lmyth-0.26 PREFIX = /usr/local/bin all: silence.cpp catagrower.cpp silence catagrower catagrower: catagrower.o $(CC) $(LDFLAGS) catagrower.o -o $(PREFIX)/$@ silence: silence.o $(CC) $(LDFLAGS) silence.o -o $(PREFIX)/$@ $(LIBPATH) $(LIBS) .cpp.o: $(CC) $(CFLAGS) $(INCPATH) $< -o $@ clean: rm -f $(PREFIX)/silence $(PREFIX)/catagrower *.o