Difference between revisions of "Commercial detection with silences"
Dizygotheca (talk | contribs) m (moved Commercial detection with silence for UK freeviewHD to Commercial detection with silences: UK reference is misleading: it also works for Australia, New Zealand, Germany) |
Dizygotheca (talk | contribs) (Add Relevance section & 0.27 tag) |
||
Line 1: | Line 1: | ||
{{Script info | {{Script info | ||
|author=Hippo | |author=Hippo | ||
− | |short= | + | |short=An alternative to mythcommflag that works by detecting short silent periods around commercials. |
− | |long=A python program based on [[Mythcommflag-wrapper]] (thank you Cowbut) that | + | |long=A python/C++ program based on [[Mythcommflag-wrapper]] (thank you Cowbut) that works by detecting short silent periods around commercials. |
|category=Scripts | |category=Scripts | ||
− | |S25=yes|S26=yes}} | + | |S25=yes|S26=yes|S27=yes}} |
+ | == Relevance == | ||
+ | * UK: Works well for Freeview/FreeSat SD/HD, | ||
+ | * Australia: Works for Freeview SD/HD, | ||
+ | * New Zealand: Works, | ||
+ | * Germany: Works | ||
+ | |||
+ | == Initial Version (by Hippo) == | ||
I tried out the scripts in [[Mythcommflag-wrapper]] and they worked well on the Freeview channels I receive but not on the FreeviewHD channels. The reason is that the audio on FreevieHD is an AAC stream and not an MP3 stream. Fixing that would require decoding from AAC and encoding back to MP3 before letting the script analyse the MP3 stream. So I wrote a little C program to analyze an uncompressed audio stream and a Python program to wrap it up and turn the output into a commercial skip list. | I tried out the scripts in [[Mythcommflag-wrapper]] and they worked well on the Freeview channels I receive but not on the FreeviewHD channels. The reason is that the audio on FreevieHD is an AAC stream and not an MP3 stream. Fixing that would require decoding from AAC and encoding back to MP3 before letting the script analyse the MP3 stream. So I wrote a little C program to analyze an uncompressed audio stream and a Python program to wrap it up and turn the output into a commercial skip list. | ||
Line 19: | Line 26: | ||
It's low CPU in that it only decodes the audio stream and since it follows the end of the recording it shouldn't thrash the memory or disk. avconv takes about 2% to decode ITV1-HD on a 1.6GHz Atom Asus motherboard. | It's low CPU in that it only decodes the audio stream and since it follows the end of the recording it shouldn't thrash the memory or disk. avconv takes about 2% to decode ITV1-HD on a 1.6GHz Atom Asus motherboard. | ||
− | == Cluster Detecting Version == | + | == Cluster Detecting Version (by dizygotheca)== |
The basic silence detection algorithm is easily thrown by odd silences that occur within 6 mins of an advert and performs poorly on animations/kids programmes. I was keen to cut adverts out of my kids' shows so I developed an algorithm that detects clusters of silences: adverts are characterised by many silences close together whilst isolated silences within programmes are ignored. | The basic silence detection algorithm is easily thrown by odd silences that occur within 6 mins of an advert and performs poorly on animations/kids programmes. I was keen to cut adverts out of my kids' shows so I developed an algorithm that detects clusters of silences: adverts are characterised by many silences close together whilst isolated silences within programmes are ignored. | ||
Revision as of 18:30, 26 December 2013
Author | Hippo |
Description | A python/C++ program based on Mythcommflag-wrapper (thank you Cowbut) that works by detecting short silent periods around commercials. |
Supports | ![]() ![]() ![]() |
Relevance
- UK: Works well for Freeview/FreeSat SD/HD,
- Australia: Works for Freeview SD/HD,
- New Zealand: Works,
- Germany: Works
Initial Version (by Hippo)
I tried out the scripts in Mythcommflag-wrapper and they worked well on the Freeview channels I receive but not on the FreeviewHD channels. The reason is that the audio on FreevieHD is an AAC stream and not an MP3 stream. Fixing that would require decoding from AAC and encoding back to MP3 before letting the script analyse the MP3 stream. So I wrote a little C program to analyze an uncompressed audio stream and a Python program to wrap it up and turn the output into a commercial skip list.
To use this
- Compile the C program and put it somewhere the Python program can find it. (e.g. /usr/local/bin)
- Copy the Python program to somehwere the backend can find it.
- Follow the instructions on Mythcommflag-wrapper except the job setting should be 'silence.py %JOBID%'
The python program uses avconv to decode the program file to an AU stream. If you don't have avconv replace it with ffmpeg or mythffmpeg (avconv is the new name for ffmpeg). It upconverts the audio to 6 channels so that it works even when the audio switches around. If you know you only ever get stereo you can replace the 6 with 2 to save a bit of CPU power. It might have to go up in future. Up-converting is better because it's low power and always works whereas down-converting may fail depending on your version of avconv/ffmpeg.
This can do near-realtime commflagging by enabling the backend setting to start commflagging when the recording starts. (mythtv-setup/General/Page9-JobQueueGlobal). The programs mark entries in the cutlist <max-break-setting> after the start of a break is detected so this will be after the commercial break has ended. If you are displaying the programme and get too close to the end you will be in the commercials before they are flagged. C'est la vie.
It's low CPU in that it only decodes the audio stream and since it follows the end of the recording it shouldn't thrash the memory or disk. avconv takes about 2% to decode ITV1-HD on a 1.6GHz Atom Asus motherboard.
Cluster Detecting Version (by dizygotheca)
The basic silence detection algorithm is easily thrown by odd silences that occur within 6 mins of an advert and performs poorly on animations/kids programmes. I was keen to cut adverts out of my kids' shows so I developed an algorithm that detects clusters of silences: adverts are characterised by many silences close together whilst isolated silences within programmes are ignored.
Hippo has provided a good platform for a commflagging script. New features of this version are;
- Determine ad breaks from clusters of silences. Solves those occasional glitches caused by silences within programmes and does a pretty good job on animations/kids programmes. Also allows the silence detection to be more sensitive (to pick up short and/or long silences) as rogue ones will be ignored.
- Integrates the script with Myth logging. Works well with rsyslog (Mythbuntu). Should also work with file logging but I haven't tested it.
- Allows parameters to be varied per-channel and per-programme. Useful for channels with 'noisier' ad breaks, ie. Dave, and regular programmes where the defaults don't suit you.
- Sends ad breaks to mythplayer as they are found. If you start watching a prog before it has finished recording then the comm-skipping will still work (assuming you're not too close to real-time).
Algorithm
An advert is defined as a cluster of silences, at least <minbreak> long, that is composed of at least <mindetect> silences that occur within <maxsep> of each other.
In practice, silences are detected as a consecutive series of frames having an average audio power below <threshold> for at least <minquiet>. If the interval between a silence and the previous one is less than <maxsep> then they belong to the same cluster; otherwise they lie in different clusters. Clusters that are shorter than <minbreak> or composed of less than <mindetect> silences are ignored. Adverts are shortened by <padding> on both sides.
Although adverts are reported in real-time, all silences and clusters are stored - I originally envisaged using post-scan analysis to amend the detected adverts. However, so far, this hasn't proved necessary or viable.
Change Summary
- silence.cpp replaces mausc.c. New algorithm. Optionally uses Qt/Myth libs in order to send messages to mythplayer.
- silence.py replaces mausc-wrapper.py. I've updated the deprecated arg parsing, integrated Myth logging and added channel/prog preset handling. It can reside anywhere but I keep mine in /usr/local/bin. It expects the C++ executables to reside in /usr/local/bin/
Upgrading from previous versions
This version communicates with MythPlayer via the Myth Python bindings. Previous versions communicated directly which (optionally) needed Myth & Qt header files to be installed. If you installed libmyth-dev & libqt4-dev just for this reason then they are no longer needed. However be wary of simply uninstalling them - that may break Myth as they also contain libraries. To remove them correctly you will probably have to reinstall Qt & Myth afterwards. It's safer to leave them installed.
Requirements
- Compilation environment (gcc, make) - install package build-essential
- libsndfile for reading audio samples - install package libsndfile-dev
- Python 2.7 for the new argument parser
Building
- Copy silence.cpp, silence.py & Makefile to a new directory and cd there.
- Build the silence executable using "make"
- Install executables & Python script to /usr/local/bin/ using "sudo make install".
- The Makefile works for me using gcc 4.6.3 (Ubuntu 12.04) & Myth 0.26. I'm no expert on C++ standards so earlier versions may need some tinkering.
Notes
- I only use Freeview SD, so I downmix my stereo reception to 1 channel to improve performance. Refer to Hippo's comments above regarding the number of channels and update silence.py (kUpmix_Channels) accordingly.
- You can reduce the audio sample rate (add "-ar 8000" to the avconv command line) to reduce the data throughput. Ultimately all channels/samples are reduced to a single audio power per frame and I haven't noticed any qualitative difference from this optimisation. However it could affect the mythffmpeg load; if loading/performance is important to you, you may wish to experiment with this. I noticed that this doubles the CPU used by avconv without saving any measurable CPU in silence.cpp.
- silence.py uses mythffmpeg but, as Hippo states, you can simply replace with avconv/ffmpeg. I notice no difference.
- <minbreak> and <mindetect> do not apply to pre-roll/post-roll (starting/ending) 'adverts'.
- Mythplayer will not auto-skip pre-roll/post-roll breaks. When starting playback you need to manually comm-skip to the programme start.
- The log information can be initially confusing - bear in mind the algorithmic process when interpreting it. The interval of a silence always relates to the previous silence; the interval reported by a cluster always relates to the previous cluster. Silences report their audio power whereas clusters report the number of silences they contain.
- If processing manually, note that silence.py clears any existing comm-skip list on startup. Be aware that this also appears to erase the bookmark (and maybe other mark-ups).
- UK commercials are usually 10-60 secs long. However I have seen occasional film trailers that are 2 min long (of constant noise). Thus <maxsep> defaults to 120. If you don't mind the odd trailer then reducing <maxsep> to 60 or 90 would probably reduce erroneous cuts.
Running
Assuming you use the same locations, your 'Advert-detection command' (mythtv-setup/General/Page 8) should be:
/usr/local/bin/silence.py %JOBID% %VERBOSEMODE% --loglevel debug
You can also run it manually from the command line like this:
silence.py --chanid 1004 --starttime 20130117220000 --loglevel debug
INFO logging shows details of the clusters/cuts, DEBUG logging also shows details of the detected silences.
My performance of SD content on an ageing ASUS M2NPV-VM/Athlon 3500+:
- Flagging a completed recording on an idle system takes 2 min for a 1 hr recording
- Flagging whilst recording uses about 2% of cpu
Channel Presets
When run on its own the Python program uses decent defaults that work pretty well.
However it's also possible to specify parameters to use for specific channels or programmes. A preset file defines values that override the defaults according to programme title or channel callsign. Only one preset can apply - the first applicable - so care is needed when deciding the order. The title/callsigns are considered to be Python regular expressions so beware of the meta-characters. The 8th field is ignored and so can be used for comments/notes. Specify a preset file using the --presetfile option, like this:
/usr/local/bin/silence.py %JOBID% %VERBOSEMODE% --loglevel debug --presetfile /home/eric/.mythtv/silence.preset
Once you understand the logging information you can easily tune your own channels/programmes by experimenting with the --preset option directly from a command line until you get decent results. For example;
silence.py --chanid=1004 --starttime=20130117220000 --loglevel=debug --preset="-80,,3,,180"
As of v0.26 the Myth database uses UTC time. However the Python bindings (used by the script) use localtime by default. Therefore determining the proper starttime argument can be frustrating, as it depends on your timezone and DST. Using an ISO format starttime (YYYY-MM-DDThh:mm:ss+hh:mm) is useful here. For example, in a timezone of UTC+9 both of the following examples will find a recording that started at 9:58pm.
This will allow you to specify a UTC time (as derived from the Myth database 'recorded' table);
env TZ=UTC silence.py --chanid=1004 --starttime=2013-01-17T12:58:00
Or use local time and add a TZ qualifier;
silence.py --chanid=1004 --starttime=2013-01-17T21:58:00+09:00
This is my preset file which customises the processing of 4 regular programmes and 'tunes' some channels.
# presets for silence.py # use comma separated values: defaults are used for absent values # For titles/callsign the name is a python regular expressions, case is ignored. # Re Metachars are # . ^ $ * + ? { } [ ] \ | ( ) # If a title contains one of these, then escape it (using \) or replace it with full stop # Names are matched to the START of a title/callsign so "e4" also matches "e4+1" # First name match is used so put specific presets (ie. programmes) before general ones (channels) # # title/callsign, threshold, minquiet, mindetect, minbreak, maxsep, padding # defaults -75, 0.16, 6, 120, 120, 0.48, # frasier, , 0.28, , , 91, , long pauses in prog channel 4 news, , 1.00, 1, 55, , 0, short advert, many silences milkshake, , 0.48, 8, 60, 61, , ignore short silences in animation/links rude tube, , 0.32, , 180, 61, , ignore short silences in links channel 4, , 0.24, more 4, , 0.24, dave, -71, , , , , , loud silences quest, , , , , 55, , short silences, long breaks, short ads channel 5, , 0.24, 2, , 300, , cut news out of films itv, , , , , , 1.0, long pad for films film 4, , , , , , 1.0, long pad for films bbc, , 0.48, 1, 20, 360, 0, pre/post-roll cbeebies, , 0.48, 1, 20, 360, 0, pre/post-roll cbbc, , 0.48, 1, 20, 360, 0, pre/post-roll
Australian Channel Presets
The following preset file is configured to suit Australian HD and SD Freeview channels. The defaults provided work well for many channels and shows, the exception being Nine's group of channels which require a different audio threshold. No effort has (yet) been made to tune for individual shows.
# presets for silence.py # use comma separated values: defaults are used for absent values # For titles/callsign the name is a python regular expressions, case is ignored. # Re Metachars are # . ^ $ * + ? { } [ ] \ | ( ) # If a title contains one of these, then escape it (using \) or replace it with full stop # Names are matched to the START of a title/callsign so "e4" also matches "e4+1" # First name match is used so put specific presets (ie. programmes) before general ones (channels) # # title/callsign, threshold, minquiet, mindetect, minbreak, maxsep, padding # defaults -75, 0.16, 6, 120, 120, 0.48, # # Defaults for Australian Freeview channels. NINE DIGITAL, -73, 0.16, 6, 150, 60, 0.48, GEM, -73, 0.16, 5, 120, 60, 0.48, GO!, -73, 0.16, 5, 120, 60, 0.48, 7 Digital, -75, 0.16, 5, 150, 60, 0.48, #ABC1 - No ads, have not bothered attempting to configure for preroll or postroll. #ABC News 24 - No ads #ABC2 - ABC4 - No ads, have not bothered attempting to configure for preroll or postroll. #ABC3 - No ads, have not bothered attempting to configure for preroll or postroll. #7mate – defaults working okay with limited testing #7TWO – defaults working okay so far #TEN Digital – defaults working okay so far #ELEVEN – defaults working okay so far #ONE – defaults working okay so far #SBS ONE – defaults working okay with limited testing #SBS TWO – defaults working okay with limited testing #SBS HD – defaults working okay with limited testing
German Channel Presets
The following preset file is configured to suit German HD and SD Freeview channels. The defaults provided work well for many channels and shows, the exception being ProSieben which require a different audio threshold and minquiet. Not all channels was tested yet.
# presets for silence.py # use comma separated values: defaults are used for absent values # # For titles/callsign the name is a python regular expression, case is ignored. # Re Metachars are # . ^ $ * + ? { } [ ] \ | ( ) # If a title contains one of these, then escape it (using \) or replace it with full stop # Names are matched to the START of a title/callsign so "e4" also matches "e4+1" # First name match is used so put specific presets (ie. programmes) before general ones (channels) # # threshold: (float) silence threshold in dB. # minquiet : (float) minimum time for silence detection in seconds. # mindetect: (float) minimum number of silences to constitute an advert. # minlength: (float) minimum length of advert break in seconds. # maxsep : (float) maximum time between silences in an advert break in seconds. # padding : (float) padding for each cut point in seconds. # # title/callsign, threshold, minquiet, mindetect, minlength, maxsep, padding # defaults , -75, 0.16, 6, 120, 120, 0.48 # prosieben,-90,0.12,,,,1 # channels doing well with defaults # kabel eins # sat.1 # rtl austria # rtl2 # super rtl # vox
Trouble Shooting
- If you get "Local access to recording not found" errors then ensure your Storage Group directories (mythtv-setup/Storage Groups) have backslashes on the end. See [[1]]
- If your comflagging jobs report 126/127 adverts found, this signifies an error when trying to run the job. Check the file permissions for the executables.
Code
silence.cpp
// Based on mausc.c by Tinsel Phipps. // v1.0 Roger Siddons // v2.0 Roger Siddons: Flag clusters asap, fix segfaults, optional headers // v3.0 Roger Siddons: Remove lib dependencies & commfree // v4.0 Kill process argv[1] when idle for 30 seconds. // v4.1 Fix averaging overflow // Public domain. Requires libsndfile // Detects commercial breaks using clusters of audio silences #include <cstdlib> #include <cmath> #include <cerrno> #include <climits> #include <deque> #include <sndfile.h> #include <unistd.h> #include <signal.h> typedef unsigned frameNumber_t; typedef unsigned frameCount_t; // Output to python wrapper requires prefix to indicate level #define DELIMITER "@" // must correlate with python wrapper char prefixdebug[7] = "debug" DELIMITER; char prefixinfo[6] = "info" DELIMITER; char prefixerr[5] = "err" DELIMITER; char prefixcut[5] = "cut" DELIMITER; void error(const char* mesg, bool die = true) { printf("%s%s\n", prefixerr, mesg); if (die) exit(1); } pid_t tail_pid = 0; void watchdog(int sig) { if (0 != tail_pid) kill(tail_pid, SIGTERM); } namespace Arg // Program argument management { const float kvideoRate = 25.0; // sample rate in fps (maps time to frame count) const frameCount_t krateInMins = kvideoRate * 60; // frames per min unsigned useThreshold; // Audio level of silence frameCount_t useMinQuiet; // Minimum length of a silence to register unsigned useMinDetect; // Minimum number of silences that constitute an advert frameCount_t useMinLength; // adverts must be at least this long frameCount_t useMaxSep; // silences must be closer than this to be in the same cluster frameCount_t usePad; // padding for each cut void usage() { error("Usage: silence <tail_pid> <threshold> <minquiet> <mindetect> <minlength> <maxsep> <pad>", false); error("<tail_pid> : (int) Process ID to be killed after idle timeout.", false); error("<threshold>: (float) silence threshold in dB.", false); error("<minquiet> : (float) minimum time for silence detection in seconds.", false); error("<mindetect>: (float) minimum number of silences to constitute an advert.", false); error("<minlength>: (float) minimum length of advert break in seconds.", false); error("<maxsep> : (float) maximum time between silences in an advert break in seconds.", false); error("<pad> : (float) padding for each cut point in seconds.", false); error("AU format audio is expected on stdin.", false); error("Example: silence 4567 -75 0.1 5 60 90 1 < audio.au"); } void parse(int argc, char **argv) // Parse args and convert to useable values (frames) { if (8 != argc) usage(); float argThreshold; // db float argMinQuiet; // secs float argMinDetect; float argMinLength; // secs float argMaxSep; // secs float argPad; // secs /* Load options. */ if (1 != sscanf(argv[1], "%d", &tail_pid)) error("Could not parse tail_pid option into a number"); if (1 != sscanf(argv[2], "%f", &argThreshold)) error("Could not parse threshold option into a number"); if (1 != sscanf(argv[3], "%f", &argMinQuiet)) error("Could not parse minquiet option into a number"); if (1 != sscanf(argv[4], "%f", &argMinDetect)) error("Could not parse mindetect option into a number"); if (1 != sscanf(argv[5], "%f", &argMinLength)) error("Could not parse minlength option into a number"); if (1 != sscanf(argv[6], "%f", &argMaxSep)) error("Could not parse maxsep option into a number"); if (1 != sscanf(argv[7], "%f", &argPad)) error("Could not parse pad option into a number"); /* Scale threshold to integer range that libsndfile will use. */ useThreshold = rint(INT_MAX * pow(10, argThreshold / 20)); /* Scale times to frames. */ useMinQuiet = ceil(argMinQuiet * kvideoRate); useMinDetect = (int)argMinDetect; useMinLength = ceil(argMinLength * kvideoRate); useMaxSep = rint(argMaxSep * kvideoRate + 0.5); usePad = rint(argPad * kvideoRate + 0.5); printf("%sThreshold=%.1f, MinQuiet=%.2f, MinDetect=%.1f, MinLength=%.1f, MaxSep=%.1f, Pad=%.2f\n", prefixdebug, argThreshold, argMinQuiet, argMinDetect, argMinLength, argMaxSep, argPad); printf("%sFrame rate is %.2f, Detecting silences below %d that last for at least %d frames\n", prefixdebug, kvideoRate, useThreshold, useMinQuiet); printf("%sClusters are composed of a minimum of %d silences closer than %d frames and must be\n", prefixdebug, useMinDetect, useMaxSep); printf("%slonger than %d frames in total. Cuts will be padded by %d frames\n", prefixdebug, useMinLength, usePad); printf("%s< preroll, > postroll, - advert, ? too few silences, # too short, = comm flagged\n", prefixdebug); printf("%s Start - End Start - End Duration Interval Level/Count\n", prefixinfo); printf("%s frame - frame (mmm:ss-mmm:ss) frame (mm:ss.s) frame (mmm:ss)\n", prefixinfo); } } class Silence // Defines a silence { public: enum state_t {progStart, detection, progEnd}; static const char state_log[3]; const state_t state; // type of silence const frameNumber_t start; // frame of start frameNumber_t end; // frame of end frameCount_t length; // number of frames frameCount_t interval; // frames between end of last silence & start of this one double power; // average power level Silence(frameNumber_t _start, double _power = 0, state_t _state = detection) : state(_state), start(_start), end(_start), length(1), interval(0), power(_power) {} void extend(frameNumber_t frame, double _power) // Define end of the silence { end = frame; length = frame - start + 1; // maintain running average power: = (oldpower * (newlength - 1) + newpower)/ newlength power += (_power - power)/length; } }; // c++0x doesn't allow initialisation within class const char Silence::state_log[3] = {'<', ' ', '>'}; class Cluster // A cluster of silences { private: void setState() { if (this->start->start == 1) state = preroll; else if (this->end->state == Silence::progEnd) state = postroll; else if (length < Arg::useMinLength) state = tooshort; else if (silenceCount < Arg::useMinDetect) state = toofew; else state = advert; } public: // tooshort..unset are transient states - they may be updated, preroll..postroll are final enum state_t {tooshort, toofew, unset, preroll, advert, postroll}; static const char state_log[6]; static frameNumber_t completesAt; // frame where the most recent cluster will complete state_t state; // type of cluster const Silence* start; // first silence Silence* end; // last silence frameNumber_t padStart, padEnd; // padded cluster start/end frames unsigned silenceCount; // number of silences frameCount_t length; // number of frames frameCount_t interval; // frames between end of last cluster and start of this one Cluster(Silence* s) : state(unset), start(s), end(s), silenceCount(1), length(s->length), interval(0) { completesAt = end->end + Arg::useMaxSep; // finish cluster <maxsep> beyond silence end setState(); // pad everything except pre-rolls padStart = (state == preroll ? 1 : start->start + Arg::usePad); } void extend(Silence* _end) // Define end of a cluster { end = _end; silenceCount++; length = end->end - start->start + 1; completesAt = end->end + Arg::useMaxSep; // finish cluster <maxsep> beyond silence end setState(); // pad everything except post-rolls padEnd = end->end - (state == postroll ? 0 : Arg::usePad); } }; // c++0x doesn't allow initialisation within class const char Cluster::state_log[6] = {'#', '?', '.', '<', '-', '>'}; frameNumber_t Cluster::completesAt = 0; class ClusterList // Manages a list of detected silences and a list of assigned clusters { protected: // list of detected silences std::deque<Silence*> silence; // list of deduced clusters of the silences std::deque<Cluster*> cluster; public: Silence* insertStartSilence() // Inserts a fake silence at the front of the silence list { // create a single frame silence at frame 1 and insert it at front Silence* ref = new Silence(1, 0, Silence::progStart); silence.push_front(ref); return ref; } void addSilence(Silence* newSilence) // Adds a silence detection to the end of the silence list { // set interval between this & previous silence/prog start newSilence->interval = newSilence->start - (silence.empty() ? 1 : silence.back()->end - 1); // store silence silence.push_back(newSilence); } void addCluster(Cluster* newCluster) // Adds a cluster to end of the cluster list { // set interval between new cluster & previous one/prog start newCluster->interval = newCluster->start->start - (cluster.empty() ? 1 : cluster.back()->end->end - 1); // store cluster cluster.push_back(newCluster); } }; Silence* currentSilence; // the silence currently being detected/built Cluster* currentCluster; // the cluster currently being built ClusterList* clist; // List of completed silences & clusters void report(const char* err, const char type, const char* msg1, const frameNumber_t start, const frameNumber_t end, const frameNumber_t interval, const int power) // Logs silences/clusters/cuts in a standard format { frameCount_t duration = end - start + 1; printf("%s%c %7s %6d-%6d (%3d:%02ld-%3d:%02ld), %4d (%2d:%04.1f), %5d (%3d:%02ld), [%7d]\n", err, type, msg1, start, end, (start+13) / Arg::krateInMins, lrint(start / Arg::kvideoRate) % 60, (end+13) / Arg::krateInMins, lrint(end / Arg::kvideoRate) % 60, duration, (duration+1) / Arg::krateInMins, fmod(duration / Arg::kvideoRate, 60), interval, (interval+13) / Arg::krateInMins, lrint(interval / Arg::kvideoRate) % 60, power); } void processSilence() // Process a silence detection { // ignore detections that are too short if (currentSilence->state == Silence::detection && currentSilence->length < Arg::useMinQuiet) { // throw it away delete currentSilence; currentSilence = NULL; } else { // record new silence clist->addSilence(currentSilence); // assign it to a cluster if (currentCluster) { // add to existing cluster currentCluster->extend(currentSilence); } else if (currentSilence->interval <= Arg::useMaxSep) // only possible for very first silence { // First silence is close to prog start so extend cluster to the start // by inserting a fake silence at prog start and starting the cluster there currentCluster = new Cluster(clist->insertStartSilence()); currentCluster->extend(currentSilence); } else { // this silence is the start of a new cluster currentCluster = new Cluster(currentSilence); } report(prefixdebug, currentSilence->state_log[currentSilence->state], "Silence", currentSilence->start, currentSilence->end, currentSilence->interval, currentSilence->power); // silence is now owned by the list, start looking for next currentSilence = NULL; } } void processCluster() // Process a completed cluster { // record new cluster clist->addCluster(currentCluster); report(prefixinfo, currentCluster->state_log[currentCluster->state], "Cluster", currentCluster->start->start, currentCluster->end->end, currentCluster->interval, currentCluster->silenceCount); // only flag clusters at final state if (currentCluster->state > Cluster::unset) report(prefixcut, '=', "Cut", currentCluster->padStart, currentCluster->padEnd, 0, 0); // cluster is now owned by the list, start looking for next currentCluster = NULL; } int main(int argc, char **argv) // Detect silences and allocate to clusters { // Remove logging prefixes if writing to terminal if (isatty(1)) prefixcut[0] = prefixinfo[0] = prefixdebug[0] = prefixerr[0] = '\0'; // flush output buffer after every line setvbuf(stdout, NULL, _IOLBF, 0); Arg::parse(argc, argv); /* Check the input is an audiofile. */ SF_INFO metadata; SNDFILE* input = sf_open_fd(STDIN_FILENO, SFM_READ, &metadata, SF_FALSE); if (NULL == input) { error("libsndfile error:", false); error(sf_strerror(NULL)); } /* Allocate data buffer to contain audio data from one video frame. */ const size_t frameSamples = metadata.channels * metadata.samplerate / Arg::kvideoRate; int* samples = (int*)malloc(frameSamples * sizeof(int)); if (NULL == samples) error("Couldn't allocate memory"); // create silence/cluster list clist = new ClusterList(); // Kill head of pipeline if timeout happens. signal(SIGALRM, watchdog); alarm(30); // Process the input one frame at a time and process cuts along the way. frameNumber_t frames = 0; while (frameSamples == static_cast<size_t>(sf_read_int(input, samples, frameSamples))) { alarm(30); frames++; // determine average audio level in this frame unsigned long long avgabs = 0; for (unsigned i = 0; i < frameSamples; i++) avgabs += abs(samples[i]); avgabs = avgabs / frameSamples; // check for a silence if (avgabs < Arg::useThreshold) { if (currentSilence) { // extend current silence currentSilence->extend(frames, avgabs); } else // transition to silence { // start a new silence currentSilence = new Silence(frames, avgabs); } } else if (currentSilence) // transition out of silence { processSilence(); } // in noise: check for cluster completion else if (currentCluster && frames > currentCluster->completesAt) { processCluster(); } } // Complete any current silence (prog may have finished in silence) if (currentSilence) { processSilence(); } // extend any cluster close to prog end if (currentCluster && frames <= currentCluster->completesAt) { // generate a silence at prog end and extend cluster to it currentSilence = new Silence(frames, 0, Silence::progEnd); processSilence(); } // Complete any final cluster if (currentCluster) { processCluster(); } }
silence.py
#!/usr/bin/env python # Build a skiplist from silence in the audio track. # Roger Siddons v1.0 # v2.0 Fix progid for job/player messages # v3.0 Send player messages via Python # v3.1 Fix commflag status, pad preset. Improve style & make Python 3 compatible # v4.0 silence.cpp will kill the head of the pipeline (tail) when recording finished # v4.1 Use unicode for foreign chars # v4.2 Prevent BE writeStringList errors import MythTV import os import subprocess import argparse import collections import re import sys kExe_Silence = '/usr/local/bin/silence' kUpmix_Channels = '6' # Change this to 2 if you never have surround sound in your recordings. class MYLOG(MythTV.MythLog): "A specialised logger" def __init__(self, db): "Initialise logging" MythTV.MythLog.__init__(self, 'm', db) def log(self, msg, level=MythTV.MythLog.INFO): "Log message" # prepend string to msg so that rsyslog routes it to correct logfile MythTV.MythLog.log(self, MythTV.MythLog.COMMFLAG, level, 'mythcommflag: ' + msg.rstrip('\n')) class PRESET: "Manages the presets (parameters passed to the detection algorithm)" # define arg ordering and default values argname = ['thresh', 'minquiet', 'mindetect', 'minbreak', 'maxsep', 'pad'] argval = [ -75, 0.16, 6, 120, 120, 0.48] # dictionary holds value for each arg argdict = collections.OrderedDict(list(zip(argname, argval))) def _validate(self, k, v): "Converts arg input from string to float or None if invalid/not supplied" if v is None or v == '': return k, None try: return k, float(v) except ValueError: self.logger.log('Preset ' + k + ' (' + str(v) + ') is invalid - will use default', MYLOG.ERR) return k, None def __init__(self, _logger): "Initialise preset manager" self.logger = _logger def getFromArg(self, line): "Parses preset values from command-line string" self.logger.log('Parsing presets from "' + line + '"', MYLOG.DEBUG) if line: # ignore empty string vals = [i.strip() for i in line.split(',')] # split individual params # convert supplied values to float & match to appropriate arg name validargs = list(map(self._validate, self.argname, vals[0:len(self.argname)])) # remove missing/invalid values from list & replace default values with the rest self.argdict.update(v for v in validargs if v[1] is not None) def getFromFile(self, filename, title, callsign): "Gets preset values from a file" self.logger.log('Using preset file "' + filename + '"', MYLOG.DEBUG) try: with open(filename) as presets: for rawline in presets: line = rawline.strip() if line and (not line.startswith('#')): # ignore empty & comment lines vals = [i.strip() for i in line.split(',')] # split individual params # match preset name to recording title or channel pattern = re.compile(vals[0], re.IGNORECASE) if pattern.match(title) or pattern.match(callsign): self.logger.log('Using preset "' + line.strip() + '"') # convert supplied values to float & match to appropriate arg name validargs = list(map(self._validate, self.argname, vals[1:1 + len(self.argname)])) # remove missing/invalid values from list & # replace default values with the rest self.argdict.update(v for v in validargs if v[1] is not None) break else: self.logger.log('No preset found for "' + title.encode('utf-8') + '" or "' + callsign.encode('utf-8') + '"') except IOError: self.logger.log('Presets file "' + filename + '" not found', MYLOG.ERR) return self.argdict def getValues(self): "Returns params as a list of strings" return [str(i) for i in list(self.argdict.values())] def main(): "Commflag a recording" # define options parser = argparse.ArgumentParser(description='Commflagger') parser.add_argument('--preset', help='Specify values as "Threshold, MinQuiet, MinDetect, MinLength, MaxSep, Pad"') parser.add_argument('--presetfile', help='Specify file containing preset values') parser.add_argument('--chanid', help='Use chanid for manual operation') parser.add_argument('--starttime', help='Use starttime for manual operation') parser.add_argument('jobid', nargs='?', help='Myth job id') # must set up log attributes before Db locks them MYLOG.loadArgParse(parser) MYLOG._setmask(MYLOG.COMMFLAG) # parse options args = parser.parse_args() db = MythTV.MythDB() logger = MYLOG(db) be = MythTV.BECache(db=db) if args.jobid: job = MythTV.Job(args.jobid, db) chanid = job.chanid starttime = job.starttime elif args.chanid and args.starttime: job = None chanid = args.chanid starttime = args.starttime else: logger.log('Both chanid and starttime must be specified', MYLOG.ERR) sys.exit(1) # get recording try: rec = MythTV.Recorded((chanid, starttime), db) except: if job: job.update({'status': job.ERRORED, 'comment': 'ERROR: Could not find recording.'}) logger.log('Could not find recording', MYLOG.ERR) sys.exit(1) channel = MythTV.Channel(chanid, db) logger.log('') logger.log('Processing: ' + channel.callsign.encode('utf-8') + ', ' + str(rec.starttime) + ', "' + rec.title.encode('utf-8') + ' - ' + rec.subtitle.encode('utf-8')+ '"') sg = MythTV.findfile(rec.basename, rec.storagegroup, db) if sg is None: if job: job.update({'status': job.ERRORED, 'comment': 'ERROR: Local access to recording not found.'}) logger.log('Local access to recording not found', MYLOG.ERR) sys.exit(1) # player update message needs prog id (with time in Qt::ISODate format) progId = str(chanid) + '_' + str(starttime).replace(' ', 'T') # create params with default values param = PRESET(logger) # read any supplied presets if args.preset: param.getFromArg(args.preset) elif args.presetfile: # use preset file param.getFromFile(args.presetfile, rec.title, channel.callsign) infile = os.path.join(sg.dirname, rec.basename) # Purge any existing skip list and flag as in-progress rec.commflagged = 2 rec.markup.clean() rec.update() # Write out the file contents and keep going till recording is finished. p1 = subprocess.Popen(["tail", "--follow", "--bytes=+1", infile], stdout=subprocess.PIPE) # Pipe through ffmpeg to extract uncompressed audio stream. p2 = subprocess.Popen(["mythffmpeg", "-loglevel", "quiet", "-i", "pipe:0", "-f", "au", "-ac", kUpmix_Channels, "-"], stdin=p1.stdout, stdout=subprocess.PIPE) # Pipe to silence which will spit out formatted log lines p3 = subprocess.Popen([kExe_Silence, "%d" % p1.pid] + param.getValues(), stdin=p2.stdout, stdout=subprocess.PIPE) # Process log output breaks = 0 level = {'info': MYLOG.INFO, 'debug': MYLOG.DEBUG, 'err': MYLOG.ERR} while True: line = p3.stdout.readline() if line: flag, info = line.split('@', 1) if flag == 'cut': # extract numbers from log numbers = re.findall('\d+', info) logger.log(info) # mark advert in database rec.markup.append(int(numbers[0]), rec.markup.MARK_COMM_START, None) rec.markup.append(int(numbers[1]), rec.markup.MARK_COMM_END, None) rec.update() breaks += 1 # send advert skiplist to MythPlayers tuplelist = [(str(x) + ':' + str(rec.markup.MARK_COMM_START), str(y) + ':' + str(rec.markup.MARK_COMM_END)) for x, y in rec.markup.getskiplist()] mesg = 'COMMFLAG_UPDATE ' + progId + ' ' \ + ','.join([x for tuple in tuplelist for x in tuple]) # logger.log(' Sending ' + mesg, MYLOG.DEBUG) result = be.backendCommand("MESSAGE[]:[]" + mesg) if result != 'OK': logger.log('Backend message failed, response = %s, message = MESSAGE[]:[]%s' % (result, mesg), MYLOG.ERR) elif flag in level: logger.log(info, level.get(flag)) else: # unexpected prefix # use warning for unexpected log levels logger.log(flag, MYLOG.WARNING) else: break # Signal comflagging has finished rec.commflagged = 1 rec.update() if job: job.update({'status': 272, 'comment': 'Detected %s adverts.' % breaks}) logger.log('Detected %s adverts.' % breaks) # Finishing too quickly can cause writeStringList/socket errors in the BE. # A short delay prevents this import time time.sleep(1) if __name__ == '__main__': main()
Makefile
CC = g++ CFLAGS = -c -Wall -std=c++0x LIBPATH = -L/usr/lib TARGETDIR = /usr/local/bin .PHONY: clean install all: silence silence: silence.o $(CC) silence.o -o $@ $(LIBPATH) -lsndfile .cpp.o: $(CC) $(CFLAGS) $< -o $@ install: silence silence.py install -p -t $(TARGETDIR) $^ clean: -rm -f silence *.o