MythNetvision Grabber Script Format

From MythTV Official Wiki
Revision as of 02:20, 8 January 2010 by Iamlindoro (talk | contribs) (A MythNetvision Video Item)

Jump to: navigation, search

MythNetvision Grabbers

Grabber API Search Tree view Channel icon Download Location [1] bliptv.png Included in MythNetvision [2] dailymotion.jpg Included in MythNetvision [3] joost.png Included in MythNetvision [4] mtv.png Included in MythNetvision [5] tmdb_nv.png Included in MythNetvision [6] vimeo.jpg Included in MythNetvision [7] youtube.png Included in MythNetvision
  • If you develop your own grabber please add its details to this table
  • Grabbers posted here must not violate the source site's TOS (Terms of Service). If your script is not according to the site's TOS, please host and advertise it yourself.
  • Grabbers listed here do not have to be based on an API. Grabbers can be "screen scrapers" as long as they do not violate TOS. Just be prepared for more frequent support requests as site changes will often break a screen scraping grabber.

Grabber Script Standards

  • All grabbers must support a search and/or a tree view option. Supporting both is optional.
  • Grabbers expect that the environment is set to UTF-8.
  • A search grabber by default should return a maximum of 20 video items per page. This usually keeps the response time to a reasonable duration.
  • A search grabber must support paging with a specific page number specified through the "-p" or "--pagenumber" command line option. The grabber must interpret the page number into what the source api requires. Sometimes it is a page number and sometimes a video index (e.g. 21st video of the search results).
  • A grabber's file name must have a matching image file name for an icon to be displayed by the MythNetvision plugin.

Grabber Script Command Line Options

Command Line Option Arguments Example Description
-v None /usr/share/mythtv/mythnetvision/scripts/ -v
  • This option is used by MythNetvision to identify the grabber's name and its capabilities, which can include search and/or tree view.
  • The format is grabber title then a "|" character followed by the grabber's supported options search and/or tree view.
  • Example: "YouTube|ST" says that the "YouTube" grabber supports both MythNetvision search and tree view options.
-p Unsigned Integer /usr/share/mythtv/mythnetvision/scripts/ -p 2 -S "A Search Term"
  • Page number. Always includes an integer and is only used with the search option (-S). When not provided the page number defaults to one.
-S Text String /usr/share/mythtv/mythnetvision/scripts/ -p 2 -S "A Search Term"
  • Search option. Requires a search term made of one or more words.
  • Only required if the grabber supports a Search view.
-T None /usr/share/mythtv/mythnetvision/scripts/ -T
  • Tree view. Returns an properly formatted tree view RSS feed.
  • Only required if a grabber supports a Tree View.

MythNetvision Search Scripts

Let's examine the following example of a search script:

Command Line:

/ -p 1 -S "Peep Show"

Returned XML:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
        <description>Share your videos with friends, family, and the world.</description>
            <title>Peep Show S02E02 P01</title>
            <pubDate>Sun, 06 Jan 2008 21:44:36 GMT</pubDate>
            <description>Peep Show season 2, episode 2, part 1/3</description>
                <media:thumbnail url=''/>
                <media:content url='' duration='' width='' height='' lang=''/>
            <title>Peep Show S02E01 P01</title>
            <pubDate>Sun, 06 Jan 2008 20:13:10 GMT</pubDate>
            <description>Peep Show Season 2, episode 1, part 1/3</description>
                <media:thumbnail url=''/>
                <media:content url='' duration='' width='' height='' lang=''/>

The above is a search command which returns two items. Let's break down the results so that you can see how it works.

Establishing The Namespace

 <?xml version="1.0" encoding="UTF-8"?>
 <rss version="2.0"

All MythNetvision returns (Search and Tree) start with the above. This is called setting the XML namespace. Namespaces are custom definitions of RSS/XML tags that can be used in your document. Presently the media, amp, and itunes tags are used to some extent, but others may be supported in the future so the above should tentatively be used on all returns.

An RSS "Channel"

        <description>Share your videos with friends, family, and the world.</description>

All RSS returns in MythNetvision are segmented into "channels." Although it's not generally necessary to have more than one channel in a return, it is possible and acceptable to do so. A channel is a logical grouping of results with a title, description, and some information to allow control of paging in search scripts. Let's take a look at what each of these tags means.

Tag name Description
channel Tag identifies the start of a channel.
title The title of the source of the metadata which will be displayed by the MythNetvision plugin.
link URL of the home page for the source of the metadata.
description Descriptive text about the metadata source.
numresults The number of results returned from a search. Usually this is total items or total pages of items. This tag is only used in search mode.
returned This number reflects the maximum number of items per page or the number returned when a search does not find a full page worth of video items. This tag is only used in search mode.
startindex The start index of the next page or next video item. If this number is less than or equal to the number of results the MythNetvision plugin will not allow the user to request a next page. This tag is only used in search mode.

A MythNetvision Video Item

            <title>Peep Show S02E02 P01</title>
            <pubDate>Sun, 06 Jan 2008 21:44:36 GMT</pubDate>
            <description>Peep Show season 2, episode 2, part 1/3</description>
                <media:thumbnail url=''/>
                <media:content url='' length='' duration='' width='' height='' lang=''/>

The above represents a single viewable video in MythNetvision. By manipulating the format of the information, you can express that the item is downloadable, web view only, and present different types of metadata.

Tag name Description
item Identifies the start of a video item.
title The description or plot text for the video.
author The author who submitted the video to the source web site. Often this is submitter's user id or username.
pubDate The date that the video was submitted to the site. The format of "Thu, 17 Dec 2009 13:51:34 GMT" must always be used. If you cannot retrieve the timezone, simply specify GMT. If you do not know the time then set it to all zeros "00:00:00".
description The text description of the video. A description should not contain any HTML or line-feeds.
link A URL to the web page containing the video or directly to the video player. If this link is not identical to the "media:content url" tag then that indicates to MythNetvision that the "media:content url" file can be downloaded. Unless the source site TOS allows a video file to be downloaded then the "link" and "media:content url" URLs should be identical (Web View Only).
player The executable name to an external player application. An example might be something like:

Not yet fully implemented.

playerargs The arguments passed to the external binary in the <player> tags. An example might be something like:
<playerargs>-fs -zoom mms://</playerargs>

Not yet fully implemented.

download The executable name to an external downloader application. An example might be something like:

Not yet fully implemented.

downloadargs The arguments passed to the external binary in the <download> tags. An example might be something like:
<downloadargs>-outpath /home/someguy/ -outfile somefile.flv</downloadargs>

Not yet fully implemented.

media:group Indicates the start of a video's thumbnail and metadata. The item must eventually be closed with a "</media:group>"
media:thumbnail url A URL to the thumbnail image for the video file. The source site does not always have a thumbnail for a video. When there are multiple thumbnails choose the first one over 200 pixels wide as it will display well but keep download times reasonable.
media:content The main tag for the video link and video metadata.
url A URL to the video file/player/web page. This link should only differ from the URL "link" tag when the URL points to a downloadable video file.
length Unlike what it appears to be, length is actually the filesize in bytes. This is only used for downloadable media.
duration The duration of the video in seconds. Some sources use a format of "MM:SS". This value must be converted to seconds.
width The width in pixels of the video.
height The height in pixels of the video.
lang The ISO language code of the video. Two character language codes
rating A floating point number representing a numerical rating for the video. There is no consistency for this value between the source sites. The most common values are X out or 5 or 10.

Clean This Mess Up, Mister!


For those unfamiliar with XML, RSS, or HTML, *all* tags must be closed or parsing the document will fail.

MythNetvision Tree Grabber Scripts

Let's examine the following example of a Site Tree building script:

Command Line:

/ -T

Returned XML: -T |more
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
        <description>Share your videos with friends, family, and the world.</description>
            <directory name="Feeds" thumbnail="/usr/share/mythtv/mythnetvision/icons/youtube.png">
                <directory name="Highest Rated" thumbnail="/usr/share/mythtv/mythnetvision/icons/directories/topics/rated.png">
                        <title>Project for Awesome - My Public Access Channel!</title>
                        <pubDate>Thu, 17 Dec 2009 13:51:34 GMT</pubDate>
                        <description>Please support all the Project for Awesome videos today with ratings/comments! Thank you! Thank you to Hank and John Green, Dan Brown and everyone 
involved!!!  I chose the public access station where I began What the Buck! They are fundraising to help with their new building project.  You can help buy simply signing up for th
is site and then when you shop, they get donations from that! Yay! Thanks if you can sign up! (its Free!) LOL xoxo Michael   Please sign up:</description>
                            <media:thumbnail url=''/>
                            <media:content url='' duration='259' width='' height='' lang=''/>

Wow, this is remarkably familiar, isn't it? The format of a site tree is very similar to the search view, and extremely similar to any old RSS feed. Let's only look at the items which differ from a search script.

Directories/Tree Hierarchy

            <directory name="Feeds" thumbnail="/usr/local/share/mythtv/mythnetvision/icons/youtube.png">
                <directory name="Highest Rated" thumbnail="/usr/local/share/mythtv/mythnetvision/icons/directories/topics/rated.png">

Shocking as it might seem, the only new element added in a tree grabber is the directory tag, which allows you to create a filesystem-like structure for your tree. The above example contains two nested directories, meaning the represented path would be:

YouTube/Feeds/Highest Rated/

Directories include a name, which is the directory name, and a thumbnail attribute, which is an option URL or absolute path to a folder icon.

Tag name Description
directory Identifies the start of a directory. The directory must eventually be closed with a "</directory>"
name Descriptive label for the directory. This description will be displayed by the MythNetvision plugin.
thumbnail An image URL or absolute file path for the image icon that will be displayed by the MythNetvision plugin.

Metadata standards

Consider the following "rules of thumb" when writing your grabber:

  • When a search returns no results then no RSS data is returned, exit with a return code of zero.
  • Do not include any video items or directories that do not have at least one video link.
  • Always include every tag even if the source does not have any data for that tag. Leave the content empty.
  • Whenever possible always use a video link that auto starts the video and displayed in full screen.
    • Sometimes the method to auto start or display in full screen is not included in the API documentation. The video's HTML Web page source can sometimes provide this information.
  • Replace any ampersand characters "&" with "&amp;".
  • Remove line feed characters and HTML tags from any text fields. This is usually an issue with the description and title text of a video.
  • Do not use the word "video" in any of your Tree view directory descriptions or grabber names as it is redundant (MythNetvision is a video plugin, after all).
  • Additional tags may by included by a grabber but they will be ignored by the MythNetvision plugin.

Tree view thumbnail icon images

  • Each directory tag can include a thumbnail image.
  • The thumbnail may be an absolute path or a URL.
  • The MythNetvision core icon set is found in "/usr{/local}/share/mythtv/mythnetvision/icons"
  • Included with MythNetvision are a set of generic directory icon images that can be used. See "/usr{/local}/share/mythtv/mythnetvision/icons/directories".

Error handling

  • Standard out (stdout) is reserved exclusively for valid RSS data read by the MythNetvision plugin.
  • Standard error (stderr) is reserved exclusively for error or grabber processing messages.
  • Logging is optional.
  • Return code is zero for any processing that completed without a script abort. This includes a search or treeview that did not return any results.
  • Return code of one if an error forced the premature termination of the grabber.
  • If a grabber script skips bad data during process an error about the bad data should be sent to stderr but a return code of zero would still be returned indicating the grabber completed its processing successfully.

Python grabber development

Although a grabber can be developed in any computer language that can support the grabber standards, a framework for grabber development is available for python. If you use the framework as the basis for your grabber, you will be able to focus on site-specific processing without having to rewrite generic grabber functions.

See this wiki link for details: MythNetvision python grabber development