MythNetvision Grabber Script Format

From MythTV Official Wiki
Revision as of 22:38, 7 January 2010 by Iamlindoro (talk | contribs)

Jump to: navigation, search

Grabbers

Grabber API Search Tree view Channel icon
bliptv.py [1] bliptv.png
dailymotion.py [2] dailymotion.jpg
joost.py [3] joost.png
mtv.py [4] mtv.png
tmdb_nv.py [5] tmdb_nv.png
vimeo.py [6] vimeo.jpg
youtube.py [7] youtube.png
  • If you develop your own grabber please add it's details to this table
  • Grabbers posted here must not violate the source site's TOS (Terms of Service)
  • Grabbers do not have to be based on an API and can be a screen scraper as long as they do not violate TOS. Just be prepared for more frequent support requests as site changes will often break a screen scraping grabber.


Grabber Script Standards

  • All grabbers must at least support a search and/or a tree view option, supporting both is optional
  • Grabbers expect that the environment is set to UTF-8
  • A search grabber by default should return a maximum of 20 video items per page. This usually keeps the response time to a reasonable duration.
  • A search grabber must support paging with a specific page number specified through the "-p" or "--pagenumber" command line option. The grabber must interpret the page number into what the source api requires. Sometimes it is a page number and sometimes a video index (e.g. 21st video of the search results).
  • A grabber's file name must have a matching image file name for an icon to be displayed by the Netvision plug-in

Command line options

"-v", "--version"

  • This option is used by Netvision to identify the grabber's name and it's supported options search and/or tree view
  • The format is grabber title then a "|" character followed by the grabber's supported options search and/or tree view
  • Example: "YouTube|ST" says that the "YouTube" grabber supports both Netvision search and tree view options

"-h:, "--help"

  • Display all options and the command line format
  • Include script version number and author's name

"-u", "--usage"

  • Command line examples on how to invoke the grabber and an abridged example of the RSS results

"-d", "--debug"

  • Debug is optional but recommended
  • Debug when supported should at least display the URL used to access a source's video data
  • Debug when supported can also display the raw data returned by the source. This is useful when a source uses a one-time or time specific URL that cannot be used directly in a browser (e.g. vimeo)
  • The debug data is displayed to standard out (stdout) or in a log

"-p". "--pagenumber"

  • Always includes an integer and is only used with the search option (-S). When not provided the page number defaults to one.

"-l", "--language"

  • Optional parameter but when used must include an two character ISO compliant language code. At this time the support for language codes in searches and tree views are limited. If the source site does not support language filtering base on a language code then this option is ignored.

"-S", "--search"

  • The search option requires a search term made of one or more words.

"-T", "--treeview"

  • Tree view does not require or support any additional options


Netvision RSS standard format

XML and RSS tag format

Both Tree view and Search results start with:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:cnettv="http://cnettv.com/mrss/"
xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
xmlns:media="http://search.yahoo.com/mrss/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:amp="http://www.adobe.com/amp/1.0"
xmlns:dc="http://purl.org/dc/elements/1.1/">

The last line in the RSS results must be:

</rss>

Channel tag format

  • Searches are organised into channels
  • Tree views are organised into channels and directories
  • A channel is usually the description of the source of the videos
   <channel>
       <title>YouTube</title>
       <link>http://www.youtube.com/</link>
       <description>Share your videos with friends, family, and the world.</description>
       <numresults>2092</numresults>
       <returned>20</returned>
       <startindex>40</startindex>
Tag name Description
channel Tag identifies the start of a channel. The channel must eventually be closed with a "</channel>"
title The title of the source of the meta data which will be displayed by the Netvision plug-in
link URL of the home page for the source of the meta data
description Descriptive text relevant for the source of the meta data
numresults The number of results returned from a search. Usually this is total items or total pages of items. This value is not used in a tree view.
returned This number reflects the maximum number of items per page or the number returned when a search does not find a full page worth of video items. This value is not used in a tree view.
startindex The start index of the next page or next video item. If this number is less than or equal to the number of results the Netvision plug-in will not allow the user to request a next page. This value is not used in a tree view.


Directory tag format

  • Directories include a label and icon image used to organise the tree view.
  • Directories can be nested
  • This is the method for the grabber to telling Netvision how to display/organize the video items
<directory name="Feeds" thumbnail="/usr/local/share/mythtv/mythnetvision/icons/youtube.png">
Tag name Description
directory Identifies the start of a directory. The directory must eventually be closed with a "</directory>"
name Descriptive label for the directory. This description will be displayed by the Netvision plug-in
thumbnail An image URL or absolute file path for the image icon that will be displayed by the Netvision plug-in


Item tag format

Each item contains meta data exclusive to a single video

   <item>
       <title>Project for Awesome - My Public Access Channel!</title>
       <author>peron75</author>
       <pubDate>Thu, 17 Dec 2009 13:51:34 GMT</pubDate>
       <description>Please support all the Project for Awesome videos today with ratings/comments! Thank you! Thank you to Hank and John Green, Dan Brown and everyone involved!!!  I chose the public access station where I began What the Buck! They are fundraising to help with their new building project.  You can help buy simply signing up for this site and then when you shop, they get donations from that! Yay! Thanks if you can sign up! (its Free!) LOL xoxo Michael   Please sign up: http://igive.com/wpaa</description>
       <link>http://www.youtube.com/v/tdBHzkoXB_8?f=standard&app=youtube_gdata&autoplay=1</link>
       <media:group>
           <media:thumbnail url='http://i.ytimg.com/vi/tdBHzkoXB_8/hqdefault.jpg'/>
           <media:content url='http://www.youtube.com/v/tdBHzkoXB_8?f=standard&app=youtube_gdata&autoplay=1' duration='259' width='640' height='480' lang='en'/>
       </media:group>
       <rating>4.972514</rating>
   </item>
Tag name Description
item Identifies the start of a video item. The video item must eventually be closed with a "</item>"
title Descriptive label for the video
author The author who submitted the video to the source web site. Often this is submitter's user id.
pubDate The date that the video was submitted to the site. The format of "Thu, 17 Dec 2009 13:51:34 GMT" must always be used. If you do not know the GMT time then set it to all zeros "00:00:00".
description The text description of the video. A description should not contain any HTML or line-feeds.
link A URL to the web page containing the video or directly to the video player. If this link is not identical to the "media:content url" tag then that indicates to Netvision that the "media:content url" file can be downloaded. Unless the source site TOS allows a video file to be downloaded then the "link" and "media:content url" URLs should be identical.
media:group Indicates the start of a video's thumbnail and meta data. The item must eventually be closed with a "</media:group>"
media:thumbnail url A URL to the thumbnail image for the video file. The source site does not always have a thumbnail for a video. When there are multiple thumbnails choose the first one over 200 pixels wide as it will display well but keep download times reasonable.
media:content The main tag for the video link and video meta data
url A URL to the video file/player/web page. This link should only differ from the URL "link" tag when the URL points to a downloadable video file.
duration The duration of the video in seconds. Some sources use a format of "MM:SS". This value must be converted to seconds.
width The width in pixels of the video
height The height in pixels of the video
lang The ISO language code of the video. Two character language codes
rating A floating point number representing a numerical rating for the video. There is no consistency for this value between the source sites. The most common values are X out or 5 or 10.


Meta data standards

  • When a search returns no results then no RSS data is returned, just exit with a return code of zero
  • Do not include any video items or directories that do not have at least one video link
  • Always include every tag even if the source does not have any data for that tag. Leave the content empty.
width=""
<description></description>
  • Whenever possible always use a video link that auto starts the video and displayed in full screen
    • Sometimes the method to auto start or display in full screen is not included in the API documentation. The video's HTML Web page source can sometimes provide this information.
  • Replace any ampersand characters "&" with "&amp;"
  • Remove line feed characters and HTML tags from any text fields. This is usually an issue with the description and title text of a video.
  • Do not use the word "video" in any of your Tree view directory descriptions as it is redundant
  • Additional tags may by included by a grabber but they will be ignored by the Netvision plug-in

Tree view thumbnail icon images

  • Each directory tag should have a thumbnail image
  • The default is a channel URL image or the grabber specific icon image found in ".../mythtv/mythnetvision/icons"
  • Included with Netvision are a set of generic directory icon images that can be used. See ".../mythtv/mythnetvision/icons/directories". You can contribute additional directory icon images but please be consistent with the provided icon image theme.

Errors handling

  • Standard out (stdout) is reserved exclusively for valid RSS data read by the MythTV Netvision plug-in
  • Standard error (stderr) is reserved exclusively for error or grabber processing messages
  • Logging is optional
  • Return code is zero for any processing that completed without a script abort. This includes a search or treeview that did not return any results.
  • Return code of one if an error forced the premature termination of the grabber
  • If a grabber script skips bad data during process an error about the bad data should be sent to stderr but a return code of zero would still be returned indicating the grabber completed its processing successfully.

Python grabber development

Although a grabber can be developed in any computer language that can support the grabber standards a framework for grabber development is available for python. If the framework is adopted by your python grabber you be able to focus on source site specific processing without having to recode generic grabber functions. See this wiki link for details Netvision python grabber development