[mythtv-users] Re: Bleb.org down??

Simon Kenyon simon at koala.ie
Tue Apr 20 08:30:13 EDT 2004


On Tuesday 20 April 2004 13:01, David wrote:
> I think the first problem here is copyright issues.

so what does bleb do about this?

>
> How many users do you think we have in the UK? 10, 20? 1000?
> If there are <100 (my guess) then it's not likely to draw attention and
> it's worth the minimal risk - we're just avoiding hammering the RT site
> and acting as a collective proxy.

the number will grow.
>
> If we make this a publicised / 'official' xmltv feed then we're asking
> for trouble :)
>
> 2nd - your design sounds complex.

yes. i'm a technical architect by trade. complex is what i do :-)

>
> I think the simple thing is to set up 2 systems (you don't need slew of
> them, simple redundancy is fine) running the grabbers and have a simple
> http pull.
>
> bittorrent is great if you're a BB user - not so cool if you want to
> pull a file ;)
>
> Databases are great - but... did you know that some people just use text
> files and directories ;)
> I think modern filesystems tend not to barf at lots of files - old ones
> limit you to a mere 65000 or so ;)
> (even that should see us through the next 20 yrs - in just 1 dir!)
>
> Why not write a nice easy xml file? (gonna have to handle XML anyway so
> no overhead here)
>
> run grab_tv_uk_rt in daily mode.
> Use an XML parser to split out the data on a 1 file per channel/day basis.
> Then provide a rebuild XML parser (simple concatenation?) and let people
> issue 1 wget per channel/day
>
> Almost simple ;)

fine. i like what you're saying. can i ask questions?

does each server run the grabbers in their entirety, so they are independent?
how do we spread the load?
how does the list of servers get distributed/updated?

what has really surprised me is that there is no obvious/off the shelf system 
for doing this. that is, mediating between a screen scraper and a set of 
clients in a distributed way. seems like it would be a common enough problem.

--
simon
>
> David
>
> Rob Willett wrote:
> >Simon,
> >
> >Most of the software needed is readily available.
> >
> >1. My understanding is that bleb.org doesn't simply use one of the
> > existing trawl programs. I *thought* I had read somewhere that Andrew
> > stated this wasn;t the case. I could be wrong though...
> >
> >2. If the XML file is generated automatically each day then all we are
> > really doing is spreading the download across many servers. The
> > syncronisation of these servers is easy using something like wget. If the
> > XML file is generated on each request for the XML file then we have to
> > sync the databases. A little difficult but not that bad, simply sqldump
> > the data into a file, wget it and then reinsert into the mirror.
> > (assuming we're running mySQL)
> >
> >3. My first thought is that something like bittorrent
> >http://bitconjurer.org/BitTorrent/ might be the quickest and easiest way
> > to make this available. Admittedly bittorrent is somewhat overkill for a
> > small file but it's the thought that counts. <grin>. This addresses most
> > of your last few points.
> >
> >If we use bittorrent then the problem becomes a little easier. I've never
> > set bittorrent up but will have a look at it later this week. (Just after
> > I've solved world hunger...)
> >
> >Rob.
> >
> >Quoting Simon Kenyon <simon at koala.ie>:
> >>On Tuesday 20 April 2004 08:01, Rob Willett wrote:
> >>>It may be that bleb.org actually constructs the XML files out of a DB to
> >>
> >>be
> >>
> >>>returned each time, I thought that this would be inefficient since we
> >>
> >>could
> >>
> >>>generate the entire XML ech day, but I don't know how it's setup either.
> >>>
> >>>Who needs to approach Andrew about this?
> >>
> >>a while ago i asked andrew about the software he runs
> >>i was concerned about the single point of failure issue
> >>i also wanted to spread the load and increase the range of channels
> >>supported
> >>[i have ntl cable and sky digital]
> >>
> >>he ignored the question (for whatever reason)
> >>
> >>i would like to build a network of servers to provide this facility
> >>that way it would be fault tolerant
> >>
> >>required:
> >>
> >>a set of grabbers from the various tv stations (tvlive, bbc, etc)
> >>a database to store them
> >>a web system for providing the listsings in xmltv format
> >>a mechanism for providing the list of servers
> >>a mechanism for keeping the server list current
> >>a mechanism for sharing the load
> >>
> >>i am willing and capable of helping with the design, code and hosting of
> >> such
> >>
> >>a system. i need partners to help in the design.
> >>
> >>i would suggest the following:
> >>
> >>a unix like os (i use linux but any should be acceptable - bsd, solaris,
> >> aix,
> >>
> >>etc)
> >>mysql at least, but possibly postgres for the database
> >>perl or python for the grabbers
> >>php or mod_perl for the servers
> >>
> >>? for the server sync - there must be software in this space already!
> >>
> >>--
> >>simon
> >>_______________________________________________
> >>mythtv-users mailing list
> >>mythtv-users at mythtv.org
> >>http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users
> >
> >-------------------------------------------------
> >This mail sent through IMP: http://horde.org/imp/
> >_______________________________________________
> >mythtv-users mailing list
> >mythtv-users at mythtv.org
> >http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users


More information about the mythtv-users mailing list