[mythtv-users] Radio Times XMLTV failing

stan at stanandliz.net stan at stanandliz.net
Tue Oct 3 10:42:28 UTC 2006


Not sure if this is related, but as of last night, mythfilldatabase is
failing miserably. Perhaps RadioTimes has changed something? Is anyone
else getting this problem?

Output below:

channel ukhistory.tv (801) not seen on site at /usr/bin/tv_grab_uk_rt line
216.
channel plus-1.thehistorychannel.co.uk (183) not seen on site at
/usr/bin/tv_grab_uk_rt line 216.
channel east.bbc2.bbc.co.uk (106) not seen on site at
/usr/bin/tv_grab_uk_rt line 216.
channel europe.cnbc.com (125) not seen on site at /usr/bin/tv_grab_uk_rt
line 216.
channel plus-1.discoveryeurope.com (152) not seen on site at
/usr/bin/tv_grab_uk_rt line 216.
channel scuzz.tv (1143) not seen on site at /usr/bin/tv_grab_uk_rt line 216.
channel theamp.tv (1144) not seen on site at /usr/bin/tv_grab_uk_rt line 216.
channel magictv.co.uk (588) not seen on site at /usr/bin/tv_grab_uk_rt
line 216.

Stephen

> On Monday 02 October 2006 23:20, malcolm torrent wrote:
>> I'd like to echo Simon's thanks to Neil for the fix.
>> I tried to diagnose this myself (unsuccessfully) so if possible I'd be
>> interested in a short explanation as to how the problem was
>> approached, resolved and why this fix works.
>> Mal.
>
> OK. The problem is corruption in the datafile 1961.dat, which corresponds
> to
> the schedules for ITV4 (running mythfilldatabase from the command line
> shows
> the Unicode wide character \u0000 is not acceptable within an XML
> document).
> So I wget'ed the offending URL and looked at the file with a binary editor
> (bvi), searched for the sequence of null characters.
>
> First thing I thought was to stop the script dying (comment out the
> "croak"
> instruction in the XMLTV code), but then it just died with a "unexpected
> end-of-file" error. So I had to replace the offending text with something
> else, so I stuck in that line in tv_grab_uk_rt which substitutes \u0000
> with
> the text ".." (ie, something harmless). Now, all of that said, there may
> very
> well be a legitimate use of a sequence of two nulls in Unicode (eg, for 3
> or
> 4 byte wide characters), so this kludge can't stay in - it replaces the
> nulls
> without regard for their context in the file.
>
> In the end, I suspect it's just a bit of file corruption from Radio Times.
> It's not happening anywhere else in the data feed, and it'll disappear
> from
> the schedules on Saturday, and we can say goodbye to ugly kludges.
>
> A longer term fix would be for XMLTV to replace offending Unicode
> characters
> with harmless ones, just to be a bit more robust when dealing with
> partially
> corrupted data. I may have a look at this over the weekend.
>
> Cheers,
>
> Neil
> _______________________________________________
> mythtv-users mailing list
> mythtv-users at mythtv.org
> http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users
>
>




More information about the mythtv-users mailing list