[mythtv-users] error when fixing database forced by mythtv upgrade

Michael T. Dean mtdean at thirdcontact.com
Sun Sep 27 19:48:06 UTC 2009


On 09/27/2009 01:59 PM, Petr Stehlik wrote:
> Michael T. Dean píše v Ne 27. 09. 2009 v 08:51 -0400:
>   
>>> BTW, the SELECT * FROM mythconverg.people where name like "Samuel Iva%"
>>> looks as follows:
>>>
>>> "person","name"
>>> 2871,"Samuel Ivaška (Slovenská republika 1983)"
>>> 4094,"Samuel Ivaška (Slovenská republika 1988)"
>>> 644,"Samuel Ivaška Hrají: Vladimír Hajdu"
>>> 3686,"Samuel Ivaška. Hrají: Ferdinand Libant"
>>> 37451,"Samuel Ivaška. Hrají: Ján Havrila. (Slovenská republika
>>> 1988)"
>>>
>>> I cannot see any duplicate entry there.
>>>       
>> The duplication is caused by truncation during conversion due to invalid 
>> encoding of some of your entries.
>>     
>
> See the following output of the same SELECT ran on the original
> 0.21-fixes - no duplicates:
>
> +--------+-------------------------------------------------------------------+
> | person | name                                                              |
> +--------+-------------------------------------------------------------------+
> |   2871 | Samuel Ivaška (Slovenská republika 1983)                        | 
> |   4094 | Samuel Ivaška (Slovenská republika 1988)                        | 
> |    644 | Samuel Ivaška Hrají: Vladimír Hajdu                            | 
> |   3686 | Samuel Ivaška. Hrají: Ferdinand Libant                          | 
> |  37451 | Samuel Ivaška. Hrají: Ján Havrila. (Slovenská republika 1988) | 
> +--------+-------------------------------------------------------------------+
> 5 rows in set (0.05 sec)
>   

If that's what you're seeing from the mysql command-line client when 
queried against your 0.21-fixes database, it proves that every single 
one of those lines is corrupt.  The "š", the "í", and the "á" characters 
should not show up properly in the mysql client unless the column or 
connection was actually set up as a UTF-8 column or connection at one 
time (when the data was inserted).  They should show up as 2 
characters.  ("š" = "Å¡", "í" = "í­­­" (which is 2 chars, but one isn't 
visible), and "á" = "á")  You'll note that in your query after 
restoring the "uncorrupted" backup, they're showing the proper characters.

(You never followed the instructions that used to be on the wiki that 
changed the database schema, converting some columns to utf8, to try to 
"fix" the OSD data, etc., did you?  If so, that's how your data got 
corrupted.)

> It's probably the right time to show the output of MySQL status:
>
> Server version:		5.0.51a-24+lenny2 (Debian)
> Server characterset:	latin1
> Db     characterset:	latin1
> Client characterset:	latin1
> Conn.  characterset:	latin1
>
> Weird, now I have the database in a different state than before. It used
> to be (sometime around the 0.22 upgrade, not sure if before or after the
> failed upgrade) as follows:
>
> Server characterset: latin1
> Db     characterset: utf-8
> Client characterset: latin1
> Conn.  characterset: latin1
>
> This is different from the example on the web...
>   

The Db characterset will get changed to utf-8 when you first run a 
mythtv trunk program or if you use the mythtv trunk mc.sql to create the 
DB (instead of using the 0.21-fixes mc.sql to create the DB).  It 
/should/ not cause any problems if you're using a good/not-corrupt full 
DB backup or creating a brand-new DB.

> Funny, so if now I have all in 'latin1' maybe I could re-try the
> upgrade?
>   

It's your data that's bad, not your (current) DB config.

>> Since the original upgrade encoding check failure occurred during the 
>> oldrecorded table test, that means that the people and oldprogram tables 
>> are not corrupt.  Therefore, you have the dreaded partial corruption 
>> problem.  Unfortunately, since one of the corrupt tables is 
>> oldrecorded--which is one of the ones you need to restore during a 
>> partial restore--even a partial restore won't work.
>>     
> Do I really need those tables? All I want is the setup configuration,
> the recorded programs and the recording rules. Oldrecorded sounds like
> something old that I no longer need.
>   

oldrecorded is the recording history.  If you want to throw away years 
of history and start over with duplicate matching, feel free to throw 
away that table.

>> Your database likely became partially corrupt due to either switching 
>> your MySQL server from a properly-configured one to an improperly 
>> configured one (i.e. moving the database between distros)
>>     
> well, I have upgraded Etch to Lenny recently... Does that count as
> moving between distros?
>   

All that matters is how the database server was configured--if it was 
ever misconfigured at any time during your use of MythTV, data during 
that time was corrupted.

>> My recommendation is--for now--stay on 0.21-fixes 
>>     
> I am back on 0.21-fixes now but I'd like to try that upgrading again.
> Could you please suggest which tables could eventually be truncated
> safely given that I don't need anything old, just the actual recordings
> that are currently saved on the disk? For example the "people" table - I
> didn't know that MythTV collects such information and I never used it
> (and likely never will).

The recommended data to keep when doing a partial restore is that in 
record (recording rules), recorded (current recordings), oldrecorded 
(recording history), recordedprogram and recordedrating (additional 
information about current recordings), and recordedmarkup and 
recordedseek (bookmarks, commercial flagging, cutlists, and seektables).

recordedprogram and recordedrating aren't /that/ important (especially 
for "watch it and delete it" people), and recordedmarkup and 
recordedseek can be completely re-created from scratch (with no data 
loss) by running either mythtranscode --buildindex or mythcommflag 
--rebuild and then running mythcommflag on every recording.

Unfortunately, though, chances are recordedprogram and, especially, 
recordedrating won't cause any issues and recordedmarkup/recordedseek 
can not be corrupted, so skipping any of those is unlikely to make any 
difference.

Mike



More information about the mythtv-users mailing list