[mythtv-users] MythTV, phpMyAdmin and German umlauts
Michael T. Dean
mtdean at thirdcontact.com
Thu Jun 26 21:28:54 UTC 2008
On 06/26/2008 05:10 PM, Torsten Crass wrote:
> German umlauts seem to be a recurring issue with MythTV... Actuall,
> everything works fine for me (DVB-T EPG data, display in frontend etc.)
> -- until I try to edit some mythconverg.recorded entry containing
> umlauts in the title/subtitle/description field using phpMyAdmin.
This will /not/ work. MythTV uses a character encoding that differs
from that MySQL thinks is in use, so phpMyAdmin will mis-interpret
and/or mis-encode the data.
Therefore, Brad's advice is spot-on: "Don't edit those fields with
phpMyAdmin."
>
> Although the umlauts show fine in the MySQL command-line client, in
> phpMyAdmin they get displayed up as funny double-characters. Example
> (though I don't know whether this will survive e-mailing):
>
> "Die größte Waldlandschaft der Erde..."
>
> shows up as
>
> "Die größte Waldlandschaft der Erde..."
>
> I tried changing the affected column's and/or the connection's collation
> to utf8 (which is the default locale on all my systems),
Doing that will likely mean that the data will be corrupted when you
upgrade to 0.22 (if the process you used to change it didn't already
corrupt it)...
> but the only
> change was that little boxes were displayed instead of funny characters...
>
> Someone once said something like "At the first glance, character
> encoding may look rather difficult. But in reality, it's even worse". I
> tend to agree.
>
> So any ideas, anyone?
>
The entire mechanism used for character encoding in MythTV is changing
significantly between 0.21 and 0.22, so the development version of Myth
no longer has the issue you have (though it may have other character
encoding issues, as the conversion is still occurring). That means that
if you really want it fixed, you'll probably need to fix it yourself
since fixing a small UI bug in the already released version that doesn't
apply to the development version is probably low priority for any devs.
Note, however, that the changes you made to your DB mean that if you
"fix" it without first fixing your DB, your fix will be incorrect.
Therefore, you need to change the database schema back /and/ fix the
data that was in the columns you changed (i.e. TRUNCATE TABLE ... is the
best way, so I hope it's only short-term data, like in program).
In MythTV 0.21, the entire MySQL DB /must/ be in latin1 encoding (table
names, column names, and VARCHAR and other text columns). However, the
data within the DB should be proper UTF-8 encoding. Myth encodes the
data before insert and after queries. Likely the part you're using just
forgets to do the appropriate conversion (and it's as simple as adding a
toUtf8() somewhere or vice versa).
The reason that these bugs still existed in 0.21 is because users who
aren't using latin character sets didn't report them as bugs, so the
devs--most of whom are using latin character sets--didn't notice them.
Unfortunately, many users instead followed the instructions on broken
wiki pages or blogs or fora posts ( i.e., such as
http://www.mythtv.org/wiki/index.php/Utf8_Text_in_OSD ).
The worst part is that all the data in those users' databases is likely
to be corrupted during the upgrade to 0.22--likely in a way that
prevents the database upgrade for 0.22 from occurring. That means those
users will probably have to start their MythTV databases over from
scratch (losing all sorts of info, like previous recordings, recording
rules, ...). If they're lucky, they'll be able to get by with a "new
host" database restore (though they'll have to follow a /very/ strict
upgrade procedure to make that work).
If nothing else, 0.22 should fix a bunch of the issues (and those that
remain are far more likely to be reported).
Anyway, lots of "extra" info, just to let those users who have broken
their database schemas start planning for problems with 0.22.
Mike
More information about the mythtv-users
mailing list