[mythtv] [mythtv-commits] Ticket #9918: Incorrect character encoding in xml status (patch provided)

Michael T. Dean mtdean at thirdcontact.com
Thu Jul 14 14:01:12 UTC 2011


On 07/13/2011 09:22 AM, MythTV wrote:
> #9918: Incorrect character encoding in xml status (patch provided)
> -------------------------------------------------+-------------------------
>   Reporter:  Ian Dall<ian@…>                      |          Owner:
>       Type:  Bug Report - General                 |         Status:
>   Priority:  minor                                |  infoneeded_new
> Component:  MythTV - General                     |      Milestone:  unknown
>   Severity:  medium                               |        Version:
>   Keywords:  xml encoding                         |  Unspecified
>                                                   |     Resolution:
>                                                   |  Ticket locked:  0
> -------------------------------------------------+-------------------------
>
> Comment (by Ian Dall<ian@…>):
>
>   Thanks, setting LANG as above does result in UTF-8 encoded xml which
>   parses without error.
>
>   The thing is, if LANG isn't *.UTF-8, then the xml is invalid, as any
>   encoding except UTF-8 must have a "Text Declaration" (or a "Byte Order
>   Mark" if it is UTF-16). [w3c xml 1.0 Section 4.3.3]
>
>   So, I guess it is a low priority given there is a work around, but I would
>   maintain that either UTF-8 locale should be forced (as my patch does) or
>   the text declaration should be added with the encoding which is actually
>   used. Something like:
>
>   {{{
>       QTextCodec *default_enc = QTextCodec::codecForLocale();

This sets a global value which overrides the Qt autodetection--it 
affects all code that's executed, not just the code you're adding.

>       QDomProcessingInstruction encoding =
>   doc.createProcessingInstruction("xml", "version=\"1.0\" encoding=\""
>                                                                            +
>   default_enc->name() + "\"");
>       doc.appendChild(encoding);
>   }}}
>
>   Unfortunately this doesn't work because QTextCodec::codecForLocale()
>   always has a name of "System" in Qt 4.7, so I can't see any clean way to
>   do this.

Right.  This is also why we don't have a log line that tells us which 
encoding Qt is using so we can tell you that you're hitting the Qt 
bug...  I planned to add this, but as you found, the codecForLocale() is 
useless for debugging.

So, regarding this bug, I would say that it /needs/ to be fixed in Qt, 
not here.  Qt does autodetection of system character encoding, and if 
QDomDocument creates a invalid XML stream unless developers override 
that autodetected encoding (and ignore the user-/system-specified 
encoding), Qt is broken.

I started down the rabbit hole to try to figure out what Qt is doing 
wrong so I could report a good bug, but I never got to the root of the 
problem--and was spending far too much of my time on the issue.  I will 
mention that it has an effect on HTTP parsing (thereby affecting 
MythNetvision's requests and MythVideo Storage Group processing stuff) 
and a lot more.

http://www.gossamer-threads.com/lists/mythtv/dev/439348#439348

(Note that some stuff has been changed in MythTV since I last looked, 
and I have a feeling that we just made a previously-not-working case 
work while breaking other previously-working cases.)

Mike


More information about the mythtv-dev mailing list