<div dir="ltr">On Mon, Oct 14, 2013 at 10:05 PM, faginbagin <span dir="ltr">&lt;<a href="mailto:mythtv@hbuus.com" target="_blank">mythtv@hbuus.com</a>&gt;</span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class=""><div class="h5">On 10/12/2013 5:19 PM, MythTV wrote:<br>


&gt; #11899: Garbled CCs on some stations<br>

&gt; -----------------------------------+----------------------------<br>

&gt;   Reporter:  faginbagin &lt;mythtv@…&gt;  |          Owner:  stichnot<br>

&gt;       Type:  Bug Report - General   |         Status:  accepted<br>

&gt;   Priority:  minor                  |      Milestone:  unknown<br>

&gt; Component:  MythTV - Captions      |        Version:  0.27-fixes<br>

&gt;   Severity:  medium                 |     Resolution:<br>

&gt;   Keywords:                         |  Ticket locked:  0<br>

&gt; -----------------------------------+----------------------------<br>

&gt;<br>

&gt; Comment (by stichnot):<br>

&gt;<br>

&gt;   A few observations so far.<br>

&gt;<br>

&gt;   1. Mostly (but not entirely), the garbling is because roughly half the<br>

&gt;   characters aren&#39;t being displayed.<br>

&gt;<br>

&gt;   2. xine shows almost exactly the same garbled captions as MythTV.<br>

&gt;<br>

&gt;   3. As noted, ccextractor (version 0.65 in my case) produces very clean<br>

&gt;   captions.<br>

&gt;<br>

&gt;   4. If you combine the captions from fields 0 and 1 by setting field=0 at<br>

&gt;   the beginning of CC608Decoder::FormatCCField(), most (but not all) of the<br>

&gt;   caption characters are correctly displayed.<br>

&gt;<br>

&gt;   Especially in light of the xine behavior, I suspect an upstream ffmpeg<br>

&gt;   problem, though it&#39;s also possible that it&#39;s a problem in our<br>

&gt;   avformatdecoder.cpp.<br>

<br>

</div></div>Thanks for looking at this. Some questions:<br>

<br>

Where is the &quot;upstream&quot; version of ffmpeg used by xine and mythtv? When I look at:<br>

<a href="https://github.com/FFmpeg/FFmpeg/blob/release/1.2/libavcodec/mpeg12.c" target="_blank">https://github.com/FFmpeg/FFmpeg/blob/release/1.2/libavcodec/mpeg12.c</a><br>

<a href="http://git.videolan.org/?p=ffmpeg.git;a=blob_plain;f=libavcodec/mpeg12.c;hb=refs/heads/release/1.2" target="_blank">http://git.videolan.org/?p=ffmpeg.git;a=blob_plain;f=libavcodec/mpeg12.c;hb=refs/heads/release/1.2</a><br>


these versions of mpeg12.c don&#39;t have any reference to the data members tmp_atsc_cc_len or tmp_scte_cc_len, which leads me to believe I&#39;m not looking at the right upstream repositories, at least when it comes to CC support in ffmpeg. I found this commit:<br>


<a href="http://git.videolan.org/?p=ffmpeg.git;a=commit;h=33d699a4e73d5281b2cfcd0fa355c0d80241dd23" target="_blank">http://git.videolan.org/?p=ffmpeg.git;a=commit;h=33d699a4e73d5281b2cfcd0fa355c0d80241dd23</a><br>

which seems to match the one referenced in this mythtv commit:<br>

<a href="https://github.com/MythTV/mythtv/commit/9c728cb8f19100e9976196709b8258480e72d30b" target="_blank">https://github.com/MythTV/mythtv/commit/9c728cb8f19100e9976196709b8258480e72d30b</a><br>

This leads me to believe videolan is the source of mythtv&#39;s ffmpeg code, but videolan must do something different when it comes to CC processing.<br></blockquote><div><br></div><div>I didn&#39;t think to actually check the pristine upstream version of ffmpeg.  It looks like most of the code in mpeg_decode_user_data() is a MythTV-specific addition in our repository.  This explains a lot.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<br>

I&#39;ve skimmed at the mpeg12.c code and wondered about the way it copies CC data from the source video stream into temp buffers that are later copied to other buffers by mpegvideo.c and then handled by avformatdecoder.cpp. I suspect the mpeg12.c code may be stripping out needed info from the control bytes. I haven&#39;t looked &quot;super&quot; close, but maybe its time to do so. It would help to know where that code came from.<br>

</blockquote><div><br></div><div>I believe the error is introduced in the parsing stage, not the copying stages.  Specifically, at</div><div><a href="https://github.com/MythTV/mythtv/blob/master/mythtv/external/FFmpeg/libavcodec/mpeg12.c#L2258">https://github.com/MythTV/mythtv/blob/master/mythtv/external/FFmpeg/libavcodec/mpeg12.c#L2258</a><br>

</div><div>with the rollup.mpg sample, I can set a breakpoint and observe it parsing cc_data_1 = &#39;X&#39; and cc_data_2 = &#39;E&#39;, and it subsequenty adds this bogus pair of characters to the stream.  I&#39;ve compared this code, the corresponding ccextractor code (user_data() in es_userdata.cpp in the section with the comment &quot;// SCTE 20 user data&quot;), and section 5.5 of the spec <a href="http://www.scte.org/documents/pdf/standards/SCTE%2020%202012.pdf">http://www.scte.org/documents/pdf/standards/SCTE%2020%202012.pdf</a>, line by line, and didn&#39;t find any meaningful differences that would explain this &quot;XE&quot; difference.</div>

<div><br></div><div>There are a couple of differences from the ccextractor code that explain most of the cc608 omissions, at least in the rollup.mpg sample:</div><div><br></div><div>==========</div><div><div>diff --git a/mythtv/external/FFmpeg/libavcodec/mpeg12.c b/mythtv/external/FFmpeg/libavcodec/mpeg12.c</div>

<div>index 6d24889..a813600 100644</div><div>--- a/mythtv/external/FFmpeg/libavcodec/mpeg12.c</div><div>+++ b/mythtv/external/FFmpeg/libavcodec/mpeg12.c</div><div>@@ -2257,13 +2257,13 @@ static void mpeg_decode_user_data(AVCodecContext *avctx,</div>

<div>                 uint8_t line_offset = get_bits(&amp;gb, 5);</div><div>                 uint8_t cc_data_1 = av_reverse[get_bits(&amp;gb, 8)];</div><div>                 uint8_t cc_data_2 = av_reverse[get_bits(&amp;gb, 8)];</div>

<div>-                uint8_t type = (1 == field_no) ? 0x00 : 0x01;</div><div>+                uint8_t type = (2 == field_no) ? 0x01 : 0x00;</div><div>                 (void) priority; // we use all the data, don&#39;t need priority</div>

<div>                 marker &amp;= get_bits(&amp;gb, 1);</div><div>                 // dump if marker bit missing</div><div>                 valid = marker;</div><div>                 // ignore forbidden and repeated (3:2 pulldown) field numbers</div>

<div>-                valid = valid &amp;&amp; (1 == field_no || 2 == field_no);</div><div>+                valid = valid &amp;&amp; (0 != field_no);</div><div>                 // ignore content not in line 21</div><div>                 valid = valid &amp;&amp; (11 == line_offset);</div>

<div>                 if (!valid)</div><div>diff --git a/mythtv/libs/libmythtv/cc608decoder.cpp b/mythtv/libs/libmythtv/cc608decoder.cpp</div><div>index e055718..e25257e 100644</div><div>--- a/mythtv/libs/libmythtv/cc608decoder.cpp</div>

<div>+++ b/mythtv/libs/libmythtv/cc608decoder.cpp</div><div>@@ -143,6 +143,7 @@ static const QChar extendedchar3[] =</div><div> </div><div> void CC608Decoder::FormatCCField(int tc, int field, int data)</div><div> {</div><div>

+    field = 0;</div><div>     int b1, b2, len, x;</div><div>     int mode;</div><div> </div></div><div>==========</div><div><br></div></div></div></div>