[mythtv] ringbuffer.cpp

Fri Feb 14 09:24:43 UTC 2014

On 14 Feb 2014, at 04:32, Jonas Arndt <jonas.arndt at thearndtfamily.com> wrote:

> Hi Guys,
> 
> I have some questions on ringbuffer.cpp. Please realize that I am fully aware that there is a complex         logic behind all this that I do not understand yet. This is where I hope somebody can help out.
> 
> I run a single back-end (Ubuntu) with multiple frond-ends (ZBOXes, also Ubuntu)
> 
> It started out by me being tortured with the never ending 
> 
> =========================================================================
> Decoder ringbuffer.cpp:1086 (WaitForAvail) RingBuf(/second_drive/recordings/3130_20120103210000.mpg): Waited 2.0 seconds for data #012#011#011#011t
> =========================================================================
Maybe I’m totally wrong but could it be that this is the symptom of an underlying problem ?
In the past before kernel 2.6.33 I didn’t have these problems, later on sometimes more or less. What I discovered then was that kswapd was using 100% of CPU because of waiting for I/O. There was plenty of memory, I even disabled swap, but the problem persisted. By disabling swap kswapd in theory has no function at all, nevertheless it ‘halted’ the system foor several seconds. Going to a pre 2.6.33 kernel solved my problems. My conclusion FWIW is that kswapd also only is the messenger not the cause.
> 
> in the backend log file. I was on 0.25 then. I tried running the backend in a virtual machine, in a stand-alone machine, with a new hard drive, difference file systems (ext3, ext4). Nothing helped. I then saw the bug 10428, which was for 0.26. I backported that one but the error message did not go away. I only had real freezes on the front-end between shows and they would last around 4-5 seconds. Finally I took the leap and upgraded to 0.27, but the error persisted. Now it is
> 
> =========================================================================
> Feb 13 11:27:53 myth1204 mythbackend: mythbackend[5567]: I ProcessRequest ringbuffer.cpp:1098 (WaitForAvail) RingBuf(/Reco
> rdings/7730_20140213171157.mpg): Waited 0.2 seconds for data #012#011#011#011to become available... 214228 < 425984
> =========================================================================
> 
> This is when I started to look at ringbuffer.cpp, and specifically the function RingBuffer::WaitForAvail(int count), where the error comes from. I have many questions here and I hope this is the place to ask it.
> 
> Prior to starting the big while loop, that seem to wait for the buffer to fill up, there is a timer t (MythTimer) instantiated and started
> 
> MythTimer t;
>     t.start();
> 
> This seems straight forward. Then the big while loop is started, which seems to be waiting for avail to be at least as big as count (which seems to be the amount of data we want). It test other things as well but they all seem external to the WaitForAvial function.
> 
> In the while loop we call the QWaitCondition wait (generalWait.wait(&rwlock, 250), with a timeout of max 250 milliseconds. Once we are out of that (worst case 250 ms later) we go onto checking if avail now is at least count big. If not we do something I don't understand this and this is where I have questions. First out, here is the if statement
> 
> ==================================================
>         if (avail < count)
>         {
>             int elapsed = t.elapsed();
>             if (elapsed > 500 && low_buffers && avail >= fill_min)
>                 count = avail;
>             else if  (((elapsed > 250) && (elapsed < 500))  ||
>                      ((elapsed >  500) && (elapsed < 750))  ||
>                      ((elapsed > 1000) && (elapsed < 1250)) ||
>                      ((elapsed > 2000) && (elapsed < 2250)) ||
>                      ((elapsed > 4000) && (elapsed < 4250)) ||
>                      ((elapsed > 8000) && (elapsed < 8250)) ||
>                      ((elapsed > 9000)))
>             {
>                 LOG(VB_GENERAL, LOG_INFO, LOC + "Waited " +
>                     QString("%1").arg((elapsed / 250) * 0.25f, 3, 'f', 1) +
>                     " seconds for data \n\t\t\tto become available..." +
>                     QString(" %2 < %3") .arg(avail).arg(count));
>             }
> 
>             if (elapsed > 16000)
>             {
>                 LOG(VB_GENERAL, LOG_ERR, LOC + "Waited " +
>                     QString("%1").arg(elapsed/1000) +
>                     " seconds for data, aborting.");
>                 return false;
>             }
>         }
> ==================================================
> 
> So why do we expect elapsed to be over 500 ms here? In worst case it should be only 250 ms, right? At least the first time we are in the while loop. We would therefore automatically enter the "else if" statement the first time we are in the while loop, and therefore risk outputting the Error.
> 
> Then looking at the "if else" statement, what is going on there?
> 
> else if  (((elapsed > 250) && (elapsed < 500))  ||
>                      ((elapsed >  500) && (elapsed < 750))  ||
>                      ((elapsed > 1000) && (elapsed < 1250)) ||
>                      ((elapsed > 2000) && (elapsed < 2250)) ||
>                      ((elapsed > 4000) && (elapsed < 4250)) ||
>                      ((elapsed > 8000) && (elapsed < 8250)) ||
>                      ((elapsed > 9000)))
>             {
>                 Log ugly message
>             }
> 
> So what is the logic?
> if elapsed  below  500 ms => OK (can't happen the first lapse in the while loop)
> if elapsed between 750 and 1000 => OK
> if elapsed between 1250 and 2000 => OK
> and so on....
> 
> It is obvious that I am missing part of the logic here and for that I apologize. Perhaps if somebody can point me to where I can read up?
> 
> To me it seems that one test for a max value would be enough. That max value would be something that would actually cause a problem for the client. Now it seems we are just polluting the log file with messages for something that seems to be perfectly okay with the client. I also notice that the log level is LOG_INFO so I am wondering why it is visible in the LOG. I run Ubuntu's package and it seems that thing comes with this log level enabled.
> 
> Just for the heck of it I made some changes (and clearly I don't know what I am doing here). 
> I changed the 
> 
> generalWait.wait(&rwlock, 250) to generalWait.wait(&rwlock, 500)
> This change was made mostly as I felt there might be another thread that needed some more time (perhaps something that is writing to the buffer?)
> 
> I then removed everything in the else-if statement and just tested if it was over 1000. 
> With these changes I very rarely get anything in the log. I still get the 4-5 seconds freeze between shows though (even though that doesn't always happen).
> 
> Any clarifications here would be awesome. I have also seen multiple other folks having these error messages (searching the Internet).
> 
> Thanks,
> 
> // Jonas
> _______________________________________________
> mythtv-dev mailing list
> mythtv-dev at mythtv.org
> http://www.mythtv.org/mailman/listinfo/mythtv-dev
> http://wiki.mythtv.org/Mailing_List_etiquette
> MythTV Forums: https://forum.mythtv.org