[mythtv] SBE core dump due to MythSocket misuse?

faginbagin mythtv at hbuus.com
Fri Nov 18 17:45:26 UTC 2011


On 11/18/2011 10:30 AM, Tom Lichti wrote:
> On Fri, Nov 18, 2011 at 12:06 AM, faginbagin<mythtv at hbuus.com>  wrote:
>> I'm in the process of debugging a deadlock that occurs in my MBE. It seems to be caused by my SBE dumping core after it finishes a recording. I should also mention that I'm running 0.23-fixes, version 26863. However, I think I've found a problem that still exists in the master branch. I hope what I've learned will help the project, so here goes.
>>
>> My SBE is dumping core in the method MainServer::reconnectTimeout() when it calls MythSocket::Unlock() immediately after a call to MythSocket::DownRef(). The call to MythSocket::DownRef() deletes the MythSocket object because the reference count goes negative. When I compare the 0.23 versions of programs/mythbackend/mainserver.cpp and libs/libmythdb/mythsocket.cpp to the master branch versions (mythsocket.cpp has moved to libs/libmythbase), I see the same code, which I think means the master branch is vulnerable to the same core dump I'm seeing on 0.23. In particular:
>>
>> The master version of mythsocket.cpp contains the following, starting at line 102:
>>
>> bool MythSocket::DownRef(void)
>> {
>>     m_ref_lock.lock();
>>     int ref = --m_ref_count;
>>     m_ref_lock.unlock();
>>
>>     LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));
>>
>>     if (m_cb&&  ref == 0)
>>     {
>>         m_cb = NULL;
>>         s_readyread_thread->RemoveFromReadyRead(this);
>>         // thread will downref&  delete obj
>>         return true;
>>     }
>>     else if (ref<  0)
>>     {
>>         delete this;
>>         return true;
>>     }
>>
>>     return false;
>> }
>>
>> The 0.23 version is almost identical. The only difference is the log statement:
>> Master version:
>>     LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));
>> 0.23 version:
>>     VERBOSE(VB_SOCKET, LOC + QString("DownRef: %1").arg(m_ref_count));
>>
>> The master version of programs/mythbackend/mainserver.cpp contains the following, starting at line 6091 in MainServer::reconnectTimeout():
>>
>>     if (!masterServerSock->writeStringList(strlist) ||
>>         !masterServerSock->readStringList(strlist) ||
>>         strlist.empty() || strlist[0] == "ERROR")
>>     {
>>         masterServerSock->DownRef();
>>         masterServerSock->Unlock();
>>         masterServerSock = NULL;
>>
>> The same code can be found in the 0.23 version starting at line 5217.
>>
>> I have confirmed that my SBE is in fact deleting the MythSocket object in the MythSocket::DownRef() method by turning on socket logging. Here's a sample of the log output:
>>
>> 2011-11-17 22:01:07.127 MythSocket(9083638:34): readStringList: Error, timed out after 30000 ms.
>> 2011-11-17 22:01:07.132 MythSocket(9083638:34): state change Connected ->  Idle
>> 2011-11-17 22:01:07.141 MythSocket(9083638:-1): DownRef: -1
>> 2011-11-17 22:01:07.148 MythSocket(9083638:-1): delete socket
>>
>> Then mainserver.cpp calls MythSocket:Unlock() and promptly dumps core. I can think of two fixes:
>>
>> 1) Swap the two calls to MythSocket:DownRef() and MythSocket:Unlock() in mainserver.cpp.
>>
>> 2) Change MythSocket:DownRef() so it doesn't delete itself when ref<  0, perhaps by replacing the "delete this;" statement with "s_readyread_thread->RemoveFromReadyRead(this);"? I don't know the code well enough to know if that would be appropriate, but seeing an object delete itself makes me very uncomfortable.
> 
> Please post your patches, as I see the same problem with the master
> branch. If my master backend is down for whatever reason, my slave
> backend will segfault eventually.
> 
> Thanks!
> Tom

Hi Tom,

I just made the change late last night, and I haven't had a chance to reproduce the circumstances since then. Also, I'm running 0.23, so I don't know my patch would apply cleanly, line numbers are different by over 800. If the following patch doesn't apply cleanly, you could use an editor and make the change by hand. The lines in question are around line number 6095 in master.

--- programs/mythbackend/mainserver.cpp.orig  2010-10-12 19:02:53.000000000 -0400
+++ programs/mythbackend/mainserver.cpp       2011-11-18 00:48:44.000000000 -0500
@@ -5218,8 +5218,8 @@
         !masterServerSock->readStringList(strlist) ||
         strlist.empty() || strlist[0] == "ERROR")
     {
-        masterServerSock->DownRef();
         masterServerSock->Unlock();
+        masterServerSock->DownRef();
         masterServerSock = NULL;
         if (strlist.empty())
         {


More information about the mythtv-dev mailing list