[mythtv-users] Hard crashes of mythtv backends
Michael T. Dean
mtdean at thirdcontact.com
Thu Nov 8 04:53:33 UTC 2007
On 11/05/2007 11:07 PM, Peter Schachte wrote:
> Michael T. Dean wrote:
>> On 11/05/2007 01:53 PM, Michael Rice wrote:
>>> One nagging problem that persists is hard crashes of the
>>> backend every so often.
>> I was having similar issues with my master backend... It was purely
>> hardware--and I knew it from the start, but that didn't stop me from
>> living with it for almost exactly a year.
> How did you deduce it was a hardware problem, Mike,
Let's just say that me and my software are a little too close, so I knew
the software wasn't the issue.
I also had many other indications (besides just "knowing" the software
was correct)--2 other machines with identical installs (from an image)
working perfectly (though kernel drivers differed because of differing
hardware); the fact that this particular hardware's reliability was
relatively "untested" in a "high-stress" environment (i.e. the hardware
components I owned--not saying anything of the vendor/model/... in
general); the fact that the motherboard was brand new, so it had never
been known to work reliably; the specific failure modes (i.e. kernel
panics occurring in everything from the OpenSSH daemon to the network
driver to the BOINC program to the video driver to the bash shell--all
panics due to memory corruption). Mainly though, I just "knew" because
I was sure I did the software correctly. :)
> and how did you pin it down
> to the motherboard?
I knew it was hardware from the start. It took me almost the entire
year I lived with it to convince myself it was the (brand-new)
motherboard I had just bought to replace the one I had originally
planned to use--the one whose caps exploded within 2 days of moving it
from being my main desktop to being my master backend. The "convincing"
had a lot to do with having replaced every other piece of hardware (with
the exception of the CPU, hard drives, and capture cards) in the
system. And I mean /every/ other piece: new motherboard, new RAM, new
PSU, new CPU fan, new chipset fan (the one on the brand new MB failed
within a couple of months). From the original system I tried to reuse
(to save money), I ended up only reusing the CPU and the case, and since
the CPU was a socket 462 (socket A) CPU, the replacement motherboard was
more expensive and less functional than newer K8/Core-2 based MB's.
> Intermittent bugs are a PITA to track down; ones that
> surface only once a week, even moreso.
Yep. Mine tended to happen once every 2 weeks to once per month--and
always seemed to occur early in the week when I was out of town on
week-long business trips (of course :)--almost never when I was at
home. (I really need to work out some solution for remote power cycling
so I can "fix" issues that occur when I'm out of town with a remote
reboot. After a kernel panic, shelling in to issue the reboot command
just doesn't work. Ideas, anyone?)
Anyway, there's a reason I lived with it for a year--and only part
(though, admittedly, a large part) of that reason is obstinance. In
many cases, I thought a change (hardware or kernel/driver configuration)
had "fixed" (or at least worked around) the issue because the system
would run for about a month without a failure.
More information about the mythtv-users