[mythtv-users] Track down unstable hardware?

Steven Adeff adeffs.mythtv at gmail.com
Thu Feb 12 17:13:35 UTC 2009


On Thu, Feb 12, 2009 at 11:39 AM,  <jarpublic at gmail.com> wrote:
> On Wed, Feb 11, 2009 at 11:36 PM, Steven Adeff <adeffs.mythtv at gmail.com>
> wrote:
>> On Wed, Feb 11, 2009 at 9:10 PM,  <jarpublic at gmail.com> wrote:
>> > On Wed, Feb 11, 2009 at 8:16 PM, Brian Wood <beww at beww.org> wrote:
>> >> On Wednesday 11 February 2009 17:42:37 jarpublic at gmail.com wrote:
>> >> > At this point I am getting off topic for this list. It is certainly
>> >> > some
>> >> > hardware failure. When it fails I can't get it to reboot. When I try
>> >> > to
>> >> > boot from a live CD I get the same kernel panic. However, I would
>> >> > hate
>> >> > get
>> >> > rid of the whole system, just because I am too ignorant to track down
>> >> > exactly which pieced of hardware is failing. Does anybody know a good
>> >> > linux
>> >> > list that may be able to help me track down which bit of hardware is
>> >> > going
>> >> > bad? It is especially challenging because if I let the system sit for
>> >> > a
>> >> > while it will boot up an work fine for some some indeterminate amount
>> >> > of
>> >> > time. I have used lm-sensors to track temps and nothing seems to be
>> >> > hot,
>> >> > all of the fans are running, and I have checked all of the drives for
>> >> > bad
>> >> > blocks. I don't know what else to do at this point. I don't want to
>> >> > bother
>> >> > the list anymore but does somebody know the right group to bother
>> >> > about
>> >> > troubleshooting linux hardware?
>> >>
>> >> A machine that always works after being off for a while probably has
>> >> some
>> >> sort
>> >> of thermal problem. Sensors are seldom helpful, as this could be on
>> >> just
>> >> about anything, chips, resistors, or even solder connections.
>> >>
>> >> You might try cooling various components with freeze-spray, that
>> >> sometimes
>> >> helps identify this sort of trouble. Remember that if the problem is on
>> >> a
>> >> chip die or the like it will take several seconds at least before
>> >> things
>> >> start to work after you spray it. Don't be impatient, or you will have
>> >> sprayed lots of components and not know which one it was if it starts
>> >> working.
>> >>
>> >> Otherwise, unless you have a lab full of test gear, the only practical
>> >> troubleshooting method is substitution, replace things one by one with
>> >> known
>> >> good replacements until you find the problem.
>> >>
>> >> I'd suspect the PSU first, but YMMV.
>> >
>> >
>> > A thermal problem seemed to be the most likely problem to me, but I
>> > wasn't
>> > sure how to narrow this thing down. I didn't really consider the power
>> > supply because it doesn't completely crash. It just freezes on the
>> > current
>> > screen, and I lose all input and network. Even if I had hardware around
>> > to
>> > switch out the problem is made complicated by the fact that even the bad
>> > hardware works for some of the time. So it would be hard to say if
>> > switching
>> > a component out help things work because of that component or because
>> > the
>> > failing component happens to be working at that moment. The kernel panic
>> > comes up immediately after grub before anything happens. So I was hoping
>> > that it would be simple to narrow it down to a drive or perhaps there
>> > was
>> > some way to get me some fore verbose error messages.
>> >
>>
>> peripherally following this thread, but I have to agree with Brian
>> that the first thing I would check is the power supply. I've seen
>> similar issues arise from power supply's on their last legs.
>> other than that, without one of those PCI slot-based hardware testers
>> it could be very hard to figure out without swapping out hardware
>> piece by piece.
>>
>> -
>
> Unfortunately it is on old Dell workstation that was decommissioned from
> school. It has some large flat proprietary PSU that covers the bottom of the
> whole case. I don't think it would be easy to replace. It is a P4 beast and
> is big and loud. Maybe it is time to move on. I just have a hard time
> getting rid of old hardware if I can keep it working for something.

if you have another computer you could borrow the PSU to test to make
sure that is the issue then just buy a new case and PSU so you don't
have to buy all new hardware. Of course, when you can buy a modern
motherboard, cpu and ram for <$200 then yes, it may be time to move on
=P

I'm actually having similar issues with my living room frontend right
now. It just randomly freezes and doesn't seem to respond to the
keyboard input at BIOS post time. I replaced the PSU (before I noticed
the keyboard issue) since I knew that the one i had was underpowered
for what the system eventaualy built into, but it still randomly
freezes (and now seems like it won't even boot up). Luckily I have
some spare parts I can swap things around to try and find the issue,
but the keyboard thing scares me that I'll probably have to buy a new
motherboard/etc.

-- 
Steve
http://www.mythtv.org/wiki/index.php/User:Steveadeff
Before you ask, read the FAQ!
http://www.mythtv.org/wiki/index.php/Frequently_Asked_Questions
then search the Wiki, and this list,
http://www.gossamer-threads.com/lists/mythtv/
Mailinglist etiquette -
http://www.mythtv.org/wiki/index.php/Mailing_List_etiquette


More information about the mythtv-users mailing list