[mythtv-users] Track down unstable hardware?
jarpublic at gmail.com
jarpublic at gmail.com
Thu Feb 12 16:39:25 UTC 2009
On Wed, Feb 11, 2009 at 11:36 PM, Steven Adeff <adeffs.mythtv at gmail.com>wrote:
> On Wed, Feb 11, 2009 at 9:10 PM, <jarpublic at gmail.com> wrote:
> > On Wed, Feb 11, 2009 at 8:16 PM, Brian Wood <beww at beww.org> wrote:
> >> On Wednesday 11 February 2009 17:42:37 jarpublic at gmail.com wrote:
> >> > At this point I am getting off topic for this list. It is certainly
> >> > hardware failure. When it fails I can't get it to reboot. When I try
> >> > boot from a live CD I get the same kernel panic. However, I would hate
> >> > get
> >> > rid of the whole system, just because I am too ignorant to track down
> >> > exactly which pieced of hardware is failing. Does anybody know a good
> >> > linux
> >> > list that may be able to help me track down which bit of hardware is
> >> > going
> >> > bad? It is especially challenging because if I let the system sit for
> >> > while it will boot up an work fine for some some indeterminate amount
> >> > time. I have used lm-sensors to track temps and nothing seems to be
> >> > all of the fans are running, and I have checked all of the drives for
> >> > bad
> >> > blocks. I don't know what else to do at this point. I don't want to
> >> > bother
> >> > the list anymore but does somebody know the right group to bother
> >> > troubleshooting linux hardware?
> >> A machine that always works after being off for a while probably has
> >> sort
> >> of thermal problem. Sensors are seldom helpful, as this could be on just
> >> about anything, chips, resistors, or even solder connections.
> >> You might try cooling various components with freeze-spray, that
> >> helps identify this sort of trouble. Remember that if the problem is on
> >> chip die or the like it will take several seconds at least before things
> >> start to work after you spray it. Don't be impatient, or you will have
> >> sprayed lots of components and not know which one it was if it starts
> >> working.
> >> Otherwise, unless you have a lab full of test gear, the only practical
> >> troubleshooting method is substitution, replace things one by one with
> >> known
> >> good replacements until you find the problem.
> >> I'd suspect the PSU first, but YMMV.
> > A thermal problem seemed to be the most likely problem to me, but I
> > sure how to narrow this thing down. I didn't really consider the power
> > supply because it doesn't completely crash. It just freezes on the
> > screen, and I lose all input and network. Even if I had hardware around
> > switch out the problem is made complicated by the fact that even the bad
> > hardware works for some of the time. So it would be hard to say if
> > a component out help things work because of that component or because the
> > failing component happens to be working at that moment. The kernel panic
> > comes up immediately after grub before anything happens. So I was hoping
> > that it would be simple to narrow it down to a drive or perhaps there was
> > some way to get me some fore verbose error messages.
> peripherally following this thread, but I have to agree with Brian
> that the first thing I would check is the power supply. I've seen
> similar issues arise from power supply's on their last legs.
> other than that, without one of those PCI slot-based hardware testers
> it could be very hard to figure out without swapping out hardware
> piece by piece.
Unfortunately it is on old Dell workstation that was decommissioned from
school. It has some large flat proprietary PSU that covers the bottom of the
whole case. I don't think it would be easy to replace. It is a P4 beast and
is big and loud. Maybe it is time to move on. I just have a hard time
getting rid of old hardware if I can keep it working for something.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mythtv-users