RH: system hangs, network connects work, service don't
I got a serious problem with RedHat 7.3 and 8.0:
- Server crashes after 1 to 21 days
- it's not possible to forsee thes events
- mainly occured during peak-hours but today it wasn't
- server still accepts tcpip-connections on ports where services were running before - but (most) services don't respond. e.g. you can telnet to port 110 (pop3), get a connect but can't use pop3 because service doesn't respond.
- DNS still works - guess because it's running completely in memory and not forking any subprocesses / reading from harddisk afaik
- ssh etc. don't work either
- console doesn't work (not even Ctrl-Alt-Del etc.)
exchanged hardware completely, moved from RH 7.3 to RH 8.0 in this step. always used latest kernels from redhat network.
I'm lost!
Anybody have similar problems or can imagine a solution?
Even changed harddisks: only common thing to my two servers was they were both using Maxtor-harddisks. Exchanged them in the new server against Seagate-ones and it seemed to work. But today again it crashed - so it didn't help :-(
Any suggestions? PLEASE
It's not a hardware problem - definitiv!
It's not a hardware-Problem because I have the same on two completely different machines with completely different hardware. Until two weeks ago only only thing common to both servers were Maxtor-harddisks (although the new PC had brand new harddisks!) but I exchanged them against Seagate one's. That wasn't the problem either :-((
On the old server I used Redhat 7.3 with all updates from RHN (Redhat Network), on the new I use 8.0 with all updates. So even software is mostly different.
I don't know what else to do about it, what else to try. I can't even say when (because of what reason) the error occurs. It reoccurs after 1 day to 4 weeks unexpectedly - and not because of high load or anything like that.
Anybody got an idea how to track these problems down?
There are no suspicious entries in the log-files either :-(