Results 1 to 10 of 11
I have a fatal Linux problem that occurs every day or so. The system slows to a bare crawl (cpu loop?), then quickly freezes (locks up).
* Restarting Xorg (with ...
- 02-14-2009 #1Just Joined!
- Join Date
- Feb 2009
- Posts
- 4
Daily freeze, not in Xorg, RAM ok
I have a fatal Linux problem that occurs every day or so. The system slows to a bare crawl (cpu loop?), then quickly freezes (locks up).
* Restarting Xorg (with ctrl-alt-backspace) does not release the freeze.
* Memtest (17 passes) detected no memory errors.
* I have a dual-boot system, but freeze has not appeared (yet) in Windows.
* Commands do start during crawl phase, but hang. For example,
* * the shutdown command started, issued shutdown warning message, then stopped
* * ctrl-alt-backspace terminated Xorg, brought up the boot message screen (VC 1), and restarted Xorg, but left me with the scrambled screen that precedes the clock and the login box
I'd like to know how I can determine whether this problem has been fixed in 11.0. Is it a known problem? I have a bash file that dumps ps and df output every two seconds. What additional diagnostics can I run? How can I find the instructions involved in the apparent loop?
.
Summary of symptoms:
* runlevel: terminating Xorg does NOT free up the system!
* processing: system first slows to a crawl
* processing: shutdown and Xorg commands DO start, but THEN hang
* harddrive: inactive (indicator light off)
* keyboard: auto-repeat function usually stops working first
* keyboard: eventually locks up entirely
* mouse: often locks up as well
* sound: sometimes get a continuous beep or alarm-like wail
* kde: applications refuse to close normally
* kde: receive "not responding" prompt box
* kde: forced termination sometimes has no effect
* kde: if applications do terminate, wallpaper is blank -- no icons
* command-line shutdown:
* : sometimes it is possible to enter a shutdown command
* : shutdown issues "system shutdown message", then hangs
Prior posts pertaining to this FATAL Linux problem:
[opensuse] 10.2 still crashing; KDE freezes; command-line shutdown fails -- 30 Jan 2009
[opensuse] recurring freeze / lock-out while in KDE; xorg in loop? -- 23 Dec 2008
.
I ran the firmware test on the installation dvd and got four FAILS: DMI, PCI, APIC and HPET. Could these be relevant?
* DMI: Out of spec value found
* PCI: Device 000:00:02.0
* APIC: Non-legacy interrupts 0, 1, 6, 8, and 14 incorrectly edge triggered
* HPET: Failed to locate HPET base
.
Here are my system specs:
X Window System Version 7.1.99.902 (7.2.0 RC 2)
Release Date: 13 November 2006
X Protocol Version 11, Revision 0, Release 7.1.99.902
Build Operating System: openSUSE SUSE LINUX
Current Operating System: Linux linux-8ez7 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006 i686
Build Date: 02 June 2008
SaX2 log : SaX2 version 8.1 - SVN Release: 1.49 2003/03/17
kde version: 3.5.5 "release 45.10"
uname gives: Linux h136 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006 i686 i686 i386 GNU/Linux
Monitor:
IBM ThinkVision
Model: L170
Microcode: L170 M-L170-110
Size: 1024x768
Aspect: 4x3
Refresh rate: 47.8kHZ 59Hz
Startup log:
boot.videobios start
Patching video bios Intel 800/900 Series VBIOS Hack : version 0.5.2
Chipset: 845G
BIOS: TYPE 3
Mode Table Offset: $C0000 + $4d2
Mode Table Entries: 25
Patch mode 3c to resolution 1024x768 complete
done
'boot.videobios start' exits with status 0Last edited by Iconoclast; 02-14-2009 at 07:56 PM. Reason: Add hardrive status
- 02-14-2009 #2Linux Newbie
- Join Date
- Jan 2008
- Location
- UK
- Posts
- 211
Hi,
Have you checked all connections are good on the motherboard?
If so, do you have anything in your PCI slots, if so if you can remove remove them and test.
Are you overclocking the motherboard, because I have had similar problems.
One problem was traced to the PCI implementation on the motherboard in conjunction with the card I was using.
And the other to overclocking, it happened with both CPU and BUS clocking. Basically a read write memory problem.
This can also be caused by mixed spec memory.
- 02-14-2009 #3Linux Guru
- Join Date
- Jan 2009
- Location
- Dover, NH
- Posts
- 1,633
I've seen this happen with SuSE 10.2 a lot on medium and low ram systems (less than 384MB). I don't know if it's result of a memory leak or what ever, but the system eventually goes into what I call swap hell, where the hard drive just starts and doesn't stop. If you have xosview running at the time, you can just watch the swap area slowly climb and the paging peak out.
However, I have not found any consistency to the cause. I have a laptop with 256MB that can run 10.2 for days on end no problem, but a dell desktop that had 256 on it that spiraled to a slow crash every other day or so. I just upgraded that computer to 384MB and it hasn't done it since.
I'm guessing that this is a code issue specific to SuSE 10.2. I don't know exactly what causes it or the right way to fix it. I know this is the issue that why on another (almost identical) desktop of a friend I skipped SuSE all toghether and went with XUbuntu. No complaints yet.
- 02-14-2009 #4Just Joined!
- Join Date
- Feb 2009
- Posts
- 4
I'm amazed -- I wasn't expecting such a quick response. Thank you!
I have had swap overflow problems now and then -- mainly due to zen tying up the system, I believe -- but this is different. The harddrive light is off. It has to be a cpu loop at a level lower than Xorg.
Although I have a good keyboard, I'm wondering whether it could be keyboard related -- usually, when it happens, it is just after I press a key. I'll try unplugging the keyboard when it happens again.
Would gdb (the Gnu debugger) help me to get the instruction address or module at the time of the loop?
- 02-14-2009 #5Linux Guru
- Join Date
- Jan 2009
- Location
- Dover, NH
- Posts
- 1,633
If you think it's a CPU loop, you can open a terminal and run top at the first sign of it and see if you can't find what's taking up the juice.
I have no idea since I've never used it, but debuggers I've used many years ago worked on the source, not compiled machine code, so I kind of doubt it. I'll admit ignorance here though and drop it at that.Would gdb (the Gnu debugger) help me to get the instruction address or module at the time of the loop?
- 02-14-2009 #6
A tone alert may indicate CPU overheat Check you CPU fan. That can lead to a slow down and shut down.
- 02-15-2009 #7Linux Guru
- Join Date
- Jan 2009
- Location
- Dover, NH
- Posts
- 1,633
Good catch, I didn't see that. Gogalthorps dead on, this is typical of overheat. In some laptops, the BIOS reports the wrong temperature to the OS (offset by some odd 30-50 deg). A proprietary windows driver will be made aware of this and compensate, operating the cooling fans correctly. However, it'll render the default settings useless in another OS as the fans will never turn on until your processor is already roasted (I suspect this is MS inspired sabotage).* sound: sometimes get a continuous beep or alarm-like wail
Edit your /boot/grub/menu.list, on the kernel line for the default boot option, add the option acpi=off . This will leave the fans (and a few other power save features) in BIOS control and they should operate correctly. The following example is from my laptop:
This is not letter for letter what yours should read, but rather just to show about where to insert. I'm a little out of date (but it works, so I don't see a need to fix it).Code:title openSUSE 10.2 - 2.6.18.8-0.7 root (hd0,1) kernel /boot/vmlinuz-2.6.18.8-0.7-default root=/dev/hda2 vga=773 acpi=off resume=/dev/hda5 splash=silent showopts initrd /boot/initrd-2.6.18.8-0.7-default
- 02-25-2009 #8Just Joined!
- Join Date
- Feb 2009
- Posts
- 4
Thanks for the suggestion, D-cat. I did try it, and I've been testing it for the last few days. I added acpi=off to the boot parameters, and, to ensure there was no actual overheating, I changed the bios fan setting to have the fan run constantly. I also opened up the box and used an external fan to blow on the innards.
No freezes for several days: I was about to come back and report the problem solved.
Then, a few minutes ago, the freeze recurred. Once again, using ctrl-alt-backspace, I was able to kill Xorg. I got the VC 1 / startup messages screen with a "login" prompt, but nothing beyond that.
I tried disconnecting the periperals -- keyboard, mouse, monitor, ethernet line -- to no avail. When I switched the monitor off and back on, the screen did refresh. The caps-lock light did work -- is that indicative? Apart from that, however, mouse and keyboard were dead, and Xorg was not restarting.
- 02-25-2009 #9Linux Guru
- Join Date
- Jan 2009
- Location
- Dover, NH
- Posts
- 1,633
Well, it seems like the first problem was solved, but prior overheating may have weakened the system. I don't think it's the exact same problem because you were able to kill the X server before things went dead and you had control over the keyboard lights.
Were you able to restart the computer? Anything jump out on restart? Does it take an inordinate amount of time to start? If so, do you remember if your root partition was ext3 or Reiser?
- 03-12-2009 #10Just Joined!
- Join Date
- Feb 2009
- Posts
- 4
It looks like the last crash was a fluke. I'm going to consider this problem solved. Thanks for the help, everyone. Sorry it's taken me so long to get back here, but I wanted to be sure problem is not recurring.
It's good to know that Linux is still basically sound -- though the ACPI=ON code leaves room for improvement! It would be nice to get a message or prompt before the system decides to freeze up!
Just out of curiosity, I'd like to find the code where the problem occurs -- and if I were a Linux programmer, I'd pursue it further. The symptoms do narrow things down quite a bit:
* FIRST a drastic slowdown, THEN a complete freeze
* Commands entered before the complete freeze start, issue a message, and THEN hang
* Ctrl-alt-backspace kills and then attempts to restart Xorg, if I hit the keys in the slowdown phase, so problem is not in Xorg.
* No excessive CPU usage reported -- does this mean loop is in the nucleus?


Reply With Quote