Welcome to Linux Forums! With a comprehensive Linux Forum, information on various types of Linux software and many Linux Reviews articles, we have all the knowledge you need a click away, or accessible via our knowledgeable members.
Find the answer to your Linux question:
New to Linux Forums? Register here for free!
    Linux Forums > GNU Linux Zone > The Linux Kernel > Where to find the errors when machine hangs up?

Forgot Password?
 The Linux Kernel   Compiling, theory, programming or other discussion about the linux kernel

Site Navigation
Linux Articles
Linux Forums
Linux Downloads
Linux Hosting
Free Magazines
Job Board
IRC Chat
RSS Feeds


Linux Forum Topics
Linux Forums
Your Distro
Linux Resources
GNU Linux Zone
The Community
Reply
 
Thread Tools Display Modes
Old 07-14-2005   #1 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
Where to find the errors when machine hangs up?

Hey, I wrote a module and ran it, however it made the machine hanging up. Then I have to restart the computer.

After restarting, I typed `dmesg' and found no information. So my questions is : How to check the errors when your codes make computer
haning up? Where can I find the debugging information?

Any help is appreciated.
walking is offline  


Reply With Quote
Old 07-14-2005   #2 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
More weridly, I ran the same codes, sometimes, it makes the box hanging up (crashed/dead), sometimes it works well.
walking is offline   Reply With Quote
Old 07-14-2005   #3 (permalink)
Just Joined!
 
Join Date: Jul 2005
Posts: 4
KGDB

Maybe u can use KGDB to debug your modules
PlayDog is offline   Reply With Quote
Old 07-14-2005   #4 (permalink)
Linux Guru
 
kkubasik's Avatar
 
Join Date: Mar 2004
Location: Lat: 39:03:51N Lon: 77:14:37W
Posts: 2,397
Send a message via AIM to kkubasik
type

dmesg


to see current error output, otherwise, check your logs (location varies by distrobution)

/var/log is a common one, but if things aren't booting its not very likely that logging is working either, now is it? but thats the fun of kernel programming..... i guess
__________________
Avoid the Gates of Hell. Use Linux
A Penny for your Thoughts

Formerly Known as qub333
kkubasik is offline   Reply With Quote
Old 07-14-2005   #5 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
Quote:
Originally Posted by qub333
type

dmesg


to see current error output, otherwise, check your logs (location varies by distrobution)

/var/log is a common one, but if things aren't booting its not very likely that logging is working either, now is it? but thats the fun of kernel programming..... i guess
Thanks, I found the information in /var/log/debug. But how to read those information?
They are so hard to understand...
walking is offline   Reply With Quote
Old 07-14-2005   #6 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
Hey, I found the error information, but very confused. How to read the following information?
Thanks a lot!!



Jul 14 10:49:57 kernel: CPU: Before vendor init, caps: 0387fbff 00000000 00000000, vendor = 0
Jul 14 10:49:57 kernel: CPU: After vendor init, caps: 0387fbff 00000000 00000000 00000000
Jul 14 10:49:57 kernel: CPU: After generic, caps: 0383fbff 00000000 00000000 00000000
Jul 14 10:49:57 kernel: CPU: Common caps: 0383fbff 00000000 00000000 00000000
Jul 14 10:49:57 kernel: CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0
Jul 14 10:49:57 kernel: CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000
Jul 14 10:49:57 kernel: CPU: After generic, caps: 0383fbff 00000000 00000000 00000000
Jul 14 10:49:57 kernel: CPU: Common caps: 0383fbff 00000000 00000000 00000000
Jul 14 10:49:57 kernel: init IO_APIC IRQs
Jul 14 10:49:57 kernel: IO-APIC (apicid-pin) 4-0, 4-5, 4-7, 4-9, 4-11, 5-0, 5-1, 5-2, 5-3, 5-7, 5-8, 5-9, 5-10, 5-12, 5-13, 5-15 not connected.
Jul 14 10:49:57 kernel: number of MP IRQ sources: 17.
Jul 14 10:49:57 kernel: number of IO-APIC #4 registers: 16.
Jul 14 10:49:57 kernel: number of IO-APIC #5 registers: 16.
Jul 14 10:49:57 kernel: IO APIC #4......
Jul 14 10:49:57 kernel: .... register #00: 04000000
Jul 14 10:49:57 kernel: ....... : physical APIC id: 04
Jul 14 10:49:57 kernel: .... register #01: 000F0011
Jul 14 10:49:57 kernel: ....... : max redirection entries: 000F
Jul 14 10:49:57 kernel: ....... : PRQ implemented: 0
Jul 14 10:49:57 kernel: ....... : IO APIC version: 0011
Jul 14 10:49:57 kernel: .... register #02: 00000000
Jul 14 10:49:57 kernel: ....... : arbitration: 00
Jul 14 10:49:57 kernel: .... IRQ redirection table:
Jul 14 10:49:57 kernel: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
Jul 14 10:49:57 kernel: 00 001 01 0 0 0 0 0 1 1 31
Jul 14 10:49:57 kernel: 01 001 01 0 0 0 0 0 1 1 39
Jul 14 10:49:57 kernel: 02 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 03 001 01 0 0 0 0 0 1 1 41
Jul 14 10:49:57 kernel: 04 001 01 0 0 0 0 0 1 1 49
Jul 14 10:49:57 kernel: 05 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 07 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 08 001 01 0 0 0 0 0 1 1 59
Jul 14 10:49:57 kernel: 09 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 0a 001 01 1 1 0 1 0 1 1 61
Jul 14 10:49:57 kernel: 0b 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 0c 001 01 0 0 0 0 0 1 1 69
Jul 14 10:49:57 kernel: 0d 001 01 0 0 0 0 0 1 1 71
Jul 14 10:49:57 kernel: 0e 001 01 0 0 0 0 0 1 1 79
Jul 14 10:49:57 kernel: 0f 001 01 0 0 0 0 0 1 1 81
Jul 14 10:49:57 kernel: IO APIC #5......
Jul 14 10:49:57 kernel: .... register #00: 05000000
Jul 14 10:49:57 kernel: ....... : physical APIC id: 05
Jul 14 10:49:57 kernel: .... register #01: 000F0011
Jul 14 10:49:57 kernel: ....... : max redirection entries: 000F
Jul 14 10:49:57 kernel: ....... : PRQ implemented: 0
Jul 14 10:49:57 kernel: ....... : IO APIC version: 0011
Jul 14 10:49:57 kernel: .... register #02: 01000000
Jul 14 10:49:57 kernel: ....... : arbitration: 01
Jul 14 10:49:57 kernel: .... IRQ redirection table:
Jul 14 10:49:57 kernel: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
Jul 14 10:49:57 kernel: 00 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 01 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 02 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 03 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 04 001 01 1 1 0 1 0 1 1 89
Jul 14 10:49:57 kernel: 05 001 01 1 1 0 1 0 1 1 91
Jul 14 10:49:57 kernel: 06 001 01 1 1 0 1 0 1 1 99
Jul 14 10:49:57 kernel: 07 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 08 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 09 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 0a 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 0b 001 01 1 1 0 1 0 1 1 A1
Jul 14 10:49:57 kernel: 0c 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 0d 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: 0e 001 01 1 1 0 1 0 1 1 A9
Jul 14 10:49:57 kernel: 0f 000 00 1 0 0 0 0 0 0 00
Jul 14 10:49:57 kernel: IRQ to pin mappings:
Jul 14 10:49:57 kernel: IRQ to pin mappings:
Jul 14 10:49:57 kernel: IRQ0 -> 0:2
Jul 14 10:49:57 kernel: IRQ1 -> 0:1
Jul 14 10:49:57 kernel: IRQ3 -> 0:3
Jul 14 10:49:57 kernel: IRQ4 -> 0:4
Jul 14 10:49:57 kernel: IRQ6 -> 0:6
Jul 14 10:49:57 kernel: IRQ8 -> 0:8
Jul 14 10:49:57 kernel: IRQ10 -> 0:10
Jul 14 10:49:57 kernel: IRQ12 -> 0:12
Jul 14 10:49:57 kernel: IRQ13 -> 0:13
Jul 14 10:49:57 kernel: IRQ14 -> 0:14
Jul 14 10:49:57 kernel: IRQ15 -> 0:15
Jul 14 10:49:57 kernel: IRQ20 -> 1:4
Jul 14 10:49:57 kernel: IRQ21 -> 1:5
Jul 14 10:49:57 kernel: IRQ22 -> 1:6
Jul 14 10:49:57 kernel: IRQ27 -> 1:11
Jul 14 10:49:57 kernel: IRQ30 -> 1:14
Jul 14 10:49:57 kernel: agpgart: no supported devices found.


Thanks again.
walking is offline   Reply With Quote
Old 07-15-2005   #7 (permalink)
Linux Guru
 
Join Date: Oct 2001
Location: Täby, Sweden
Posts: 7,578
Those aren't errors, those are just normal operational messages of the kernel setting up APICs, routing IRQs and so forth.

The thing is, you cannot save a panic message to a file. Since the kernel panics, it doesn't run any processes, filesystem drivers or disk device drivers anymore. If it did, then the kernel corruption that caused the panic could potentially corrupt the filesystem as well.

The only generic way to save a panic message is to take a photograph of your monitor. There are patches to send them over a serial link or save to a floppy disk as well, if you can find them.

Of course, you'll have to run the system in text mode for the panic message to even get printed on the screen.
Dolda2000 is offline   Reply With Quote
Old 07-15-2005   #8 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
Quote:
Originally Posted by Dolda2000
Those aren't errors, those are just normal operational messages of the kernel setting up APICs, routing IRQs and so forth.

The thing is, you cannot save a panic message to a file. Since the kernel panics, it doesn't run any processes, filesystem drivers or disk device drivers anymore. If it did, then the kernel corruption that caused the panic could potentially corrupt the filesystem as well.

The only generic way to save a panic message is to take a photograph of your monitor. There are patches to send them over a serial link or save to a floppy disk as well, if you can find them.

Of course, you'll have to run the system in text mode for the panic message to even get printed on the screen.
Hi, thank you so much for your reply! However, I don't understand "take a photograph of your monitor". Could you please say more? I am running the module in the text mode. Sometimes my module works, sometimes it makes the computer crashed/Freezing/No reponse.

I can't see any information from Console or the log files after restarting.
I even modified the printk file and let all the information appear on the
console. However, I still didn't see them. The computer just hanged up
and freezed, no any response.....
walking is offline   Reply With Quote
Old 07-15-2005   #9 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
BTW: I found the following information in /var/log/syslog

/usr/sbin/exim -a -f /etc/exim/exim.conf ]; then /usr/sbin/exim -q ; fi)

Jul 14 17:23:01 /USR/SBIN/CRON[1074]: (mail) CMD ( if [ -x /usr/sbin/exim -a -f /etc/exim/exim.conf ]; then /usr/sbin/exim -q ; fi)

Jul 14 17:38:01 /USR/SBIN/CRON[1079]: (mail) CMD ( if [ -x /usr/sbin/exim -a -f /etc/exim/exim.conf ]; then /usr/sbin/exim -q ; fi)

------

What are they? Are they errors? Thanks again!
walking is offline   Reply With Quote
Old 07-15-2005   #10 (permalink)
Linux Guru
 
Join Date: Oct 2001
Location: Täby, Sweden
Posts: 7,578
Those aren't errors -- they're just info messages from various system services, in this case cron running a couple of programs.

I find it strange indeed that you don't get a panic message on the text console. Maybe the machine hangs before the kernel even gets to detect that, although I haven't seen anything like that previously. What does your module do, really? If you printk() some messages immediately upon module initialization, would they appear on the text console?
Dolda2000 is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Free Magazines
Run Your Own Web Server Using Linux & Apache - Free 191 Page Preview
Learn about everything you'll need to build and maintain your Linux servers, and to deploy Web applications to them.
subscribe
Open Source Security Myths Dispelled
Dispel the five major myths surrounding Open Source Security and gain the tools necessary to make a truly informed decision for your IT organization
subscribe
InformationWeek
InformationWeek is the only newsweekly you'll need to stay on top of the latest developments in information technology.
subscribe



All times are GMT. The time now is 08:49 AM.






© 2000 - 2009 - All Rights Reserved - Property of  MAS Media

Content Relevant URLs by vBSEO 3.3.0 RC2