Welcome to Linux Forums! With a comprehensive Linux Forum, information on various types of Linux software and many Linux Reviews articles, we have all the knowledge you need a click away, or accessible via our knowledgeable members.
Find the answer to your Linux question:
New to Linux Forums? Register here for free!
    Linux Forums > GNU Linux Zone > The Linux Kernel > Where to find the errors when machine hangs up?

Forgot Password?
 The Linux Kernel   Compiling, theory, programming or other discussion about the linux kernel

Site Navigation
Linux Articles
Linux Forums
Linux Downloads
Linux Hosting
Free Magazines
Job Board
IRC Chat
RSS Feeds


Linux Forum Topics
Linux Forums
Your Distro
Linux Resources
GNU Linux Zone
The Community
Reply
 
Thread Tools Display Modes
Old 07-15-2005   #11 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
Quote:
Originally Posted by Dolda2000
Those aren't errors -- they're just info messages from various system services, in this case cron running a couple of programs.

I find it strange indeed that you don't get a panic message on the text console. Maybe the machine hangs before the kernel even gets to detect that, although I haven't seen anything like that previously. What does your module do, really? If you printk() some messages immediately upon module initialization, would they appear on the text console?
Thank you very much! I am sending packets based on the modified module. This module
is like UDP/IP. Sometimes, I can send packets smoothly, sometimes the sending makes the
computer crashed (only the source node). Yesterday I found sometimes it made the system
"segmentation fault" instead of hanging up, that is much better for me. Here is the error:


Code:
Unable to handle kernel NULL pointer dereference at virtual address 0000007f
 printing eip:
d0861472
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<d0861472>]    Not tainted
EFLAGS: 00010297
eax: 00000073   ebx: cfa16964   ecx: 00000000   edx: cfc22800
esi: cfafc140   edi: cfc22800   ebp: cf05dcd4   esp: cf05dcb4
ds: 0018   es: 0018   ss: 0018
Process my_test (pid: 264, stackpage=cf05d000)
Stack: cf05dcdc d085f153 cfafc140 08049578 cf046c2e 00000020 cfafc140 0000000e
       cf05dd14 d0860bf9 cfa16d40 cfafc140 ca000000 d0860b29 ca000000 cfa16d40
       00000001 d08614f5 d0865923 d0865967 00000000 ffffffff 00000001 0005df08
Call Trace: [<d085f153>] [<d0860bf9>] [<d0860b29>] [<d08614f5>] [<d0865923>]
   [<d0865967>] [<d0860630>] [<d08636b1>] [<d0863ede>] [<d0863d04>] [<d085f564>]
   [<c01ff3f5>] [<c02002f5>] [<c0112d2a>] [<c0112d2a>] [<c0200afb>] [<c0106e5b>]
However, I don't think I access an NULL pointer in my module. Is there any kernel function
I used wrong? Can I get some information from the above errors? I read the file under the directory /var/log/ksysmoops trying to find the above xxxxxxx code's meaning, but I didn't get them.

What's more, when I tried to remove the module, it said " Device or Resource Busy". Why this happened? Thanks a lot!

Thanks again!

Mod edit - added code tags.
walking is offline  


Reply With Quote
Old 07-15-2005   #12 (permalink)
Linux Guru
 
Join Date: Oct 2001
Location: Täby, Sweden
Posts: 7,578
Quote:
Originally Posted by walking
However, I don't think I access an NULL pointer in my module.
How can you be so sure? Is it really impossible that you may have stored 0 or NULL in a pointer anywhere, or alternatively not initialized some structure insufficiently, and then derefenced it?

Quote:
Originally Posted by walking
Is there any kernel function I used wrong? Can I get some information from the above errors?
The first thing you should do is recompile the kernel with debugging support. That way, the panic handler will be able to print symbolic function names instead of addresses in the backtrace. That'll make it very much easier to debug.
Quote:
Originally Posted by walking
I read the file under the directory /var/log/ksysmoops trying to find the above xxxxxxx code's meaning, but I didn't get them.
That could just be because the addresses were inside your module, in which case they won't be found by ksymoops. Anyhow, like I said, recompiling the kernel with debug support will solve that part of the problem.

Quote:
Originally Posted by walking
What's more, when I tried to remove the module, it said " Device or Resource Busy". Why this happened?
Probably because some refcount mismatch. Are you releasing all resources properly? Either way, if you're segfaulting inside the kernel, there's really no telling what may happen.
Dolda2000 is offline   Reply With Quote
Old 07-16-2005   #13 (permalink)
Just Joined!
 
Join Date: Nov 2004
Posts: 43
Thank you so much for your reply! I really appreciate it.

Quote:
Originally Posted by Dolda2000
How can you be so sure? Is it really impossible that you may have stored 0 or NULL in a pointer anywhere, or alternatively not initialized some structure insufficiently, and then derefenced it?
I found the error which makes the kenel panic. As I tried to print "skb->mac.ethernet->h_proto " , probably sometimes it is a NULL pointer. I am not very
sure. I haven't read how the mac header works. However, after I removed the printk
statement. No segmentation fault any more.


However, I returned to the original question. Sometimes, I sent traffic based on the module,
everything works. However, sometimes, it made the whole machine HANG up. No any
kernel panic information in log file or console. Because I tried to debug the code to see
where is problem, I set some printk statements in every function. However, when I enabled
the printk statements, the machine doesn't hangs up ( at least, very lower possibility than disabling the printk statements.). Why could it be that?

I don't want to enable printk statments, because it really affected my measurements. such as
throughput, etc.

Any advice? Thanks a lot!
walking is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Free Magazines
Run Your Own Web Server Using Linux & Apache - Free 191 Page Preview
Learn about everything you'll need to build and maintain your Linux servers, and to deploy Web applications to them.
subscribe
Open Source Security Myths Dispelled
Dispel the five major myths surrounding Open Source Security and gain the tools necessary to make a truly informed decision for your IT organization
subscribe
InformationWeek
InformationWeek is the only newsweekly you'll need to stay on top of the latest developments in information technology.
subscribe



All times are GMT. The time now is 11:18 AM.






© 2000 - 2009 - All Rights Reserved - Property of  MAS Media

Content Relevant URLs by vBSEO 3.3.0 RC2