Results 1 to 4 of 4
What is the best way of converting an existing device driver to work under a real-time kernel?
A device driver provided for a special serial card (Curtiss Wright SL240 sFDFP ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 02-09-2010 #1Just Joined!
- Join Date
- Feb 2010
- Posts
- 2
Converting a device driver for real time kernel
What is the best way of converting an existing device driver to work under a real-time kernel?
A device driver provided for a special serial card (Curtiss Wright SL240 sFDFP card) works under vanilla Linux (SLES SP10 SP2 kernel 2.6.16.60-0.21-smp), but compiling the driver under the real time kernel (2.6.22.19-0.14-rt) the system locks up and crashes. The rt kernel is being used to write large amounts of data from the serial card to disk and testing has found the rt kernel has the best consistent write performance.
I have gone back to the supplier and they are not interested in developing a real time version of the driver. So I am trying to modify the driver myself. The driver code can be provided if anyone wants a look (released under GPLv2).
Any assistance would be useful.
- 02-09-2010 #2Just Joined!
- Join Date
- Jul 2009
- Posts
- 49
when you say Real Time kernel...
The RT patches to the linux kernel add some more API calls so that the various drivers can take advantage of some extra scheduling features, high resolution timers and such but it is not an actual real time kernel. To get true real time you need either the RTAI linux kernel or move to VxWorks (ka-ching$$$$) QNX, OSE etc...
I suspect that you are running into a compatibility issue in general, not with any actual real time issues since your driver was never written to use the RT API calls.
You need to start sending single characters to the I/F and debugging the code using printk() and the mighty panic() if it still dies without a whimper. Welcome to kernel debugging. Good Luck.
Cheers!!
- 02-11-2010 #3Just Joined!
- Join Date
- Feb 2010
- Posts
- 2
Thanks for all the suggestion.
I have uploaded the driver code, if anyone wants a look:
dl.dropbox.com/u/4618054/nsl_driver.zip
Using the existing RT kernel, I have enabled debugging in the driver and the crash output from dmesg is below.
START dmesg dump
GTDBG: dcfiIoctl()
GTDBG: <4>dcfiIoctlXfer begin
GTDBG: dcfiXferChunk() begin
GTDBG: dcfiXferChunk() end
GTDBG: <4>dcfiIoctlXfer end
GTDBG: dcfiIoctl() exit...
GTDBG: dcfiIoctl()
GTDBG: <4>dcfiIoctlXfer begin
stopped custom tracer.
Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
<6>GTDBG: dcfiXferChunk() end
[<ffffffff80255b58>] lock_hrtimer_base+0x18/0x60 PGD 5848c7067 PUD 584d19067 PMD 0
Oops: 0000 [1] PREEMPT SMP
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 1
Modules linked in: dcfi_nsl_module nfsd exportfs lockd nfs_acl sunrpc ipv6 button sg battery ac st apparmor<6>GTDBG: <4>dcfiIoctlXfer end xfs loop dm_mod usbhid hid ff_memless generic nx_nic ide_core ipmi_si uhci_hcd ehci_hcd usbcore ipmi_msghandler firmware_class rtc_cmos rtc_core rtc_lib reiserfs edd fan thermal processor cciss ata_piix libata sd_mod scsi_mod
Pid: 7023, comm: nsltp Tainted: G N 2.6.22.19-0.14-rt #1
RIP: 0010:[<ffffffff80255b58>] [<ffffffff80255b58>] lock_hrtimer_base+0x18/0x60
RSP: 0018:ffff81057d9afb38 EFLAGS: 00010292
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff810308c50b30
RDX: ffff81058c080040 RSI: ffff81057d9afb68 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffff81057d9ae000 R09: 0000000000000000
R10: 0000000000000000 R11: ffffffff8021cec0 R12: ffff81057d9afb68
R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000292
FS: 0000000041001940(0063) GS:ffff81059762e5c0(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000028 CR3: 000000058d250000 CR4: 00000000000006e0 Process nsltp (pid: 7023, threadinfo ffff81057d9ae000, task ffff81058c080040)
Stack: 0000000000000000 00000000ffffffff 0000000000000004 ffffffff80255c08
0000000000000004 ffffffff8025e97d 0000000000400040 0000000000000000<6>GTDBG: dcfiIoctl() exit...
ffff810308c50b08 ffffffff80255c76 0000000000000000 ffffffff80468cd2 Call Trace:
GTDBG: dcfiXferChunk() end
GTDBG: <4>dcfiIoctlXfer end
GTDBG: dcfiIoctl() exit...
[<ffffffff80255c08>] hrtimer_try_to_cancel+0x18/0x70
GTDBG: dcfiIoctl()
[<ffffffff8025e97d>] do_try_to_take_rt_mutex+0x16d/0x1b0
[<ffffffff80255c76>] hrtimer_cancel+0x16/0x20
GTDBG: <4>dcfiIoctlXfer begin
GTDBG: dcfiXferChunk() begin
[<ffffffff80468cd2>] rt_mutex_slowlock+0x242/0x2d8 [<ffffffff8025f4fd>] rt_down+0x1d/0x60 [<ffffffff8843c5cd>] :dcfi_nsl_module:dcfiIoctlXfer+0x4bd/0x6a0
GTDBG: dcfiIoctl()
GTDBG: <4>dcfiIoctlXfer begin
GTDBG: dcfiXferChunk() begin
[<ffffffff8843c965>] :dcfi_nsl_module:dcfiIoctl+0x1b5/0x440
[<ffffffff8025e97d>] do_try_to_take_rt_mutex+0x16d/0x1b0
[<ffffffff80468c26>] rt_mutex_slowlock+0x196/0x2d8 [<ffffffff8843c7b0>] :dcfi_nsl_module:dcfiIoctl+0x0/0x440
[<ffffffff802bb8b3>] do_ioctl+0x93/0xe0 [<ffffffff802bb974>] vfs_ioctl+0x74/0x2d0 [<ffffffff802bbc65>] sys_ioctl+0x95/0xb0 [<ffffffff8020a04e>] system_call+0x7e/0x83
Code: 48 8b 5d 28 48 85 db 74 33 48 8b 3b e8 87 3b 21 00 49 89 04 RIP [<ffffffff80255b58>] lock_hrtimer_base+0x18/0x60 RSP <ffff81057d9afb38>
CR2: 0000000000000028
GTDBG: dcfiXferChunk() end
GTDBG: <4>dcfiIoctlXfer end
GTDBG: dcfiIoctl() exit...
GTDBG: dcfiXferChunk() end
GTDBG: <4>dcfiIoctlXfer end
GTDBG: dcfiIoctl() exit...
END dmesg dump
- 02-11-2010 #4Just Joined!
- Join Date
- Jul 2009
- Posts
- 49
So it died at this point. You can get the actual source code line of the module by using the command "objdump" to dump the mixed source code and assembly code. If you are cross compiling to some other architecture the objdump command will be prepended or post-pended with the architecture name (when I say other I mean you are cross compiling from intel/AMD to something like a ppc or a blackfin etc...).
I have an alias for the objdump command to get all of the switches turned on for the dump:
alias sd='objdump -Sldw'
I run this command against the installed module that is crashing and I will get output of the source code and the assembly code intermixed. You will see that the address of the functions are included. The kernel dump gives you the offset into the function that it died in along with the length of the function. You just need to add the offset to the objdump's function address and you will see the exact assembly line it died at. Looking around the dump above where it died there should be an actual line number of the file and source code that corresponds to the assembly line.
Your module is dying in a kernel API call so unless you look at the kernel source you won't be able to trace the actual source code line but, you do have the stack trace so you can see where in your module it is calling the function lock_hrtimer_base(). I would say that the low level API has changed and your driver needs to be updated. Your problems are occuring in the use of the hr timers (High Resolution Timers) which are not in fact really part of the RT patches (I mis-spoke early) but are a separate patch.
Cheers!!


Reply With Quote

