Find the answer to your Linux question:
Results 1 to 4 of 4
What is the best way of converting an existing device driver to work under a real-time kernel? A device driver provided for a special serial card (Curtiss Wright SL240 sFDFP ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Feb 2010
    Posts
    2

    Converting a device driver for real time kernel


    What is the best way of converting an existing device driver to work under a real-time kernel?

    A device driver provided for a special serial card (Curtiss Wright SL240 sFDFP card) works under vanilla Linux (SLES SP10 SP2 kernel 2.6.16.60-0.21-smp), but compiling the driver under the real time kernel (2.6.22.19-0.14-rt) the system locks up and crashes. The rt kernel is being used to write large amounts of data from the serial card to disk and testing has found the rt kernel has the best consistent write performance.

    I have gone back to the supplier and they are not interested in developing a real time version of the driver. So I am trying to modify the driver myself. The driver code can be provided if anyone wants a look (released under GPLv2).

    Any assistance would be useful.

  2. #2
    Just Joined!
    Join Date
    Jul 2009
    Posts
    49

    when you say Real Time kernel...

    The RT patches to the linux kernel add some more API calls so that the various drivers can take advantage of some extra scheduling features, high resolution timers and such but it is not an actual real time kernel. To get true real time you need either the RTAI linux kernel or move to VxWorks (ka-ching$$$$) QNX, OSE etc...

    I suspect that you are running into a compatibility issue in general, not with any actual real time issues since your driver was never written to use the RT API calls.

    You need to start sending single characters to the I/F and debugging the code using printk() and the mighty panic() if it still dies without a whimper. Welcome to kernel debugging. Good Luck.

    Cheers!!

  3. #3
    Just Joined!
    Join Date
    Feb 2010
    Posts
    2
    Thanks for all the suggestion.

    I have uploaded the driver code, if anyone wants a look:
    dl.dropbox.com/u/4618054/nsl_driver.zip

    Using the existing RT kernel, I have enabled debugging in the driver and the crash output from dmesg is below.

    START dmesg dump

    GTDBG: dcfiIoctl()
    GTDBG: <4>dcfiIoctlXfer begin
    GTDBG: dcfiXferChunk() begin
    GTDBG: dcfiXferChunk() end
    GTDBG: <4>dcfiIoctlXfer end
    GTDBG: dcfiIoctl() exit...
    GTDBG: dcfiIoctl()
    GTDBG: <4>dcfiIoctlXfer begin
    stopped custom tracer.
    Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
    <6>GTDBG: dcfiXferChunk() end
    [<ffffffff80255b58>] lock_hrtimer_base+0x18/0x60 PGD 5848c7067 PUD 584d19067 PMD 0
    Oops: 0000 [1] PREEMPT SMP
    last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
    CPU 1
    Modules linked in: dcfi_nsl_module nfsd exportfs lockd nfs_acl sunrpc ipv6 button sg battery ac st apparmor<6>GTDBG: <4>dcfiIoctlXfer end xfs loop dm_mod usbhid hid ff_memless generic nx_nic ide_core ipmi_si uhci_hcd ehci_hcd usbcore ipmi_msghandler firmware_class rtc_cmos rtc_core rtc_lib reiserfs edd fan thermal processor cciss ata_piix libata sd_mod scsi_mod
    Pid: 7023, comm: nsltp Tainted: G N 2.6.22.19-0.14-rt #1
    RIP: 0010:[<ffffffff80255b58>] [<ffffffff80255b58>] lock_hrtimer_base+0x18/0x60
    RSP: 0018:ffff81057d9afb38 EFLAGS: 00010292
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff810308c50b30
    RDX: ffff81058c080040 RSI: ffff81057d9afb68 RDI: 0000000000000000
    RBP: 0000000000000000 R08: ffff81057d9ae000 R09: 0000000000000000
    R10: 0000000000000000 R11: ffffffff8021cec0 R12: ffff81057d9afb68
    R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000292
    FS: 0000000041001940(0063) GS:ffff81059762e5c0(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000028 CR3: 000000058d250000 CR4: 00000000000006e0 Process nsltp (pid: 7023, threadinfo ffff81057d9ae000, task ffff81058c080040)
    Stack: 0000000000000000 00000000ffffffff 0000000000000004 ffffffff80255c08
    0000000000000004 ffffffff8025e97d 0000000000400040 0000000000000000<6>GTDBG: dcfiIoctl() exit...

    ffff810308c50b08 ffffffff80255c76 0000000000000000 ffffffff80468cd2 Call Trace:
    GTDBG: dcfiXferChunk() end
    GTDBG: <4>dcfiIoctlXfer end
    GTDBG: dcfiIoctl() exit...
    [<ffffffff80255c08>] hrtimer_try_to_cancel+0x18/0x70
    GTDBG: dcfiIoctl()
    [<ffffffff8025e97d>] do_try_to_take_rt_mutex+0x16d/0x1b0
    [<ffffffff80255c76>] hrtimer_cancel+0x16/0x20
    GTDBG: <4>dcfiIoctlXfer begin
    GTDBG: dcfiXferChunk() begin
    [<ffffffff80468cd2>] rt_mutex_slowlock+0x242/0x2d8 [<ffffffff8025f4fd>] rt_down+0x1d/0x60 [<ffffffff8843c5cd>] :dcfi_nsl_module:dcfiIoctlXfer+0x4bd/0x6a0
    GTDBG: dcfiIoctl()
    GTDBG: <4>dcfiIoctlXfer begin
    GTDBG: dcfiXferChunk() begin
    [<ffffffff8843c965>] :dcfi_nsl_module:dcfiIoctl+0x1b5/0x440
    [<ffffffff8025e97d>] do_try_to_take_rt_mutex+0x16d/0x1b0
    [<ffffffff80468c26>] rt_mutex_slowlock+0x196/0x2d8 [<ffffffff8843c7b0>] :dcfi_nsl_module:dcfiIoctl+0x0/0x440
    [<ffffffff802bb8b3>] do_ioctl+0x93/0xe0 [<ffffffff802bb974>] vfs_ioctl+0x74/0x2d0 [<ffffffff802bbc65>] sys_ioctl+0x95/0xb0 [<ffffffff8020a04e>] system_call+0x7e/0x83


    Code: 48 8b 5d 28 48 85 db 74 33 48 8b 3b e8 87 3b 21 00 49 89 04 RIP [<ffffffff80255b58>] lock_hrtimer_base+0x18/0x60 RSP <ffff81057d9afb38>
    CR2: 0000000000000028
    GTDBG: dcfiXferChunk() end
    GTDBG: <4>dcfiIoctlXfer end
    GTDBG: dcfiIoctl() exit...
    GTDBG: dcfiXferChunk() end
    GTDBG: <4>dcfiIoctlXfer end
    GTDBG: dcfiIoctl() exit...

    END dmesg dump

  4. $spacer_open
    $spacer_close
  5. #4
    Just Joined!
    Join Date
    Jul 2009
    Posts
    49
    Quote Originally Posted by aleggo View Post
    Thanks for all the suggestion.
    [<ffffffff80255b58>] lock_hrtimer_base+0x18/0x60 PGD 5848c7067 PUD 584d19067 PMD 0

    END dmesg dump
    So it died at this point. You can get the actual source code line of the module by using the command "objdump" to dump the mixed source code and assembly code. If you are cross compiling to some other architecture the objdump command will be prepended or post-pended with the architecture name (when I say other I mean you are cross compiling from intel/AMD to something like a ppc or a blackfin etc...).

    I have an alias for the objdump command to get all of the switches turned on for the dump:
    alias sd='objdump -Sldw'

    I run this command against the installed module that is crashing and I will get output of the source code and the assembly code intermixed. You will see that the address of the functions are included. The kernel dump gives you the offset into the function that it died in along with the length of the function. You just need to add the offset to the objdump's function address and you will see the exact assembly line it died at. Looking around the dump above where it died there should be an actual line number of the file and source code that corresponds to the assembly line.

    Your module is dying in a kernel API call so unless you look at the kernel source you won't be able to trace the actual source code line but, you do have the stack trace so you can see where in your module it is calling the function lock_hrtimer_base(). I would say that the low level API has changed and your driver needs to be updated. Your problems are occuring in the use of the hr timers (High Resolution Timers) which are not in fact really part of the RT patches (I mis-spoke early) but are a separate patch.

    Cheers!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •