Hi All,

I am currently trying to debug a linux module that very occasionaly causes a "scheduling while atomic" oops.

Here is the oops trace:

Code:
BUG: scheduling while atomic: dragon/2185/0x00010001
Modules linked in: sm_xscale wlan_scan_ap ath_rate_sample ath_pci wlan ath_hal(P)

Pid: 2185, comm: dragon Tainted: P           (2.6.28-tuxonice-r5 #31)  
EIP: 0060:[<c012a22d>] EFLAGS: 00000246 CPU: 0
EIP is at down_interruptible+0x48/0x4a
EAX: 00000000 EBX: f70e0000 ECX: f6dc2000 EDX: 00000000
ESI: 00000246 EDI: 00000208 EBP: f70f1058 ESP: f6dc3f64
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
CR0: 8005003b CR2: b80a7cca CR3: 36dda000 CR4: 00000090
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Call Trace:
[<f9025378>] ? smyers_read+0x7d/0xb2 [sm_xscale]
[<f90252fb>] ? smyers_read+0x0/0xb2 [sm_xscale]
[<c01696fd>] ? vfs_read+0x81/0xfa
[<c016980e>] ? sys_read+0x3c/0x63
[<c0102cde>] ? syscall_call+0x7/0xb
dragon[2145]: Exception reading from USB // My user land program noticing something has gone wrong
And here is the code that generates it:

Code:
static ssize_t smyers_read(struct file *file, char *buffer, size_t count, loff_t *ppos)
{
	struct usb_smyers *dev;
	int retval = 0;

	dev = (struct usb_smyers *)file->private_data;
	while (atomic_read(&dev->buffers_in_use) == 0)
	{
        	if(interruptible_sleep_on_timeout(&dev->read_queue, SLEEP_TIMEOUT)
        		== SLEEP_TIMEOUT)
        	{
        		break;
        	}
 	}

	/* verify that the device wasn't unplugged */

	if (dev->udev == NULL || atomic_read(&dev->buffers_in_use) == 0)
	{
		retval = -ENODEV;
	}
	else
	{
		if (atomic_read(&dev->buffers_in_use) != 0)
		{
			int index = atomic_read(&dev->current_read_buffer);
			if (count > dev->read_buffers[index].length)
			{
				count = dev->read_buffers[index].length;
			}	
			if (copy_to_user(buffer,
							 dev->read_buffers[index].buffer,
							 count))
			{
				retval = -EFAULT;
			}
			else
			{
				retval = count;
			}

			/* lock this object */
		
			if (down_interruptible(&dev->sem))
			{
				retval = -ERESTARTSYS;
			}
			else
			{
		
				atomic_set(&dev->current_read_buffer, (index + 1) & (READ_BUFFER_COUNT - 1));
				atomic_dec(&dev->buffers_in_use);

				/* unlock the device */

				up (&dev->sem);
			}
		}
		else
		{ 	
			retval = 0;
		}
	}
	return retval;
}
From the oops trace I have to assume the problem is occuring at/during the call to down_interruptible. I can't figure out why this is causing the "scheduling while atomic" issue as I can't see a way that the kernel can be atomic at the point down_interruptible is called.

I am looking for some pointers on how to find the cause of this, and a little more explination of the oops trace. What exactly does "EIP is at down_interruptible+0x48/0x4a" mean.

A couple of things to note:
  1. The module is reading data from a connected USB device
  2. The problem only occurs every few hours when the device is heavily loaded with reads
  3. This is my first go at kernel programming
  4. I did not orriginally write the kernel module with this code in
  5. I have fixed some other "scheduling while atomic" bugs in this module already


If you need the rest of the module code before you give advice, just ask and I will post the rest of it.

Thanks in advance.

Simon.