Welcome to Linux Forums! With a comprehensive Linux Forum, information on various types of Linux software and many Linux Reviews articles, we have all the knowledge you need a click away, or accessible via our knowledgeable members.
Find the answer to your Linux question:
Site Navigation
Linux Forums
Linux Articles
Product Showcase
Linux Downloads
Linux Hosting
Free Magazines
Job Board
IRC Chat
RSS Feeds
Free Publications


This is the first part of a series of articles regarding Linux Kernel Modules. In this series we will see some examples of module programming and some techniques and general rules that we must keep in mind when we work in kernel mode. This is not an in-depth series of articles, but and introduction for those people who want to know more about kernel internals. A background in C programming will be helpful.

OS basics

The kernel is the core of an operating system. It is the component that actually does the real work in a computer. Because of this, it is important to design the kernel to be efficient, robust and if it is possible, portable. The rest of the operating system strongly depends on the kernel capabilities.

There are several kernel architectures, but the two most important are:

  • Monolithic
  • Micro-kernel
  • (Yes, I explicitly forget exokernels)

Let's take a look to these ones.

Monolithic kernels.

The monolithic term means that the kernel is implemented as an only one process, with an only address space. It is like when you create a single .c file, and it is executed as a single process.People that are in favour of this model, cite the simplicity as the main goal. With a monolithic kernel, communication is simple, and it is possible to call a defined function from wherever in the kernel.Most of the UNIX kernels are monolithic (Hurd is an exception)

Micro-kernels

The approach for micro-kernel design is simple: Let the kernel perform only the essential operations. Everything else that can be performed in user space should be done there. In a micro-kernel scenario, the kernel must communicate with the other operating system parts. Typically, the memory management, file systems and IPC communications are not inside the kernel. This model can take advantage of memory protection features provided by modern processors. The drawback is the extra communication caused by this distributed schema where every subsystem has it's own address space.

A brief comparison

The strength of the monolithic approach is the simplicity of the design and the performance obtained without a great effort. However, all the kernel capabilities run in kernel mode. As we will see later, this can be dangerous if programming is not extremely careful. On the other hand, micro-kernels, implement separate processes to offer all the services provided by the kernel. This means more protection between them, but it causes extra communication. Even more, theoretically, only one process should run in kernel mode and the other ones should run in user space. In real word, this is not always true.

If you are interested in these concepts and the benefits and drawbacks of them, you can read this interesting thread between Linus Torvalds (linux creator) and Andy Tanenbaum (minix creator).

And the linux kernel is...?

Monolithic, but it is also modular, i.e. the kernel can dynamically load parts of the kernel code. This is a really nice feature since in this way, it is possible to extend the kernel capabilities without modifying the rest of the code. With traditional UNIX kernels, it was necessary to recompile the kernel to add a new feature. Now, it is possible to insert the module while the kernel is running. This doesn't produce any problem. Modern UNIX kernels support modules loading.

The Linux Kernel Modules
What is a module?

As I said before, a kernel module is a part of the kernel code. In fact, if you have ever compiled your kernel, surely you know this: when we set preferences for our customized kernel, we can choose a feature to be compiled as a module or to be integrated in the kernel code.
Almost everything in kernel code can be implemented as a module. However, it is common to implement file systems, drivers or some programs that need kernel mode (e.g. performance analysers like vtune).
It is usually assumed that if you can do it in user space, you should do it in user space. Since modules run in kernel module (in fact, they are integrated in the kernel itself) they can corrupt kernel structures if they are not careful. A dangling pointer can cause your system goes down.
But if your program needs to work in kernel mode, the best you can do is to implement it as a module instead of modifying the rest of the kernel. Sometimes this is inevitable, e.g. you need to change the task_struct, but this is not usual.

Modules implementation

Now, we know what a module is. At this point, we inspect the kernel code to learn more about how the modules are implemented. Let's see at /usr/include/linux, we find three files related to modules:
  • module.h: most of the code definitions for loading modules are here.
  • moduleloader.h: implements various functions and macros to deal with module and kernel parameters. The
  • parse_args functions is used when a module is loaded. moduleparam.h: code for different architectures that support modules. These functions are only prototypes.

In /usr/src/linux/kernel we find:

  • module.c: The actual work is done here.
  • Kmod.c: kernel module loader

A module is represented by a module struct. It is defined in module.h:

struct module
{
enum module_state state;
struct list_head list;
char name[MODULE_NAME_LEN];
...
const struct kernel_symbol *syms;
...
const struct kernel_symbol *gpl_syms;
unsigned int num_exentries;
const struct exception_table_entry *extable;
int (*init)(void);
void *module_init;
unsigned long init_size, core_size;
...
int unsafe;
int license_gplok;
#ifdef CONFIG_MODULE_UNLOAD
struct module_ref ref[NR_CPUS];
struct list_head modules_which_use_me;
struct task_struct *waiter;
void (*exit)(void);
#endif
...
void *percpu;
char *args;
};

Above you can see a summary of the module struct. The most interesting fields are:

  • state can be set to one of this: MODULE_STATE_LIVE, MODULE_STATE_COMING or MODULE_STATE_GOING
  • *syms and *gpl_syms are symbols exported by the module. The reason for this separation is the license of the symbols (GPL vs. not GPL)
  • int (*init)(void) and void (*exit)(void) are pointers that will point to our initialising and clean-up functions (see Modules anatomy).
  • modules_which_use_me is a list of modules that need the services provided by this module.
  • *waiter represents the process that is waiting for this module to be unloaded.

The module struct is embedded as a member of the module_kobject struct. This struct has two fields: a module struct and a kobject struct. The kobject struct (/usr/include/linux/kobject.h) offers a foundation to manage kernel objects rather than other subsystems try to implement the same features (this is, tries to avoid redundancy).

Kmod.c:

Kmod is the substitute of the kerneld daemon. It acts as an intermediate between user programs and the kernel for loading modules. It communicates with the kernel using IPC mechanisms. The request_module function loads a required module by invoking modprobe from user space. This is done via call_usermodehelper function. This is a bit strange because it is an example of user-space program invocation from within the kernel. This strategy is commonly used in exokernels (do you remember I forgot them?)

Modules dependencies.

Good programmers are lazy. They don't try to reinvent the wheel in every program they write. Module programmers are not an exception. When they start to write a new module, they search documentation to know if there is a module already written that could help them in their challenges. This produces some dependencies between modules. For instance, if you want to write a driver for you new usb digital camera, you can write it from scratch, but it would be faster if you use the usb-core module to support the operations that your module performs. If you do that, your module will depend on the usb-core module.
This is a kind of stacked disposition. The modules at the base of the stack provide basic and generic operations. When we go up we find more sophisticated modules that use the lower ones to achieve more specific goals.

Loading and unloading modules

The basic commands to deal with modules are:
  • insmod: insert a module.
  • modprobe: same as insmod but handles dependencies
  • lsmod: lists the loaded modules
  • rmmod: deletes a module from the kernel

For instance, when I load my wireless card, and run lsmod, I see:

ipw2200 78644 0
ieee80211 21124 1 ipw2200

The ipw2200 is the driver itself, and the ieee80211 is a module that implements the standard protocol. As we can see in the last column, other module, the ipw2200, uses the ieee80211 module.
Now, we can try to delete the ipw2200 module. Try: rmmod ipw2200 and now run lsmod again and search for the ieee80211:

ieee80211 21124 0

As you can see, this is a clear example of stacked modules.
As we said before, the kernel uses the modprobe utility from userland to load the modules. But what happens if we move the modprobe command to other location? To answer to this, take a look at: /proc/sys/kernel/modprobe. A cat command reveals:

[root@Hammer kernel]# cat modprobe
/sbin/modprobe

This is the path where the kernel will look for the modprobe command. Now, I change this path:

[root@Hammer kernel]# echo /nothing/nothing > modprobe
[root@Hammer kernel]# cat modprobe
/nothing/nothing

If I try to switch on my wireless card, I get an error in dmesg. It indicates and error when loading ieee80211_crypt_wep

eth1: could not initialize WEP: load module ieee80211_crypt_wep

Loading modules: behind the scenes

At this moment, we see the internals of the loading modules, i.e. how the kernel manages the symbols and allocates memory for the new module.
There are two ways to load a module:

  • Explicitly
  • On demand

Let's take a look to the explicitly loading process:

We use the insmod utility to load a new module. This command uses a system call named create_module to allocate the new module structure. First of all, the kernel searches for other instance of the module. If it is found, the system call ends, if not, the kernel allocates memory (with valloc) for the module structure and inserts the module in the module list. If the module exports some symbols, they are now exported to the entire kernel. After this, the kernel invokes the init_module system call to initialize the module.
The on demand process is related to dependencies. Suppose that our module needs other module to be inserted in the kernel. We will use insmod with the first module, but when it is going to be linked to the kernel, there are unresolved symbols, i.e. the kernel does not provide some features we need, so the module that satisfies these symbols must be located.
The modprobe command is a wrapper for the insmod utility. It tries to load a module but handles the possible dependencies as well. If you want to load a module by hand, it is better to use modprobe instead of insmod.

Module anatomy

To conclude this article, I explain the structure of a kernel module. This last section links with my next article about the same topic. In the next article you will compile real modules, but for the moment, let's see the parts of a module. A schema of the parts of a generic kernel module is shown below.

The first thing we need to write a module is to include the proper headers. These headers are separated from normal headers. The kernel headers are in the linux subdirectory in /usr/include.
The essential headers are:

#include /* We are working in kernel mode */
#include /* Specifically a module */

These two files provide basic definitions to start working with modules.
Now we need to indicate some information about our module and ourselves. This is not strictly necessary but it is a good practice after all. To do this, we use the following macros:

MODULE_AUTHOR
MODULE_LICENSE
MODULE_DESCRIPTION

These macros take a string as argument. We continue writing our module:

MODULE_AUTOR("Fernando Apesteguia")
MODULE_LICENSE("GPL")
MODULE_DESCRIPTION("Dumb module")

Now, we need to declare the functions that will perform the module entry and exit points respectively. You can name these functions as you wish, but it is recommended to use names like mymodule_init or exit_mymodule.
So, let's write our functions:

int enter_module(void)
{
printk("This is written from kernel-landn");
return 0;
}

void exit_module(void)
{
printk("Bye!n");
}

Both of the functions take void as argument. The exit function also returns void. The init function returns an int to identify the error if any. In this example, the function returns zero indicating no error.
Now that we have the functions, we need to make the bindings to the module struct, i.e. we need to mark our functions to be used as the first one and the last one to be executed. This is done with:

module_init (enter_module);
module_exit (exit_module);

It is possible to see the use of __init and __exit macros in some modules. These macros take the following meanings:

  • __init indicates that the memory of the init function should be discarded after the function finishes. This macro however, has no effect in modules, but in built-in code.
  • __exit causes the exit function to be omitted for built-in code, but has no effect for modules.
These macros are used for those codes that can be compiled as a module or as part of the entire kernel.
The full code of our first module is listed below:

/* Very simple module */
#include
#include

int enter_module(void)
{
printk("This is written from kernel-landn");
return 0;
}

void exit_module(void)
{
printk("Bye!n");
}

module_init (enter_module);
module_exit(exit_module);

MODULE_AUTHOR("Fernando Apesteguia");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Dumb module");

To compile this module, you can create a Makefile as follows (replace test.o to match your selected name):

obj-m += test.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

And now, you can modprobe your module.

Conclusion

In this first article we have seen some different approaches on kernel design. We have also learnt about the internal structures the kernel uses to keep track of modules and to load them. We have tested some user-land commands used to deal with modules as insmod, rmmod, etc.
The next article of this series will be about some of the techniques, functions and variables provided by the kernel to be used by modules and built-in code. The basic rules of kernel programming will be explained too.
And of course, I'll write more examples to play with them.
 
Rate This Article: poor excellent
 
Comments about this article
Question about the example
writen by: someone on 2006-05-01 04:04:32
what with those empty includes in the last example?
RE: Question about the example written by someone:
RE: Question about the example
writen by: jaya1 on 2010-02-17 06:04:14
include
include
Reply to jaya1:
nice
writen by: Andy on 2006-05-01 14:17:48
interesting read thanks
RE: nice written by Andy:
Great Article
writen by: bsesser on 2006-05-02 16:15:08
Great Article it really opened the door for thought and for advancement. Keep up the good work.
RE: Great Article written by bsesser:
broken article text!
writen by: Michael Shigotin on 2006-05-03 08:52:42
At least one place is oversanitized (> in command line) and pretty much every #include resembles PoC due to undersanitization (even if #include can be read in page source this time ;-).
RE: broken article text! written by Michael Shigotin:
happy to see
writen by: William on 2006-05-03 11:57:42
Hi Man, happy to see a guide like this... go ahead :) Good stuff!
RE: happy to see written by William:
Sorry for the missings
writen by: fernape on 2006-05-07 15:24:17
Sorry for the missing includes. They are: #include #include I think the "less than" symbol was mixed with "tag delimiter" for HTML. Best regards PS: I'm writing a second part...
RE: Sorry for the missings written by fernape:
Sorry... Once again
writen by: fernape on 2006-05-09 12:10:24
#include &l tkernel.h > #include &l tmodule.h > I hope this is showed right.... Best regards
RE: Sorry... Once again written by fernape:
Question: How to run LKM ?
writen by: Mike on 2006-05-22 23:41:57
I have been using Linux for a year and I am desperate to do experiments with it. I have redhat 7,8 and SUSE 10 installed. I have one basic question. Will I be able to run this demo programm just like that shown here? I have heard that I will have to configure kernel from scratch..is it true ?? If yes, then can you guide me exact article for it. There are plenty of articles on internet but they are not complete. I have tried them but couldn't succeed. Thank you all, Please reply.........
RE: Question: How to run LKM ? written by Mike:
RE: Question: How to run LKM ?
writen by: jaya1 on 2010-02-17 06:13:55
hi mike
in case of kernel with loadable modules enabled there is no need to configure the kernel and in the make file u should add the path for the kernel which u will complie the module in the $KERNELDIR then i will work normal commands
in case u have a problem in inserting please check the /var/log/messages for reason to not insert the module . the log file is used to report the error of the modules and all
Reply to jaya1:
Re: Question: How to run LKM ?
writen by: hhro on 2006-05-24 10:08:14
rtfm! if you fail after one year of linux experience you are too dumb.
RE: Re: Question: How to run LKM ? written by hhro:
No, You don't need it
writen by: fernape on 2006-05-24 16:14:30
No, you don't need to re-compile your kernel. This are modules for: you compile only your module and then you can insert on the fly so your kernel remains untouched. Best regards
RE: No, You don't need it written by fernape:
Opened my eye's thx!
writen by: lall on 2006-08-09 09:00:27
Interesting article, looking forward reading part II.
RE: Opened my eye's thx! written by lall:
Intersing kernel modules programmaticall
writen by: Prasanna Nandaragi on 2006-12-12 04:30:33
Hello, I want to create device file programmitically for the kernel module I want to insert, so that I can get back the device major number. I tried using 'insmod', I could not find a way to get the device major number back in user mode. I tried passing user space integer pointer as a command line argument for kernel module(ex: insmod module major_num_addres = &int_maj_num) and use this pointer in init_module() for getting the major number using copy_to_user() function. This resulted in 'sementation fault'. Can anybody suggest me how can I insert a module being in user space and get the major number which is allocated in init_module() by the kernel. Help in this regard is appreciated. Thanks in advance, Prasanna
RE: Intersing kernel modules programmaticall written by Prasanna Nandaragi:
Great Article
writen by: sexbento.com on 2007-05-29 15:58:58
RE: Great Article written by sexbento.com:
Kernel type
writen by: Kiran on 2007-07-05 01:14:50
How to find my kernel (RH Linux) is monolithic or micro kernel???
RE: Kernel type written by Kiran:
RE: Kernel type
writen by: jaya1 on 2010-02-17 06:06:39
i answer iam giving is right or wrong i dont know but to know the kernel is micro kernel or monolithical check the version and menuconfig will decide the kernel
Reply to jaya1:
Monolithic.
writen by: Tom on 2007-07-06 15:07:31
RE: Monolithic. written by Tom:
Linux Kernal
writen by: Ratheesh Nair on 2008-01-03 01:33:38
hello prasanna, You can insert the kernal module by using insmod , and you can remove it from the kernal , by using rmmod command , and you can find the respective major number (device number) by checking the /proc file system , which will add all the running modules on FLY , check by using the following command vi /proc/modules , in that you can find your respective major number . I think (not sure)by defult the major number is 256 . try to execute the following program - #define MODULE #include<linux/kernel.h> main(){ int init_process(void *){ printk("i am in kernal"); return 0; } void cleanup_process(){ printk(bye bye kernal"); } }
RE: Linux Kernal written by Ratheesh Nair:
RE: Linux Kernal
writen by: jaya1 on 2010-02-17 06:09:36
but using modprobe unknown symbol are auotmatically resovled and if that symbols are not exportted form the other module then in case of insmod an error message will be shown where as in modprobe it automatically exports the symbols . for insmod to work only modprobe;conf is needed . many will say modprobe is simple compare to insmod
Reply to jaya1:
nice work
writen by: Adrian on 2008-08-19 16:10:06
Great Job, Thanks!
RE: nice work written by Adrian:

Comment title: * please do not put your response text here