and < unistd.h> includes. Using them, it is possible to call a syscall service routine with the syntax that follows:
syscall(the_syscall,parameter, parameter...)
Let's see an example code:
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include < unistd.h>
#include < stdio.h>
int main()
{
int ret;
ret=syscall(SYS_open,"my_file.txt",O_RDONLY);
if (ret==-1)
printf("Errorn");
return(0);
}
This code actually performs the same operation than the previous example, but this calls system calls directly.
The unistd.h file provides the identification numbers for every defined system call. The unistd.h file can be found in the unistd.h file at include/asm-$arch/ inside your kernel source directory.
#define __NR_restart_syscall 0
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
#define __NR_open 5
#define __NR_close 6
#define __NR_waitpid 7
#define __NR_creat 8
#define __NR_link 9
#define __NR_unlink 10
#define __NR_execve 11
#define __NR_chdir 12
#define __NR_time 13
#define __NR_mknod 14
#define __NR_chmod 15
...
...
Most of the syscalls have the same names than the user-land libc equivalent functions, so they will be not explained.
Playing with syscalls.
In the 2.4 kernel series, the syscall table was exported to modules. This means that we could declare the syscall table inside our module as follows:
extern unsigned long syscall_table;
This provided certain flexibility for the modules since they could replace some of the entries at this table to make it point to their custom system calls. However, this is not good, because syscalls are very critical. Imagine that one of these custom syscalls corrupts kernel memory, or maybe it doesn't perform the expected task in a safer way. Because of this, the syscall table is not exported to modules anymore. This prevents to access it and to change the addresses that it contains. Once I've said this, I would like to present the final example of this article, but always keep in mind this: The method used here is for learning purposes only. There are millions of things that can go bad with this code because of its nature. So this kind of hack should not be used in production software in anyway. Specially, this code is not safe against module unloading and has race conditions. Some synchronization methods should be used to ensure that the module unloading doesn't produce an Oops (in the worst case, the Oops can lead your system to goes down.). I explicitly omitted this additional code to improve code's clarity.
Code example:
First of all, we need to define both the syscall table and the routine that we want to replace. This is done with the two first lines:
static unsigned long **sct;
int (* old_unlink)(void);
The syscall table is an array of pointers and the function that we will replace (unlink in this case) is a pointer to function.
The next step is to get the syscall table's address. Because this is not exported, we need another method to get it. We use the address that can be found at System.map file. The System.map file contains the memory addresses for the symbols declared in the kernel. This file is used by some applications like the klogd daemon that uses it to translate debug information from kernel into a more human-readable format.
We can use grep to find this address:
fernape@localhost:/boot$ grep sys_call_table System.map-2.6.15.6
c02aa560 D sys_call_table
So, once we have the two needed addresses (remember that unlink and others are exported so they can be freely used), now we can make the swapping:
sct=(void *)0xc02aa560;
old_unlink=(void *)sct[__NR_unlink];
sct[__NR_unlink]=(unsigned long *)&my_unlink;
First, we set the syscall table address and then we replaced the original unlink syscall by our own routine that doesn't perform anything but a "unlink unavailable" message from kernel. Obviously, we also save the original address in order to restore the syscall table when unloading the module.
Here is the complete code:
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/types.h>
#include <asm/unistd.h>
static unsigned long **sct;
int (* old_unlink)(void);
int my_unlink(void)
{
printk(KERN_INFO "unlink unavailablen");
return 0;
}
static int enter (void)
{
sct=(void *)0xc02aa560;
old_unlink=(void *)sct[__NR_unlink];
sct[__NR_unlink]=(unsigned long *)&my_unlink;
return 0;
}
static void go_out(void)
{
printk(KERN_INFO "Bye, restoring syscallsn");
sct[__NR_unlink]=(unsigned long *)old_unlink;
}
module_init(enter);
module_exit(go_out);
Now let's test it. Create a single file named "myfile" (touch myfile). And now, insert the module with insmod or modprobe. After this, try to delete the "myfile" file. You will see the "unlink unavailable" message and if you run ls, you will see that your file keeps intact. If you don't see the message, try with tail /var/log/messages.
Don't forget to unload the module and check that now, you can delete files.
Jun 24 17:21:47 localhost kernel: unlink unavailable
Conclusion
This second article about LKM programming tried to offer an overview of the mechanism that makes system calls possible. Although the method above exposed to get the syscall table address was used by some programs like ancient Oprofile versions and (not so ancient) Intel Vtune driver, I would like to remark again that this should not be done for real production modules. There are better ways to achieve these purposes like using the Linux Trace Toolkit or kprobes (and there are others more ugly if possible, that scan part of the kernel address space to find the syscall table). I just hope that you enjoyed reading this and playing with the code as I did.