Welcome to Linux Forums! With a comprehensive Linux Forum, information on various types of Linux software and many Linux Reviews articles, we have all the knowledge you need a click away, or accessible via our knowledgeable members.
Find the answer to your Linux question:
Site Navigation
Linux Forums
Linux Articles
Product Showcase
Linux Downloads
Linux Hosting
Free Magazines
Job Board
IRC Chat
RSS Feeds
Free Publications


Procfs is a pseudo-filesystem (like sysfs and several others), which means that files in /proc do not exist in your hard drive, but the information they have is calculated on demand.

Like the rest of filesystems used in Linux, procfs is supported by the Virtual File System (VFS). VFS is a kernel layer that provides abstraction when working with file systems, so that it handles the differences between file systems and shows a common interface to work with them. Although other Unix-like systems provide a procfs (FreeBSD, for example), the format varies between systems. Linux uses a plain text format and FreeBSD uses a binary format in some places. The first approach is better when working with shell commands like cat, grep, etc, but the second one is better when programming.

Under /proc we can find general system information and specific process information and statistics. Linux distinguishes different types of information with the inode number. An inode number in Linux is represented as a 32 bit number and a PID (Process Identifier) is represented as a 16 bit number. With this schema, Linux splits the inode number in two halves of 16 bit. The left half is interpreted as a PID number and the right one is interpreted as a class of information. Since a PID=0 is not valid, Linux uses this value to indicate that inode contains global information.

What kernel does when we type for example cat /proc/cpuinfo is showed in Illustration below.

Process File System

First of all, the process created by the shell requests data by reading the file. VFS catches the request and establishes the kind of file to read is a procfs file (actually, the file is a pseudo-file). The procfs subsytem queries the kernel tables to find the information required by the process. The kernel structures asked depend on the type of information the process wants (global, specific, about cpu, a process, etc). After the data have been collected, the processí buffer is filled.

The most important aspect is that this process of information gathering is completely transparent from an external point of view.

{mospagebreak title=OS basics: user-land vs. Kernel-land}

OS basics: user-land vs. Kernel-land

Modern operating systems use features from modern processors. These processors allow to execute code in two or more levels of privilege. Linux kernel uses two of the available levels to perform user mode and kernel mode (i.e., user-land and kernel-land).

The main goal is to provide an independent environment for kernel execution. In this way, if a user process crashes it may affect to other user space process, but never to kernel space.

The complexity added is to copy data from user-land to kernel-land and reverse. This is what procfs does in a way that is well known by linux users: using files.

Looking at cpuinfo

cpuinfo is one of the files that can be found in /proc. It provides information about your cpu. As we can see, a lot of useful information is displayed. This is what my cpuinfo says about my cpu:

processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 4
model name : Mobile AMD Athlon(tm) 64 Processor 2800+
stepping : 10
cpu MHz : 801.854
cache size : 1024 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx
fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow
bogomips : 1576.96
TLB size : 1088 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

Where does this information come from? If we navigate by the source code of the linux kernel, we find a file in /usr/include/asm/asmx86_64 (your path can be different, of course). In this directory, we can see the processor.h file. Let's look inside and search for a structure named cpuinfo_x86.

struct cpuinfo_x86 {
__u8 x86; /* CPU family */
__u8 x86_vendor; /* CPU vendor */ __u8 x86_model; __u8 x86_mask; /* We know that wp_works_ok = 1, hlt_works_ok = 1, hard_math = 1, etc... */ char wp_works_ok; /* It doesn't on 386's */ char hlt_works_ok; /* Problems on some 486Dx4's and old 386's */ char hard_math; char rfu; int cpuid_level; /* Maximum supported CPUID level, 1= no CPUID */ __u32 x86_capability[NCAPINTS]; char x86_vendor_id[16]; char x86_model_id[64]; int x86_cache_size; /* in KB valid for CPUS which support this call */ int fdiv_bug; int f00f_bug; int coma_bug; unsigned long loops_per_jiffy; } ____cacheline_aligned;

This is one of the structures where the linux kernel keeps information about your system characteristics.

Is difficult to achieve this information in user-land mode, so the kernel gathers it in kernel-land and offers it to user-land via procfs. This is a good approach and it avoids the extensive use of copy_from_user and copy_to_user functions.

{mospagebreak title=Implementing procfs}

Implementing procfs

The Linux kernel implements a set of functions to create and manage procfs pseudo-files. These functions are exported by the kernel, so you can use them from a module. (I explain some of them):

  • create_proc_entry:Creates a new file under /proc. Parameters are the name of the file, the mode of the file and the parent directory (if null, /proc is assumed). You can also create symbolic links and directories with proc_symlink and proc_mkdir.
  • remove_proc_entry: Removes a file from the /proc hierarchy.

The creation functions return a pointer to a proc_dir_entry structure. This structure is used to specify the behaviour of the file when a read or write operation is performed. The proc_dir_entry structure is shown below (from /usr/include/linux/proc_fs.h)

struct proc_dir_entry {
unsigned int low_ino;
unsigned short namelen;
const char *name;
mode_t mode;
nlink_t nlink;
uid_t uid;
gid_t gid;
unsigned long size;
struct inode_operations * proc_iops;
struct file_operations * proc_fops;
get_info_t *get_info;
struct module *owner;
struct proc_dir_entry *next, *parent, *subdir;
void *data;
read_proc_t *read_proc;
write_proc_t *write_proc;
atomic_t count; /* use count */
int deleted; /* delete flag */
};

The two last pointers are related to the read and write functions respectively. Letís assume that we have defined two functions:

static int my_read(char *page, char **start,off_t off, int count,
  int *eof, void *data){
  ...
  ...
}

and

static int my_write(struct file *file,const char *buffer,
unsigned long count,void *data)
{
...
...
}

All we need to do to link this operations to our file is:

my_entry->read_proc=my_read;
my_entry->write_proc=my_write;

What we can do in the body of these functions can be very complex. In the simplier case, if we want to read data from the kernel and make it available for user-land we have to print data in the page structure (with sprintf or whatever else function at your own).

When a user wants to write a data, we first have to read the data from buffer. But watch out! This buffer is in user-land so we need to copy to kernel-land by using copy_from_user function.

Let's see an example. We will create a file in /proc that returns the number of times the file has been read:

*test.c*/

#include  /*Working in kernel mode*/
#include  /*Writing a module*/
#include  /*We use procfs functions*/
static struct proc_dir_entry *my_entry; /*Our file entry*/
/*This is executed when the file is read*/
int
my_read_function (char *page,
char **buffer_location,
off_t offset, int buffer_length, int zero)
{
int len;
static int my_count = 1;
if (offset > 0)
return 0;
len = sprintf (page, "File read %d timesn", my_count);
my_count++;
*buffer_location = page;
return len;
}
/*This function is executed when the module is loaded*/
static int
enter (void)
{
printk (KERN_INFO "Module loadedn");
my_entry = create_proc_entry ("my_test", 0444, NULL);
if (my_entry == NULL)
{
/*Failed when creating file */
printk (KERN_ALERT "Error while creating filen");
return -1;
}
/*Set the read function */
my_entry->read_proc = (read_proc_t *)my_read_function;
my_entry->owner = THIS_MODULE;
return 0;
}
/*This is executed when the module is unloaded*/
static void
finish (void)
{
printk (KERN_INFO "Unloading module...n");
remove_proc_entry ("my_test", NULL);
}
module_init (enter);
module_exit (finish);
To compile this, create a Makefile with these lines:
obj-m += test.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

This program creates a file named my_test that counts the number of reads.

I know this could appear a bit confusing and dark. Writing kernel modules is not as easy as programming a Hello world application with libc and other support libraries. But I hope this will help you to understand the internals of procfs. Reentrancy and concurrency are not trivial problems we have to deal with if we are working in kernel mode. So take it easy. This document is not trying to discover all the knowledge about kernel programming or module writing. This is only an overview about kernel processes needed by procfs to work.

{mospagebreak title=Some examples on using procfs}

Some examples on using procfs

Procfs is a very useful file system. You can gather a lot of information from it without worrying about libraries, or system calls or writing C programs. Just reading files.We have seen how procfs works from the inside. Letís see what can we do with its files:

System temperature:

[fernape@Hammer]$ while true;
> do cat /proc/acpi/thermal_zone/THRM/temperature ;
> sleep 5;
> done;

You can easily change the script above to read /proc/acpi/battery/BAT0/state to know your battery charge. Isn't this easier than writing a C program?

Change host and domain names:

Certain files under /proc are available for writing (only for root). You can change hostname and domain name by writing in:

/proc/sys/kernel/domainname and /proc/sys/kernel/hostname

Please, take care. Change the host and domain name is not a dangerous task, but with procfs you can change a lot of settings than can lead to a lack of performance.

See the kernel boot parameters:

You can see them in cmdline.

[fernape@Hammer proc]$ cat cmdline
ro root=LABEL=/1 rhgb quiet console=tty0

These are only a few examples. With procfs you can set the maximum number of waiting RT POSIX signals, the number of semaphores the system can manage, the maximum number of files opened by the kernel, etc. And procfs can be a good allied in program debugging.

If you want to read more about files in /proc you can read this excellent article: http://www.linuxforums.org/misc/understanding_/proc.html

{mospagebreak title=Conclusions}

Conclusions

As we said before, other UNIX-like systems have a procfs. However, is in linux where it becomes powerful and very used. FreeBSD removed procfs from the system. They now collect system information by means of system calls directly. But if you want to run linux programs on FreeBSD, you need to install the procfs emulation package.

There are several applications and libraries that use procfs. Some of them are procps, libgtop or lkmonitor (I take part in the last one).

Although there are other ways to gather system information, procfs provides a simple way to achieve this goal. Everything in linux is a file (sockets, devices, pipes...), so I think procfs uses a good and consistent strategy to offer this information.

Rate This Article: poorexcellent
 
Comments about this article
Good information
writen by: beparas on 2008-07-04 05:39:48
This is article contains good information. Thank You
RE: Good information written by beparas:

Comment title: * please do not put your response text here