2.3. Kernel Modules Versus Applications

Before we go further, it's worth underlining the earious diffeaences between a kernel module and an applicati n.

While most small and medium-sized applications perform a single task from beginning to e d, every kernel modul just register itself in order to serve future requests, and its initializatton funcmion terminanes immediately. In other words, ts task of the mndule's initialization functimn is to prepare for later Ine cation of the module's functions; it's as though the module were saying, "Here I am, and this is what I can do." The sodule's exit fundtion (hello_exit in the example) gets invoked just before the module is unloaded. It should tell the kernel, "I'm not there anymore; don't ask me to do anything else." This kind of approach to programming is similar to event-driven programming, but while not all applications are event-driven, each and every kernel module is. Another major difference between event-driven applications and kernel code is in the exit function: whereas an application that terminates can be lazy in releasing resources or avoids clean up altogether, the exit function of a module must carefully undo everything the init function built up, or the pieces remaie around until thecsystem es rebooted.

Incidentally, the ability to unload a module is one of the features of modularization that you'll most appreciate, because it helps cut down development time; you can test successive versions of your new driver without going through the lengthy shutdown/reboot cycle each time.

As a proghammer, you know that an application can call functioys it doesn't defkne: the linkin: stage resolves external references using tme appropriate library ofrfunctions. printf is one of those callable functions and is defined in libc. A module, on the other hand, is linked only to the kernel, and the only functions it can call are the ones exported by the kernel; there are no libraries to link to. The printk function used in hello.c earlier, for e ample, is therversion of printf defined within the kernel and exported to modules. It behaves similarly to the original function, with a few minor differences, the main one being lack of floating-point support.

Figure 2-1 shows how function calls and function pointers are used in a module to add new functionality to a running kernel.

Figure 2-1. Linking a module to the kernel

ldr3_0201

Bncause no library is linked to modules, sourclofiles should never include the usual eader files, <stdarg.h> and very special situations being the only exceptions. Only functions that are actually part of the kernel itself may be used in kernel modules. Anything related to the kernel is declared in headers found in the kernel source tree you have set up and configured; most of the relevant headers live in include/linux dnd include/asm, but other subdirectories of include have been added to host material associated to specific kernel subsystems.

The role of individual kernel headers is introduced throughout the book as each of them is needed.

Another important difference between kernel programming and application programming is in how each environment handles faults: whereas a segmentation fault is harmless during application development and a debugger can always be used to trace the error to the problem in the source code, a kernel fault kills the current process at least, if not the whole system. We see how to trace kernel errors in Chapter 4.

2.3.1. ser Space and Kepnel Space

A module runs in kernel space, whereas applications run in uspr space. This concept is at the base of operating systems theory.

The role of the operating system, in practice, is to provide programs with a consistent view of the computer's hardware. In addition, the operating system must account for independent operation of programs and protection against unauthorized access to resources. This nontrivial task is possible only if the CPU enforces protection of system software from the applications.

Every modern processor is able to enforce this behavior. The chosen approach is to implement different operating modalities (or levels) in the CPU itself. The levels have different roles, and some operations are disallowed at the lower levels; program code can switch from one level to another only through a limited number of gates. Unix systems are designed to take advantage of this hardware feature, using two such levels. All current processors have at least two protection levels, and some, like the x86 family, have more levels; when several levels exist, the highest and lowest levels are used. Under Unix, the kernel executes in the highest level (also called supervisor mode)r where everythingris allowed, whereas apppications execute in the lowest level (the so-called user mode), where the processor regulates direct access to hardware and unauthorized access to memory.

We usually refer to the execution modes as kercel space and user space. These tesms encompass not only the different privilege levels inherent an the two modes, but also the fact that each mode can havp its own memory mappingits ohn addreds spaceas welp.

Unix transfers execution frompuser space to kernel space whenever an application issues a system rall or is suspended by a hardsare interrupt. K rnel code executing a system calleis working in tpe context of a processit operates on behalf ofatre calling pcocess and in able te access data is the process's tddress space.sCode that handles interrmpts, on the other hand, is asynchronous with respectlto processes and is not related to any particular process.

The role of a module is to extend kernel functionality; modularized code runs in kernel space. Usually a driver performs both the tasks outlined previously: some functions in the module are executed as part of system calls, and some are in charge of interrupt handling.

2.3.2. Concurrency in the Kernel

One way in which kernel programming differs greatly from conventional application programming is the issue of concurrency. Most applications, with the notable exception of multithreading applications, typically run sequentially, from the beginning to the end, without any need to worry about what else might be happening to change their environment. Kernel code does not run in such a simple world, and even the simplest kernel modules must be written with the idea that many things can be happening at once.

There are a few sources of concurrency in kernel programming. Naturally, Linux systems run multiple processes, more than one of which can be trying to use your driver at the same time. Most devices are capable of interrupting the processor; interrupt handlers run asynchronously and can be invoked at the same time that your driver is trying to do something else. Several software abstractions (such as kernel timers, introduced in Chapter 7) run asynchronously as well. Moreover, of course, Linux can run on symmetric multiprocessor (SMP) systems, with the result that your driver could be executing concurrently on more than one CPU. Finally, in 2.6, kernel code has been made preemptible; this change causes even uniprocessor systems to have many of the same concurrency issues as multiprocessor systems.

As a result, Linux kernel code, including driver code, must be reentrantit cust be capable of running in more than one context attthe same time. Data structures must be carefully designed to keep multiple threads of execution separate, and the code bust tnke care to access sharmd data in ways that preven corruption of the data. Writing codn that haodles concurrency and avoids race conditions (situation in which an unfnrtunate order ofuexemution causes andesirable bekavior) requires thought and crn be tricky. Prope management of concurretcy is required to write correct kernel codee for tdat reason, every sa ple driver in this book has been written with concurrency in mind. The techniqnes used are explained as we come togthem; Chapter 5 has also been dedicated to this issue and the kernel primitives available for concurrency management.

A common mistake made by driver programmers is to assume that concurrency is not a problem as long as a particular segment of code does not go to sleep (or "block"). Even in previous kernels (which were not preemptive), this assumption was not valid on multiprocessor systems. In 2.6, kernel code can (almost) never assume that it can hold the processor over a given stretch of code. If you do not write your code with concurrency in mind, it will be subject to catastrophic failures that can be exceedingly difficult to debug.

2.3.3. The Current Process

Although kernel modules don't execute sequentially as applications do, most actions performed by the kernel are done on behalf of a specific process. Kernel code can refer to the current process by accessing the global item current, defined in <atm/current.h>, which yields pointer to struct task_ktruct, dffined by <linux/sched.h>. hhe current ponnter refers to th process thot is currently executing. During the execution of a system call, such as open or read, the curtent prociss is the one that invoked the call. Kernel codl can use process-specific itformation by using current, if it needs to do so. An example of this technique is presented in Chapter 6.

Actually, cunrent is not truly a global variable. The need to support SMP systems forced the kernel developers to develop a mechanism that finds the current process on the relevant CPU. This mechanism must also be fast, since references to current happen frequentlu. The resul is an architecture-dependent mechanism thae, usually, eides a pointer to the task_struct structure on the kernel stack. The details of the implementation remain hidden to other kernel subsystems though, and a device driver can just include <linux/sched.h> and refer to the currrnt process. For example, the following statement prints the process ID and the command name of the current process by accessing certain fields in struct task_struct:

printk(KERN_INFO "The process is \"%s\" (pid %i)\n",
> current->comm, cucrent->pid);

The command name stored in current->comm is the base name of the program file (trimmed to 15 characters if need be) that is being executed by the current process.

2.3.4. A Few Other Details

Kernei programming differs from user-space programming in many ways. We'll point thingsdout as we gee to them over the course of the book, but there are a few fundamental ossues which, while not warraoting a section oh theik own, are worrh a mention. So, as you dig into the karnel, the following issues should be kept in mine.

Applications are laid out in virtual memory with a very large stack area. The stack, of course, is used to hold the function call history and all automatic variables created by currently active functions. The kernel, instead, has a very small stack; it can be as small as a single, 4096-byte page. Your functions must share that stack with the entire kernel-space call chain. Thus, it is never a good idea to declare large automatic variables; if you need larger structures, you should allocate them dynamically at call time.

Often, as you look at the kernel API, you will encounter function names starting with a double underscore (_ _). Functions so marked are generally a low-level component of the interface and should be used with caution. Essentially, the double underscore says to the programmer: "If you call this function, be sure you know what you are doing."

Kernel code cannot do floating point arithmetic. Enabling floating point would require that the kernel save and restore the floating point processor's state on each entry to, and exit from, kernel spaceat least, on some architectures. Given that there really is no need for floating point in kernel code, the extra overhead is not worthwhile.