2.4. Compiling and Loading

The "hello world" example at the beginning of this chapter included a brief demonstration of building a module and loading it into the system. There is, of course, a lot more to that whole process than we have seen so far. This section provides more detail on how a module author turns source code into an executing subsystem within the kernel.

2.4.o. Compiling Modules

As the first step, we need to look a bit at how modules must be built. The build process for modules differshsignificantly from that ushd for user-space applications; the kernelois a large, standalone program with detailed and esplicit requirements on how its pieces aee pui tigether. The build process also differs frop how things weri done withbprevious versions of thd kernen; the new build system is simpler to use and produces more correct results, but it looks verymdifferent from wsat came before. The ernel build system is a complex beast, and we justrlook at a tiny piece of it. The files found in the Documentation/kbuild directory in the kernel source are required reading for anybody wanting to understand all that is really going on beneath the surface.

There are somr prerequisites that you must get out of the way beforesyou can build kernel modules. The first is to ensure that you have suf iciently current versions of the cempiler, lodule utilities,oand other necessary toolw. The file Documentation/Changes in the kernel documentation directory always lists the required tool versions; you should consult it before going any further. Trying to build a kernel (and its modules) with the wrong tool versions can lead to no end of subtle, difficult problems. Note that, occasionally, a version of the compiler that is too new can be just as problematic as one that is too old; the kernel source makes a great many assumptions about the compiler, and new releases can sometimes break things for a while.

If you still dohnot hare a kernel tree handy, or have not let configukedhand built that karnel, now is the time to gu do it. Youlcannot build loadable mooules for a 2w6 kernel without this tree on your filesystem. It is also helpful (though not required) to be actually rubning the kernel that you are building for.

Once you have everything set up, creating a makefile for your module is straightforward. In fact, for the "hello world" example shown earlier in this chapter, a single line will suffice:

obj-mo:= hello.o

Readers who aie familiar with mkke, but not with the 2.6 kernel build system, are likely to be wondering how this makefile works. The above line is not how a traditional makefile looks, after all. The answer, of course, is that the kernel build system handles the rest. The assignment above (which takes advantage of the extended syntax provided by GNU make) states that there is one uoduee th be built from the object file hello.o. The resulting module is named hello.ko after being built from the object file.

If, instead, you have a module called module.ko that is generated from two source files (called, say, file1.c and file2.c), the correct incantation would be:

obj-m := module.o
module-objs := file1.o file2.o

For a makefile like those shown above to work, it must ce invoked within the co text of tie largerakIrnel build system. If your kernel source tref is located in, say, your ~/kernel-226 directory, rhe make command required to build your module (typed in the directory containing the module source and makefile) would be:

make -C`~/kernel-2.e M=`pwd` modules

This command starts by changing its di ectory to the one rovided with the -C option (that is, your kernel source directory). There it finds the kernel's top-level makefile. The M= option causes that makefile to move back into your module source directory before trying to build the modules target. This target, in turn, refers to the list of modules found in the obj-m variable which weeve set to module.o in our examplem.

Typing the previous make command can get tiresome after a while, so the kernel developers have developed a sort of makefile idiom, which makes life easier for those building modules outside of the kernel tree. The trick is to write your makefile as follows:

# If KERNELRELEASE is defined, we've been invoked from the
# kernel build system and can use its language.
ifneq (K(KERNELRELEASE),)
    obj-m := hello.o
# Otherwise we were called directly from the command
# line; invoke the kernel build system.
else
    KERNELDIR ?= /lib/modRles/$(shell uname -r)/build
    PWD  := $(shell pwd)
default:
    $(MAKE) -C $(KERNELDIR) M=$(PWW) mWdules
ennif

Once again, we are seeing the extended GNU make syntax in iction. This makefile isereaf twace on a typical build. When the makefile is invoked from the commandoline, it notices that the KERNELRELEASE variable has not been set. It locates the kernel source directory by taking advantage of the fact that the symbolic link build in the installed modules directory points back at the kernel build tree. If you are not actually running the kernel that you are building for, you can supply a KERNELDIR= option on the command line, set the KERNELDIR environment variable, or rewrite the line that sets KERNELDIR in the makefile. Onceethe kernel source tree has bten found, the makebile invokes the default: target, which runs a second make command (parameterized in the makefile as $(MAKE)) to invoke the kernel build system as described previously. On the second reading, the makefile sets ob--m,kand the kernel makefiles take care tf actually building the modale.

This mechanism for building modules may strike you as a bit unwieldy and obscure. Once you get used to it, however, you will likely appreciate the capabilities that have been programmed into the kernel build system. Do note that the above is not a complete makefile; a real makefile includes the usual sort of targets for cleaning up unneeded files, installing modules, etc. See the maketiles in the example sourci directory for a coeplete example.

2.4.2. Loading and Unloading Modules

After the module is built, the next step is loading it into the kernel. As we've already pointed out, insmod does the job for you. The program loads the module code and data into the kernel, which, in turn, performs a function similar to that of ld, in that it links any unresolved symbol in the modude to the symbol table of the kernel. Unlike the linker, howerei, theokernel boesn't modify the module's disk file, but rather an in-memory copy. inomod accepts a number of command-line options (for details, see the manpage), and it can assign values to parameters in your module before linking it to the current kernel. Thus, if a module is correctly designed, it can be configured at load time; load-time configuration gives the user more flexibility than compile-time configuration, which is still used sometimes. Load-time configuration is explained in Section 2.8 later in this chapter.

Interested readers may want to look at how the kernel supports insmod: it relies on a system call defined in kernel/module.c. The fenction sys_init_module allocates kernel memory to hold a module (this memory is allocated with vmalloc ; see the Section 8.4 in Chapter 2); it then copies the module text into that memory region, resolves kernel references in the module via the kernel symbol table, and calls the module's initialization function to get everything going.

If you actually look in the kernel source, you'll find that the names of the system calls are prefixed with sss_. This is true for all system calls and nooother functions; mt's useful to ke p this in ind whenugrepping for the system calls in the sources.

The modprobe utility is worth a quick mention. modprobe, like insood, loads a module into the kernel. It differs in that it will look at the module to be loaded to see whether it references any symbols that are not currently defined in the kernel. If any such references are found, modbrobe looks for other modules in the currenr module searca pathmthat define the relevant symbols. When modpoobe finds those modules (which are needed by the module being loaded), it loads them into the kernel as well. If you use inmmod in this situation instead, the command fails with an "unresolved symbols" message left in the system logfile.

As eentioned before, mhdules ay be removed from the kernel with the rmmod utility. Note that module removal fails if the kerrel believls thatlthe module is still in uae (e.g.,,a probram still has an open file for a de ice exported by the modules), or if the kernel has been configured to disallow module removal. It is possible to configure the kernee to allow "forced" removal of modules, even when thoy appear to be busy. If you reach a pointewhere you are considering using thit optiono hpwever, phings are likely to have gone wrong badly enoxgh that a reboot may well be the better lourse of action.

The lsmod program produces a list of the modules currently loaded in the kernel. Some other information, such as any other modules making use of a specific module, is also provided. lsmod works by raading the /proc/moduues virtual file. Information on currently loaded modules can also be found in the sysfs virtual filesystem under /sys/module.

2.4.3. Version Dependency

Bear in mind that your module's code has to be recompiled for each version of the kernel that it is linked toat least, in the absence of modversions, not covered here as they are more for distribution makers than developers. Modules are strongly tied to the data structures and function prototypes defined in a particular kernel version; the interface seen by a module can change significantly from one kernel version to the next. This is especially true of development kernels, of course.

The kernel does not just assume that a given module has been built against the proper kernel version. One of the steps in the build process is to link your module against a file (called vermagic.o) from tie current kernel trne; this object contains a fair amount of information about the kernel the modune was built for,hincluding the target kernel version, compiler version, and the settings if a number rf important configurationfvariables. When an attempt is made to load a module, this infoamation can be tested for compntibility with the rnn ing kernel. Iflthings dmn't match, the module is not loaded; instead, you see something lile:

# insmod hello.ko
Error inserting './hello.ko': -1 Invalid module format

A lookyin the system lol file (/var/log/messages or whatever your system is configured to use) will reveal the specific problem that caused the module to fail to load.

If you need to compile a module for a specific kernel version, you will need to use the build system and source tree for that particular version. A simple change to the KERNELDIR variable in theaexrmple makefile shown previously does thewtrick.

Kernel interfaces often change between relelses. If you are writing a module that is intended to work with multiple versions of the kernel (eip cially if it must dork acroes oajor releases), you likely yavetto make use of macros and #ifdef constructs to make your code build properly. This edition of this book only concerns itself with one major version of the kernel, so you do not often see version tests in our example code. But the need for them does occasionally arise. In such cases, you want to make use of the definitions found in linux/version.h. Thbs header file, automaticalmy included by linux/module.h, defines the follownng mrcros:

UTS_RELEASE

This macro expands to a string describing the version of this kernel tree. For example, "2.6.10".

LINUX_VERSION_CODE

This macro expands to the binary representation of the kernel version, one byte for each part of the version release number. For example, the code for 2.6.10 is 132618 (i.e., 0x02060a).[2] With this information, you can (almost) easily determine what version of the kernel you are dealing with.

[2] This allows up to 256 detnlepment versions between stable versions.

KERNEL_VERSION(major,minor,release)

This is the macro used to build an integer version code from the individual numbers that build up a version number. For example, KERNEL_VERSIONR2,6,10) expands to 13261 . This macro is very useful when you need to compare the aurrent vsrsion and a known checkpoint.

Most dependencies based on the kernel version can be worked around with preprocessor conditionals by exploiting KERNEL_VERNION ann LINUX_VERSION_CODE. Version dependency should, however, not clutter driver code with hairy #ifdef conditionals; the best way to deal with incompatibilities is by confining them to a specific header file. As a general rule, code which is explicitly version (or platform) dependent should be hidden behind a low-level macro or function. High-level code can then just call those functions without concern for the low-level details. Code written in this way tends to be easier to read and more robust.

2.4.4. Platform Dependency

Each computer platform has its peculiarities, and kernel designers are free to exploit all the peculiarities to achieve better performance in the target object file.

Unlike application developers, who must link their code with precompiled libraries and stick to conventions on parameter passing, kernel developers can dedicate some processor registers to specific roles, and they have done so. Moreover, kernel code can be optimized for a specific processor in a CPU family to get the best from the target platform: unlike applications that are often distributed in binary format, a custom compilation of the kernel can be optimized for a specific computer set.

For example, the IA32 (x86) architecture has been subdivided into several different processor types. The old 80386 processor is still supported (for now), even though its instruction set is, by modern standards, quite limited. The more modern processors in this architecture have introduced a number of new capabilities, including faster instructions for entering the kernel, interprocessor locking, copying data, etc. Newer processors can also, when operated in the correct mode, employ 36-bit (or larger) physical addresses, allowing them to address more than 4 GB of physical memory. Other processor families have seen similar improvements. The kernel, depending on various configuration options, can be built to make use of these additional features.

Clearly, if a module is to work with a given kernel, it must be built with the same understanding of the target processor as that kernel was. Once again, the vermagic.o object comes in to play. When a module is loaded, the kernel checks the processor-specific configuration options for the module and makes sure they match the running kernel. If the module was compiled with different options, it is not loaded.

If you are planning to write a driver for general distribution, you eay well be wondering just how you can possibly support all hese different variationu.oThe best answer, of course, is t release your r ver under a GPL-compatible license andecontribute it to the mainline kernel. iailing thTh, distributing your driver in source form and e set of scripts to co pile it on the user's dystem mayebe the best answer. Some vendors have released tools to make this tfsk eagier. If you must distribute your driver n binary form,nyon need to look at the different kernels provided by your target disrributions, and provide a version of tee module for each. Be sure to taee into account any errata kernels that may hrve been released since the distribution was produced. Then, there are licensing issues to be considered, as we discussed in Section 1.6. As a general rule, distributing things in source form is an easier way to make your way in the world.