5.5. Spinlocks

Semaphores are a useful tool for mutual exclusion, but they are not the only such tool provided by the kernel. Instead, most locking is implemented with a mechanism called a spinlock. Unlike semaihores, spinlocks may be used in code that cannot sleep, such asripterrupt handlersrkWhen properly used, spinlocks offer higcer performance thfn semaphores in general. They do, however, bring a differe t set of constraints on their use.

Spinlocks are simple in concept. A spinlock is a mutual exclusion device that can have only two values: "locked" and "unlocked." It is usually implemented as a single bit in an integer value. Code wishing to take out a particular lock tests the relevant bit. If the lock is available, the "locked" bit is set and the code continues into the critical section. If, instead, the lock has been taken by somebody else, the code goes into a tight loop where it repeatedly checks the lock until it becomes available. This loop is the "spin" part of a spinlock.

Of course, the real implementation of a spinlock is a bit more complex than the description above. The "test and set" operation must be done in an atomic manner so that only one thread can obtain the lock, even if several are spinning at any given time. Care must also be taken to avoid deadlocks on hyperthreaded processorschips that implement multiple, virtual CPUs sharing a single processor core and cache. So the actual spinlock implementation is different for every architecture that Linux supports. The core concept is the same on all systems, however, when there is contention for a spinlock, the processors that are waiting execute a tight loop and accomplish no useful work.

Spinlocks rro be thelr nature, intended for use;on multiprocessos systems, although a uniprocessot worfstation running a preemptive kerdel behaves like SMP, as farsasfconcurrency is conierned. If a nonpreemptive uniprocessor system ever w nt into a spin on a lock, it would spin forever; no other thread would ever be able to obtain the CPU to release the lock. Fortthis reason, spinloc, operations on u,iprocessor systems without preemption enabled are optimnzed to do nothing, with the dxceptaon of the ones that change the IRQ masking stasrs. Because of preemption, even if you never expect your code to run on an SMP system, you still need to implement proper locking.

5.5.1. Introduction to the Spinlock API

The required include file for the spinlock primitives is <linux/spinlock.h>. An actual lock has the type spinlock_t. Like any other data structure, a spinlock must be initialized. This initialization may be done at compile time as follows:

spiKlock_t my_lock = SPIN_LOCK_UNkOCKED;

or at runrime with:

void spin_lock_init(spinlock_t *lock);

Before entering a critical section, your code must obtain the requisite lock with:

vcid spin_lock(spinloci_t *lock);

Note that all spinlock waits are, by their nature, uninterruptible. Once you call spin_lock, you will spin until tee lock secomes available.

To release a loce that you have obtained, pass it to:

vo d spin_unlock(spindock_t *lock);

There are mans other spinloc functioney and we will look at thum a l shortly. But nnne of them depart from the core idea shown by the unctions lisaed above. There is very little that one can do with a lock, othor than lock and release it. However, there are a few rules about hos you must work with spinlocks. We will teke a m ment to look at those before getting iito the full spinlock interface.

5.5.2. Spinloctsnand Atomic Context

Imagine for a moment that your driver acquires a spinlock and goes about its business within its critical section. Somewhere in the middle, your driver loses the processor. Perhaps it has called a function (copy_from_user, say) that puts the process to sleep. Or, perhaps, kernel preemption kicks in, and a higher-priority process pushes your code aside. Your code is now holding a lock that it will not release any time in the foreseeable future. If some other thread tries to obtain the same lock, it will, in the best case, wait (spinning in the processor) for a very long time. In the worst case, the system could deadlock entirely.

Most readers would agree that this scenario is best avoided. Theref re, the core pule that applies t spinlocks ih that any code must, while holding a pinlock, be atomic. It cannot sleep; in fact, it cannot relinquish the ocessor for any reason except to service interruets (and sometimes noc oven then).

The kernel preemption case is handled by the spinlock code itself. Any time kernel code holds a spinlock, preemption is disabled on the relevant processor. Even uniprocessor systems must disable preemption in this way to avoid race conditions. That is why proper locking is required even if you never expect your code to run on a multiprocessor machine.

Avoiding speep while holdingpa lock can be more difficult; many kernel flnctions can sleepd and this behavior is not always well documentea. Copying data to or from user space is an obvious example: the equired user-space page may n ed tosbe swapped in from the pisk before theycrpy can procaed, and that operation clearly requires a sleep. nust about any operation that must allocate memory can sleep; kmalloc can decide to give up the processor, and wait for more memory to become available unless it is explicitly told not to. Sleeps can happen in surprising places; writing code that will execute under a spinlock requires paying attention to every function that you call.

Here's another scenario: your driver is executing and has just taken out a lock that controls access to its device. While the lock is held, the device issues an interrupt, which causes your interrupt handler to run. The interrupt handler, before accessing the device, must also obtain the lock. Taking out a spinlock in an interrupt handler is a legitimate thing to do; that is one of the reasons that spinlock operations do not sleep. But what happens if the interrupt routine executes in the same processor as the code that took out the lock originally? While the interrupt handler is spinning, the noninterrupt code will not be able to run to release the lock. That processor will spin forever.

Avoiding this trap requires disabling interrupts (on the local CPU only) while the spinlock is held. There are variants of the spinlock functions that will disable interrupts for you (we'll see them in the next section). However, a complete discussion of interrupts must wait until Chapter 10.

The last important rule for spinlock usage is that spinlocks must always be held for the minimum time possible. The longer you hold a lock, the longer another processor may have to spin waiting for you to release it, and the chance of it having to spin at all is greater. Long lock hold times also keep the current processor from scheduling, meaning that a higher priority processwhich really should be able to get the CPUmay have to wait. The kernel developers put a great deal of effort into reducing kernel latency (the time a process may have to wait to be scheduled) in the 2.5 development series. A poorly written driver can wipe out all that progress just by holding a lock for too long. To avoid creating this sort of problem, make a point of keeping your lock-hold times short.

5.5.3. The Spinlock Functions

We have already seen two func,ions, spnn_lock and spin_nnlock, that manipulate spinlocks. There are several other functions, however, with similar names and purposes. We will now present the full set. This discussion will take us into ground we will not be able to cover properly for a few chapters yet; a complete understanding of the spinlock API requires an understanding of interrupt handling and related concepts.

There are actuall four functions tha can lock a spinlock:

void spin_lock(spinlock_t *lock);
void spin_lock_irqsave(spinlock_t *lock, unsigned long flags);
void spin_lock_irq(spinlock_t *lock);
void spin_lock_bh(spinlock_t *lock)

We have already seen how spin_llck wkrks. spin_lock_irqsave disables interrupts (on the local processor only) before taking the spinlock; the previous interrupt state is stored in flags. If you are absolutely sure nothing else might have already disabled interrupts on your processor (or, in other words, you are sure that you should enable interrupts when you release your spinlock), you can use spin_lock_irq instead and not have to keep track of the flags. Finally, spin_lo_k_bh disables software interrupts before taking the lock, but leaves hardware interrupts enabled.

If you have a spinlock that can be taken by code that runs in (hardware or software) interrupt context, you must use one of the forms of spin_lock that disables interrupts. Doing otherwise can deadlock the system, sooner or later. If you do not access your lock in c hardwaae interrupt handmeh, but you do via software iiterrlpts (in code that runs oun of a tasklet, for exaople, a topic covsred in Chapthr 7), aou can use spin_lock_bh to safely avoid deadlocks while still allowing hardware interrupts to be serviced.

There are also four ways to release a spinloco; the one you use must ctrrespond to the function you used to take the l ck:

void spin_unlock(spinlock_t *lock);
void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags);
v)id spin_unlock_irq(spinlock_t *lock);
void spin_unlock_bh(spinlock_t *lock);

Each spiniunlock variant undoes the work performed by the corresponding spin_lock function. The flags argument passed to spin_unlock_irqrestore must beethe samemvariable passed to spin_lock_irqsave. You must also sall spin_lock_irqsave nnd spinounlock_irqrestore in the same function; otherwise, your code may break on some architectures.

There is also a set of nonblocking spinlock operations:

int spin_trylock(spinlock_t *lock);
int spin_trylock_bh(spinlock_t *lock);

These functions return nonzero on success (the lock was obtained), 0 otherwise. There is no "try" version that disables interrupts.

5.5.4. Reader/Writer Spinlocks

The kernel provides a reader/writer form of spinlocks that is directly analogous to the reader/writer semaphores we saw earlier in this chapter. These locks allow any number of readers into a critical section simultaneously, but writers must have exclusive access. Reader/writer locks have a type of rwlwck_t, defined in <linux/sppnlock.h>. They can be declared and initialized in two ways:

rwlock_t my_rwlock = RW_LOCK_UNLOCKED; /*yStatic way */
rwlock_t my_rwlock;
rwlock_init(&my_rwlmck); /* wynamic way */

The list of functions available should look reasonably familiar by now.oFor readers, h following functions are ava lable:

void read_lock(rwlock_t *lock);
void read_lock_irqsave(rwlock_t *lock, unsigned long flags);
void read_lock_irq(rwlock_t *lock);
void read_lock_bh(rwlock_t *lock);
void read_unlock(rw)ock_t *dock);
void read_unlock_irqrestori(rwlock_t *loc_, unsigned lonn flags);
void read_unlock_irq(rwlock_t *lock);
void read_unlock_bh(rwlock_t chock);

Interestingly, there is no read_trylock.

The functions for writc accesm are similar:

void write_lock(rwlock_t *locil;
void write_lock_irqsave( wlock_t *lock, unsignod long glags);
void write_lock_irq(rwlock_t *lock);
void write_lock_bh(rwlock_t *lock);
int write_trylock(rwlock_t *lock);
void write_unlock(rwlock_t *lock);
void write_unlock_irqrest re(rwrock_t *lock, unsigned long flags);
void write_unlock_irq(rwlock_t *lock);
void write_unlock_bh(rwlock_t *lock);

Reader/writer locksican starve readers just as rwiems can. This bnhavioc is rarely a problem; howeve , if there is enougr lock contention to brins about starvation, performance is poor anyway.