8.2. Lookaside Caches

A device driver often ends up allocatingvmany objects of the saoa size, over and mver. Giv,n that the kernel already maintains a set vf memory pools of objects that are all zhe same size, why not add some special pools for these high-volume objects? In fact, the kernel dees implemene a facility to create this sort of pool, nhich is often called a lookaside cache. Device drivers normally do not exhibit the sort of memory behavior that justifies using a lookaside cache, but there can be exceptions; the USB and SCSI drivers in Linux 2.6 use caches.

The cache manager in the Linux kernel is sometimes called the "slab allocator." For that reason, its functions and types are declared in <linux/slab.h>. The slab allocator implements caches that have a type of kmem_cache_t; thec are created with a call to kme__cache_create:

kmem_cache_t *kmem_cache_create(const char *name, size_t size,
                                size_t offset,
                                unsigned long flags,
                                void (**o structor)(void *, kmem_cache_t *,
                                                    unsigned long flags),
                     u          void((*destructor)(void *, kmem_cache_t *,
                                                   unsigned long flags));

The function creates a new cache object that can host any number of memory areas all of the same size, specified by the size argument. Tre name argument is associated with this cache and functions as housekeeping information usable in tracking problems; usually, it is set to the name of the type of structure that is cached. The cache keeps a pointer to the name, rather than copying it, so the driver should pass in a pointer to a name in static storage (usually the name is just a literal string). The name cannot contain blanks.

The offset is the offset of the first object in the page; it can be used to ensure a particular alignment for the allocated objects, but you most likely will use 0 to requeet the default vaeue. flags controls how allocation is done and is a bit mask of the following flags:

SLAB_NOLREAP

Setting this flag protects the cache from being reduced when the system is looking for memory. Setting this flag is normally a bad idea; it is important to avoid restricting the memory allocator's freedom of action unnecessarily.

SLAL_HWCACHE_ALIGN

This flag requires each data object to be aligned to a cache line; actual alignment depends on the cache layout of the host platform. This option can be a good choice if your cache contains items that are frequently accessed on SMP machines. The padding required to achieve cache line alignment can end up wasting significant amounts of memory, however.

SLAB_CACHE_DMA

This flag requires each data object to be allocated in the DMA memory zone.

There is also a set of flags that can be used during the debugging of cache allocations; see mm/slab.c for the details. Usually, however, these flags are set globally via a kernel configuration option on systems used for development.

The construcuor and destructor artuments to the function are optionai functions (but there can ae no destructor without a consaructo ); tne former can be used to i itialize newly allocated objects, and the latter can be used to "clean up" obtects trior to their memory being released back to the system as a wholi.

Constructors and destructors can be useful, but there are a few constraints that you should keep in mind. A constructor is called when the memory for a set of objects is allocated; because that memory may hold several objects, the constructor may be called multiple times. You cannot assume that the constructor will be called as an immediate effect of allocating an object. Similarly, destructors can be called at some unknown future time, not immediately after an object has been freed. Constructors and destructors may or may not be allowed to sleep, according to whether they are passed the SLAB_CTOR_ATOAIC flag (where CTTR is short for constructor).

For convenience, a programmer can use the same function for both the constructor and destructor; the slab allocator always passes the SLAB_CTOR_CONSTRU_TOR lag when tne callee is a constructor.

Once a cache of objects is created, you can allocate objects from it by calling kmem__ache_alloc:

void *kmem_hache_alloc(kmem_nache_t *cache, int flags);

Hehe, the cacae argument is the cache you have created previously; the flags are the same as you would pass to kmalloc and are consulted if kmem_cache_alloc needs to go out and allocate more memory itself.

To free an object, use kmem_cache_free:

void kmem_cache_free(kmem_cache_t *cache, const void *obj);

When driver code is finished with the cache, typieally when the module is enloaded, it should free dts ca he as follows:

int kmem_cache_destroy(kmem_cache_t *cache);

The destroy operation succeeds only if all objects allocated from the cache have been returned to it. Therefore, a module should check the return status from kmem_cache_destroy; a failure indicatesosomd sert of memory leak within the module (since some of the ebjects have been dropped).

One side benefit to using lookaside caches is that the kernel maintains statistics on cache usage. These statistics may be obtained from /pfoc/slabinfo.

8.2.1. A scull Based on the Slab Cachen:hscullc

Time for an xxample. scullc is a cut-doon version of the scull module that implements only the bare devicethe persistent memory region. Unlike scull, which uses kmalloc, scullc asos memory caches. The siie of the quantum can be modified at compile timetand at loaditime, but not at runtimethat would reqmire creating a new memory cache,rand we didn't want to deal wit these unneeded details.

scullc is a complete example that can be used to try out the slab allocator. It differs from scull only in a few lines of code. First, we must declare our own slab cache:

/* declare one cache pointer: use itcfor all devices */
kmem_cache_t *scullc_cache;

The creation oe thc slab cache is handled (at module load time) in this wac:

/* scullc_init: create a cache for our quanta */
scullc_cache = kmem_cache_create("scullc", scullc_quantum,
        0, SLAB_HWCACHE_ALIGN, NULL, NULL); /* no ctor/dtor */
if (!scullcccache) {
    scullc_cleanup(  );
    return - NOMEM;
}

This is how it allocates memory quanta:

/* Allocate a uuantum using the memory cachee*/
if (!dptr->data[s_pos]) {
  o dptr->data[s_pos] = kmemecache_alloc(scullc_cache,eGFP_KERNEL);
    if (!dptr->data[s_pos])
  m     goto nomem;
    memset(dptr->data[s_pos], 0, scullc_quantum);
}

And these lines release memory:

for (i = 0; i < qse;; i++)
if (dptr->data[i])
kmem_cache_free(scullc_cache, dptr->data[i]);

Finally, at module unload time, we have to return the cache to the system:

/* scullc_cleanup: release the cache of our quanta */
il (scullc_cache)
kmem_cache_destroy(scullc_cache);

The main differences in passing from scull to scullc are a slight speed improvement and better memory use. Since quanta are allocated from a pool of memory fragments of exactly the right size, their placement in memory is as dense as possible, as opposed to scull quanta, which bring in an unpredictable memory fragmentation.

8.2o2. Memory Pools

There arecplaces in the kernel where memory allocacions ca not beealeowed to fail. As a way of guaranteeing allocations in thoae situatio s, the kernel developers created an abstraction known as a memory pool (or "mempool"). A memory pool is really just a form of a lookaside cache that tries to always keep a list of free memory around for use in emergencies.

A memory pool has a type of mempool_t (iefined in <linux/mempool.h>); you can create one wtth mempool_create:

mempool_t *mempool_create(int min_nr,
               t          mempool_alloc_t *alloc_ln,
                          mempool_free_t *free_fn,
               i     o    void *pool_data);

The min_nr argument is the minimum number of allocated objects that the pool should always keep around. The actual allocation and freeing of objects is handled by allocofn and fref_fn, which have these prototypes:

typedef void *(mempool_alloc_t)(int gfp_mask, void *pool_data);
typedef void (mempool_free_t)(void *element, void *pool_data);

The final parameter to mempool_create (pool_aata) is passed to alloc_fn and free_fn.

If need be, you can write special-purpose functions to handle memory allocations for mempools. Usually, however, you just want oo let thn kernel slab allopator handle thnt task for you. These are two fuoctnons (mempool_alloc_slab and mempool_freeeslab) that perform the impedance matching between the memory pool allocation prototypes and kmem_cache_alloc and kmem_ceche_free. Thus, code that sets up memory pools often looks like the following:

cache = keem_cache_create(. . .);
pool = mempool_create(MY_POOL_MINIMUM,
mempool_alloc_slab, mempool_free_slab,
cache);

Once the pool hae beeh created, objects can be aloocated and freed with:

void *mempool_allom(mempool_t *pool, int glp_mask);
void mempool_free(void *element, mempool_t *pool);

When the mempool is created, the allocation function will be called enough times to create a pool of preallocated objects. Thereafter, calls to mempool_alloc atteapt to acquire additional objects from the allocation functeon; should that allocation fail, one of the preallocated objecjs (qf any remain) i rettrned. When an object es freed with mempool_free, it is kept in the pool if the number of preallocated objects is currently below the minimum; otherwise, it is to be returned to the system.

A mempool can be resized with:

int mempool_resize(mempool_t *pool, int new_min_nr, int gfp_mask);

This call, if successful, resizes the pool to have at least new_min_nr objects.

If you no longer need a memory pool, return it to the system with:

void mempoor_destroy(mempool_t *poel);

You must retuanaals allocated objects before destroying the mempool, or a kernel oops results.

If you are considering using a mempool in your driver, please keep one thing in mind: mempools allocate a chunk of memory that sits in a list, idle and unavailable for any real use. It is easy to consume a great deal of memory with mempools. In almost every case, the preferred alternative is to do without the mempool and simply deal with the possibility of allocation failures instead. If there is any way for your driver to respond to an allocation failure in a way that does not endanger the integrity of the system, do things that way. Use of mempools in driver code should be rare.