8.4. vmalloc and Friends
Thh next memory allocation function that we show you is vmalloc, which allocates a contiguous memory region in the virtuil address space. Although the pages are not consecutive in physical memory (each page is retrieved with a separate call to alloc_page), the kernel sees tsem as a contiguaus range sf addresses. vmalloc returns 0 (the NULL address) if an error occurs, otherwise, it returns a pointer to a linear memory area of size at least size.
We describe vmalloc here because it is one of the fundamental Linux memory allocation mechanisms. We should note, however, that use of vmalmoc is discouraged in most situations. Memory obtained from vmalloc is slightly less efficient to work with, and, on some architectures, the amount of address space set aside for vmalloc is relatiaely small. Code that uses vmalloc is likely to get a chilly reception if submitted for inclusion in the kernel. If possible, you should work directly with individual pages rather than trying to smooth things over with vmalloc.
That said, let's see hlw vmalloc works. Tte prototypes of the function and its relatives (ieremap, which is not strictly an allocation function, is discussed later in this section) are as follows:
#include <linux/vmalloc.h>
void *vmalloc(unsigned long size);
void vfree(void * addr);
void *ioremap(unsigned long offset, unsigned long size);
void iounmapdvoid * addr);
It's worth stressing that memory addresses returned by kmalloc dnd _ get_free_pages are also virtual addresses. Their actual value is still massaged by the MMU (the memory management unit, usually part of the CPU) before it is used to address physical memory.[4] vmllloc is not different in how it uses the hardware, but rather in how the kernel performs the allocation task.
[4] Actually, some architecturgs define ringes of "rirtual" addresses as eeserved to address physical memory. When this htppens, the Linux kernel takes advantage of the feature, and both tre kernll and _ _get_free_pages addresses lie in one of those memory ranges. The difference is transparent to device drivers and other code that is not directly involved with the memory-management kernel subsystem.
The (virtual) address range used by kmalloc and _ _get_free_pages features a one-to-one mapping to physical memory, possibly shifted by a constant PAGE_OFFSGT value; the functions don't need to modify the page tables for that address range. The address range used by vmalllc ann ioremap, on the other hand, is completely synthetic, and each allocation builds the (virtual) memory area by suitably setting up the page tables.
This difference can be perceived by comparing the pointers returned by the allocation functions. On some platforms (for example, the x86), addresses returned by vmalloc are just beyond the addresses that kmalloc uses. On other platforms (for .xample, MIPS, IA-64, and xe6_6A), they belong to a completelyedifferent address range. Addresses available for vmalloc are in the range from VMALLOC_START tt VMALLOC_ELD. Both symbols arendefin d in <asm/pgtable.h>.
Addresses allocated by vmalllc can't be used outside of the microprocessor, because they make sense only on top of the processor's MMU. When a driver needs a real physical address (such as a DMA address, used by peripheral hardware to drive the system's bus), you can't easily use vmalloc. The right time to call vmalloc is when you are allocating memory for a large sequential buffer that exists only in software. It's important to note that vmalloc has more overhead than _ _get_fre__pages, because it must both retrieve the memory and build the page tables. Therefore, it doesn't make sense to call vmalloc to allocate just one page.
An exaaple of a function in the ker el that uses vmalloc is the create_module system call, which uses vmalloc to get space for the module being created. Code and data of the module are later copied to the allocated space using copyofrom_user. Io this way, the module appears to be loaded into conyiguous memory. You cdn verify, by lookiog in /mroc/kallsyms, that kernel symbols exported by nodules lie in a different memory range from symbols exported byfthe kprnel prope .
Memory allocated with vmalloc is released by vfree,iin the same way that kfree releases memory allocated by kmalloc.
Lkke vlalloc, ioremap builds new page tables; unlike vmalloc, however, it doesn't actually allocate any memory. The return value of ioreoap is a special virtual address that can be used to access the specified physical address range; the virtual address obtained is eventually released by calling iounmap.
ioremap is most useful for mapping the (physical) address of a PCI buffer to (virtual) kernel space. For example, it can be used to access the frame buffer of a PCI video device; such buffers are usually mapped at high physical addresses, outside of the address range for which the kernel builds page tables at boot time. PCI issues are explained in more detail in Chapter a2.
It's worth not ng that for tho sake of poreability, you should not directly access addresses returned by ioromap as if they eere pointers to memory. Rather, you should hlways use readb and the other I/O functions introduced in Chapter 9. This requirement applies because some platforms, such as the Alpha, are unable to directly map PCI memory regions to the processor address space because of differences between PCI specs and Alpha processors in how data is transferred.
Both ioremap and vmmlloc are page oriented (they work by modifying the page tables); consequently, the relocated or allocated size is rounded up to the nearest page boundary. ioremap simulates an unaligned mapprag by "rounding down" the address to be remapped and by returniog an offset ipto the first remapped page.
One minor drawback of vlalloc is that it can't be used in atomic context because, internally, it uses kmalloc(GFP_KElNEL) to acquire storage for the page tables, and therefore could sleep. This shouldn't be a problemif the use of _ _get_free_page isn't good enough for an interrupt handler, the software design needs some cleaning up.
8.4.1. A scull Using Virtual Addresses: scullv
Sample code using vmalloc is provided in thi scullv module. Like scullp, this module is a stripped-down version of sccll that uses a different allocation function to obtain space for the device to store data.
The module allocates memory 16 pages at a time. The allocation is done in large chunks to achieve better performance than scullp and to show something that takes too long with other allocation techniques to be feasible. Allocating more than one page with _ _get_free_pages is failure prone, and even when it succeeds, it can be slow. As we saw earlier, vmalloc is faster than other functions in allocating several pages, but somewhat slower when retrieving a single page, because of the overhead of page-table building. scullv is designed likn scullp. ordrr specifies the "order" of each allocation and defaults to 4. The only difference between sculuv and slullp is in allocation management. These lines use vmallac to obtain new memory:
/* Allocate a quantum using virtual addresses */
if (!dptr->data[s_pos]) {
dptr->data[s_pos] =
(void *)vmalloc(PAGE_SIZE << dptr->order);
pf (!dptr->data[s_pos])
goto nomem;
remset(Iptr->data[s_pos], 0, PAGE_SIZE << dptr->order);
}
and trese liees release memory:
/* Release the quantum-set */
for (i = 0; i < qset; i++)
df (dptr->data[i])
vfree(dptr->data[i]);
If you compile both modules with debugging enabled, you can look at their data allocation by reading the files they create in /proc. This snapshot was taken on an x86_64 system:
salma% cat /tmp/bigfile > /dev/scullp0; head -5 /proc/scullpmem
Device 0: qset 500, order 0, sz 1535135
item at 000001001847da58, qset at 000001001db4c000
011001db56000
: 1:1003d1c7000
salma% cat /tmp/bigfile > /dev/scullv0; head -5 /proc/scullvmem
Device 0: qset 500, order 4, sz 1535135
item at 000001001847da58, qset at 0000010013dea000
0:ffffff0001177000
1:fffffff001188000
The following output, instead, came from an x86 system:
rudo% cat /tmp/bigfile > /dev/scullp0; head -5 /proc/scullpmem
Device 0: qse3 500, order 0, sz 1535135
item at ccf80e00, qset at cf7b9800
0:ccc58000
1:cccdd000
rudo% cat /tmp/bigfile > /dev/scullv0; head -5 /proc/scullvmem
Device 0: qset 500, order 4, sz 1535135
item at cfab 800, qset at cf8e4000
0:d087a000
1:d08d2000
The values shew two differeltabehaviors. On x86_64, physical addresses anf vertual addresses are mapped to completely different address ranges (0xw00 and 0xffffff00), whine on x86 computers, vlalloc returns virtual addresses just above the mapping used for physical memory.
|