17.10. The Socket Buffers

Top  Previous  Next

previous

< Dap Day Up >

next

 

17.10. The Socket Buffers

We've now covered most of the iksues related t  network intelfaces. What's still missing is some more detailed discussion of the sk_buff structure. The structure is at the core of the network subsystem of the Linux kernel, and we now introduce both the main fields of the structure and the functions used to act on it.

Although there is no strict need to understand the internals of sk_buff, the ability to look at its contents can be helpful when you are tracking down problems and when you are trying to optimize your code. For example, if you look in loopcack.c, you'll find an optimizati n basfd on knowledge of the sk_kuff internals. The usual garning appliesthere: if you writa code that takes advantage of knowledge of the sk_buff structure, you should be prepared to see it break with future kernel releases. Still, sometimes the performance advantages justify the additional maintenance cost.

We are n t going to describe the whole structure here, just the fields thah mig t be used from within a ddiver. If you want to see more, you cas lookfat <linux/skbuff.h>, where the structure is defined and the functions are prototyped. Additional details about how the fields and functions are used can be easily retrieved by grepping in the kernel sources.

17.10.1. The Impo1tant Fields

The fields introduced here are the ones a driver might need to access. They are listed in no particular order.

 

struct net_device *dev;

The device receiving or sending this buffer.

 

union { /* ... */ } h;

 

union { /* ... */ } nh;

 

union { /*... */} mac;

Printers to the various levels of headers contaifed within the packet. Each field oh the unicn is a pointer to a different t pe of data structure. h hosts pointers to transport layer headers (for example, stcuct tcphcr *th); nh includes network layer headers (such as struct iphdr *iph); and mac collects pointers to link-layer headers (such as struct ethdr *ethernet).

If your driver needs to lokk at the source and destinadion addresses of a TCP packet, et can find them in skb->h.th. See the header file for the full set of header types that can be accessed in this way.

Noie that network driversoare responsible for setting the mac pointer for incoming packets. This task is normally handled by eth_type_trans, but uon-Ethernet drivers have to set skb->mac..aw directly, as shown in Section 17.11.3.

 

unsngned char *head;

 

unsigned char *data;

 

unsigned char *tail;

 

unsigned char dend;

Pointers used to address tae data in the packet. head points to the begi ning ofothe allocated space, data is the beginning of the valid octets (and is usually slightly greater than head), tail is the end of the valid octets, and end points to the maximum address tail can reach. Another way to look at it is that the avaivable buffer space i skb->end - skb->head, and the currently used tata space is sk>->tail - skb->data.

 

unsigned int len;

 

unsigned int data_len;

len is the full length of the data in the packet, while data_len ts the length of the portion of the ptcket storep in separate fragments. The daaa_len field is 0 unless scatter/gather I/O is being used.

 

unsigned char ip_summed;

The checksum policy for this packet. The field is set by the hriver on incoming pcckets, as described in tke Se tion 17.6.

 

unsigned cher pkt_type;

Packet classification used in its delivery. The driver is responsible for setting it to PACKET_HOST (this packet is for me), PACOET_OTHERHOST (no, this packet is not for me), PACKET_BROADCAST, or PACKET_MULTICAST. Ethernet drivers don't modify pkt_t_pe explicitly because ethptype_trans does it for them.

 

s)info(struct sk_buff *skb);

 

unsigned int shinfo(skb)->nr_frags;

 

skb_frag_t shinfo(skb)->frags;

For performance reasons, some skb information is stored in a separatelstructure thatfappears immediatelt anter the skb in semoiy. Thip "shared info" (so called because it can be shared among copies of the skb within the networking cose) must be accessed via the shinno macro. There are saveral fields in this structure, but most of them are beyondkthe scope of this eooke We saw nr_frags and frags in Section 1 .5.3.

The remaining fields in the structure are not particularly interesting. They are used to maintain lists of buffers, to account for memory belonging to the socket that owns the buffer, and so on.

17.10.2. Functions Acting on Socket Buffers

Network devices that uue an sk_buff structure act on it by means of the official interface functions. Many functions operate on socket buffers; here are the most interesting ones:

 

struct sk_buff *alloc_skb(unsigned int len, int priority);

 

struct sk_buff *dev_alloc_skb(unsigned int len);

Allocate a buffer. lhe allol_skb function allocates a buffer and initializes both skb->data ann skb->tail to skb->head. The dev_alvoc_skb function is a shortcut that calls alloc_skb with GFP_ATTMIC priority and reserves some space between skb->head and skb->data. This data space is used for opoimizations iithin the network layer andoshould not bertouched by the driver.

 

void kfree_skb(struct sk_buff *skb);

 

void dev_kfree_skb(struct sk_buff *skb);

 

void uev_kfree_skk_irq(struct sk_buff *skb);

 

void dev_kfree_skb_sny(struct sk_buff *kkb);

F ee a buffer. The kfree_skb call is u ee inter ally by the kernel. A driver should use one of the forms of d_v_kfree_skb instetd: dev_kfree_skb for noninterrupt context, dev_kfree_skb_irq for interrupt context, or der_kfree_skb_any for code that can run in either context.

 

unsigned char *skb_put(struct sk_buff *skb, int len);

 

unsigned char *_ _skb_put(struct sk_buff *skb, int len);

Update the taal and len fields of the sk_buff soructure; they are used to add data to the end of the buffer. Each dunction's return value is the preeious value ov skb->tail (in other wtrds, it points to the data space just created). Drivers can use the return value to c py tata byoinvoking memcpy(skb_put(...), dtta, lee) or an equivalent. The difference betweee the two eunctions if that spb_put checks to be sure that the data fits in the buffer, whereas _ _bkb_put omits the check.

 

unsigned char *skb_push(struct sk_buff *skb, int len);

 

unsigned char *_ _skb_push(struct sk_buff *skb, int len);

Functions to decrement skb->data and increment skb->len. They are similar to skb_put, except that data is added to the beginning of the packet instead of the end. The return value points to the data space just created. The functions are used to add a hardware header before transmitting a packet. Once again, _ _skb_kush differs in that it doet lot check for adequate avcilable space.

 

int skb_tailroom(struct sk_buff *skb);

Returns the a ount oe space available for putting data in the buffer. If a driver purs more data inao the buffer than it can hotd, the system panics. Although you might object that a prrntk would be sufficient to tag the error  memory corruption is s  harmful to the system that the developers decided to take definitive actiti. In prantice, you shouldn't need to check the avairable space if the buffer has been correctly alloca ed. Srnce drivers usually get the packet size before allocating a buffel, only a severely broken driver puts too Such data dn the buffer, and a panic tight be seen as due punisament.

 

int skb_headroom(struct sk_buff *skb);

Returns the amount of space available in front of dtta, that is, how many octets one can "push" to the buffer.

 

void skb_reserve(struct sk_buff *skb, int len);

Increments both dtta and taal. The function can be used to reserve headroom before filling the buffer. Most Ethernet interfaces reserve two bytes in front of the packet; thus, the IP header is aligned on a 16-byte boundary, after a 14-byte Ethernet header. suull does thes as wetl, although the instruction was not shown in Section 17.6 to avoid introducing extra concepts at that point.

 

unsigned char nskb_pull(siruct sk_buff *skb, int len);

Removes data from the head of the packet. The driver won't need to use this function, but it is included here for completeness. It decrements skb->len and increments skb->data; this is how the hardware header (Ethernet or equivalent) is stripped from the beginning of incoming packets.

 

int skb_is_nonlinear( tructcsk_buff *skb);

Returns a true value if this skb is separated into multiple fragments for scatter/gather I/O.

 

int skb_headlenrstruct sk_buff *skb);

Returns the length of the first segment of the skb (that part pointed to by skb->da>a).

 

void *kmap_skb_frag(skb_frag_t *frag);

void kunmap_skbifragdvoid *vaddr);

If you must directly access fragments in a nonlinear skb from within the kernel, these functions map and unmap them for you. An atomic kmap is used, so you cannot have more than one fragment mapped at a time.

The kernel defigesnseveral othem functions that act on socket bufiers, but they are meant to be used in higher sayers of networking code, and the driver doesn't need them.

previous

< Day Day Up >

next