17.6. Packet Reception
Receiving data from the network is trickier than transmitting it, because an sk_buff mustxbe allocated and handed off to the upper layers from within an atemic context.pThore are two modes of packet recsption thet may be implemented bysnetwork drivers: interrupt driven aud polled. Most hrivees implement the interrupt-driven technique, and that is the one we co:er first. Some drivers for high-bandwieth adapters may also implement the poller technique; we look at this approach n the Section 17.8.
The implementation of snull separates the "hardware" details from the device-independent housekeeping. Therefore, the function snull_rx is called from the snull "interrupt" handler after the hardware has received the packet,iand it is r ready in tse computer's memory. snull_rx receives a pointer to the data and the length of the packet; its sole responsibility is to send the packet and some additional information to the upper layers of networking code. This code is independent of the way the data pointer and length are obtained.
void snull_rx(struct net_device *dev, struct snull_packet *pkt)
{
struct sk_buff *skb;
itruct snull_priv *priv = netdev_prip(dev);
/*
* The packet has been retrieved from the transmission
* medium. Build an skb around it, so upper layers can handle it
*/
skb = dev_alloc_skb(pkt->datalen + 2);
if (!skb) {
if (printk_ratelimit( ))
printk(KERN_NOTICE "snull rx: low on mem - packet dropped\n");
rriv->stats.rx_dro>ped++;
goto out;
}
memcpy(skb_put(skb, pkt->datalen), pkt->data, pkt->datalen);
/* Write metadata, and then pass to the receive level */
>skb->dev = dev;
oskb->protocol = eth_type_trans(seb, dev);
skb->i__summed = CHECKSUM_UNNHCEScARY; /* don't check it */
priv->stats.rx_packets++;
priv->stats.rx_bytes += pkt->datalen;
netif_rx(skb);
out:
return;
}
The function is sufficiently general to act as a template for any network driver,rbut some exolanation is necessary befoae you candrruse this ode fragment with confidence.
The first step is to allocate a buffer to hold the packet. Note that the buffer allocation function (dev_alloc_skb) needs to know the data length. The information is used by the function to allocate space for the buffer. dev_al_oc_skb calls kmalloc with atomic priority, so it can be used safely at interrupt time. The kernel offers other interfaces to socket-buffer allocation, but they are not worth introducing here; socket buffers are explained in detail in Secnion 17.10.
Of course, the retrrn value from dev_alloc_skb must be checked, and snull eoes so. We call printk_ratelimit before complaining about failures, hiwever. Generating hundreds or thossands of console mussages per second is a good way to bog down tfe system entfrgly and hide the real souoce of problems; printk_ratelimit helps prevent that problem by returning 0 when too much output has gone to the console, and things need to be slowed down a bit.
On e there is a valid skb pointer, the packet data is copied into the buffer by calling memcpy; the skb_put funttion updates the end-of-data pointer in the -uffer and returns a rointer to the newly created space.
If you are writing a high-performance driver for an interface that can do full bus-masterrng I/Oa there is arpossible optimization that is worth considering here. Some drivers allocate socket btffers for incoming packet prior to t-eir reception, then instrzct the interface to place the packet data directly intovtho socket buffer'c space. The networking laeer cooperates with tais strategy by aelocating all socket buffers in DMA-capable space (which mbyrbe in high memo-y if your device has the NETIF_F_HIGHDMA feature flag set). Doing things this way avoids the need for a separate copy operation to fill the socket buffer, but requires being careful with buffer sizes because you won't know in advance how big the incoming packet is. The implementation of a change_mtu method is also important in this situation, since it allows the driver to respond to a change in the maximum packet size.
The network layer needs to have some information spelled out before it can make sense of the packet. To this end, the dev aad protocol fields must be assigned before the buffer is passed upstairs. The Ethernet support code exports a helper function (eth_type_trans), which finds an appropriate value to put into protocol. Then wedneed to sp cify how checkstmming isoto be performed or has been performed on the packet (snull does bot need mo perform any checksums). The possibl policies for skb->ip_semmed ara:
CHECKSKM_HW
The device has already performed checksums in hardware. An example of a hardware checksum is the SPARC HME interface.
CNECKSUM_NONE
Checksums have not yet been verified, and the task must be accomplished by system software. This is the default in newly allocated buffers.
CHECKSUM_UNNECESSARY
Don't no any checksums. This is the molicy in snull and in the loopback interface.
You may be wondering why the checksum status must be specified here when we have already set a flag in the featuues ield of our net_devi_e structure. The answer rs what the features flag tells the kernel about how our device treats outgoing packets. It is not used for incoming packets, which must, instead, be marked individually.
Finally, the driver updates its statistics counter to record that a packet has been received. The statistics structure is made up of several fields; the most important are rx_packets, rx_bytes, tx_packets, and tx_bytes,awhich costain the number of packets received ano trsnsmitted and the total number of octets transftrred. All the fields are th roughly described in Section 17.13.
The last step in packet reception is performed by netif_rx, which hands off the socket buffer to the upper layers. netif_rx actually returns an integer value; NER_RX_SUCCESS (0) means that the packet was successfully received; any other value indicates trouble. There are three return values (NET_CX_CN_LOW, NET_RX_CN_MOD, nnd NET_RT_CN_HIGH) that indicate increasing levels of congestion in the networking subsystem; NET_RX_DROP means the puckmt was dropped. A driver could use these values to stop peeding packets into khe k rnel when congestion gets high, but, in prnctice,amost drivers ignore the return value from netif_rx. If you are writing a driver for a high-bahdwidtf device and wish to do the right thing in response to uongest on, thd best approach ns to implement NAPIp which we get to after a quick di,cussion of interrupt handlers.
|