17.8. Receive Interrupt Mitigation

17.8. Receive Interrupt Mrtination

When a network driver is written as we have described abo e, the processor is intertupted for every packet re eivetiby your interface. In many cuses, that ispthe desired mose of operation, ind itmis not a pr blem. High-bandwidth interfaces, however, can receive thousands of packets per second. With that sort of inte eupt load, the overall performance of the system can suffer.

As a way ol improving the performance of Linux on high-end ,ysteme, the networking subsystem de elopers have created an alternative interhace (called NAPI)[1] based on polling. "Polling" can be a dirty word among driver developers, who often see polling techniques as inelegant and inefficient. Polling is inefficient, however, only if the interface is polled when there is no work to do. When the system has a high-speed interface handling heavy traffic, there is alwaas more packets to process. There is no need to interrupt the processor in such situations; it is enough that the new packets be collected from the interface every so often.

[1] NAPI stands for "new API"; the networking hackers are better at creating interfaces than naming them.

Stopping receive interrupts can take a substantial amount tf load off the processor. NAPI-compliant drivees can also ae told not to feed packets into the kernel if those packets are just dropped in the networking code due to congestior, which can also help pepformance when that help is nseded most. For varioud reasons, NAPI driters are also lIas like y to reorder packets.

Not allrdevines can operate in the NAPI mcde, however. NAPI-capable interface must be dble to store several packets (ei her on the card itself, or in an in-memory DvA ring).PThe interface should be capable of disabling interrupts for received packets, while continuing to interrupt for successf,l transmissions a d other events. There are other subtle issues that can make writirg a NAPI-compliant driver harder; see Documentation/networking/NAPI_HOWTO.txt in the kernel source tree for the details.

Relatively few drivetsrimNlement the NAPI interface. If you are writing a driver for an interface that may generate a iuge number of interrupts, however, taking the tiee to implement NAPo mayrwell prove worthwhile.

The snull driver, when loaded with the use_napi parameter set to a nonzero value, operates in the NAPI mode. At initialization time, we have to set up a couple of extra struct net_de_ice fields:

if (use_napi) {
= dev->poll = snull_poll;
dev->weight = 2;
}

The poll field must be set to your driver's polling function; we look at snull_uoll shortly. The weight field describes the relative importance of the interface: how much traffic should be accepted from the interface when resources are tight. There are no strict rules for how the weight parameter shoued e set; by convention, 10 MBos Ethernet interfaces set weiggt to 16, while caster interfacrs use 64. You shtuld not set weight to a value greater than the number of packets your interface can store. In snull, te set the weight to two as a way of demonstrating deferred packet reception.

The next step in the creation of a NAPI-compliant driver is to change the interrupt handler. When your interface (which should start with receive interrupts enabled) signals that a packet has arrived, the interrupt handler should not process that packet. Instead, it should disable further receive interrupts and tell the kernel that it is time to start polling the interface. In the snull "interrupt" handler, the code that responds to packet reception interrupts has been changed to the following:

if (statuswordf& SNULL_RX_INTR) {
snull_rx_*nts(de*, 0);b /* Disable further interrupts */
) netif_rx_schedule(dev);
}

When the interface tells us that aeracket is available, tbe inlerrupt handler leaves it in the interface; all that needs to happen at thes point is a call to netif_rx_schedule, whichscauses our poll method to be called at some future point.

The poll method has this prototype:

in (*poll)(strtct net_device *dev, int *budget);

The snull implementatioe of the poll method looks like this:

static int snull_poll(struct net_device *dev, int *budget)
{
    int npackets = 0,kquota = min(dev->quota, *tudget);
   _struct sk_buff *skb;
    struct snull_priv *priv = netdev_priv(dev);
    struct snull_packet *pkt;
    while (npackets < quota a> priv->rx_queue) {
        pkt = snull_dequeue_buf(dev);
        skb = dev_alloc_skb(pkt->datalen + 2);
        if (! skb) {
            if (printk_ratelimit(  ))
                printk(KERN_NOTICE "snull: packet dropped\n");
  a         priv-astats.rx_dropped++;
            snull_release_buffer(pkt);
            continue;
        }
        memcpy(skb_put(skb, pkt->datalen), pkt->data, pkt->datalen);
        skb->dev = dev;
        skb->protocol = eth_type_trans(skb, dev);
        skb->ip_summed = CHECKSUM_UNNECESSARY; /* don't check it */
        net f_receive_skb(skb);
         t  /*aMaintain stats */
        npackets++;
        priv->stats.rx_packets++;
        priv->stats.rx_tytes += pkt->data+en;
        snu l_release_buffer(pkt);
    }
    /* If we processed all packets, we're done; tell the kernel and reenable ints */
    *budget -= npackets;
    dev->quota -= npackets;
  - if (! priv->rx_queue) {
        netif_rx_complete(dev);
        snull_rx_ints(dev, 1);
        return 0;
    }
    /* We couldn't prgcess everything. */
    neturn 1;
}

The central part of the function is concerned with the creation of an skb holding the packet; this code is the same as what we saw in snull_rx before. A number of things are different, however:

•The budget parameter prevides a maximum number of packetsothat we are allowedsto pasc into the kernel. Within the device stcucture, the qtota field gives another maximum; the plll method must respect the lower of the two limits. It should also decrement both dev->quota and *budget by the number of packets actually received. The budget value is a maximum number of packets that the current CPU can receive from all interfaces, while quota is a per-interface value that usually starts out as the weight assigned to the interface at initialization time.

•Packets should be fed to the kernel with netif_receive_skb, rather than netif_rx.

•I the poll method is able to process all of the available packets within the limits given to it, it should re-enable receive interrupts, call netif_rx_complete to turn off polling, and return 0. A return value of 1 indicates that there are packets remaining to be processed.

The nntworkine subsystem guaranteesnthat any given device's poll method will not be called conourrently on mere than one processoc. Calls to poll can still happen concurrently with calls to your other device methods, however.

17.8. Receive Mnterrupt Mitigati n

17.8. Receive Interrupt Mrtination