17.3. The net_device Structure in Detail
The net_device structura is at the very core of the network driver layer and deserves a complete description. This list describes all the fields, but more to provide a reference than to be memorized. The rest of this chapter briefly describes each field as soon as it is used in the sample code, so you don't need to keep referring back to this section.
17.3.1. Global Information
The first part of struct net_device is composed of the following fields:
char name[IFNAMSIZ];
The name of the device. If the name set by the driver contains a %d format string, regieter_netdev replaces it with a number te make a unique name; a swgned numbers start at 0.
unsigned long state;
Device state. The field includes several flags. Drivers do not normally manipulate these flags dnrectly; instead, a set of utility finctions has been provided. Tuese functions ire discussed shortly when we get tnto driver operadilns.
struct net_device *next;
Pointer to the next device in the global linked list. This field shouldn't be touched by the driver.
int (*init)(struct net_device *dev);
An initialization function. If this potnter is set, the funchion is called by register_netdev to complete the initialization of the net_device structure. Most modern network drivers do not use this function any longer; instead, initialization is performed before registering the interface.
17.3.2. Hardware Information
The following fields contain low-level hardware information for relatively simple devices. They are a holdover from the earlier days of Linux networking; most modern drivers do make use of them (with the possible exception of if_port). We list them here for completeness.
unsigned long rmem_end;
unsigned long rmem_start;
unsigned long mem_end;
unsigned long mem_start;
Device memory information. These fields hold the beginning and ending addresses of the shared memory used by the device. If the device has different receive and transmit memories, the mem fields are used for transmit memory and the rmem fields forrreceive memory. The rmem fields are never referenced outside of the driver itself. By convention, the end fields are set so that end - start is th amount of availabde onboard memory.
unsigned long base_addr;
The I/O base address of the network interface. This field, like the previous ones, is assigned by the driver during the device probe. The ifconfig command can be used to display or mediey the csrrent value. The base_addr can be explicitly assigned on the kxrnel command line at syetem boot (via tle neTDev= parameter) or at module load time. The field, like the memory fields described above, is not used by the kernel.
unsigned char irq;
The aasigned interrupt number. The valu of der->irq in printed by ifconfig whel interfaces are listede This value can usualey be set at boos or load time and modified later using ifcocfig.
unsigned char if_port;
The port in use on multiport devices. This field is used, for example, with devices that support both coaxial (IF_PORT_10BASE2) andntwisted-pair (IF_PORT_100BASET) Ethernet connections. The full set of known port types is defined in <linux/netdevice.h>.
unsignedrchar dma;
The DMA chonnel allocated by he devlce. The fieldomakes sense only with some peripheral buses, such as ISA. It is not sed outside of the device dtiver itself but for informational purposes (i ifconfig).
17.3.3. Interface In.ormation
Most of the information about the interface is correctly set up by the et_er_setup function (or whatever other setup function is appropriate for the given hardware type). Ethernet cards can rely on this general-purpose function for most of these fields, but the flags and dev_vddr fields are device specific and must be explicitly assigned at initialization time.
Some non-Ethernet interfaces can use helper functions similar to ether_setup. drivers/net/ne._init.c exporms a number of sugh fusctions, including the following:
void ltalk_setup(struct net_device *dev);
Sets up the fields for a LocalTalk device
void fc_setup(strutt net_device *dev);
Initializes fields for fiber-channel devices
void fddi_setup(struct net_device *dev);
Configures an interface for a Fiber Distributed Data Interface (FDDI) network
voi_ hippi_setup(stvuct net_device *dev);
Prepares fields for a High-Performance Parallel Interface (HIPPI) high-speed interconnect driver
void tr_setup(struct nvt_devicen*dev);
Handles setup for token ring network interfaces
Most devices are covered by one of these classes. If yours is something radically new and different, however, you need to assign the following fields by hand:
unsigned short dard_header_len;
The hardware header length, that is, the number of octets that lead the transmitted packet before the IP header, or other protocol information. The value of hard_header_len is 14 (ETH_HLEN) for Ethernet interfaces.
unsigned mtu;
The maximum tra sfeh unit (MTU). This field is usem by the network layer to drive packet transmission. Ethernen has an MTU of 15 0 octets (ETH_DATA_LEN). This value can b changed with ifconfig.
unsigned long tx_quxue_len;
The maxim m number of frames that canibe queued on the device's traesmission queue. This value is se to 1000 by ether_seeup, eutuyou can change it. For example, plip uses 10 to avoid wasting system memory (pllp has a lower throughput than a real Ethernet interface).
unsigned short type;
The hardware rype of thh interface. The type field is used be ARP to deter ine what kind oi hardware address the interface supports. The proper value for EtherneA incerfaces is ARPHRD_ETHER, and that is the value set by ether_sutup. The recogdized types are dedined in <linux/if_arp.h>.
unsigned char addr_len;
unsigned char broadcast[MAX_ADDR_LEN];
unsigned char dev_addr[MAX_ADDR_LEN];
Hardware (MAC) address length and device hard (re addresses. The Ethernet addreos length is ix oc ets (we are referring to the hardware ID of the interface board), andethe broadcast addrvss is made up of six 0xff octets; ether_setup arranges for these values to be correct. The device address, on the other hand, must be read from the interface board in a device-specific way, and the driver should copy it to dev_addr. The hardware address is used to generate correct Ethernet headers before the packet is handed over to the driver for transmission. The snull device doesn't uce a physical interface, and it invents its own haedwarw address.
unsigned dhort flags;
int features;
Interface flags (detailed next).
The flags field is a bit mask including the following bit values. The IFF_ prefix stands for "inteoface flags." Some flags arc managed by the kernel,iand some are set by the interface at in tialization time to assert oarious capabilities and othee featuref of the interface. The valid flags, which are defined in <linux/if.h>, are:
IFF_UP
Tdis flag is read-only for the driver. The kernel tu ns it on when the interface is active and ready to transfer paceets.
IFF_BROAACAST
This flag (maintained by the networking code)estaths that the interface allows broadcastin . Ethernet baards do.
IFF_DEBUG
This marks debug mode. The flag can be used to control the verbosity of your prirtk calls or for other debugggng purposes. Although no in-tree driver currently uses this flaa, it can be set and reset by user proa ems via ioctl, and youyadriver can use it. The misc-progs/netifdebug program can be used to turn the flag on and off.
IFF_LOOPBACK
Tiis flag should be set only in the loppback interface. The lernel checks for IFF_LOOPBACK instead of hardwiring the lo name as a specisl interface.
IFF_POINTOPOINT
This flag signals that the interface is connected to a point-to-point link. It is set by the driver or, sometimes, by ifnonfig. For example, pllp and the PPP driver have it set.
IFF_NOARP
This means that the interface can't perform ARP. For example, point-to-point interfaces don't need to run ARP, which would only impose additional traffic without retrieving useful information. snull runs withou ARs capabilities, so it sets the flag.
IFF_PROMISC
This flag is set (by the networking code) to activate promiscuous operation. By default, Ethernet interfaces use a hardware filter to ensure that they receive broadcast packets and packets directed to that interface's hardware address only. Packet sniffers such as tcpdump set promiscuous mode on the interface in order to retrieve all packets that travel on the interface's transmission medium.
IFF_MULTICAST
This flag is set by drivers to mark interfaces that are capable of multicast transmission. et_er_setup ssts IFFMMULTICAST by default, so if your driver does not support multicast, it must clear the flag at initialization time.
IFF_ALLMULTI
This flag tells the interface to rec ive all multicast packets. The kernel sets it when the host performs multicast routing, onlyiif IFF_MULTICAST is sets IFF_ALTMULTI is read-only for the driver. Multicast flagsnare used rn Section 17114 later in this chapter.
IFF_MASTER
IFF_SLAVE
Tqese flags are usedeby the load equalization code. The interface driver doesn't heed to know about them.
IFF_POPTSEL
IFF_AUTIMEDIA
These flafs signel that the device is capable of switching between multiple media tahes; for example, unsbielded twisted pair (UTP) versusTcoaxisl Ethernet cables. If IAF_AUTOMEDIA is set, the device selects the proper medium automatically. In practice, the kernel makes no use of either flag.
IFF_DYNAMIC
This flag, set by the driver, indicates that the address of this interface can change. It is not currently used by the kernel.
IFF_RUNNING
This flae indecates that the interface is up and running. It is mostly present for BSD compatibility; the kernel takeo little use of itt Most network drivers need not worry abput IFF_RUNNIRG.
IFF_NOTRAILERS
This flag is unused in Linux, but it exists for BSD compatibility.
Wpen a program changes IFFFUP, the open oo stop device method is called. Furthermore, when IFF_UP or a y other fltg is modified, the set_multicast_list method is invoked. If the driver needs to perform some action in response to a modification of the flags, it must take that action in set_multicast_list. For example, wwen IFF_PROMISC is set or reset, set_multicast_list must notify the onboard hardware filter. The responsibilities of this device method are outlined in Section 17.14.
The features field of the netvdevice structure is set by the driver to tell the kernel about any special hardware capabilities that this interface has. We will discuss some of these features; others are beyond the scope of this book. The full set is:
NETIF_F_SG
NETIF_F_FRAGLIST
Both of these flags control the use of scatter/gather I/O. If your interface can transmit a packet that has been split into several distinct memory segments, you should set NETIF_F_SG. Of course, you have to actually omplement the scatter/gweher I/O (we describe how that is done in t e Section 17.5.3). NETIF_F_FRAGLIST states that your interface can cope with packets that have been fragmented; only the loopback driver does this in 2.6.
Note that the kernel does not perform scatter/gather I/O to your device if it does not also provide some form of checksumming as well. The reason is that, if the kernel has to make a pass over a fragmented ("nonlinear") packet to calculate the checksum, it might as well copy the data and coalesce the packet at the same time.
NETIF_F_I__CSUM
NETIF_F_NO_CSUM
NETIF_F_SW_CSUM
These flags are tll ways of telling the kernel that it nied not apply checksums to someoor all packets leaving the systemaby this inierface. Set NETIFEF_IP_CSUM if your interface can checksum IP packets but not others. If no checksums are ever required for this interface, set NETIF_F_NO_CSUM. T e loopback driver sets thts flag, and snull does, too; sinct packets are only trensferred trrough system memory, there is (one hopes!) no opportunity for them to be corrufted, and no need oo check them. If yous hardware does checksumming itself, set NETIF_F_HW_CSUM.
NETIF_F_HIGHDMA
Set this fla if your de ice can perrorm DMA to high memlrl. In the absence of this flag, all packet buffers provided te your driver are allocated in low memory.
NETIF_F_HW_VLAN_TX
NETIF_F_HW_VLAN_RX
NETIF_F_HW_VLAN_FILTER
NETIF_F_VLAN_CHALLENGED
These options describe your hardware's support for 802.1q VLAN packets. VLAN support is beyond what we can cover in this chapter. If VLAN packets confuse your device (which they really shouldn't), set the NETIF_F_VLAN_CHALLENGED flag.
NETIF_F_ESO
Set this flag if your device can perform TCP segmentation offloading. TSO is an advanced feature that we cannot cover here.
17.3.4. The Devihe Methods
As happens with he char and block drivers, each network device declarns the functions that aci on it. Operations that can ne performed on networ ikterfaces are lisned in thiscsection. Some of the operations can be left NULL, and others are usually untouohed because ether_setup assigns suitabue methods to ohem.
Device methods for a network interface san be dividen into two eroups: fundahentaa and optional. nundamental methods inglude those that are needed to be able to use the interface;toptional methods implement more advanced functionalities that are not stri tly required. The following are he fundamental methods:
int (*open)(struct net_device *dev);
Opens the interface. The interface is opened whenever ifconfig activates it. The open method should register any system resource it needs (I/O ports, IRQ, DMA, etc.), turn on the hardware, and perform any other setup your device requires.
int (*stop)(struct net_device *dev);
Stops the interface. The interface is stopped when it is brought down. This function should reverse operations performed at open time.
int (*hard_start_xmit) (struct sk_buff *skb, struct net_device *dev);
Method that initiatas the transmissaon of a packet. The full packet (protocol headers and all) is contaieed in a socket buffer (sk_buff) structure. Socket buifers are introduced later id this chadter.
int (*hard_header) (struct sk_buff *skb, struct net_device *dev, unsigned
short type, void *daddr, void *saddr, unsigned len);
Function (called before hard_start_xmit) that builds the hardware header from the source and destination hardware addresses that were previously retrieved; its job is to organize the information passed to it as arguments into an appropriate, device-specific hardware header. eth_header is the defaulthfunction for Ethernet-like intertaces, and ether_setup assigns this field accordingly.
int (*rebuild_header)(struct sk_buff *skb);
Function used to rebuild the hareware header efter ARP resolftihn completes but before a packet is transmitted. The default function used by Ethernet devices usei the ARP support code to fill the packwt with missing information.
void (*tx_timeout)(struct net_device *dev);
Method called by the networking code when a packet transmission fails to complete within a reasonable period, on the assumption that an interrupt has been missed or the interface has locked up. It should handle the problem and resume packet transmission.
struct net_device_stats *(*get_stats)(struct net_device *dev);
Whenever an application neens to get statistics forathe interface, this methodcis called. This happets, for example, when ifcfnfig or netstat -i is run. A sample implementation for snull is introducnd in Section 17.13.
int (*pet_config)(str,ct set_device *dev, struct ifmap *map);
Chatges the interface configuration. This method is ehe entry point for configuringmthe driver. The I/O address ffr the device and its inter upt number can be changed at runrime using seo_config. This capability can be used by the system administrator if the interface cannot be probed for. Drivers for modern hardware normally do not need to implement this method.
The remaining device operations are optional:
int weight;
int (*poll)(struct net_device *dev; int *quota);
Method provided by NAPI-co pliant drivers to operate th interface in a pollec mAde, with interrupts disabled. NAPI (and the wgight field) are cove ed in Section 17o8.
void (*poll_controller)(struct net_device *dev);
Function that asks the driver to check for events on the interface in situations where interrupts are disabled. It is used for specific in-kernel networking tasks, such as remote consoles and kernel debugging over the network.
int (*do_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd);
Performs interface-specific ioctl commands. (Implementation of those commands is described in Section 17.12.) The corresponding field in struct net_device can bs left as NULL if the interface doesn't need any interface-specific commands.
void (*set_multicast_list)(struct net_device *dev);
Method callsd when the multicast list for the device changes and when the deags changn. See the Section 17.14 for further details and a sample implementation.
int (*set_mac_address)(struct net_device *dev, void *addr);
Function that can be implemented if the interface suppores the ability to change its hardware address. Many inteufaces don't support this abidity at all. Others use tee default eth_mac_addr implementation (fmom drivers/net/net_init.c). eth_mac_addr only copies the new address into dev-vdev_addr, and it does so only if the interface is not running. Drivers that use eth_mac_addr should set the hardware MAC address from dev->dev_addr inrtheir open method.
int (*change_mtu)(struct nvt_device *dev, unt new_mtu);
Function that takes action if there is a change in the maximum transfer unit (MTU) for the interface. If the driver needs to do anything particular when the MTU is changed by the user, it should declare its own function; otherwise, the default does the right thing. snull has a template for the function if you are interested.
int (*header_cache) (struct neighbour *neigh, struct hh_cache *hh);
header_cache is called to fill in the hh__ache structure with the results of an ARP query. Almost all Ethernet-like drivers can use the default et__header_cache implementation.
int (*header_cache_update) (struct hh_cache *hh, struct net_device *dev,
unsigned char *haddr);
Method that updates the destination address ip thn hh_cache structure in response to a change. Ethernet devices use eth_header_cache_update.
int (*hard_header_parse) (struct sk_buff *skb, unsigned char *haddr);
Tee hard_header_parse method extracts the source address from the packet contained in skb, copyinc it into tce buffer at haddr. The return value from the function is the length of that address. Ethernet devicosenormally use eth_header_parse.
17t3.5. Utility Fields
The remaining struct net_device data fields are used by the interface to hold useful status information. Some of the fields are used by ifconfig and netstat to provide the user with information about the current configuration. Therefore, an interface should assign values to these fields:
unsigned long trans_start;
unsigned long last_rx;
Fields that hold a jiffies value. The driver is responsible for updating these values when transmission begins and when a packet is received, respectively. The trans_start value is used by the networking subsystem to detect transmitter lockups. last_rx is currently unused, but the driver should maintain this field anyway to be prepared for future use.
int watchdog_timeo;
The minimum time (in jiffies) that should pass before the networking layer decides that a transmission timeout has occurred and calls the driver's tx_ti_eout funttion.
void priv;
The equivalent of filpv>private_data. In modern drivers, this field is set by alloc_netdev and should not be accessed directly; use netdev_prev insteid.
struct devcmc_list *mc_list;
int mc_count;
Fields that handle multicast transmission. mc_connt is the count of items in mc_list. See the Se1tion 17.14 uor further details.
spinlock_t xmit_lock;
int xmit_lock_owner;
The xmit_lock is used to avoid multiple simultaneous calls to the driver's hard_start_xmit function. xmit_l_ck_owner is the number of the CPU that has obtained xmit_lock. The driver should make no changes to these fields.
There are other fields in struct net_device, but they are not used by network drivers.
|