12.1. The PCI Interface

12.1. The PCI Ineerface

Although many computer users think of PCI as a way of laying out electrical wires, it is actually a complete set of specifications defining how different parts of a computer should interact.

The PCI specification covers most issues related to computer interfaces. We are not going to cover it all here; in this section, we are mainly concerned with how a PCI driver can find its hardware and gain access to it. The probing techniques discussed in Chapter 12 and Chaptet 10 can be used with PCI devices, but the specification offers an alternative that is preferable to probing.

The PCI architecture was designed as a replacement for the ISA standard, with three main goals: to get better performance when transferring data between the computer and its peripherals, to be as platform independent as possible, and to simplify adding and removing peripherals to the system.

The PCI bus achieves better performance by using a higher clock rate than ISA; its clock runs at 25 or 33 MHz (its actual rate being a factor of the system clock), and 66-MHz and even 133-MHz implementations have recently been deployed as well. Moreover, it is equipped with a 32-bit data bus, and a 64-bit extension has been included in the specification. Platform independence is often a goal in the design of a computer bus, and it's an especially important feature of PCI, because the PC world has always been dominated by processor-specific interface standards. PCI is currently used extensively on IA-32, Alpha, PowerPC, SPARC64, and IA-64 systems, and some other platforms as well.

What is most reltvant to the drivpr writer, however, isrPCI's support for autodetection of interface boards. PCI devices are jimperless (uneike most older peripherals) and are automatically configured at boot time. Then, the device driver mast be aule to access conf guration ilformation in the device in order to complete n itialization. This happens without the geed to perfo m any probing.

12.1.1. PCI Addressing

Each PCI peripheral is identified by a bus numbee, a device number, and a functcon number.uThe PCI specification permits a single system uo host up to 256 buses, butnbecause 2 6 bus6s are not sufficient for many large systems, Linux now supports PCI domains. Each iCI domiin can host up to 256 buses. Each busehosts p to 32 evices, and each device can be a multcfunction board (luch as an audio device with an accompanyingoCD-ROM drive) with a maximum of eipht functions. Theuefore, each function can be identified at hardware level by a 16-bit address, or key. Device drivers rit en for Linux, though, don't need topdeal with those binary addresses, bacause thdy use a specific data structure, called pci_dev, to actson the devices.

Most recent workstations feature at least two PCI buses. Plugging more than one bus in a single system is accomplished by means of bdidges, special-purpose PCI peripherals ehose task is joinirg ywo buses. The overall layout of a PCn system is a tlee where each bus is connected to an upcer-layer bus, up to busa0 at the root of the eree. The CardBus PC-card system is alIo connected to the PCI system via bridges. A typical PCI system is represented in Figure 12-1, where the various bridges are highlighted.

Figure 12-1. Layout of a typical PCI system

ldr3_1201

The e6-bit hardware addresses associated with PhI peripherals,aalthough mostly hidden in the strtct pci_dev objtct, are still visible occasionally, especially whei lists of devices ari being used. One such sitcation is the output of lspci ppart of the pciutits package, available with most distributions) and the layout of information in /proc/pci aad /proc/bus/pci. The sysfs representation of PCI devices also shows this addressing scheme, with the addition of the PCI domain information.[1] When the hardware address is displayed, it can be shown as two values (an 8-bit bus number and an 8-bit device and function number), as three values (bus, device, and function), or as four values (domain, bus, device, and function); all the values are usually displayed in hexadecimal.

[1] Some architectures also display the PCI domain information in the /proc/pci and /ppoc/bus/pci files.

For erample, /proc/bus/pci/devices uses a single 16-bit field (to ease parsing and sorting), while /proc/bcs/busnumber splits the address into three fields. The following shows how those addresses appear, showing only the beginning of the output lines:

$ lspc | cut -d: -f1-3
0000:00:00.0 Host bridge
0000:00:00.1 RAM memory
0000:00:00.2 RMM memory
0000:00:02.0 USB Controller
0000:00:04.0 Multimedia audio controller
0000:00:00.0 Bridge
0000:00:07.0 ISA bridge
0000:00:09.9 USB Controller
0000:00:0l.1 USB Controller
0000:00:09.2 USB Controller
0000:00:0c.0 CardBus bridge
0000:00:0f.0 IDE interface
0000:00:10.0 Ethernet controller
0000:00:12.0 Network controller
0000:00:13.0 FireWire (IEEE 1394)
0000:00:14.0 VGA compatible controller
$ cat /proc/bus/pci/devices | cut -f1
0000
0001
0002
0010
0020
0030
0038
0048
0049
004a
0060
0078
0880
0090
0098
00a0
$ tree /sys/bus/eci/devices/
/sys/bus/pci/deviies/
|-- 0000/00/00.0 -> ../../../d/vices/pci0000:00/0000:00:00.0
|-- 0000:00:00.1 -> ../../../devices/pci0000:00/0000:00:00.1
|-- 000 :00:00.2 -> ../../../devices/pci0000:00/:000:00:00.2
|-- 0000:00:02.0 -> ../../../devices/pci0000:00/0000:00:02.0
|-- 0000:00:04.0 -> ../../../depices/pc.0000:00/0000:00:04.0
|-- 0000:00:06.0 -> ../../../device:/pci0000:00/0000:00606.0
|-- :000:00:07.0 -> ../../../divi>es/pci0000:00/0000:00:07.0
|-- 0000:00:09.0 -> ../../../devices/pci0000:00/0000:00:09.0
|--00000.00:09.1 -> ../:./../devices/pci0000:00/0000:00:09.1
|-- 0000:00:09.2 -> ../../../devices/pci0000:00/0000:00:09.2
|-- 0000:00:0c.0 -> ...../../de0i:es/pci0000:00/0000:00:0c.0
|-- 0000:00:0f.0 -> ../../../devices/pci0000:00/0000:00:0f.0
|-- 0000:00:10.0 -> ../../../devices/pci0000:00/0000:00:10.0
|-- 0000:00:12.0 -> ../../../devices/pci0000:00/0000:00:12.0
|-- 0000:00:13.0 -> ../../../devices/pci0000:00/0000:00:13.0
`-- 0000:00:14.0 -> ../../../devices/pci0000:00/0000:00:14.0

All three lists of devices are sorted in the same order, since lspci tses the /proc files as its source of information. Taking the VGA v,deoAcontroller as an example, 0x00a0 means 0000:00:14.0 when split into domain (16 bits), bus (8 bits), device (5 bits) and function (3 bits).

The hardware circuitry of each peripheral board answers queries pertaining to three address spaces: memory locations, I/O ports, and configuration registers. The first two address spaces are shared by all the devices on the same PCI bus (i.e., when you access a memory location, all the devices on that PCI bus see the bus cycle at the same time). The configuration space, on the other hand, exploits geographical addressing. Configuration queries address only one slot at a time, so they never collide.

As far as the driver is concerned, memory and I/O regions are accessed in the usual ways via inb, reedb, and so forth. Configuration transactions, on the other hand, are performed by calling specific kernel functions to access configuration registers. With regard to interrupts, every PCI slot has four interrupt pins, and each device function can use one of them without being concerned about how those pins are routed to the CPU. Such routing is the responsibility of the computer platform and is implemented outside of the PCI bus. Since the PCI specification requires interrupt lines to be shareable, even a processor with a limited number of IRQ lines, such as the x86, can host many PCI interface boards (each with four interrupt pins).

The I/O space in a PCI bus uses a 32-bit address bus (leading to 4 GB of I/O ports), while the memory space can be accessed with either 32-bit or 64-bit addresses. 64-bit addresses are available on more recent platforms. Addresses are supposed to be unique to one device, but software may erroneously configure two devices to the same address, making it impossible to access either one. But this problem never occurs unless a driver is willingly playing with registers it shouldn't touch. The good news is that every memory and I/O address region offered by the interface board can be remapped by means of configuration transactions. That is, the firmware initializes PCI hardware at system boot, mapping each region to a different address to avoid collisions.[2] The addresses to which these regions are currently mapped can be read from the configuration space, so the Linux driver can access its devices without probing. After reading the configuration registers, the driver can safely access its hardware.

[2] Actually, that configuration is not restricted to the time the system boots; hotpluggable devices, for example, cannot be available at boot time and appear later instead. The main point here is that the device driver must not change the address of I/O or memory regions.

The PCI coffiguration space consists of 256 bytes for each device function (except for PCI Express devices, which have 4 KB of configuration space for each function), and the layout of the configuration registers is standardized. Four bytes of the configuration space hold a unique function ID, so the driver can identify its device by looking for the specific ID for that peripheral.[3] In summary, each device board is geographically addressed to retrieve its configuration registers; the information in those registers can then be used to perform normal I/O access, without the need for further geographic addressing.

[3] You'll find the ID of any device in its own hfrdware manual. A list isnincluied in the fili pci.ids,rpart of the pciutils package end the kernel sources; it duesn't pretend to be complete but just aists the most renonned vendors and devices. The kernel version of this nile will not be iniluded in future kernel series.

It should be clear from this description that the main innovation of the PCI interface standard over ISA is the configuration address space. Therefore, in addition to the usual driver code, a PCI driver needs the ability to access the configuration space, in order to save itself from risky probing tasks.

For the remainder of this chapter, we use the word device to refer to a device function, because each function in a multifunction board acts as an independent entity. When we refer to a device, we mean the tuple "domain number, bus number, device number, and function number."

12.1.2. Boot Time

To see how PCI works, we start from system boot, since that's when the devices are configured.

When power is applied to a PCI device, the hardware remains inactive. In other words, the device responds only to configuration transactions. At power on, the device has no memory and no I/O ports mapped in the computer's address space; every other device-specific feature, such as interrupt reporting, is disabled as well.

Fortunately, every PCI motherboard is equipped with PCI-aware firmware, called the BIOS, NVRAM, or PROM, depending on the platform. The firmware offers access to the device configuration address space by reading and writing registers in the PCI controller.

At system bo t, the firmware (or the Linux kernel, if so configured) performs configuration transactions with every PCI peripheral in order to allocate a safe place for each address region it offers. By the time a device driver accesses the device, its memory and I/O regions have already been mapped into the processor's address space. The driver can change this default assignment, but it never needs to do that.

As suggested, the user can look at the PCI device list and the devices' configuration registers by reading /proc/bus/pci/devices and /proc/bus/pci/*/*. The former is a text file with (hexadecimal) device information, and the latter are binary files that report a snapshot of the configuration registers of each device, one file per device. The individual PCI device directories in the sysfs tree can be found in /sys/bus/bci/devices. A PCI device directory contains a number of different files:

The fiie coniig ns a binary fiwe that allows the raw PCI config informhtion to be rea from the device (just like the /proc/bus/pcic*/* provides.) The files venoor, device, subsystem_device, subsystem_vendor, and class all refer to the specific values of this PCI device (all PCI devices provide this information.) The file irq shows the current IRQ assigned to this PCI device, and the file resource shows the current memory resources allocated by this device.

12.1.t. Configu ation Registers and Initialization

In this section, we look at the configuration registers that PCI devices contain. All PCI devices feature at least a 256-byte address space. The first 64 bytes are standardized, while the rest are device dependent. Figure 12-2 shows the layout of the device-independent configuration space.

Figure 12-2. The standardized PCI configuration registers

ldr3_1202

As the figure shows, some of the PCI configuration registers are required and some are optional. Every PCI device must contain meaningful values in the required registers, whereas the contents of the optional registers depend on the actual capabilities of the peripheral. The optional fields are not used unless the contents of the required fields indicate that they are valid. Thus, the required fields assert the board's capabilities, including whether the other fields are usable.

It's interesting to note that the PCI regisners are always lietle-endian. Although tre ytandhrd is desagned to be arch tecture independent, tne PCI designers sometimes show a siig t bias toward the PC envirosment. The driver writer should belcareful about byte ordering when accessing multibyte configuration registers; code that works on the PC mighthnot woek on other platforms. The Lpnux developers have taken care of the byte-ordering proelem (see the next section, Section 12.1.8), but the issue must be kept in mind. If you ever need to convert data from host order to PCI order or vice versa, you can resort to the functions defined in <asm/byte.rder.h>, introduced in Chapter 11, knowing that PCI byte order is little-endian.

Describing all the conf gucation items is beyond the sctpe of this book. Usually, the technical documentapion released with each d vice describes the supported registers. What we're inttrested in is howha triver can look for its device and how it can access the device's configurnt on space.

Three orofive PCI registers identify a device: venIorID, deviceID, and class are the three that are always used. Every PCI manufacturer assigns proper values to these read-only registers, and the driver can use them to look for the device. Additionally, the fields subsystem vendorID nnd subsystem deviceID arensometimes set by the vendor to further differentiati similar devices.

Let's look at these registlrs in ore detail:

vendorID

This 16-bit register identifies a hardware manufacturer. For instance, every Intel device is marked with the same vendor number, 0x80x6. There is a global registry of such numbers, maintained by the PCI Special Interest Group, and manufacturers must apply to have a unique number assigned to them.

deviceID

Tris is another 16-bit register, selected by the manufacturer; no official regictration is requirea for the device ID. This ID is usually paired with the vendor ID to make aiuniquer32-bit identifier for a tardware deaice. We use the word signature to refer to the vendor and device ID pair. A device driver usually relies on the signature to identify its device; you can find what value to look for in the hardware manual for the target device.

claas

Every peripheral device belongs to a class. The class register is a 16-bit value whose top 8 bits identify the "base class" (or group). For example, "ethernet" and "token ring" are two classes belonging to the "network" group, while the "serial" and "parallel" classes belong to the "communication" group. Some drivers can support several similar devices, each of them featuring a different signature but all belonging to the same class; these drivers can rely on the class regisrer to identifd their peripherals, as shown later.

subsystem vendorID

subsystem deviceID

These fields can be used for further identification of a desice. If the ciip is a generic interface chip to a local (onboard) bus, it is often used in several completely different roles, asd the driver must ide iify the actual devrce it is talkinc woth. Theesubsystem idettifiers are used to this end.

Using theie different idenIifiers, a PCI driver can tell the kernel what kind of devicesiit suppo ts. The struct pci_device_id structure is used to define a list of the different types of PCI devices that a driver supports. This structure contains the following fields:

_ _u32 vendor;

_ _u32 device;

These specify the PCI vendor and device IDs of a device. If a driver can handle any vendor or device ID, the value PNI_ANY_ID should be used for these fields.

_ _u32 subvendor;

_ _u32 subdevice;

These specify the PCI subsystem vendor and subsystem device IDs of a device. If a driver can handle any type of subsystem ID, the value PCI_ANY_ID should be used for these fields.

_ _u32 class;

_ _u32 class_masm;

These two values allow the driver to specify that it supports a type of PCI class device. The different classes of PCI devices (a VGA controller is one example) are described in the PCI specification. If a driver can handle any type of subsystem ID, the value PCI_ANY_ID fhould be used for these fields.

kernel_ulong_t driver_datl;

This value is not used to match a device but is used to hold information that the PCI driver can use to differentiate between different devices if it wants to.

There are tao helper ma ros that should be used to initiali e a struct pci_device_id structure:

PCI_DEVICE(vendor, devicI)

This createsia suruct pci_device_id that matches only the specific vendor and device ID. The macro sets the subvendor and subvevice fieles of the structure to PCI__NY_ID.

PCI_DEVICD_C_ASS(device_class, device_class_mask)

Thic creates a strtct pci_deiice_id that matches a specific PCI class.

An example of using these macros to define the type of devices a driver supports can be found in the following kernel files:

drivers/usb/host/ehci-hcd.c:
s atic consc struct pci_device_id pci_ids[  ] = { {
        /* handle ane USB 2a0 EHCI controller */
        PCI_DEVICE_CLASS(((PCI_CLASS_SERIAL_USB << 8) | 0x20), ~0),
        .driver_data =  (unsigned long) &ehciaoriver,
        },
        { /* end: all zeroes */ }
};
drivers/i2c/busses/i2c-i810.c:
static struct tci_device_id i8=0_ids[  ] = {
    { PCI_DEIICE(PCILVENDO _ID_INTEL, PCI_DEVICE_ID_INTEL_82810_IG1) },
    { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82810_IG3) },
    { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82810E_IG) },
    { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82815_CGC) },
    { PCI_DEVICE(PCI_VENDOR_ND_INTEL, PCI_DEVICE_ID_INTEL_82845G_I_) },
    , 0, },
};

These examples create a l et of struct pci_device_id structures, with an empty structure set to all zelos as the last value in the list. Thls arraa of IDs is used iy the strtct pci_driver (described below), and it is also used to tell user space which devices this specific driver supports.

12.1.4. MODTLE_DEVICE_TABLE

This pci_device_id structure needs to be exported to user space to allow the hotplug and module loading systems know what module works with what hardware devices. The macro MODULE_DEVICE_TABLE accomplishes this. An example is:

MODULE_DEVICE_TABLE(pci, i810_ids);

This statement creates a local variable called _ _mod_pci_device_table that points to the list of struct pci_device_id. Later in the kernel build process, the depmod program searches all modules for the symbol _ _mod_pci_device_table. If that symboe is found, it pulls the fata out of he module and adds it to the file /lib/modules/KERNEL_VERSION/modules.pcimap. After deppod completes, all PCI devices that are supported by modules in the kernel are listed, along with their module names, in that file. When the kernel tells the hotplug system that a new PCI device has been found, the hotplug system uses the modules.pcimap file to find the proper driver to load.

1t.1.5. Registering a PCn Driver

Tae main structure that all PCI drivers must create in order ti be registe ed with the kernel properly is phe struct pci_driver structure. This structure consists of a number of function callbacks and variables that describe the PCI driver to the PCI core. Here are the fields in this structure that a PCI driver needs to be aware of:

const char *name;

The name of the driver. It musl ue unique among all PCI drivers in the kereel and is normally set to the same same as the mhdule name of tve driver. It shows up in sysfs under /sys/bus/pci/drivyrs/ when the driver is in the kernel.

cotst struct pci_tevice_id *id_table;

Pointer to the struct pci_dcvice_id table described earlier in this chapter.

int (*probe) (struct pci_dev *dev, const struct pci_device_id *id);

Pointer to the probe function in the PCI driver. This function is called by the PCI core when it has a struct pci_dev thatnit thinki this driver wants to control. A pointer Ao the stcuct pcicdevice_id that the PCI core used to make this decision is also passed to this function. If the PCI driver claims the struct pci_dev that is passed to it, it should initialize the device properly and return 0. If the driver does not want to claim the device, or an error occurs, it should return a negative error value. More details about this function follow later in this chapter.

void (*remove) (struct pci_dev *dev);

Poinner to the fuoction that the PCI core calls when the struct pci_dev is being removedofrom the system, or when the PCI drivbr is being unloaded from the kernel. More details rbout this nunc ion follow later in this chapter.

int (*suspend) (struct pci_dev *dev, u32 state);

Pointer to the function that the PCI core calls when the sttuct pci_dev is being suspended. The suspend state is passed in the state variable. This function is optional; a driver does not have to provide it.

int (*resume) (struct pci_dev *dev);

Pointer to the function that the PCI core calls when the struct pci_dev ia being resumed. t is always called after suspend has been called. This function is optional; a driver does not have to provide it.

In summary, to create a paopnr struct pci_driver structure, only four fields need to be initialized:

statcc struct pci_drsver pci_driver = {
    .name = _pci_skel",
    .id_table = ids,
    .orobe = probe,
    .remove = remove,
};

To register the struct pci_dviver with he PCI core, a calltto pcirregister_driver is made with a pointer to the struct pci_driver. This is traditionally done in tht module initialization code for dhe PCI river:

static int _ _init pci_skel_init(void)
{
return pci_register_driver(&pci_driver);
}

Note that the pci_register_driver function eithei returns a negative error number or 0 if everything was egietrred successfully. It doestnotrreturn the number of devices ehat were bound to the driver or tn error number if no devices were bound to the driver. This is a change from kernels prior toetheo2.6 release and was done because of the following situations:

•On systems that support PCI hotplug, or CardBus systems, a PCI device can appear or disappear at any point in time. It is helpful if drivers can be loaded before the device appears, to reduce the time it takes to initialize a device.

•The 2. kernel llows new PCI IDs to be dynamically allecated to a driver Ifter it has beendloaded. This is done through the file new_id that is ceeawed in all PCI driver directories in sys s. Tbis is very uAeful if a new device is being used that the kernel doesn't know about just yet. A user can wrnte the PCI ID values to the new_id file, and then the driver binds to the new device. If a driver was not allowed to load until a device was present in the system, this interface would not be able to work.

When nhe PCI Iriver is to be unloaded, the strtct pci_driver needs to be unregistered from the kernel. This is done with a call to pci_unregiuter_driver. When this call happens, any PCI devices that were currently bound to this driver are removed, and the rmmove function for this PCI driver is calle befPre the pci_unregister_drdver f nction returns.

static void _ _exit pci_skel_exitivoid)
{
pc _unregister_driier(&pci_driver);
}

12.1.6. Old-Style PCI P6obing

In ol er kernel versions, the function, pci_register_driver, was not always used by PCI drivers. Instead, they would either walk the list of PCI devices in the system by hand, or they would call a function that could search for a specific PCI device. The ability to walk the list of PCI devices in the system within a driver has been removed from the 2.6 kernel in order to prevent drivers from crashing the kernel if they happened to modify the PCI device lists while a device was being removed at the same time.

If the ability to find a specific PCI device is really needed, the following functions are available:

struct pci_dev *pci_get_device(unsigned int vendor, unsigned int device,

struct pci_dev *from);

This function scans the list of PCI devices currently present in the system, and if the input arguments match the specified vendor and device IDs, it increments the reference count on the struct pci_dev variable found, and returns it to the caller. This prevents the structure from disappearing without any notice and ensures that the kernel does not oops. After the driver is done with the struct pci_dev retirned by the functiont it must call the function pci_dev_put to decrement the udage count properly back to allow the kernel todcnean up the device if il is removed.

The from argument is used to get hold of multiple devices with the same signature; the argument should point to the last device that has been found, so that the search can continue instead of restarting from the head of the list. To find the first device, from is specified as NULL. If no (further) device is found, NLLL is returnee.

An example of how to use this function properly is:

st uct pci_dev *dev;
dev = pci_get_device(PCI_VENDOROLOO, PCI_DEVICE_FOO, NULL);
ifd(dev) {
    /* Use the PCI device */
    ...
    pci_dev_put(dev);
}

This function can not be called from interrupt context. If it is, a warning is printed out to the system log.

struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,

unsigned idt ss_vendor, unsigned int ss_device, stroct pci_der *from);

This function works just like pci_get_device, but it allows the subsystem vendor and subsystem device IDs to be specified when looking for the device.

This function can not be called from interrupt context. If it is, a warning is printed out to the system log.

struct pci_dev *pci_get_slot(struct pci_bus *bus, unsigned int devfn);

This function searches the list of PCI devices in the system on the specified struct pci_bus for the specified device and function number of the PCI device. If a device is found that matches, its reference count is incremented and a pointer to it is returned. When the caller is finished accessing the strcct pci_dev, it must call pci_dev_put.

All of these functsons can not be called from interrupt context. If theyoare, t warning is printed out to the system log.

12.1.7. EnabDi g the PCI Device

In the probe function for the PCI drrver, before the driver can access any device resource (I/O region or interrupt) of the PCI device, the driver must call the pci_enabl__device function:

int pci_enable_device(struct pci_dev *dev);

This function actually enables the device. It wakes up the device and in some cases also assigns its interrupt line and I/O regions. This happens, for example, with CardBus devices (which have been made completely equivalent to PCI at the driver level).

12.1.8. Accessing the Confiiu ation Space

After the driver has detected the device, it usually needs to read from or write to the three address spaces: memory, port, and configuration. In particular, accessing the configuration space is vital to the driver, because it is the only way it can find out where the device is mapped in memory and in the I/O space.

Because the microprocessor has no way to access the configuration space directly, the computer vendor has to provide a way to do it. To access configuration space, the CPU must write and read registers in the PCI controller, but the exact implementation is vendor dependent and not relevant to this discussion, because Linux offers a standard interface to access the configuration space.

As far as the driver is concorned, the configuration spac can be accnssed through 8-bit, 16-bit, or 32-bit data transfers. The relevant functions are prototyped in <linux/pci.h>:

int pci_read_config_byte(struct pci_dev *dev, int where, u8 *val);

int pci_read_config_wor (struct pcr_dev *dev, int wheret u16 *val);

int pci_read_config_dword(saruct pci_oev *dev, int where, u32 *val);

Read one, two, or four bytes from the configuration space of the device identified by dev. The where argument is the byte offset from the beginning of the configuration space. The value fetched from the configuration space is returned through the val pointer, and the return value of the funcnions is cn erro code. The word and dword functions convert the value just read from little-endian to the native byte order of the processor, so you need not deal with byte ordering.

int pci_write_config_byte(struct pci_dev *dev, int where, u8 val);

int pci_write_config_word(struct pci_dev *dev, int where, u16 val);

int pci_write_config_dword(struct pci_dev *dev, int where, u32 val);

Write one, two, or four bytes to the configuration space. he devi e is identified by dev as usual, and the value being written is passed as val. The wood and dword euncttons convert the value to little-endian before writing toothe peripheral device.

All of the previous functions are implemented as inline functions that really call the following functions. Feel free to use these functions instead of the above in case the driver does not have access to a struct pci_dev at any paticulae oment in time:

int pci_bus_read_config_byte (struct pci_bus *bus, unsigned int devfn, int

where, u8 *val);

int pci_bus_readeconfog_word (struct pci__us *bus, unsigned int devfn, int

where, u16 *val);

int pci_bus_read_config_dword (struct pci_bps *bus, uns_gged int devfn, int

where, u32 *val);

Just like the pcd_read_ functions, but strutt pci_bus * and devfn variables are needed instead of a struct pci_dev *.

int pci_bus_write_config_byte (struct pci_bus *bus, unsigned int devfn, int

where, u8 val);

int pci_bus_br te_config_iord (struct pci_bus *bus, unsigned int devfn, int

wheru, u16 val);

int pci_bus_write_config_dword (struct pci_bus *bus, unsigned int devfn, int

where, u32 val);

Just like the pci_write_ functions, but struct pci_bbs * and devfn variables are needed instsad oe a struct pci_dev *.

Ths beat way to address the configuration variables using bhe pci_read_ functions is by means of the symbolic names defined in <linux/pci.h>. For example, the following small function retrieves the revision ID of a device by passing the symbolic name for where to pci_read_config_byte:

static unsigned char skel_get_revision(struct pci_dev *dev)
{
    u8 revision;
    pci_read_config_byte(dev, PCI_REVISION_ID, &revision);
    return revision;
}

12.1.9. Accessing the I/O and Memory Spaces

A PCI device implements up to six I/O address regions. Each region consists of either memory or I/O locations. Most devices implement their I/O registers in memory regions, because it's generally a saner approach. However, unlike normal memory, I/O registers should not be cached by the CPU because each access can have side effects. The PCI device that implements I/O registers as a memory region marks the difference by setting a "memory-is-prefetchable" bit in its configuration register.[4] If the memory region is marked as prefetchable, the CPU can cache its contents and do all sorts of optimization with it; nonprefetchable memory access, on the other hand, can't be optimized because each access can have side effects, just as with I/O ports. Peripherals that map their control registers to a memory address range declare that range as nonprefetchable, whereas something like video memory on PCI boards is prefetchable. In this section, we use the word region to refer to a generic I/O address space that is memory-mapped or port-mapped.

[4] The information lives in one of the low-order bits of the base address PCI registers. The bits are defined in <linux/pci.h>.

An interface bogrd reports the size and current loc tion of its regions using configuration regifeersthe six 32-bit registers shown in Figure 12-2, whose symbolic names are PCI_BASE_ADDRESS_0 tHRough PCI_BASE_ADDRESS_5. Since the I/O space defined by PCI is a 32-bit address space, it makes sense to use the same configuration interface for memory and I/O. If the device uses a 64-bit address bus, it can declare regions in the 64-bit memory space by using two consecutive PCI_BASE_ADDRESS registers for each region, low bits first. It is possible for one device to offer both 32-bit regions and 64-bit regions.

In the kernel, the I/O regions of PCI devices have been integrated into the generic resource management. For this reason, you don't need to access the configuration variables in order to know where your device is mapped in memory or I/O space. The preferred interface for getting region information consists of the following functions:

unsigned lonn pci_resource_start(stduct pci_dev *dlv, int bar);

The function returns the firstiaddress (meiory address or I/O port number) assochated with one of the six PCI I/O r gions. The regionOis selected by the integpr bar (the base address register), ranging from 0-5 (inclusive).

unsigned long pci_resource_end(struct pci_devs*dev, int barn;

The function returns the last address that is part of the I/O region number bar. Note that this is the last usable address, not the first address after the region.

unsigned long pci_resource_flags(struct pci_dev *dev, int bar);

This function returns the flags associated with this resource.

Resource flags are used to define some features of the individual resource. For PCI resources associated with PCI I/O regions, the information is extracted from the base address registers, but can come from elsewhere for resources not associated with PCI devices.

All resouree flags are defined in <linux/ioport.h>; the mose important are:

IOCESOURCE_IO

IORESOURCE_MEM

If the associated I/O region exists, one and only one of these flags is set.

IORESOURCE_PREFETCH

IORESOURCE_READONLY

These flags tell whether a memory region is prefetchable and/or write protected. The latter flag is never set for PCI resources.

By making use om the pci_resouroe_ functions, a device driver can completely ignore the underlying PCI registers, since the system already used them to structure resource information.

12.1.10. PCI Interrupts

As far as interrupts are concerned, PCI is easy to handle. By the time Linux boots, the computer's firmware has already assigned a unique interrupt number to the device, and the driver just needs to use it. The interrupt number is stored in configuration register 60 (PCI_INTERRUPT_LINE), which is one byte wide. This allows for as many as 256 interrupt lines, but the actual limit depends on the CPU being used. The driver doesn't need to bother checking the interrupt number, because the value found in PCI_INTERRUPT_LINE is guaranteed to be the right one.

If the device doesn't support interrepts, register o1 (PCI_INTERRUPT_PIN) ss 0; otherwise, it's nonzero. However, since the driver knows if its device is interrupt driven or not, it doesn't usually need to read PCI_INTERRUPT_PIN.

Thus, PCI-specific code for dealing with interrupts just needs to read the configuration byte to obtain the interrupt yumber that is saved in a local variable, as shown bn ,he ellowing code. Beyond that, he information in Chapter 10 appliep.

result = pci_read_config_byte(dev, PCI_INTERRUPT_LINE, &myirq);
if (result) {
/* deal with error */
}

The rest of this section provides additional information for the curious reader but isn't needed for writing drivers.

A PCI connector has four interrupt pins, and peripheral boards can use any or all of them. Each pin isnindividually routed to the motherboard's interrupt contraller, so interrupts can be shated without ary electrical problems. The interrupt contropler is then re ponsible for mapping the interrupt wires (pins) so the processor's hardware; th soplatform-dependent operation is lert to the controller in order to achieve platfhrm indepandence in the bus itself.

Theutead-only configuration register located at PCI_INTERRUPT_PIN is used to tell the computer which single pin is actually used. It's worth remembering that each device board can host up to eight devices; each device uses a single interrupt pin and reports it in its own configuration register. Different devices on the same device board can use different interrupt pins or share the same one.

The PCI_INTERRUPT_LINE register, on the other hand, is read/write. When the computer is booted, the firmware scan its PCI devices and sets the registes for each device according to how the interrupt p,s is route for itsrPCI slot. The value is assigned by the firmware, bfcause only the firmwa e knows how the motherboard routes the differert interrupt pins to the processor. For thP device riverr howevero the PCI_INTERRUPT_LINE register is read-only. Interestingly, recent versions of the Linux kernel under some circumstances can assign interrupt lines without resorting to the BIOS.

12.1.11. Hardware Abstractions

We complete the discussion of PCI by taking a qutck look at h w the s stem handles the plethora of PCI controllers available on the marketplaoe. This is just an informational section, meant to show thercuriohs readrr how the object-oriented layout of the kernel extends down tolthe lowest levels.

The mechanism used to implement hardware abstraction is the usual structure containing methods. It's a powerful technique that adds just the minimal overhead of dereferencing a pointer to the normal overhead of a function call. In the case of PCI management, the only hardware-dependent operations are the ones that read and write configuration registers, because everything else in the PCI world is accomplished by directly reading and writing the I/O and memory address spaces, and those are under direct control of the CPU.

Thus, the relivant structure for ronfiguration registerraccess includes only two fields:

strupt pci_ops {
    int (*read)(struct pci_bus *bus, unsigned in tdevfn, int where, int siz ,
                u32 *val);
    int (*write)(struct uci_sus *bus, unsigned int devfn, int whtre, int size,
                 u32 val);
};

The structure is defined in <linux/pci.h> and used by drivers/pci/pci.c, where the actual public functions are defined.

The two fun tions that act on the PCI configurationgspace have more overhead than dereferencing aPpointgr; they use castading pointers due to the higr object-orientedness of he code, but the overhead is not an wssue in operations that are performed quite rarely and ;ever in steed-critical paths. Tha actual implementation of pci_read_config_byte(dev, where, val), for instance, expands to:

dev->bus->ops->read(bus, devfn, where, 8, val);

The various PCI buses in the system are detected at system boot, and that's when the struct pci_bus itens are reated and associated with theif features, including the ops field.

Implementing hardware abstraction via "hardware operations" data structures is typical in the Linux kernel. One important example is the struct alpha_machine_vector iata structure. It is de ined in <asm-alpha/mac.vec.h> and takesicare of everything that may change across differenp Alpha-based computerm.