Chapter 43
Linux Driver Set: liboskit_linux_dev.a

The Linux device driver library consists of various native Linux device drivers coupled with glue code to export the OSKit interfaces such as blkio, netio, and bufio (See Chapter 7). See the source file linux/dev/README for a list of devices and their status.

The header files oskit/dev/linux_ethernet.h and oskit/dev/linux_scsi.h determine which network and SCSI drivers are compiled into liboskit_linux_dev.a. Those files also influence driver probing; see the oskit_linux_init routines below.

43.1 Initialization and Registration

There are several ways to initalize this library. One can either initialize all the compiled-in drivers (oskit_linux_init_devs, initialize a specific class of drivers (oskit_linux_init_ethernet), or initialize specific drivers (e.g., oskit_linux_init_scsi_ncr53c8xx).

These initialization functions initialize various glue code and register the appropriate device(s) in the device tree, to be probed with oskit_dev_probe.

43.1.1 oskit_linux_init_devs: Initialize and register all known drivers

SYNOPSIS

#include <oskit/dev/linux.h>

void oskit_linux_init_devs(void);

DESCRIPTION

This function initializes and registers all known drivers. The known drivers are: the IDE disk driver, and all drivers listed in the <oskit/dev/linux_ethernet.h> and <oskit/dev/linux_scsi.h> files.

Once drivers are registered, their devices may be probed via oskit_dev_probe.

43.1.2 oskit_linux_init_net: Initialize and register known network drivers

SYNOPSIS

#include <oskit/dev/linux.h>

void oskit_linux_init_net(void);

DESCRIPTION

This function initializes and registers all available network drivers. Currently this means Ethernet drivers only, but in the future there may be other network drivers supported such as Myrinet. The known Ethernet drivers are listed in the <oskit/dev/linux_ethernet.h> file.

Once drivers are registered, their devices may be probed via oskit_dev_probe.

43.1.3 oskit_linux_init_ethernet: Initialize and register known Ethernet network drivers

SYNOPSIS

#include <oskit/dev/linux.h>

void oskit_linux_init_ethernet(void);

DESCRIPTION

This function initializes and registers all available Ethernet network drivers. The known Ethernet drivers are listed in the <oskit/dev/linux_ethernet.h> file.

Once drivers are registered, their devices may be probed via oskit_dev_probe.

43.1.4 oskit_linux_init_scsi: Initialize and register known SCSI disk drivers

SYNOPSIS

#include <oskit/dev/linux.h>

void oskit_linux_init_scsi(void);

DESCRIPTION

This function initializes and registers all available SCSI disk drivers. The known SCSI drivers are listed in the <oskit/dev/linux_scsi.h> file.

Once drivers are registered, their devices may be probed via oskit_dev_probe.

43.1.5 oskit_linux_init_ide: Initialize and register known IDE disk drivers

SYNOPSIS

#include <oskit/dev/linux.h>

void oskit_linux_init_ide(void);

DESCRIPTION

This function initializes and registers all available IDE disk drivers. There is currently only one IDE driver.

Once drivers are registered, their devices may be probed via oskit_dev_probe.

43.1.6 oskit_linux_init_scsi_name : Initialize and register a specific SCSI disk driver

SYNOPSIS

#include <oskit/dev/linux.h>

void oskit_linux_init_scsi_name(void);

DESCRIPTION

This function initializes and registers a specific SCSI disk driver. The name must be one from the name field of the drivers listed in the <oskit/dev/linux_scsi.h> file.

Once drivers are registered, their devices may be probed via oskit_dev_probe.

43.1.7 oskit_linux_init_ethernet_name : Initialize and register a specific Ethernet network driver

SYNOPSIS

#include <oskit/dev/linux.h>

void oskit_linux_init_ethernet_name(void);

DESCRIPTION

This function initializes and registers a specific Ethernet network driver. The name must be one from the name field of the drivers listed in the <oskit/dev/linux_ethernet.h> file.

Once drivers are registered, their devices may be probed via oskit_dev_probe.

43.2 Obtaining object references

Once the desired drivers are initialized, registered, and probed, one can obtain references to their blkio, netio, etc interfaces (See Chapter 7) two different ways.

The first way is to look them up via their Linux name, e.g., “sd0” for a SCSI disk, or “eth0” for a Ethernet device. This is described here as it is specific to Linux.

The second, and preferred, way is to use osenv_device_lookup to find a detected device with the desired interface, such as oskit_etherdev_iid (See Chapter 17).

43.2.1 oskit_linux_block_open: Open a disk given its Linux name

SYNOPSIS

#include <oskit/dev/linux.h>

oskit_error_t oskit_linux_block_open(const char *name, unsigned flags, [out] oskit_blkio_t **out_io);

DESCRIPTION

This function takes a Linux name of a disk, e.g., “sd0” or “wd1”, and returns an oskit_blkio_t that can be used to access the device.

The oskit_blkio interface is described in Chapter 7.

PARAMETERS
name:
The Linux name of the device e.g., “sd0” or “wd1”.
flags:
Formed by or’ing the following values:
OSKIT_DEV_OPEN_READ
OSKIT_DEV_OPEN_WRITE
OSKIT_DEV_OPEN_ALL
out_io:
Upon success, is set to point to an oskit_blkio_t that can be used to access the device.
RETURNS

Returns 0 on success, or an error code specified in <oskit/error.h>, on error.

43.2.2 oskit_linux_block_open_kdev: Open a disk given its Linux kdev

43.2.3 oskit_linux_netdev_find: Open a netcard given its Linux name

43.2.4 oskit_linux_net_open: Open a netcard given its Linux name

The rest of this chapter is very incomplete. Some of the internal details of the Linux driver emulation are described, but not the aspects relevant for typical use of the library.

43.3 Introduction

XXX

Much of the data here on Linux device driver internals is out-of-date with respect to the newer device drivers that are now part of the OSKit. This section documents drivers from Linux 1.3.6.8 or earlier; the current OSKit drivers are from Linux 2.2.12, so parts of this section are likely no longer correct.

XXX Library can be used either as one component or can be used to produce many separate components, depending on how it is used.

43.4 Partially-compliant Drivers

There are a number of assumptions made by some drivers: if a given assumption is not met by the OS using the framework, then the drivers that make the assumption will not work, but other drivers may still be usable. The specific assumptions made by each partially-compliant driver are listed in a table in the appropriate section below; here is a summary of the assumptions some of the drivers make:

43.5 Internals

The following sections document all the variables and functions that Linux drivers can refer to. These variables and functions are provided by the glue code supplied as part of the library, so this information should not be needed for normal use of the library under the device driver framework. However, they are documented here for the benefit of those working on this library or upgrading it to new versions of the Linux drivers, or for those who wish to “short-cut” through the framework directly to the Linux device drivers in some situations, e.g., for performance reasons.

43.5.1 Namespace Management Rules

For an outline of our namespace management conventions, see Section 4.7.2 in our SOSP paper, http://www.cs.utah.edu/flux/papers/index.html#SOSKIT.

43.5.2 Variables

current:

This is a global variable that points to the state for the current process. It is mostly used by drivers to set or clear the interruptible state of the process.

jiffies:

Many Linux device drivers depend on a global variable called jiffies, which in Linux contains a clock tick counter that is incremented by one at each 10-millisecond (100Hz) clock tick. The device drivers typically read this counter while polling a device during a (hopefully short) interrupt-enabled busy-wait loop. Although a few drivers take the global clock frequency symbol HZ into account when determining timeout values and such, most of the drivers just used hard-coded values when using the jiffies counter for timeouts, and therefore assume that jiffies increments “about” 100 times per second.

irq2dev_map:

This variable is an array of pointers to network device structures. The array is indexed by the interrupt request line (IRQ) number. Linux network drivers use it in interrupt handlers to find the interrupting network device given the IRQ number passed to them by the kernel.

blk_dev:

This variable is an array of “struct blk_dev_struct” structures. It is indexed by the major device number. Each element contains the I/O request queue and a pointer to the I/O request function in the driver. The kernel queues I/O requests on the request queue, and calls the request function to process the queue.

blk_size:

This variable is an array of pointers to integers. It is indexed by the major device number. The subarray is indexed by the minor device number. Each cell of the subarray contains the size of the device in 1024 byte units. The subarray pointer can be NULL, in which case, the kernel does not check the size and range of an I/O request for the device.

blksize_size:

This variable is an array of pointers to integers. It is indexed by the major device number. The subarray is indexed by the minor device number. Each cell of the subarray contains the block size of the device in bytes. The subarray can be NULL, in which case, the kernel uses the global definition BLOCK_SIZE (currently 1024 bytes) in its calculations.

hardsect_size:

This variable is an array of pointers to integers. It is indexed by the major device number. The subarray is indexed by the minor device number. Each cell of the subarray contains the hardware sector size of the device in bytes. If the subarray is NULL, the kernel uses 512 bytes in its calculations.

read_ahead:

This variable is an array of integers indexed by the major device number. It specifies how many sectors of read-ahead the kernel should perform on the device. The drivers only initialize the values in this array; the Linux kernel block buffer code is the actual user of these values.

wait_for_request:

The Linux kernel uses a static array of I/O request structures. When all I/O request structures are in use, a process sleeps on this variable. When a driver finishes an I/O request and frees the I/O request structure, it performs a wake up on this variable.

EISA_bus:

If this variable is non-zero, it indicates that the machine has an EISA bus. It is initialized bye the Linux kernel prior to device configuration.

high_memory:

This variable contains the address of the last byte of physical memory plus one. It is initialized by the Linux kernel prior to device configuration.

intr_count:

This variable gets incremented on entry to an interrupt handler, and decremented on exit. Its purpose is let driver code determine if it was called from an interrupt handler.

kstat:

This variable contains Linux kernel statistics counters. Linux drivers increment various fields in it when certain events occur.

tq_timer:

Linux has a notion of “bottom half” handlers. These handlers have a higher priority than any user level process but lower priority than hardware interrupts. They are analogous to software interrupts in BSD. Linux checks if any “bottom half” handlers need to be run when it is returning to user mode. Linux provides a number of lists of such handlers that are scheduled on the occurrence of specific events. tq_timer is one such list. On every clock interrupt, Linux checks if any handlers are on this list, and if there are, immediately schedules the handlers to run.

timer_active:

This integer variable indicates which of the timers in timer_table (described below) are active. A bit is set if the timer is active, otherwise it is clear.

timer_table:

This variable is an array of “struct timer_struct” elements. The array is index by global constants defined in ¡linux/timer.h¿. Each element contains the duration of timeout, and a pointer to a function that will be invoked when the timer expires.

system_utsname:

This variable holds the Linux version number. Some drivers check the kernel version to account for feature differences between different kernel releases.

43.5.3 Functions

autoirq_setup:
int autoirq_setup(int waittime);

This function is called by a driver to set up for probing IRQs. The function attaches a handler on each available IRQ, waits for waittime ticks, and returns a bit mask of IRQs available IRQs. The driver should then force the device to generate an interrupt.

autoirq_report:
int autoirq_report(int waittime);

This function is called by a driver after it has programmed the device to generate an interrupt. The function waits waittime ticks, and returns the IRQ number on which the device interrupted. If no interrupt occurred, 0 is returned.

register_blkdev:
int register_blkdev(unsigned major, const char *name, struct file_operations *fops);

This function registers a driver for the major number major. When an access is made to a device with the specified major number, the kernel accesses the driver through the operations vector fops. The function returns 0 on success, non-zero otherwise.

unregister_blkdev:
int unregister_blkdev(unsigned major, const char *name);

This function removes the association between a driver and the major number major, previously established by register_blkdev. The function returns 0 on success, non-zero otherwise.

getblk:
struct buffer_head *getblk(kdev_t dev, int block, int size);

This function is called by a driver to allocate a buffer size bytes in length and associate it with device dev, and block number block.

brelse:
void brelse(struct buffer_head *bh);

This function frees the buffer bh, previously allocated by getblk.

bread:
struct buffer_head *bread(kdev_t dev, int block, int size);

This function allocates a buffer size bytes in length, and fills it with data from device dev, starting at block number block.

block_write:
int block_write(struct inode *inode, struct file *file, const char *buf, int count);

This function is the default implementation of file write. It is used by most of the Linux block drivers. The function writes count bytes of data to the device specified by i_rdev field of inode, starting at byte offset specified by f_pos of file, from the buffer buf. The function returns 0 for success, non-zero otherwise.

block_read:
int block_read(struct inode *inode, struct file *file, const char *buf, int count);

This function is the default implementation of file read. It is used by most of the Linux block drivers. The function reads count bytes of data from the device specified by i_rdev field of inode, starting at byte offset specified by f_pos field of file, into the buffer buf. The function returns 0 for success, non-zero otherwise.

check_disk_change:
int check_disk_change(kdev_t dev);

This function checks if media has been removed or changed in a removable medium device specified by dev. It does so by invoking the check_media_change function in the driver’s file operations vector. If a change has occurred, it calls the driver’s revalidate function to validate the new media. The function returns 0 if no medium change has occurred, non-zero otherwise.

request_dma:
int request_dma(unsigned drq, const char *name);

This function allocates the DMA request line drq for the calling driver. It returns 0 on success, non-zero otherwise.

free_dma:
void free_dma(unsigned drq);

This function frees the DMA request line drq previously allocated by request_dma.

disable_irq:
void disable_irq(unsigned irq);

This function masks the interrupt request line irq at the interrupt controller.

enable_irq:
void enable_irq(unsigned irq);

This function unmasks the interrupt request line irq at the interrupt controller.

request_irq:
int request_irq(unsigned int irq, void (*handler)(int, struct), unsigned long flags, const char *device);

This function allocates the interrupt request line irq, and attach the interrupt handler handler to it. It returns 0 on success, non-zero otherwise.

free_irq:
void free_irq(unsigned int irq);

This function frees the interrupt request line irq, previously allocated by request_irq.

kmalloc:
void *kmalloc(unsigned int size, int priority);

This function allocates size bytes memory. The priority argument is a set of bitfields defined as follows:

GFP_BUFFER:
Not used by the drivers.
GFP_ATOMIC:
Caller cannot sleep.
GFP_USER:
Not used by the drivers.
GFP_KERNEL:
Memory must be physically contiguous.
GFP_NOBUFFER:
Not used by the drivers.
GFP_NFS:
Not used by the drivers.
GFP_DMA:
Memory must be usable by the DMA controller. This means, on the x86, it must be below 16 MB, and it must not cross a 64K boundary. This flag implies GFP_KERNEL.
kfree:
void kfree(void *p);

This function frees the memory p previously allocated by kmalloc.

vmalloc:
void *vmalloc(unsigned long size);

This function allocates size bytes of memory in kernel virtual space that need not have underlying contiguous physical memory.

check_region:
int check_region(unsigned port, unsigned size);

Check if the I/O address space region starting at port and size bytes in length, is available for use. Returns 0 if region is free, non-zero otherwise.

request_region:
void request_region(unsigned port, unsigned size, const char *name);

Allocate the I/O address space region starting at port and size bytes in length. It is the caller’s responsibility to make sure the region is free by calling check_region, prior to calling this routine.

release_region:
void release_region(unsigned port, unsigned size);

Free the I/O address space region starting at port and size bytes in length, previously allocated by request_region.

add_wait_queue:
void add_wait_queue(struct wait_queue **q, struct wait_queue *wait);

Add the wait element wait to the wait queue q.

remove_wait_queue:
void remove_wait_queue(struct wait_queue **q, struct wait_queue *wait);

Remove the wait element wait from the wait queue q.

down:
void down(struct semaphore *sem);

Perform a down operation on the semaphore sem. The caller blocks if the value of the semaphore is less than or equal to 0.

sleep_on:
void sleep_on(struct wait_queue **q, int interruptible);

Add the caller to the wait queue q, and block it. If interruptible flag is non-zero, the caller can be woken up from its sleep by a signal.

wake_up:
void wake_up(struct wait_queue **q);

Wake up anyone waiting on the wait queue q.

wait_on_buffer:
void wait_on_buffer(struct buffer_head *bh);

Put the caller to sleep, waiting on the buffer bh. Called by drivers to wait for I/O completion on the buffer.

schedule:
void schedule(void);

Call the scheduler to pick the next task to run.

add_timer:
void add_timer(struct timer_list *timer);

Schedule a time out. The length of the time out and function to be called on timer expiry are specified in timer.

del_timer:
int del_timer(struct timer_list *timer);

Cancel the time out timer.

43.5.4 Directory Structure

The linux subdirectory in the OSKit source tree is organized as follows. The top-level linux/dev directory contains all the glue code implemented by the Flux project to squash the Linux drivers into the OSKit driver framework. linux/fs contains our glue for Linux filesystems, and linux/shared contains glue used by both components. In general, everything except the code in the linux/src directory was written by us, whereas everything under linux/src comes verbatim from Linux. Each of the subdirectories of linux/src corresponds to the identically named subdirectories of in the Linux kernel source tree.

Of course, there are a few necessary deviations from this rule: a few of the Linux header and source files are slightly modified, and a few of the Linux header files (but no source files) were completely replaced. The header files that were heavily modified include:

linux/src/include/linux/sched.h:
Linux task and scheduling declarations

43.6 Block device drivers









Name Description V = P jiffies P+Y current














cmd640.c CMD640 IDE Chipset







floppy Floppy drive * * * *







ide-cd.c IDE CDROM * * *







ide.c IDE Disk







rz1000.c RZ1000 IDE Chipset







sd.c SCSI disk *







sr.c SCSI CDROM







triton.c Triton IDE Chipset *








Table 43.1: Linux block device drivers

43.7 Network drivers

Things drivers may want to do that make emulation difficult:









Name Description V = P jiffies P+Y current














3c501.c 3Com 3c501 ethernet *







3c503.c NS8390 ethernet *







3c505.c 3Com Etherlink Plus (3C505) *







3c507.c 3Com EtherLink16 *







3c509.c 3c509 EtherLink3 ethernet *







3c59x.c 3Com 3c590/3c595 ”Vortex” *







ac3200.c Ansel Comm. EISA ethernet *







apricot.c Apricot * *







at1700.c Allied Telesis AT1700 *







atp.c Attached (pocket) ethernet *







de4x5.c DEC DE425/434/435/500 *







de600.c D-link DE-600 *







de620.c D-link DE-620 *







depca.c DEC DEPCA & EtherWORKS *







e2100.c Cabletron E2100 *







eepro.c Intel EtherExpress Pro/10 *







eexpress.c Intel EtherExpress *







eth16i.c ICL EtherTeak 16i & 32 *







ewrk3.c DEC EtherWORKS 3 *







hp-plus.c HP PCLAN/plus *







hp.c HP LAN *







hp100.c HP10/100VG ANY LAN *







lance.c AMD LANCE * *







ne.c Novell NE2000 *







ni52.c NI5210 (i82586 chip) *







ni65.c NI6510 (am7990 ‘lance’ chip) * *







seeq8005.c SEEQ 8005 *







sk_g16.c Schneider & Koch G16 *







smc-ultra.c SMC Ultra *







tulip.c DEC 21040 * *







wavelan.c AT&T GIS (NCR) WaveLAN *







wd.c Western Digital WD80x3 *







znet.c Zenith Z-Note *








Table 43.2: Linux network drivers

43.8 SCSI drivers

The Linux SCSI driver set includes both the low-level SCSI host adapter drivers and the high-level SCSI drivers for generic SCSI disks, tapes, etc.









Name Description V = P jiffies P+Y current














53c7,8xx.c NCR 53C7x0, 53C8x0 * *







AM53C974.c AM53/79C974 (PCscsi) *







BusLogic.c BusLogic MultiMaster adapters * *







NCR53c406a.c NCR53c406a * *







advansys.c AdvanSys SCSI Adapters * *







aha152x.c Adaptec AHA-152x *







aha1542.c Adaptec AHA-1542 * *







aha1740.c Adaptec AHA-1740 *







aic7xxx.c Adaptec AIC7xxx * *







eata.c EATA 2.x DMA host adapters *







eata_dma.c EATA/DMA host adapters * *







eata_pio.c EATA/PIO host adapters *







fdomain.c Future Domain TMC-16x0 *







in2000.c Always IN 2000 *







NCR5380.c Generic NCR5380 * * *







pas16.c Pro Audio Spectrum/Studio 16







qlogic.c Qlogic FAS408 *







scsi.c SCSI middle layer * * * *







scsi_debug.c SCSI debug layer *







seagate.c ST01,ST02, TMC-885 *







t128.c Trantor T128/128F/228







u14-34f.c UltraStor 14F/34F * *







ultrastor.c UltraStor 14F/24F/34F *







wd7000.c WD-7000 * *








Table 43.3: Linux SCSI drivers