\chapter{Device Drivers}

This section written by Peter De Schrijver and Daniel Wagner.


\section{Requirements}

\begin{itemize}
\item Performance: Speed is important!
\item Portability: Framework should work on different architectures.
    
  Also: Useable in a not hurdisch environment with only
  small changes.

\item Flexibility
\item Convenient interfaces
\item Consistency 
\item Safety: driver failure should have as minimal system impact as
  possible.
\end{itemize}


\section{Overview}

The framework consists of: 
\begin{itemize}
\item Bus drivers
\item Device drivers
\item Service servers (plugin managers, $\omega_0$, deva)
\end{itemize}

\subsection{Layer of the drivers}

The device driver framework consists only of the lower level drivers
and doesn't need to have a complicated scheme for access control.
This is because it should be possible to share devices, e.g. for
neighbour Hurd.  The authentication is done by installing a virtual
driver in each OS/neighour Hurd.  The driver framework trusts these
virtual drivers.  So it's possible for a non Hurdish system to use
the driver framework just by implementing these virtual drivers.
  
Only threads which have registered as trusted are allowed to access
device drivers.  The check is simply done by checking the senders
ID against a table of known threads.

\subsection{Address spaces}

Drivers always reside in their own AS. The overhead for cross AS IPC
is small enough to do so.

\subsection{Zero copying and DMA}

It is assumed that there are no differences between physical memory
pages. For example each physical memory page can be used for DMA
transfers. Of course, older hardware like ISA devices can so not be
supported.  

Still some support for ISA devices like serial ports and PS/2 for
keyboard is needed.  
  
With this assumption, the device driver framework can be given any
physical memory page for DMA operation.  This physical memory page
must be pinned down.
  
If an application wants to send or receive data to/from a device
driver it has to tell the virtual driver the page on which the
operation has to be executed.  Since the application doesn't know the
virtual-real memory mapping, it has to ask the physical memory manager
for the real memory address of the page in question.  If the page is
not directly mapped from the physical memory manager the application
asks the mapper (another application which has mapped this memory
region to the first application) to resolve the mapping.  This can be
done recursively.  Normally, this resolving of a mapping can be sped
up using a cache services, since a small number of pages are reused
very often.
  
With the scheme, the drivers do not have to take special care of zero
copying if there is only one virtual driver.  When there is more than
one virtual driver pages have to be copied for all other virtual
drivers.

\subsection{Physical versus logical device view}

The device driver framework will only offer a physical device view.
Ie. it will be a tree with devices as the leaves connected by
various bus technologies.  Any logical view and naming persistence
will have to be build on top of this (translator).

\subsection{Things for the future}

\begin{itemize}
\item Interaction with the task server (e.g. listings driver threads 
  with ps,etc.)
\item Powermanagement
\end{itemize}


\section{Bus Drivers}

A bus driver is responsible to manage the bus and provide access to
devices connected to it.  In practice it means a bus driver has to
perform the following tasks:

\begin{itemize}
\item Handle hotplug events

  Busses which do not support hotplugging, will treated as if there is
  1 insertion event for every device connected to it when the bus
  driver is started.  Drivers which don't support autoprobing of
  devices will probably have to read some configuration data from a
  file\footnote{It might be a good idea, if the device driver has no
  notion how the configuraiton is stored.  It just asks the bus driver
  which should know how to get the configuration.} or if the driver is
  needed for bootstrapping configuration can be given as argument on
  its stack.  In some cases the bus doesn't generate insertion/removal
  events, but can still support some form of hotplug functionality if
  the user tells the driver when a change to the bus configuration has
  happened (eg. SCSI).
\item Configure client device drivers

  The bus driver should start the appropriate client device driver
  translator when an insertion event is detected.  It should also
  provide the client device driver with all necessary configuration
  info, so it can access the device it needs.  This configuration data
  typically consists of the bus addresses of the device and possibly
  IRQ numbers or DMA channel ID's.  The device driver is loaded by the
  associated plugin manager.
\item Provide access to devices

  This means the bus driver should be able to perform a bus
  transaction on behalf of a client device driver.  In some cases this
  involves sending a message and waiting for reply (eg. SCSI, USB,
  IEEE 1394, Fibre Channel,...).  The driver should provide
  send/receive message primitives in this case.  In other cases
  devices on the bus can be accessed by memory accesses or by using
  special I/O instructions.  In this case the driver should provide
  mapping and unmapping primitives so a client device driver can get
  access to the memory range or is allowed to access the I/O
  addresses.  The client device driver should use a library, which is
  bus dependant, to access the device on the bus.  This library hides
  the platform specific details of accessing the bus.
\item Rescans

  Furthermore the bus driver must also support rescans for hardware.
  It might be that not all drivers are found during bootstrapping and
  hence later on drivers could be loaded.  This is done by generating
  new attach notification, which are sent to the bus's plugin manager.
  The plugin manager then loads a new driver, if possible.  A probe
  funtion is not needed since all supported hardware can be identified
  by vendor/device identification (unlike ISA hardware).  For hardware
  busses which don't support such identification only static
  configuration is possible (configuration scripts etc.)
\end{itemize}

\subsection{Root bus driver}

The root bus is the entrypoint to look up devices.

\subsection{Generic Bus Driver}

Operations:
\begin{itemize}
\item notify (attach, detach)
\item string enumerate
\end{itemize}


\subsection{ISA Bus Driver}

Inherits from:
\begin{itemize}
\item Generic Bus Driver
\end{itemize}

Operations:
\begin{itemize}
\item (none)
\end{itemize}


\subsection{PCI Bus Driver}

Inherits from:
\begin{itemize}
\item Generic Bus Driver
\end{itemize}

Operations:
\begin{itemize}
\item map\_mmio: map a PCI BAR for MMIO
\item map\_io: map a PCI BAR for I/O
\item map\_mem: map a PCI BAR for memory
\item read\_mmio\_{8,16,32,64}: read from a MMIO register
\item write\_mmio\_{8,16,32,64}: write to a MMIO register
\item read\_io\_{8,16,32,64}: read from an IO register
\item write\_io\_{8,16,32,64}: write to an IO register
\item read\_config\_{8,16,32,?}: read from a PCI config register
\item write\_config\_{8,16,32,?}: write to a PCI config register
\item alloc\_dma\_mem(for non zero copying): allocate main memory 
  useable for DMA
\item free\_dma\_mem  (for non zero copying): free main memory 
  useable for DMA
\item prepare\_dma\_read: write back CPU cachelines for DMAable memory area
\item sync\_dma\_write: discard CPU cachelines for DMAable memory area
\item alloc\_consistent\_mem: allocate memory which is consistent 
  between CPU and device
\item free\_consistent\_mem: free memory which
  is consistent between CPU and device
\item get\_irq\_mapping (A,B,C,D): get the IRQ matching the 
  INT(A,B,C,D) line
\end{itemize}


\section{Device Drivers}

\subsection{Classes}

\begin{itemize}
\item character: This the standard tty as known in the Unix environment.
\item block
\item human input: Keyboard, mouse, ...
\item packet switched network
\item circuit switched network
\item framebuffer
\item streaming audio
\item streaming video
\item solid state storage: flash memory
\end{itemize}

\subsection{Human input devices (HID) and the console}

The HIDs and the console are critical for user interaction with the
system.  Furthmore, the console should be working as soons as possible
to give feedback.  Log messages which are send to the console before
the hardware has been initialized should be buffered.

\subsection{Generic Device Driver}

Operations:
\begin{itemize}
\item init : prepare hardware for use
\item start : start normal operation
\item stop : stop normal operation
\item deinit : shutdown hardware
\item change\_irq\_peer : change peer thread to propagate irq message to.
\end{itemize}

\subsection{ISA Devices}

Inherits from:
\begin{itemize}
\item Generic Device Driver
\end{itemize}

Supported devices
\begin{itemize}
\item Keyboard (ps2)
\item Serial port (mainly for debugging purposses)
\end{itemize}


\subsection{PCI Devices}

Inherits from:
\begin{itemize}
\item Generic Device Driver
\end{itemize}

Supported devices:
\begin{itemize}
\item block devices
\end{itemize}


\section{Service Servers}

\subsection{Plugin Manager}

Each bus driver has a handle/reference to which insert/remove events
are send.  The owner of the handle/refence must then take
appropriate action like loading the drivers.  These actors are
called plugin managers.

The plugin manager is also the pager for the loaded driver.

\begin{comment}
Obviously, the plugin manager needs some sort of exec format
support.  Maybe it's own ELF loader.
\end{comment}

\subsection{Deva}

Deva stands for \emph{Device Access Server}.  This server implements basic
services for the device driver framework like thread creation, thread
deletion, etc.  The device driver framework itself doesn't depend on
any Hurd code.  The interaction with the Hurd system will be
abstracted by deva. 

Which services must deva provide:
\begin{itemize}
\item task/thread manipulation (create, deletion)
\item memory (de)allocation (virtual, physical)
\item io ports 
\item driver (un)loading
\item bootstrapping
\end{itemize}

%% Deva is also the pager for the first plugin manager.

\subsection{$\omega_0$}

$\omega_0$ is a system-central IRQ-logic server. It runs in the
privileged AS space in order to be allowed rerouting IRQ IPC.

If an IRQ is shared between several devices, the drivers are daisy
chained and have to notify their peers if an IRQ IPC has arrived.

For more details see http://os.inf.tu-dresden.de/\~hohmuth/prj/omega0.ps.gz

Operations:
\begin{itemize}
\item attach\_irq : attach an ISR thread to the IRQ 
\item detach\_irq : detach an ISR thread from the IRQ
\end{itemize}


\section{Resource Management}

\subsection{IRQ handling}

\subsubsection{IRQ based interrupt vectors}

Some CPU architectures (eg 68k, IA32) can directly jump to an
interrupt vector depending on the IRQ number. This is typically the
case on CISC CPU's. In this case there is some priorization scheme. On
IA32 for example, the lowest IRQ number has the highest priority.
Sometimes the priorities are programmable.  Most RISC CPU's have only
a few interrupt vectors which are connected external IRQs. (typically
1 or 2). This means the IRQ handler should read a register in the
interrupt controller to determine which IRQ handler has to be
executed.  Sometimes the hardware assists here by providing a register
which indicates the highest priority interrupt according to some
(programmable) scheme.

\subsubsection{IRQ acknowlegdement}

The IRQ acknowledgement is done in two steps. First inform the
hardware about the successful IRQ acceptance. Then inform the ISRs
about the IRQ event.

\subsubsection{Edge versus level triggered IRQs}

Edge triggered IRQs typically don't need explicit acknowledgment by
the CPU at the device level. You can just acknowledge them at the
interrupt controller level.  Level triggered IRQs typically need to
explicitly acknowledged by the CPU at the device level. The CPU has to
read or write a register from the IRQ generating peripheral to make
the IRQ go away. If this is not done, the IRQ handler will be
reentered immediatly after it ended, effectively creating an endless
loop. Another way of preventing this would be to mask the IRQ.

\subsubsection{Multiple interrupt controllers}

Some systems have multiple interrupt controllers in cascade. This is
for example the case on a PC, where you have 2 8259 interrupt
controllers. The second controller is connected to the IRQ 2 pin of
the first controller. It is also common in non PC systems which still
use some standard PC components such as a Super IO controller. In this
case the 2 8259's are connected to 1 pin of the primary interrupt
controller. Important for the software here is that you need to
acknowledge IRQ's at each controller. So to acknowledge an IRQ from
the second 8259 connected to the first 8259 connected to another
interrupt controller, you have to give an ACK command to each of those
controllers.  Another import fact is that on the PC architecture the
order of the ACKs is important.

\subsubsection{Shared IRQs}

Some systems have shared IRQs. In this case the IRQ handler has to
look at all devices using the same IRQ...

\subsubsection{IRQ priorities}

All IRQs on L4 have priorities, so if an IRQ occurs any IRQ lower then
the first IRQ will be blocked until the first IRQ has been
acknowlegded.  ISR priorities must much the hardware priority (danger
of priority inversion).  Furthermore the IRQ acknowledgment order is
important.

The 8259 also supports a specific IRQ acknowledge iirc. But, this
scheme does not work in most level triggered IRQ environments. In
these environments you must acknowledge (or mask) the IRQ before
leaving the IRQ handler, otherwise the CPU will immediately reenter
the IRQ handler, effectively creating an endless loop. In this case L4
would have to mask the IRQ. The IRQ thread would have to unmask it
after acknowledgement and processing.

\subsubsection{IRQ handling by L4/x86}

The L4 kernel does handle IRQ acknowlegdment. 

\subsection{Memory}

If no physical memory pages are provided by the OS the device driver
framework alloces pages from the physical memory manager.  The device
driver framework has at no point of time to handle any virtual to
physical page mapping.


\section{Bootstrapping}

The device driver framework will be started by deva, which is started
by wortel.  All drivers and servers (e.g. the plugin manager) are
stored in a archive which will be extracted by deva.

\subsection{deva}

For bootstrapping deva will only have a subset of drivers ready.  
As soon the filesystem runs deva can ask for drivers from the harddisk. 
If new drivers are available it has to inform the plugin manager to ask
for unresolved drivers again.

Deva starts as first task a plugin server.  The plugin server does
then the rest of the bootstrapping process.

\subsection{Plugin Manager}

A Plugin manager handles driver loading for devices.  It asks deva for
drivers.

The first plugin server does also some bootstrapping.  First, it starts
the root bus driver.

\section{Order of implementation}

\begin{enumerate}
\item deva, plugin manager
\item root bus server
\item pci bus
\item isa bus
\item serial port  (isa bus)
\item console 
\end{enumerate}


\section{Scenarios}

\subsection{Insert Event}

\begin{figure}
  \begin{center}
    \leavevmode 
    \includegraphics[scale=0.8]{ddf_insert_event.eps}
  \end{center}
  \caption[New hardware detected]{A new hardware device is detected (a
  network card) by the PCI root bus driver.  The PCI root bus driver
  initiates the loading of the correct driver for the new hardware
  device.}
  \label{fig:ddf_insert_event}
\end{figure}

If a simple hardware device is found the ddf will load a driver for
the new hardware device as follows (see Figure~\ref{fig:ddf_insert_event}):

\begin{enumerate}
\item The PCI Bus Driver detects a hardware device for which no driver
has been loaded yet.  It generates an insert event which it sends to
one (all?) registered entity.  The interface for the event handler has
not been decided yet.
\item The Root Bus Driver receives the event signal.  Note it is not
necessary that the Root Bus Driver handles the insert signal for all
drivers.  It forwards the signal to the/a Plugin Manager (PLM).
\item The/a Plugin Manager (PLM) asks Deva to load the driver binary
for the new device.
\item Deva forwards the loading request to the ext2 filesystem
process.  During bootstrapping Deva will handle the request by itself.
Deva has an archive of drivers loaded by grub.
\item The ext2 process decides where it finds the device driver binary
(block address)
\item The ddwrapper (device driver wrapper) forwards the read call
from the ext2 process to the IDE Driver.
\item After checking if the caller is allowed start a read command,
the IDE Driver reads the device driver from the disk.
\item The IDE Driver returns the data.
\item ddwrapper returns the data. XXX This might be wrong.  IFRC, the
data is returned in a container and only the handle of the container is
transfered.
\item Ext2 returns the device driver (data).
\item Deva returns the device driver (data).
\item Ask Deva to create a new address space.
\item Deva asks wortel to create new address space.
\item wortel returns ``a new address space''.
\item Deva returns ``a new address space''.
\item PLM is registered as pagefault handler for the new driver
address space.  The bootstrap thread starts to run and generates a
page fault.
\item PLM asks Deva for memory.
\item Deva asks physmem for memory.
\item physmem returns memory pages.
\item Deva returns memory pages.
\item PLM maps the device driver binary into the address space of the
new driver.
\end{enumerate}

\subsection{Several Plugin Managers}

\begin{figure}
  \begin{center}
    \leavevmode 
    \includegraphics{ddf_several_plms.eps}
  \end{center}
  \caption[Several plugin managers]{For the new NIC driver a
  specialised plugin manager is loaded first.}
  \label{fig:ddf_several_plms}
\end{figure}

For certain drivers it makes sense to have specialised plugin
managers.  The default plugin manger (dPLM) has to be asked to create
a new plugin manager.  It is loaded like a normal driver.  The default
plugin manager will also act as pager for the new plugin manager.
When the new plugin manager is activated it registers itself to the
Deva as new plugin manager.  Deva will send all signals/messages from
outside of the ddf to all registered plugin managers.