210 lines
10 KiB
HTML
210 lines
10 KiB
HTML
|
<html>
|
||
|
<body bgcolor=white>
|
||
|
<h1>Haiku Network Stack Architecture</h1>
|
||
|
<p>
|
||
|
The Haiku Network Stack is a modular and layered networking stack, very
|
||
|
similar to what you may know as BONE.
|
||
|
</p>
|
||
|
<p>
|
||
|
The entry point when talking to the stack is through a dedicated device
|
||
|
driver that publish itself in /dev/net. The userland library libnetwork.so
|
||
|
(which combines libsocket.so, and libbind.so) directly talks to this
|
||
|
driver, mostly via ioctl()<sup><a href="#foot1">1</a></sup>.
|
||
|
</p>
|
||
|
The driver either creates sockets, or passes on every command to the socket
|
||
|
module<sup><a href="#foot2">2</a></sup>. Depending on the address family and
|
||
|
type of the sockets, the lower layers will be loaded and connected.
|
||
|
</p>
|
||
|
<p>
|
||
|
For example, with a TCP/IP socket, the stack could look like this:
|
||
|
<table cellspacing=1 cellpadding=5 border=0>
|
||
|
<tr><td colspan=2 bgcolor="#aaaadd">Socket</td></tr>
|
||
|
<tr><td bgcolor="#ccccff">TCP</td>
|
||
|
<td rowspan=2 bgcolor="#ddddff"><p>Protocols<br>
|
||
|
<font size="-2">defined by the socket (address family, type)</p>
|
||
|
(session, transport, network layers)</font></td>
|
||
|
</tr>
|
||
|
<tr><td bgcolor="#ccccff">IPv4</td></tr>
|
||
|
<tr><td colspan=2 bgcolor="#ddcc88">Datalink</td></tr>
|
||
|
<tr><td bgcolor="#ffee88">ARP</td>
|
||
|
<td rowspan=2 bgcolor="#ffee99"><p>Datalink Protocols<br>
|
||
|
<font size="-2">defined by the interface (IP address, device)</p>
|
||
|
(datalink layer)</font></td>
|
||
|
</tr>
|
||
|
<tr><td bgcolor="#ffee88">Ethernet framing</td></tr>
|
||
|
<tr><td bgcolor="#ffdd00">Ethernet device</td><td bgcolor="#ffdd55"><font size="-2">(physical layer)</font></tr>
|
||
|
</table>
|
||
|
Where TCP, and IPv4 are net_protocol modules, and ARP, and the Ethernet framing are net_datalink_protocol modules.
|
||
|
All modules are connected in a chain, even though the datalink layer introduces more than
|
||
|
one path (one for each interface).
|
||
|
</p>
|
||
|
<p>
|
||
|
When sending data through a socket, a net_buffer is created in the socket module, and passed
|
||
|
on to the lower levels where each protocol processes it, before passing it on to the next
|
||
|
protocol in the chain. When the buffer reaches the datalink level, an accompanied net_route
|
||
|
object will determine for which interface (which determines the datalink protocols in the
|
||
|
chain) the buffer is destined. The route has to be specified by the upper protocols before
|
||
|
the buffer gets into the datalink level - if a buffer comes in without a valid route, it
|
||
|
is discarded.
|
||
|
</p>
|
||
|
<p>
|
||
|
The protocol modules are loaded and unloaded as needed. The stack itself stays loaded
|
||
|
as long as there are interfaces defined - as soon as the last interface is removed,
|
||
|
the stack gets unloaded (which is, of course, not yet implemented).
|
||
|
</p>
|
||
|
<h3>The Structures and Classes</h3>
|
||
|
<h4>net_domain</h4>
|
||
|
<p>
|
||
|
Every supported address family gets its own domain. A domain comprises such a family,
|
||
|
a net_protocol module that handles this domain, and a list of interfaces and routes.
|
||
|
It also gets a name: for example, the IPv4 module registers the "internet" domain
|
||
|
(AF_INET).
|
||
|
</p>
|
||
|
<p>
|
||
|
The domain protocol module is responsible for managing the domain; it has to register
|
||
|
it when it's loaded, and it has to unregister it when it is unloaded by the networking
|
||
|
stack.
|
||
|
</p>
|
||
|
<h4>net_interface</h4>
|
||
|
<p>
|
||
|
An interface makes an underlying net_device accessible by the stack. When creating
|
||
|
a new interface, you have to specify a domain, and a device to be used. The stack
|
||
|
will then look through the registered datalink protocols, and builds a chain of
|
||
|
them for that interface.
|
||
|
</p>
|
||
|
<p>
|
||
|
The interface usually gets a network address, and a route that directs buffers to
|
||
|
be sent to it. If there is no route to an interface, it will never be used for
|
||
|
outgoing data, but may well receive data from other hosts.
|
||
|
</p>
|
||
|
<p>
|
||
|
An interface can be "up" (when <code>IFF_UP</code> is set in its <code>flags</code>
|
||
|
member) in which case it accepts data - when that flag is not set, it will discard
|
||
|
all data it gets. The interface also specifies the maximum buffer size that can be
|
||
|
sent over this interface (the <code>mtu</code> member, a.k.a. maximum transmission
|
||
|
unit).
|
||
|
</p>
|
||
|
<p>
|
||
|
Interfaces are configured via ioctl()s (SIOCAIFADDR, ...). You can use the command
|
||
|
line tool "ifconfig" to do this for you.
|
||
|
</p>
|
||
|
<h4>net_device</h4>
|
||
|
<p>
|
||
|
A networking device is used to actually send and receive the buffers. It either points
|
||
|
to an actual hardware device (in case of ethernet), or to a virtual device (in case of
|
||
|
loopback). Every device has a unique name that identifies it. When creating a device,
|
||
|
the name also decides which net_device module will be chosen; for example, everything
|
||
|
that starts with "loop" will end up in the loopback device, while the ethernet device
|
||
|
accepts names that start with "/dev/net/".
|
||
|
</p>
|
||
|
<p>
|
||
|
A device can be shared by many interfaces at the same time. The device to be used by
|
||
|
an interface is specified at the time an interface is created.
|
||
|
It also has an <code>mtu</code> member that determines the upper limit of an interface's
|
||
|
<code>mtu</code> as well.
|
||
|
</p>
|
||
|
<h4>net_buffer</h4>
|
||
|
<p>
|
||
|
A buffer holds exactly one packet, and has a source as well as a destination address.
|
||
|
The addresses may be changed in every layer the buffer passes through. For example,
|
||
|
the datalink protocols usually use sockaddr_dl structures with family AF_DLI, while
|
||
|
the upper levels may use sockaddr_in structures with family AF_INET. Every protocol
|
||
|
only supports a small number of address types, and it's the requirement of the upper
|
||
|
protocols to prepare the address for use in the lower protocols (and that's also a
|
||
|
reason why it wouldn't work to arbitrarily stack protocols onto each other).
|
||
|
</p>
|
||
|
<p>
|
||
|
The net_buffer module can be used to access the data within the buffer, append new
|
||
|
data to the buffer, or remove chunks of data from it. Internally, the buffer consists
|
||
|
of usually fixed size (2048 byte) buffers that can be shared or connected as needed.
|
||
|
</p>
|
||
|
<h4>net_socket</h4>
|
||
|
<p>
|
||
|
The socket is only of interest for the net_protocol modules, as it stores options
|
||
|
that may have an effect on the protocol's performance. It's the direct counterpart
|
||
|
to a socket file descriptor in userland, but it has only little logic bound to it.
|
||
|
</p>
|
||
|
<p>
|
||
|
When a socket is created, the networking stack creates a chain of net_protocol
|
||
|
modules for the socket that will then do the real work. When the socket is closed,
|
||
|
the net_protocol chain is freed, and the modules are eventually unloaded (if they
|
||
|
are no longer in use).
|
||
|
</p>
|
||
|
<h4>net_protocol</h4>
|
||
|
<p>
|
||
|
The protocols are bound to a specific socket, process the outgoing buffers as needed
|
||
|
(ie. add or remove headers, compute checksums, ...), and pass it on to the next
|
||
|
protocol. The last protocol in the chain is always a special datalink bridge protocol
|
||
|
that will redirect the calls to the net_datalink_protocol modules.
|
||
|
</p>
|
||
|
<p>
|
||
|
The domain protocol (the net_protocol module that registered the domain, ie. IPv4,
|
||
|
but also everything that would come after it) needs to be able to execute send_data(),
|
||
|
and get_route() without a pointer to its net_protocol object, as those may be called
|
||
|
outside of the socket context.
|
||
|
</p>
|
||
|
<p>
|
||
|
This is similar for incoming data, which is always handled outside of the socket
|
||
|
context, as the actual target socket is unknown during processing. Only the top-most
|
||
|
protocol will be able to forward the packet to the target socket(s). To receive
|
||
|
incoming data, a protocol must register itself as receiving protocol with the
|
||
|
networking stack. The domain protocol is usually registered automatically by a
|
||
|
net_datalink_protocol module that knows about both ends (for example, the ARP
|
||
|
module is both IPv4 and ethernet specific, and therefore registers the AF_INET
|
||
|
domain to receive ethernet packets of type IP).
|
||
|
</p>
|
||
|
<h4>net_datalink_protocol</h4>
|
||
|
<p>
|
||
|
The datalink protocols are bound to a specific net_interface, and therefore to a
|
||
|
specific net_device as well. Outgoing data is processed so that it can be sent
|
||
|
via the net_device. For example, the ARP protocol will replace sockaddr_in structures
|
||
|
in the buffer with sockaddr_dl structures describing the ethernet MAC address of
|
||
|
the source and destination hosts, the ethernet_frame protocol will add the usual
|
||
|
ethernet header, etc.
|
||
|
</p>
|
||
|
<p>
|
||
|
The last protocol in the chain is also a device interface bridge protocol, that
|
||
|
redirects the calls to the underlying net_device.
|
||
|
</p>
|
||
|
<p>
|
||
|
Incoming data is handled differently again; when you want to receive data directly
|
||
|
coming from a device, you can either register a deframing function for it, or a
|
||
|
handler that will be called depending on what data type the deframing module reported.
|
||
|
For example, the ethernet_frame module registers an ethernet deframing function, while
|
||
|
the ARP module registers a handler for ethernet ARP packets with the device. When the
|
||
|
deframing function reports a <code>ETHER_TYPE_ARP</code> packet, the ARP receiving
|
||
|
function will be called.
|
||
|
</p>
|
||
|
<h4>net_route</h4>
|
||
|
<p>
|
||
|
A route determines the target interface of an outgoing packet. A route is always
|
||
|
owned by a specific domain, and the route is chosen by comparing the networking
|
||
|
address of the outgoing buffer with the mask and address of the route.
|
||
|
</p>
|
||
|
<p>
|
||
|
A protocol will usually not use the routes directly, but use a net_route_info
|
||
|
object (see below), that will make sure that the route is updated automatically
|
||
|
whenever the routing table is changed.
|
||
|
</p>
|
||
|
<h4>net_route_info</h4>
|
||
|
<p>
|
||
|
A routing helper for protocol usage: it stores the target address as well as the
|
||
|
route to be used, and has to be registered with the networking stack via
|
||
|
<code>register_route_info()</code>.
|
||
|
</p>
|
||
|
<p>
|
||
|
Then, the stack will automatically update the route as needed, whenever the
|
||
|
routing table of the domain changes; it will always matches the address specified
|
||
|
there. When the routing is no longer needed, you must unregister the net_route_info
|
||
|
again.
|
||
|
</p>
|
||
|
<hr>
|
||
|
<small>
|
||
|
<a name="foot1">1</a> You can find the definition of the driver interface
|
||
|
in <a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/headers_private_net/">headers/private/net/net_stack_driver.h</a>, as well as
|
||
|
the driver itself at <a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/add-ons_kernel_drivers_network_stack/">src/add-ons/kernel/drivers/network/stack/</a><br>
|
||
|
<a name="foot2">2</a><a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/add-ons_kernel_network/stack/">src/add-ons/kernel/network/stack/</a>
|
||
|
</small>
|
||
|
</body>
|
||
|
</html>
|