e705c841d7
* Fix dead links where possible * Use online instead of local links to the BeBook Change-Id: I250117dcccc0026925c56545cca7e3b4467f2c78 Reviewed-on: https://review.haiku-os.org/c/811 Reviewed-by: Adrien Destugues <pulkomandy@gmail.com>
219 lines
11 KiB
HTML
219 lines
11 KiB
HTML
<html>
|
|
<body bgcolor=white>
|
|
<h1>Haiku Network Stack Architecture</h1>
|
|
<p>
|
|
The Haiku Network Stack is a modular and layered networking stack, very
|
|
similar to what you may know as BONE.
|
|
</p>
|
|
<p>
|
|
The entry point when talking to the stack is through a dedicated device
|
|
driver that publish itself in /dev/net. The userland library libnetwork.so
|
|
(which combines libsocket.so, and libbind.so) directly talks to this
|
|
driver, mostly via ioctl()<sup><a href="#foot1">1</a></sup>.
|
|
</p>
|
|
The driver either creates sockets, or passes on every command to the socket
|
|
module<sup><a href="#foot2">2</a></sup>. Depending on the address family and
|
|
type of the sockets, the lower layers will be loaded and connected.
|
|
</p>
|
|
<p>
|
|
For example, with a TCP/IP socket, the stack could look like this:
|
|
<table cellspacing=1 cellpadding=5 border=0>
|
|
<tr><td colspan=2 bgcolor="#aaaadd">Socket</td></tr>
|
|
<tr><td bgcolor="#ccccff">TCP</td>
|
|
<td rowspan=2 bgcolor="#ddddff"><p>Protocols<br>
|
|
<font size="-2">defined by the socket (address family, type)</p>
|
|
(session, transport, network layers)</font></td>
|
|
</tr>
|
|
<tr><td bgcolor="#ccccff">IPv4</td></tr>
|
|
<tr><td colspan=2 bgcolor="#ddcc88">Datalink</td></tr>
|
|
<tr><td bgcolor="#ffee88">ARP</td>
|
|
<td rowspan=2 bgcolor="#ffee99"><p>Datalink Protocols<br>
|
|
<font size="-2">defined by the interface (IP address, device)</p>
|
|
(datalink layer)</font></td>
|
|
</tr>
|
|
<tr><td bgcolor="#ffee88">Ethernet framing</td></tr>
|
|
<tr><td bgcolor="#ffdd00">Ethernet device</td><td bgcolor="#ffdd55"><font size="-2">(physical layer)</font></tr>
|
|
</table>
|
|
Where TCP, and IPv4 are net_protocol modules, and ARP, and the Ethernet framing are
|
|
net_datalink_protocol modules. All modules are connected in a chain, even though the
|
|
datalink layer introduces more than one path (one for each interface).
|
|
</p>
|
|
<p>
|
|
When sending data through a socket, a net_buffer is created in the socket module, and passed
|
|
on to the lower levels where each protocol processes it, before passing it on to the next
|
|
protocol in the chain. The last protocol in the chain is always a domain protocol - it will
|
|
directly forward the buffers to the datalink module. When the buffer reaches the datalink
|
|
level, an accompanied net_route object will determine for which interface (which determines
|
|
the datalink protocols in the chain) the buffer is destined. The route has to be specified
|
|
by the upper protocols before the buffer gets into the datalink level - if a buffer comes
|
|
in without a valid route, it is discarded.
|
|
</p>
|
|
<p>
|
|
The protocol modules are loaded and unloaded as needed. The stack itself stays loaded
|
|
as long as there are interfaces defined - as soon as the last interface is removed,
|
|
the stack gets unloaded (which is, of course, not yet implemented).
|
|
</p>
|
|
<h3>The Structures and Classes</h3>
|
|
<h4>net_domain</h4>
|
|
<p>
|
|
Every supported address family gets its own domain. A domain comprises such a family,
|
|
a net_protocol module that handles this domain, and a list of interfaces and routes.
|
|
It also gets a name: for example, the IPv4 module registers the "internet" domain
|
|
(AF_INET).
|
|
</p>
|
|
<p>
|
|
The domain protocol module is responsible for managing the domain; it has to register
|
|
it when it's loaded, and it has to unregister it when it is unloaded by the networking
|
|
stack.
|
|
</p>
|
|
<h4>net_interface</h4>
|
|
<p>
|
|
An interface makes an underlying net_device accessible by the stack. When creating
|
|
a new interface, you have to specify a domain, and a device to be used. The stack
|
|
will then look through the registered datalink protocols, and builds a chain of
|
|
them for that interface.
|
|
</p>
|
|
<p>
|
|
The interface usually gets a network address, and a route that directs buffers to
|
|
be sent to it. If there is no route to an interface, it will never be used for
|
|
outgoing data, but may well receive data from other hosts.
|
|
</p>
|
|
<p>
|
|
An interface can be "up" (when <code>IFF_UP</code> is set in its <code>flags</code>
|
|
member) in which case it accepts data - when that flag is not set, it will discard
|
|
all data it gets. The interface also specifies the maximum buffer size that can be
|
|
sent over this interface (the <code>mtu</code> member, a.k.a. maximum transmission
|
|
unit).
|
|
</p>
|
|
<p>
|
|
Interfaces are configured via ioctl()s (SIOCAIFADDR, ...). You can use the command
|
|
line tool "ifconfig" to do this for you.
|
|
</p>
|
|
<h4>net_device</h4>
|
|
<p>
|
|
A networking device is used to actually send and receive the buffers. It either points
|
|
to an actual hardware device (in case of ethernet), or to a virtual device (in case of
|
|
loopback). Every device has a unique name that identifies it. When creating a device,
|
|
the name also decides which net_device module will be chosen; for example, everything
|
|
that starts with "loop" will end up in the loopback device, while the ethernet device
|
|
accepts names that start with "/dev/net/".
|
|
</p>
|
|
<p>
|
|
A device can be shared by many interfaces at the same time. The device to be used by
|
|
an interface is specified at the time an interface is created.
|
|
It also has an <code>mtu</code> member that determines the upper limit of an interface's
|
|
<code>mtu</code> as well.
|
|
</p>
|
|
<h4>net_buffer</h4>
|
|
<p>
|
|
A buffer holds exactly one packet, and has a source as well as a destination address.
|
|
The addresses may be changed in every layer the buffer passes through. For example,
|
|
the datalink protocols usually use sockaddr_dl structures with family AF_DLI, while
|
|
the upper levels may use sockaddr_in structures with family AF_INET. Every protocol
|
|
only supports a small number of address types, and it's the requirement of the upper
|
|
protocols to prepare the address for use in the lower protocols (and that's also a
|
|
reason why it wouldn't work to arbitrarily stack protocols onto each other).
|
|
</p>
|
|
<p>
|
|
The net_buffer module can be used to access the data within the buffer, append new
|
|
data to the buffer, or remove chunks of data from it. Internally, the buffer consists
|
|
of usually fixed size (2048 byte) buffers that can be shared or connected as needed.
|
|
</p>
|
|
<h4>net_socket</h4>
|
|
<p>
|
|
The socket is only of interest for the net_protocol modules, as it stores options
|
|
that may have an effect on the protocol's performance. It's the direct counterpart
|
|
to a socket file descriptor in userland, but it has only little logic bound to it.
|
|
</p>
|
|
<p>
|
|
When a socket is created, the networking stack creates a chain of net_protocol
|
|
modules for the socket that will then do the real work. When the socket is closed,
|
|
the net_protocol chain is freed, and the modules are eventually unloaded (if they
|
|
are no longer in use).
|
|
</p>
|
|
<h4>net_protocol</h4>
|
|
<p>
|
|
The protocols are bound to a specific socket, process the outgoing buffers as needed
|
|
(ie. add or remove headers, compute checksums, ...), and pass it on to the next
|
|
protocol. The last protocol in the chain is always a domain protocol that will forward
|
|
the calls to the datalink module directly, if needed.
|
|
</p>
|
|
<p>
|
|
A domain protocol is a net_protocol that registered a domain, ie. IPv4. Other than usual
|
|
protocols, domain protocols have some special requirements:
|
|
<ul>
|
|
<li>they need to be able to execute send_data(), and get_domain() without a pointer to
|
|
its net_protocol object, as those may be called outside of the socket context.</li>
|
|
<li>as mentioned, they also don't talk to the next protocol in the chain (as they are
|
|
always the last one), but to the datalink module directly.</li>
|
|
</ul>
|
|
</p>
|
|
<p>
|
|
Similar to the need to perform send_data() outside of the socket context, all protocols
|
|
that can receive data need to handle incoming data without the socket context: incoming
|
|
data is always handled outside of the socket context, as the actual target socket
|
|
is unknown during processing.
|
|
</p>
|
|
<p>
|
|
Only the top-most protocol will be able to forward the packet to the target socket(s).
|
|
To receive incoming data, a protocol must register itself as receiving protocol with
|
|
the networking stack. The domain protocol is usually registered automatically by a
|
|
net_datalink_protocol module that knows about both ends (for example, the ARP
|
|
module is both IPv4 and ethernet specific, and therefore registers the AF_INET
|
|
domain to receive ethernet packets of type IP).
|
|
</p>
|
|
<h4>net_datalink_protocol</h4>
|
|
<p>
|
|
The datalink protocols are bound to a specific net_interface, and therefore to a
|
|
specific net_device as well. Outgoing data is processed so that it can be sent
|
|
via the net_device. For example, the ARP protocol will replace sockaddr_in structures
|
|
in the buffer with sockaddr_dl structures describing the ethernet MAC address of
|
|
the source and destination hosts, the ethernet_frame protocol will add the usual
|
|
ethernet header, etc.
|
|
</p>
|
|
<p>
|
|
The last protocol in the chain is also a special device interface bridge protocol,
|
|
that redirects the calls to the underlying net_device.
|
|
</p>
|
|
<p>
|
|
Incoming data is handled differently again; when you want to receive data directly
|
|
coming from a device, you can either register a deframing function for it, or a
|
|
handler that will be called depending on what data type the deframing module reported.
|
|
For example, the ethernet_frame module registers an ethernet deframing function, while
|
|
the ARP module registers a handler for ethernet ARP packets with the device. When the
|
|
deframing function reports a <code>ETHER_TYPE_ARP</code> packet, the ARP receiving
|
|
function will be called.
|
|
</p>
|
|
<h4>net_route</h4>
|
|
<p>
|
|
A route determines the target interface of an outgoing packet. A route is always
|
|
owned by a specific domain, and the route is chosen by comparing the networking
|
|
address of the outgoing buffer with the mask and address of the route.
|
|
</p>
|
|
<p>
|
|
A protocol will usually not use the routes directly, but use a net_route_info
|
|
object (see below), that will make sure that the route is updated automatically
|
|
whenever the routing table is changed.
|
|
</p>
|
|
<h4>net_route_info</h4>
|
|
<p>
|
|
A routing helper for protocol usage: it stores the target address as well as the
|
|
route to be used, and has to be registered with the networking stack via
|
|
<code>register_route_info()</code>.
|
|
</p>
|
|
<p>
|
|
Then, the stack will automatically update the route as needed, whenever the
|
|
routing table of the domain changes; it will always matches the address specified
|
|
there. When the routing is no longer needed, you must unregister the net_route_info
|
|
again.
|
|
</p>
|
|
<hr>
|
|
<small>
|
|
<a name="foot1">1</a> You can find the definition of the driver interface
|
|
in <a href="https://git.haiku-os.org/haiku/tree/headers/private/net">headers/private/net/net_stack_interface.h</a>, as well as
|
|
the driver itself at <a href="https://git.haiku-os.org/haiku/tree/src/add-ons/kernel/drivers/network">src/add-ons/kernel/drivers/network</a><br>
|
|
<a name="foot2">2</a><a href="https://git.haiku-os.org/haiku/tree/src/add-ons/kernel/network/stack">src/add-ons/kernel/network/stack/</a>
|
|
</small>
|
|
</body>
|
|
</html>
|