haiku/docs/develop/net/Network Stack Overview.html

<html>
<body bgcolor=white>
	<h1>Haiku Network Stack Architecture</h1>
	<p>
	The Haiku Network Stack is a modular and layered networking stack, very
	similar to what you may know as BONE.
	</p>
	<p>
	The entry point when talking to the stack is through a dedicated device
	driver that publish itself in /dev/net. The userland library libnetwork.so
	(which combines libsocket.so, and libbind.so) directly talks to this
	driver, mostly via ioctl()<sup><a href="#foot1">1</a></sup>.
	</p>
	The driver either creates sockets, or passes on every command to the socket
	module<sup><a href="#foot2">2</a></sup>. Depending on the address family and
	type of the sockets, the lower layers will be loaded and connected.
	</p>
	<p>
	For example, with a TCP/IP socket, the stack could look like this:
	<table cellspacing=1 cellpadding=5 border=0>
		<tr><td colspan=2 bgcolor="#aaaadd">Socket</td></tr>
		<tr><td bgcolor="#ccccff">TCP</td>
			<td rowspan=2 bgcolor="#ddddff"><p>Protocols<br>
				<font size="-2">defined by the socket (address family, type)</p>
				(session, transport, network layers)</font></td>
		</tr>
		<tr><td bgcolor="#ccccff">IPv4</td></tr>
		<tr><td colspan=2 bgcolor="#ddcc88">Datalink</td></tr>
		<tr><td bgcolor="#ffee88">ARP</td>
			<td rowspan=2 bgcolor="#ffee99"><p>Datalink Protocols<br>
				<font size="-2">defined by the interface (IP address, device)</p>
				(datalink layer)</font></td>
		</tr>
		<tr><td bgcolor="#ffee88">Ethernet framing</td></tr>
		<tr><td bgcolor="#ffdd00">Ethernet device</td><td bgcolor="#ffdd55"><font size="-2">(physical layer)</font></tr>
	</table>
	Where TCP, and IPv4 are net_protocol modules, and ARP, and the Ethernet framing are net_datalink_protocol modules.
	All modules are connected in a chain, even though the datalink layer introduces more than
	one path (one for each interface).
	</p>
	<p>
	When sending data through a socket, a net_buffer is created in the socket module, and passed
	on to the lower levels where each protocol processes it, before passing it on to the next
	protocol in the chain. When the buffer reaches the datalink level, an accompanied net_route
	object will determine for which interface (which determines the datalink protocols in the
	chain) the buffer is destined. The route has to be specified by the upper protocols before
	the buffer gets into the datalink level - if a buffer comes in without a valid route, it
	is discarded.
	</p>
	<p>
	The protocol modules are loaded and unloaded as needed. The stack itself stays loaded
	as long as there are interfaces defined - as soon as the last interface is removed,
	the stack gets unloaded (which is, of course, not yet implemented).
	</p>
	<h3>The Structures and Classes</h3>
	<h4>net_domain</h4>
	<p>
	Every supported address family gets its own domain. A domain comprises such a family,
	a net_protocol module that handles this domain, and a list of interfaces and routes.
	It also gets a name: for example, the IPv4 module registers the "internet" domain
	(AF_INET).
	</p>
	<p>
	The domain protocol module is responsible for managing the domain; it has to register
	it when it's loaded, and it has to unregister it when it is unloaded by the networking
	stack.
	</p>
	<h4>net_interface</h4>
	<p>
	An interface makes an underlying net_device accessible by the stack. When creating
	a new interface, you have to specify a domain, and a device to be used. The stack
	will then look through the registered datalink protocols, and builds a chain of
	them for that interface.
	</p>
	<p>
	The interface usually gets a network address, and a route that directs buffers to
	be sent to it. If there is no route to an interface, it will never be used for
	outgoing data, but may well receive data from other hosts.
	</p>
	<p>
	An interface can be "up" (when <code>IFF_UP</code> is set in its <code>flags</code>
	member) in which case it accepts data - when that flag is not set, it will discard
	all data it gets. The interface also specifies the maximum buffer size that can be
	sent over this interface (the <code>mtu</code> member, a.k.a. maximum transmission
	unit).
	</p>
	<p>
	Interfaces are configured via ioctl()s (SIOCAIFADDR, ...). You can use the command
	line tool "ifconfig" to do this for you.
	</p>
	<h4>net_device</h4>
	<p>
	A networking device is used to actually send and receive the buffers. It either points
	to an actual hardware device (in case of ethernet), or to a virtual device (in case of
	loopback). Every device has a unique name that identifies it. When creating a device,
	the name also decides which net_device module will be chosen; for example, everything
	that starts with "loop" will end up in the loopback device, while the ethernet device
	accepts names that start with "/dev/net/".
	</p>
	<p>
	A device can be shared by many interfaces at the same time. The device to be used by
	an interface is specified at the time an interface is created.
	It also has an <code>mtu</code> member that determines the upper limit of an interface's
	<code>mtu</code> as well.
	</p>
	<h4>net_buffer</h4>
	<p>
	A buffer holds exactly one packet, and has a source as well as a destination address.
	The addresses may be changed in every layer the buffer passes through. For example,
	the datalink protocols usually use sockaddr_dl structures with family AF_DLI, while
	the upper levels may use sockaddr_in structures with family AF_INET. Every protocol
	only supports a small number of address types, and it's the requirement of the upper
	protocols to prepare the address for use in the lower protocols (and that's also a
	reason why it wouldn't work to arbitrarily stack protocols onto each other).
	</p>
	<p>
	The net_buffer module can be used to access the data within the buffer, append new
	data to the buffer, or remove chunks of data from it. Internally, the buffer consists
	of usually fixed size (2048 byte) buffers that can be shared or connected as needed.
	</p>
	<h4>net_socket</h4>
	<p>
	The socket is only of interest for the net_protocol modules, as it stores options
	that may have an effect on the protocol's performance. It's the direct counterpart
	to a socket file descriptor in userland, but it has only little logic bound to it.
	</p>
	<p>
	When a socket is created, the networking stack creates a chain of net_protocol
	modules for the socket that will then do the real work. When the socket is closed,
	the net_protocol chain is freed, and the modules are eventually unloaded (if they
	are no longer in use).
	</p>
	<h4>net_protocol</h4>
	<p>
	The protocols are bound to a specific socket, process the outgoing buffers as needed
	(ie. add or remove headers, compute checksums, ...), and pass it on to the next
	protocol. The last protocol in the chain is always a special datalink bridge protocol
	that will redirect the calls to the net_datalink_protocol modules.
	</p>
	<p>
	The domain protocol (the net_protocol module that registered the domain, ie. IPv4,
	but also everything that would come after it) needs to be able to execute send_data(),
	and get_route() without a pointer to its net_protocol object, as those may be called
	outside of the socket context.
	</p>
	<p>
	This is similar for incoming data, which is always handled outside of the socket
	context, as the actual target socket is unknown during processing. Only the top-most
	protocol will be able to forward the packet to the target socket(s). To receive
	incoming data, a protocol must register itself as receiving protocol with the
	networking stack. The domain protocol is usually registered automatically by a
	net_datalink_protocol module that knows about both ends (for example, the ARP
	module is both IPv4 and ethernet specific, and therefore registers the AF_INET
	domain to receive ethernet packets of type IP).
	</p>
	<h4>net_datalink_protocol</h4>
	<p>
	The datalink protocols are bound to a specific net_interface, and therefore to a
	specific net_device as well. Outgoing data is processed so that it can be sent
	via the net_device. For example, the ARP protocol will replace sockaddr_in structures
	in the buffer with sockaddr_dl structures describing the ethernet MAC address of
	the source and destination hosts, the ethernet_frame protocol will add the usual
	ethernet header, etc.
	</p>
	<p>
	The last protocol in the chain is also a device interface bridge protocol, that
	redirects the calls to the underlying net_device.
	</p>
	<p>
	Incoming data is handled differently again; when you want to receive data directly
	coming from a device, you can either register a deframing function for it, or a
	handler that will be called depending on what data type the deframing module reported.
	For example, the ethernet_frame module registers an ethernet deframing function, while
	the ARP module registers a handler for ethernet ARP packets with the device. When the
	deframing function reports a <code>ETHER_TYPE_ARP</code> packet, the ARP receiving
	function will be called.
	</p>
	<h4>net_route</h4>
	<p>
	A route determines the target interface of an outgoing packet. A route is always
	owned by a specific domain, and the route is chosen by comparing the networking
	address of the outgoing buffer with the mask and address of the route.
	</p>
	<p>
	A protocol will usually not use the routes directly, but use a net_route_info
	object (see below), that will make sure that the route is updated automatically
	whenever the routing table is changed.
	</p>
	<h4>net_route_info</h4>
	<p>
	A routing helper for protocol usage: it stores the target address as well as the
	route to be used, and has to be registered with the networking stack via
	<code>register_route_info()</code>.
	</p>
	<p>
	Then, the stack will automatically update the route as needed, whenever the
	routing table of the domain changes; it will always matches the address specified
	there. When the routing is no longer needed, you must unregister the net_route_info
	again.
	</p>
	<hr>
	<small>
	<a name="foot1">1</a> You can find the definition of the driver interface
		in <a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/headers_private_net/">headers/private/net/net_stack_driver.h</a>, as well as
		the driver itself at <a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/add-ons_kernel_drivers_network_stack/">src/add-ons/kernel/drivers/network/stack/</a><br>
	<a name="foot2">2</a><a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/add-ons_kernel_network/stack/">src/add-ons/kernel/network/stack/</a>
	</small>
</body>
</html>
Added the basic overview for the stack that I posted to the list. git-svn-id: file:///srv/svn/repos/haiku/haiku/trunk@18238 a95241bf-73f2-0310-859d-f6bbb57e9c96 2006-07-21 13:34:45 +04:00			`<html>`
			`<body bgcolor=white>`
			`<h1>Haiku Network Stack Architecture</h1>`
			`<p>`
			`The Haiku Network Stack is a modular and layered networking stack, very`
			`similar to what you may know as BONE.`
			`</p>`
			`<p>`
			`The entry point when talking to the stack is through a dedicated device`
			`driver that publish itself in /dev/net. The userland library libnetwork.so`
			`(which combines libsocket.so, and libbind.so) directly talks to this`
			`driver, mostly via ioctl()<sup><a href="#foot1">1</a></sup>.`
			`</p>`
			`The driver either creates sockets, or passes on every command to the socket`
			`module<sup><a href="#foot2">2</a></sup>. Depending on the address family and`
			`type of the sockets, the lower layers will be loaded and connected.`
			`</p>`
			`<p>`
			`For example, with a TCP/IP socket, the stack could look like this:`
			`<table cellspacing=1 cellpadding=5 border=0>`
			`<tr><td colspan=2 bgcolor="#aaaadd">Socket</td></tr>`
			`<tr><td bgcolor="#ccccff">TCP</td>`
			`<td rowspan=2 bgcolor="#ddddff"><p>Protocols<br>`
			`<font size="-2">defined by the socket (address family, type)</p>`
			`(session, transport, network layers)</font></td>`
			`</tr>`
			`<tr><td bgcolor="#ccccff">IPv4</td></tr>`
			`<tr><td colspan=2 bgcolor="#ddcc88">Datalink</td></tr>`
			`<tr><td bgcolor="#ffee88">ARP</td>`
			`<td rowspan=2 bgcolor="#ffee99"><p>Datalink Protocols<br>`
			`<font size="-2">defined by the interface (IP address, device)</p>`
			`(datalink layer)</font></td>`
			`</tr>`
			`<tr><td bgcolor="#ffee88">Ethernet framing</td></tr>`
			`<tr><td bgcolor="#ffdd00">Ethernet device</td><td bgcolor="#ffdd55"><font size="-2">(physical layer)</font></tr>`
			`</table>`
			`Where TCP, and IPv4 are net_protocol modules, and ARP, and the Ethernet framing are net_datalink_protocol modules.`
			`All modules are connected in a chain, even though the datalink layer introduces more than`
			`one path (one for each interface).`
			`</p>`
			`<p>`
			`When sending data through a socket, a net_buffer is created in the socket module, and passed`
			`on to the lower levels where each protocol processes it, before passing it on to the next`
			`protocol in the chain. When the buffer reaches the datalink level, an accompanied net_route`
			`object will determine for which interface (which determines the datalink protocols in the`
			`chain) the buffer is destined. The route has to be specified by the upper protocols before`
			`the buffer gets into the datalink level - if a buffer comes in without a valid route, it`
			`is discarded.`
			`</p>`
			`<p>`
			`The protocol modules are loaded and unloaded as needed. The stack itself stays loaded`
			`as long as there are interfaces defined - as soon as the last interface is removed,`
			`the stack gets unloaded (which is, of course, not yet implemented).`
			`</p>`
			`<h3>The Structures and Classes</h3>`
			`<h4>net_domain</h4>`
			`<p>`
			`Every supported address family gets its own domain. A domain comprises such a family,`
			`a net_protocol module that handles this domain, and a list of interfaces and routes.`
			`It also gets a name: for example, the IPv4 module registers the "internet" domain`
			`(AF_INET).`
			`</p>`
			`<p>`
			`The domain protocol module is responsible for managing the domain; it has to register`
			`it when it's loaded, and it has to unregister it when it is unloaded by the networking`
			`stack.`
			`</p>`
			`<h4>net_interface</h4>`
			`<p>`
			`An interface makes an underlying net_device accessible by the stack. When creating`
			`a new interface, you have to specify a domain, and a device to be used. The stack`
			`will then look through the registered datalink protocols, and builds a chain of`
			`them for that interface.`
			`</p>`
			`<p>`
			`The interface usually gets a network address, and a route that directs buffers to`
			`be sent to it. If there is no route to an interface, it will never be used for`
			`outgoing data, but may well receive data from other hosts.`
			`</p>`
			`<p>`
			`An interface can be "up" (when <code>IFF_UP</code> is set in its <code>flags</code>`
			`member) in which case it accepts data - when that flag is not set, it will discard`
			`all data it gets. The interface also specifies the maximum buffer size that can be`
			`sent over this interface (the <code>mtu</code> member, a.k.a. maximum transmission`
			`unit).`
			`</p>`
			`<p>`
			`Interfaces are configured via ioctl()s (SIOCAIFADDR, ...). You can use the command`
			`line tool "ifconfig" to do this for you.`
			`</p>`
			`<h4>net_device</h4>`
			`<p>`
			`A networking device is used to actually send and receive the buffers. It either points`
			`to an actual hardware device (in case of ethernet), or to a virtual device (in case of`
			`loopback). Every device has a unique name that identifies it. When creating a device,`
			`the name also decides which net_device module will be chosen; for example, everything`
			`that starts with "loop" will end up in the loopback device, while the ethernet device`
			`accepts names that start with "/dev/net/".`
			`</p>`
			`<p>`
			`A device can be shared by many interfaces at the same time. The device to be used by`
			`an interface is specified at the time an interface is created.`
			`It also has an <code>mtu</code> member that determines the upper limit of an interface's`
			`<code>mtu</code> as well.`
			`</p>`
			`<h4>net_buffer</h4>`
			`<p>`
			`A buffer holds exactly one packet, and has a source as well as a destination address.`
			`The addresses may be changed in every layer the buffer passes through. For example,`
			`the datalink protocols usually use sockaddr_dl structures with family AF_DLI, while`
			`the upper levels may use sockaddr_in structures with family AF_INET. Every protocol`
			`only supports a small number of address types, and it's the requirement of the upper`
			`protocols to prepare the address for use in the lower protocols (and that's also a`
			`reason why it wouldn't work to arbitrarily stack protocols onto each other).`
			`</p>`
			`<p>`
			`The net_buffer module can be used to access the data within the buffer, append new`
			`data to the buffer, or remove chunks of data from it. Internally, the buffer consists`
			`of usually fixed size (2048 byte) buffers that can be shared or connected as needed.`
			`</p>`
			`<h4>net_socket</h4>`
			`<p>`
			`The socket is only of interest for the net_protocol modules, as it stores options`
			`that may have an effect on the protocol's performance. It's the direct counterpart`
			`to a socket file descriptor in userland, but it has only little logic bound to it.`
			`</p>`
			`<p>`
			`When a socket is created, the networking stack creates a chain of net_protocol`
			`modules for the socket that will then do the real work. When the socket is closed,`
			`the net_protocol chain is freed, and the modules are eventually unloaded (if they`
			`are no longer in use).`
			`</p>`
			`<h4>net_protocol</h4>`
			`<p>`
			`The protocols are bound to a specific socket, process the outgoing buffers as needed`
			`(ie. add or remove headers, compute checksums, ...), and pass it on to the next`
			`protocol. The last protocol in the chain is always a special datalink bridge protocol`
			`that will redirect the calls to the net_datalink_protocol modules.`
			`</p>`
			`<p>`
			`The domain protocol (the net_protocol module that registered the domain, ie. IPv4,`
			`but also everything that would come after it) needs to be able to execute send_data(),`
			`and get_route() without a pointer to its net_protocol object, as those may be called`
			`outside of the socket context.`
			`</p>`
			`<p>`
			`This is similar for incoming data, which is always handled outside of the socket`
			`context, as the actual target socket is unknown during processing. Only the top-most`
			`protocol will be able to forward the packet to the target socket(s). To receive`
			`incoming data, a protocol must register itself as receiving protocol with the`
			`networking stack. The domain protocol is usually registered automatically by a`
			`net_datalink_protocol module that knows about both ends (for example, the ARP`
			`module is both IPv4 and ethernet specific, and therefore registers the AF_INET`
			`domain to receive ethernet packets of type IP).`
			`</p>`
			`<h4>net_datalink_protocol</h4>`
			`<p>`
			`The datalink protocols are bound to a specific net_interface, and therefore to a`
			`specific net_device as well. Outgoing data is processed so that it can be sent`
			`via the net_device. For example, the ARP protocol will replace sockaddr_in structures`
			`in the buffer with sockaddr_dl structures describing the ethernet MAC address of`
			`the source and destination hosts, the ethernet_frame protocol will add the usual`
			`ethernet header, etc.`
			`</p>`
			`<p>`
			`The last protocol in the chain is also a device interface bridge protocol, that`
			`redirects the calls to the underlying net_device.`
			`</p>`
			`<p>`
			`Incoming data is handled differently again; when you want to receive data directly`
			`coming from a device, you can either register a deframing function for it, or a`
			`handler that will be called depending on what data type the deframing module reported.`
			`For example, the ethernet_frame module registers an ethernet deframing function, while`
			`the ARP module registers a handler for ethernet ARP packets with the device. When the`
			`deframing function reports a <code>ETHER_TYPE_ARP</code> packet, the ARP receiving`
			`function will be called.`
			`</p>`
			`<h4>net_route</h4>`
			`<p>`
			`A route determines the target interface of an outgoing packet. A route is always`
			`owned by a specific domain, and the route is chosen by comparing the networking`
			`address of the outgoing buffer with the mask and address of the route.`
			`</p>`
			`<p>`
			`A protocol will usually not use the routes directly, but use a net_route_info`
			`object (see below), that will make sure that the route is updated automatically`
			`whenever the routing table is changed.`
			`</p>`
			`<h4>net_route_info</h4>`
			`<p>`
			`A routing helper for protocol usage: it stores the target address as well as the`
			`route to be used, and has to be registered with the networking stack via`
			`<code>register_route_info()</code>.`
			`</p>`
			`<p>`
			`Then, the stack will automatically update the route as needed, whenever the`
			`routing table of the domain changes; it will always matches the address specified`
			`there. When the routing is no longer needed, you must unregister the net_route_info`
			`again.`
			`</p>`
			`<hr>`
			`<small>`
			`<a name="foot1">1</a> You can find the definition of the driver interface`
			`in <a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/headers_private_net/">headers/private/net/net_stack_driver.h</a>, as well as`
			`the driver itself at <a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/add-ons_kernel_drivers_network_stack/">src/add-ons/kernel/drivers/network/stack/</a><br>`
			`<a name="foot2">2</a><a href="http://svn.berlios.de/viewcvs/haiku/haiku/branches/team/network/new_stack/add-ons_kernel_network/stack/">src/add-ons/kernel/network/stack/</a>`
			`</small>`
			`</body>`
			`</html>`