toaruos

Table of Contents

Why ELF Relocatable Objects?
Symbol Tables
Module Loading
Followup
Modules depending on Modules

This page is incomplete; please come back later when it's done.

ToaruOS uses ELF relocatable object files (the format you get with cc -c) to provide loadable kernel modules.

Why ELF Relocatable Objects?

When most people think about dynamically loading code, they turn to shared objects. But, in the context of the kernel, the benefits of shared objects make very little sense, and the complexities involved in loading them only make matters worse. Shared objects were designed to support applications in userspace, particularly in cases where many applications need to make use of the same library at the same time. As such, a key feature of shared objects is their ability to be loaded once in memory and "relocated" through an offset table for many different users, rather than directly modifying code. This means that shared objects make use of additional memory operations at runtime to dereference symbols, as they must look into the offset table first. In the context of a kernel, we have no need for multiple instances of a module - they are only loaded once for use in the kernel. Relocatable objects work by modifying the code itself when relocating the object, which means it can't be referenced from multiple different virtual address spaces. This also means that symbol references are direct, as the addresses are patched into the code instead of relying on a lookup table, which means both that there are more relocations to perform (once for every call to a symbol, rather than once for each individual symbol within the offset table), but also that runtime speed is improved.

Symbol Tables

In order to link our modules into our kernel at runtime, we need access to a symbol table. ELF normally includes one, and some implementations of Multiboot will allow you to access the symbol table for your kernel ELF binary at runtime, but some won't (eg. qemu). As ToaruOS was developed with qemu as a primary tool for testing, a different approach was taken than using the built-in table from the ELF: We generate a symbol table through a two-pass process when building the kernel. Essentially, a stub kernel without the table is built to get a list of all of the symbol names in the kernel, which is then dumped into a generated assembly file as an array of name-pointer pairs, built into an object, and then linked into the final kernel (resolving all of the addresses and building a complete symbol table). This also has the advantage that the kernel binary can be converted to other formats (which may not have a symbol table) easily. There are ways to get access to the ELF kernel table with restricted implementations of multiboot, such as presenting the ELF binary as a raw binary, which you can also try.

Module Loading

ToaruOS supports loading modules from memory (multiboot modules), or from the file system (in which case they are copied into memory). Due to the way relocatable modules work, we can generally load them anywhere (page-aligned) in memory and they'll be ready to go. A more mature loader would only load the relevant sections into memory, but ToaruOS's loader was designed with multiboot in mind and thus uses the module files in place.

If you have implemented an ELF loader for userspace applications, the relocatable object process should look similar. Be sure to provide a BSS (and zero it), especially if you're using an in-place file. Keep track of the symbol and string tables, as you need them to perform the actual relocations and provide symbols for other modules.

ToaruOS also includes support for module dependencies. A section is used to store the string names of modules that a new module expects to have already been loaded. ToaruOS does not resolve dependencies on its own - modules must be manually loaded in the correct order - but it will identify when a module has unresolved dependencies.

You should go through each symbol in the symbol table and identify whether it's a new symbol or a reference to an external symbol (and verify that the symbol exists). ToaruOS uses a symbol hashmap of symbol names to addresses, but a more mature implementation would want to also track symbol types and export flags.

The final step in loading is to perform relocations. This involves going through a relocation table, determining which relocation function to run (there are only two important ones for 32-bit x86), performing the requested operation, and storing the result in the binary image.

Followup

After loading a module, you probably want to execute some code, similar to a main function. ToaruOS includes a special-named symbol pointing to a struct with a module name, init function (run when the module is loaded), and fini ("finish", run when the module is unloaded, in theory; ToaruOS doesn't really support unloading yet).

Modules depending on Modules

As ToaruOS uses a hashmap internally to track symbols, it can add symbols from modules, and thus other modules can depend on those new symbols.