2 What Happens When
Kevin Lange edited this page 2015-10-31 11:50:03 -07:00

This article is a personal take on the "What Happens When..." question, answered from the perspective of ToaruOS. As of 2 October 2015, ToaruOS does not have a graphical web browser ported, so we will answer the question "What happens when I run fetch http://www.toaruos.org/ in my terminal?" instead, substituting the challenges of HTML, CSS, DOMs, etc. with the added task of loading a binary (fetch). Each step of the process will include a reference to a source file in ToaruOS (kernel, userspace, C library glue...). It should be noted before we begin that ToaruOS's network stack is very new and still missing many critical features, and we'll be skipping over anything that ToaruOS is incapable of doing at the moment. Note also that many of the steps in this process are repeated several times throughout - we will only explain them in detail the first time.

You can watch a presentation of this on YouTube.

Initial State

Before we begin, let's establish the current state of things:

We're at a terminal.

Our shell is in the middle of a read system call instigated by a call to fgetc(stdin) in the rline library that provides line editing. The file descriptor for stdin points to a pty slave device, which was created when the terminal application called syscall_openpty. Thus, sys_read will call the read method of the pty slave. Normally, a PTY device would be in canonical input mode by default, but before we called rline in the shell, we disabled canonical input and input echoing. VMIN is still set to 1, its default value, so we'll read precisely one character from the PTY's ring buffer. As there is nothing available to read in the ring buffer, we will sleep until a writer wakes us up.

Meanwhile, our terminal application is doing several things: It is blinking a cursor in a thread that sleeps at regular intervals. Another thread is reading from the PTY master waiting for new terminal output to process and render. A third thread is waiting for events from the window compositor. We don't quite care about the first two threads at the moment, but we'll get to both of them later. The third thread, however, will be critical to the first step in our walkthrough, so we should talk about its current state: Waiting for events from the window compositor means a call to yutani_poll which ends up as a read system call on a "PEX" (Packet EXchange, a ToaruOS-specific packet datagram IPC mechanism) endpoint. PEX is implemented internally through an older form of pipes, and that read boils down to another sleeping ring buffer.

Next, our compositor's keyboard driver is also blocked on a read, this time from the keyboard device, which is actually the same kind of pipe PEX uses, and thus also a sleeping ring buffer. The compositor itself is also processing commands from other running applications (the terminal's blinking cursor, the clock that updates every second...) as the server side of a PEX endpoint which - you guessed it - also involves a ring buffer.

So that's a bunch of ring buffers waiting for writers and some background noise. Now what happens when we...

Press the f key

We go to type fetch, and as soon as we hit the first key, our PS/2 keyboard fires off an interrupt. When we first booted, our PS/2 keyboard driver installed an interrupt handler for this interrupt, and our base interrupt handler will locate and call that handler. As ToaruOS leaves a lot of the finer details of keyboard mappings to a userspace library, not much more happens in the kernel driver; we read in the scancode from the keyboard and then append it to the ring buffer. Writing to a ring buffer will wake up the sleeping readers, which in this case means the compositor's keyboard thread, which will finish its ring buffer read of one character, return to userspace, and then add that scancode to a state machine which will convert it into a generic scancode and modifier status. The encoded scancode and modifier status are then turned into a message for the window compositor and sent to the event queue where they will be processed like any other request to the window compositor. This will awaken the event handling thread in the compositor, which will then process the key event, determining whether it should perform some general window management task, pass the key event to a globally bound application or send it to the focused window. Since we've typed f with no modifiers and have nothing bound to steal our f key presses, we send an event to the owner of the focused window: the terminal.

The terminal receives the event from the window compositor, recognizes it as a key event and then converts the key press into the appropriate string to send to the PTY. Many keys require special sequences, such as the arrow keys or function keys, but our f requires nothing special, and we can simply send it on its way by writing it to the PTY master. Writing to a PTY master may involve extra processing depending on the mode of the PTY - especially in the case of ICANON or ECHO, but neither of those flags are set in our mode, so we simply append the f to our ring buffer, which will wake up the sleeping PTY slave. Reading one character, the PTY slave will immediately return, putting us back into userspace in shell - or more specifically, in the rline library. rline needs to process our input to check it for special actions like arrow keys and function keys, which we would represent as special escape sequences. To do that, it uses a state machine which converts terminal escape sequences into the same scancodes and modifier statuses that the compositor uses. Since f requires no special treatment, we can skip over the details of the state machine as well as the handlers for various control key combinations and add the character to our input buffer. Since rline handles complex editing tasks such as insertion, it needs to be able to handle cases where the "cursor" isn't at the end of the input. It also needs to perform its own input echoing, as we have disabled this on the PTY, so we'll then print f and flush stdout, which will write to the PTY slave, awakening the PTY master, awakening the terminal thread that was handling terminal output. The terminal will process the output it has read from the PTY (the character f) through a state machine of its own, which processes terminal escape sequences. That state machine lives in the termemu library and supports coloring output, moving the cursor, and several other things. The dialect of escape sequences that ToaruOS's terminal implements is based on Xterm, with a few additions from Konsole. The state machine will eventually call a callback in the terminal to write the character f, where additional processing will happen to handle line feeds, bells, UTF-8 codepoint reconstruction, and more. Eventually, the terminal sets a cell to the character f, with appropriate foreground, background, emboldening, and so on, and draws that cell. Actually drawing a cell involves calling into Freetype to render the glyph for the character in the cell as pixels in a canvas (a mapped region of memory shared with the compositor containing the texture of the window). Once the character is drawn, a partial display flip event is sent to the window compositor to repaint the modified (or "damaged") display regions.

The compositor's event handler thread receives the display flip event and appends the damaged region to a list of damaged regions for processing in the next pass of the rendering thread. The render thread attempts to run 60 times each second, and only performs work when necessary - such as when display regions have been marked as damaged. The compositor uses Cairo to render window canvases, restricting rendering to affected regions to avoid unnecessary work. The compositor can perform various transformations to the window when drawing it, such as rotation or animation effects. When compositing is done, the results will be copied to the framebuffer, writing 32-bit RGB values directly into graphics memory that was mapped in the compositor, updating the display.

fetch http://www.toaruos.org

The process for the f key repeats for all the other keys until we hit Enter and the shell's call to rline() returns, our buffer now filled with "fetch http://www.toaruos.org". At this point, the shell needs to parse the buffer to figure out what we want to do. The shell supports complex quoting, variables, and pipes, the last of which means forking off multiple child processes and linking their inputs and outputs appropriately. Luckily, we've entered a fairly simple command and all the shell needs to do is split it into two tokens (fetch and http://www.toaruos.org), fork a child and run the command.

Forking and execing

TODO: This section could use more details.

Forking is the traditional Unix method of starting a new process. Forking involves building a new copy of the current process, including its file descriptors, memory, CPU state, and so on. Our entry point into this process is the fork system call, which will do all of those things and then add this new process to the scheduler's "ready" queue. Our old process will receive the PID of the new process as its return value from the system call, while the new process will receive 0, allowing us to track the new child process.

The original shell process will wait on "all" of its "children" (one child process in this case), and - finding none of them to be finished - will sleep on an event to be raised when a child finishes later. Meanwhile, the new shell process will call execvp to replace itself with the fetch binary. execvp evaluates the PATH environment variable, searching the filesystem with open, readdir, and fstat to find a binary named fetch. All of those filesystem operations call down into the EXT2 filesystem module, which will read inodes, which means reading blocks which means reading sectors from the disk or memory in a ramdisk. Once execvp has found an appropriate binary, it then calls execve system call which will load and parse the executable to determine whether it's an ELF binary, shell script with #! line, or something else it can execute. Finding fetch to be an ELF binary, it will then continue loading the rest of the binary and copy it into memory. Arguments, environment variables, and ELF auxiliary variables will also be copied into place. Eventually we will jump back to userspace at the binary's requested start address, which will bring us to the crt0 startup code, which then jump to pre_main which reads the environment variables, calls init functions, and then calls main.

fetch

Now we're at the point where we can look at the code for fetch itself. We're at main().

TODO: This is still a work in progress