Add some notes about the sparc architecture.

Change-Id: I2fd042981d2771abdedcd3648e2eeb6e06db4253 Reviewed-on: https://review.haiku-os.org/c/1142 Reviewed-by: waddlesplash <waddlesplash@gmail.com>
2019-03-03 20:41:38 +01:00 · 2019-03-03 20:41:38 +01:00 · 6823ced505
commit 6823ced505
parent 8bdd63fe5d
3 changed files with 81 additions and 0 deletions
--- a/docs/develop/arch/sparc/ABI.txt
+++ b/docs/develop/arch/sparc/ABI.txt
@ -0,0 +1,32 @@
+The SPARC architecture has 32 integer registers, divided as follows:
+
+- global registers (g0-g7)
+- input (i0-i7)
+- local (l0-l7)
+- output (o0-o7)
+
+Parameter passing and return is done using the output registers, which are
+generally considered scratch registers and can be corrupted by the callee. The
+caller must take care of preserving them.
+
+The input and local registers are callee-saved, but we have hardware assistance
+in the form of a register window. There is an instruction to shift the registers
+so that:
+- o registers become i registers
+- local and output registers are replaced with fresh sets, for use by the
+  current function
+- global registers are not affected
+
+Note that as a side-effect, o7 is moved to i7, this is convenient because these
+are usually the stack and frame pointers, respectively. So basically this sets
+the frame pointer for free.
+
+Simple enough functions may end up using just the o registers, in that case
+nothing special is necessary, of course.
+
+When shifting the register window, the extra registers come from the register
+stack in the CPU. This is not infinite, however, most implementations of SPARC
+will only have 8 windows available. When the internal stack is full, an overflow
+trap is raised, and the handler must free up old windows by storing them on the
+stack, likewise, when the internal stack is empty, an underflow trap must fill
+it back from the stack-saved data.
--- a/docs/develop/arch/sparc/misaligned
+++ b/docs/develop/arch/sparc/misaligned
@ -0,0 +1,37 @@
+The SPARC CPU is not designed to gracefully handle misaligned accesses.
+You can access a single byte at any address, but 16-bit access only at even
+addresses, 32bit access at multiple of 4 addresses, etc.
+
+For example, on x86, such accesses are not a problem, it is allowed and handled
+directly by the instructions doing the access. So there is no performance cost.
+
+On SPARC, however, such accesses will cause a SIGBUS. This means a trap handler
+has to catch the misaligned access and do it in software, byte by byte, then
+give back control to the application. This is, of course, very slow, so we
+should avoid it when possible.
+
+Fortunately, gcc knows about this, and will normally do the right thing:
+- For usual variables and structures, it will make sure to lay them out so that
+  they are aligned. It relies on stack alignment, as well as malloc returning
+  sufficiently aligned memory (as required by the C standard).
+- On packed structure, gcc knows the data is misaligned, and will automatically
+  use the appropriate way to access it (most likely, byte-by-byte).
+
+This leaves us with two undesirable cases:
+- Pointer arithmetics and casting. When computing addresses manually, it's
+  possible to generate a misaligned address and cast it to a type with a wider
+  alignment requirement. In this case, gcc may access the pointer using a
+  multi byte instruction and cause a SIGBUS. Solution: make sure the struct
+  is aligned, or declare it as packed so unaligned access are used instead.
+- Access to hardware: it is a common pattern to declare a struct as packed,
+  and map it to hardware registers. If the alignment isn't known, gcc will use
+  byte by byte access. It seems volatile would cause gcc to use the proper way
+  to access the struct, assuming that a volatile value is necessarily
+  aligned as it should.
+
+In the end, we just need to be careful about pointer math resulting in unalined
+access. -Wcast-align helps with that, but it also raises a lot of false positives
+(where the alignment is preserved even when casting to other types). So we
+enable it only as a warning for now. We will need to ceck the sigbus handler to
+identify places where we do a lot of misaligned accesses that trigger it, and
+rework the code as needed. But in general, except for these cases, we're fine.
--- a/docs/develop/arch/sparc/softfloat.txt
+++ b/docs/develop/arch/sparc/softfloat.txt
@ -0,0 +1,12 @@
+The SPARC instruction set specifies instruction for handling long double
+values, however, no hardware implementation actually provides them. They
+generate a trap, which is expected to be handled by the softfloat library.
+
+Since traps are slow, and gcc knows better, it will never generate those 
+instructions. Instead it directly calls into the C library, to functions
+specified in the ABI and used to do long double math using softfloats.
+
+The support code for this is, in our case, compiled into both the kernel and
+libroot. It lives in src/system/libroot/os/arch/sparc/softfloat.c (and other
+support files). This code was extracted from FreeBSD, rather than the glibc,
+because that made it much easier to get it building in the kernel.