Add some notes about the sparc architecture.

Change-Id: I2fd042981d2771abdedcd3648e2eeb6e06db4253
Reviewed-on: https://review.haiku-os.org/c/1142
Reviewed-by: waddlesplash <waddlesplash@gmail.com>
This commit is contained in:
PulkoMandy 2019-03-03 20:41:38 +01:00 committed by waddlesplash
parent 8bdd63fe5d
commit 6823ced505
3 changed files with 81 additions and 0 deletions

View File

@ -0,0 +1,32 @@
The SPARC architecture has 32 integer registers, divided as follows:
- global registers (g0-g7)
- input (i0-i7)
- local (l0-l7)
- output (o0-o7)
Parameter passing and return is done using the output registers, which are
generally considered scratch registers and can be corrupted by the callee. The
caller must take care of preserving them.
The input and local registers are callee-saved, but we have hardware assistance
in the form of a register window. There is an instruction to shift the registers
so that:
- o registers become i registers
- local and output registers are replaced with fresh sets, for use by the
current function
- global registers are not affected
Note that as a side-effect, o7 is moved to i7, this is convenient because these
are usually the stack and frame pointers, respectively. So basically this sets
the frame pointer for free.
Simple enough functions may end up using just the o registers, in that case
nothing special is necessary, of course.
When shifting the register window, the extra registers come from the register
stack in the CPU. This is not infinite, however, most implementations of SPARC
will only have 8 windows available. When the internal stack is full, an overflow
trap is raised, and the handler must free up old windows by storing them on the
stack, likewise, when the internal stack is empty, an underflow trap must fill
it back from the stack-saved data.

View File

@ -0,0 +1,37 @@
The SPARC CPU is not designed to gracefully handle misaligned accesses.
You can access a single byte at any address, but 16-bit access only at even
addresses, 32bit access at multiple of 4 addresses, etc.
For example, on x86, such accesses are not a problem, it is allowed and handled
directly by the instructions doing the access. So there is no performance cost.
On SPARC, however, such accesses will cause a SIGBUS. This means a trap handler
has to catch the misaligned access and do it in software, byte by byte, then
give back control to the application. This is, of course, very slow, so we
should avoid it when possible.
Fortunately, gcc knows about this, and will normally do the right thing:
- For usual variables and structures, it will make sure to lay them out so that
they are aligned. It relies on stack alignment, as well as malloc returning
sufficiently aligned memory (as required by the C standard).
- On packed structure, gcc knows the data is misaligned, and will automatically
use the appropriate way to access it (most likely, byte-by-byte).
This leaves us with two undesirable cases:
- Pointer arithmetics and casting. When computing addresses manually, it's
possible to generate a misaligned address and cast it to a type with a wider
alignment requirement. In this case, gcc may access the pointer using a
multi byte instruction and cause a SIGBUS. Solution: make sure the struct
is aligned, or declare it as packed so unaligned access are used instead.
- Access to hardware: it is a common pattern to declare a struct as packed,
and map it to hardware registers. If the alignment isn't known, gcc will use
byte by byte access. It seems volatile would cause gcc to use the proper way
to access the struct, assuming that a volatile value is necessarily
aligned as it should.
In the end, we just need to be careful about pointer math resulting in unalined
access. -Wcast-align helps with that, but it also raises a lot of false positives
(where the alignment is preserved even when casting to other types). So we
enable it only as a warning for now. We will need to ceck the sigbus handler to
identify places where we do a lot of misaligned accesses that trigger it, and
rework the code as needed. But in general, except for these cases, we're fine.

View File

@ -0,0 +1,12 @@
The SPARC instruction set specifies instruction for handling long double
values, however, no hardware implementation actually provides them. They
generate a trap, which is expected to be handled by the softfloat library.
Since traps are slow, and gcc knows better, it will never generate those
instructions. Instead it directly calls into the C library, to functions
specified in the ABI and used to do long double math using softfloats.
The support code for this is, in our case, compiled into both the kernel and
libroot. It lives in src/system/libroot/os/arch/sparc/softfloat.c (and other
support files). This code was extracted from FreeBSD, rather than the glibc,
because that made it much easier to get it building in the kernel.