sparc: boot mmu support

Get enough of the mmu working to be able to allocate memory.

Unlike on PowerPC, we get both address and size as 64bit values. So
adjust of_region to allow this.

Also unlike the PPC port, we do not drive the hardware directly, instead we
rely on the openboot primitives to manage the translation table. This
allows staying independant of the hardware, which is a good idea at
least for the bootloader (we can do actual hardware things in the
kernel)

Change-Id: Ifa57619d3a09b8f707e1f8640d8b4f71bb717e2a
Reviewed-on: https://review.haiku-os.org/c/haiku/+/1482
Reviewed-by: Alex von Gluck IV <kallisti5@unixzen.com>
This commit is contained in:
PulkoMandy 2019-05-24 23:12:08 +02:00 committed by waddlesplash
parent 011b188df0
commit 56f9c76088
9 changed files with 440 additions and 79 deletions

View File

@ -0,0 +1,116 @@
Notes on the Ultrasparc MMUs
============================
First, a word of warning: the MMU was different in SPARCv8 (32bit)
implementations, and it was changed again on newer CPUs.
The Ultrasparc-II we are supporting for now is documented in the Ultrasparc
user manual. There were some minor changes in the Ultrasparc-III to accomodate
larger physical addresses. This was then standardized as JPS1, and Fujitsu
also implemented it.
Later on, the design was changed again, for example Ultrasparc T2 (UA2005
architecture) uses a different data structure format to enlarge, again, the
physical and virtual address tags.
For now te implementation is focused on Ultrasparc-II because that's what I
have at hand, later on we will need support for the more recent systems.
Ultrasparc-II MMU
=================
There are actually two separate units for the instruction and data address
spaces, known as I-MMU and D-MMU. They each implement a TLB (translation
lookaside buffer) for the recently accessed pages.
This is pretty much all there is to the MMU hardware. No hardware page table
walk is provided. However, there is some support for implementing a TSB
(Translation Storage Buffer) in the form of providing a way to compute an
address into that buffer where the data for a missing page could be.
It is up to software to manage the TSB (globally or per-process) and in general
keep track of the mappings. This means we are relatively free to manage things
however we want, as long as eventually we can feed the iTLB and dTLB with the
relevant data from the MMU trap handler.
To make sure we can handle the fault without recursing, we need to pin a few
items in place:
In the TLB:
- TLB miss handler code
- TSB and any linked data that the TLB miss handler may need
- asynchronous trap handlers and data
In the TSB:
- TSB-miss handling code
- Interrupt handlers code and data
So, from a given virtual address (assuming we are using only 8K pages and a
512 entry TSB to keep things simple):
VA63-44 are unused and must be a sign extension of bit 43
VA43-22 are the 'tag' used to match a TSB entry with a virtual address
VA21-13 are the offset in the TSB at which to find a candidate entry
VA12-0 are the offset in the 8K page, and used to form PA12-0 for the access
Inside the TLBs, VA63-13 is stored, so there can be multiple entries matching
the same tag active at the same time, even when there is only one in the TSB.
The entries are rotated using a simple LRU scheme, unless they are locked of
course. Be careful to not fill a TLB with only locked entries! Also one must
take care of not inserting a new mapping for a given VA without first removing
any possible previous one (no need to worry about this when handling a TLB
miss however, as in that case we obviously know that there was no previous
entry).
Entries also have a "context". This could for example be mapped to the process
ID, allowing to easily clear all entries related to a specific context.
TSB entries format
==================
Each entry is composed of two 64bit values: "Tag" and "Data". The data uses the
same format as the TLB entries, however the tag is different.
They are as follow:
Tag
---
Bit 63: 'G' indicating a global entry, the context should be ignored.
Bits 60-48: context ID (13 bits)
Bits 41-0: VA63-22 as the 'tag' to identify this entry
Data
----
Bit 63: 'V' indicating a valid entry, if it's 0 the entry is unused.
Bits 62-61: size: 8K, 64K, 512K, 4MB
Bit 60: NFO, indicating No Fault Only
Bit 59: Invert Endianness of accesses to this page
Bits 58-50: reserved for use by software
Bits 49-41: reserved for diagnostics
Bits 40-13: Physical Address<40-13>
Bits 12-7: reserved for use by software
Bit 6: Lock in TLB
Bit 5: Cachable physical
Bit 4: Cachable virtual
Bit 3: Access has side effects (HW is mapped here, or DMA shared RAM)
Bit 2: Privileged
Bit 1: Writable
Bit 0: Global
TLB internal tag
----------------
Bits 63-13: VA<63-13>
Bits 12-0: context ID
Conveniently, a 512 entries TSB fits exactly in a 8K page, so it can be locked
in the TLB with a single entry there. However, it may be a wise idea to instead
map 64K (or more) of RAM locked as a single entry for all the things that needs
to be accessed by the TLB miss trap handler, so we minimize the use of TLB
entries.
Likewise, it may be useful to use 64K pages instead of 8K whenever possible.
The hardware provides some support for mixing the two sizes but it makes things
a bit more complex. Let's start out with simpler things.

View File

@ -12,4 +12,19 @@
#include <arch_cpu.h>
struct TsbEntry {
public:
bool IsValid();
void SetTo(int64_t tag, void* physicalAddress, uint64 mode);
public:
uint64_t fTag;
uint64_t fData;
};
extern void sparc_get_instruction_tsb(TsbEntry **_pageTable, size_t *_size);
extern void sparc_get_data_tsb(TsbEntry **_pageTable, size_t *_size);
#endif /* _KERNEL_ARCH_SPARC_MMU_H */

View File

@ -17,10 +17,10 @@
extern intptr_t gChosen;
template<typename AddressType>
template<typename AddressType, typename SizeType>
struct of_region {
AddressType base;
uint32 size;
SizeType size;
} _PACKED;
struct of_arguments {

View File

@ -106,7 +106,7 @@ find_physical_memory_ranges(size_t &total)
// On 64-bit PowerPC systems (G5), our mem base range address is larger
if (regAddressCells == 2) {
struct of_region<uint64> regions[64];
struct of_region<uint64, uint32> regions[64];
int count = of_getprop(package, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
count = of_getprop(memory, "reg", regions, sizeof(regions));
@ -136,7 +136,7 @@ find_physical_memory_ranges(size_t &total)
}
// Otherwise, normal 32-bit PowerPC G3 or G4 have a smaller 32-bit one
struct of_region<uint32> regions[64];
struct of_region<uint32, uint32> regions[64];
int count = of_getprop(package, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
count = of_getprop(memory, "reg", regions, sizeof(regions));

View File

@ -13,6 +13,7 @@ for platform in [ MultiBootSubDirSetup openfirmware ] {
on $(platform) {
BootMergeObject boot_platform_openfirmware_sparc.o :
arch_mmu.cpp
arch_start_kernel.S
cpu.cpp
mmu.cpp

View File

@ -25,13 +25,8 @@
#include "support.h"
// set protection to WIMGNPP: -----PP
// PP: 00 - no access
// 01 - read only
// 10 - read/write
// 11 - read only
#define PAGE_READ_ONLY 0x01
#define PAGE_READ_WRITE 0x02
#define PAGE_READ_ONLY 0x0002
#define PAGE_READ_WRITE 0x0001
// NULL is actually a possible physical address...
//#define PHYSINVAL ((void *)-1)
@ -45,7 +40,8 @@
#endif
uint32 sPageTableHashMask;
unsigned int sMmuInstance;
unsigned int sMemoryInstance;
// begin and end of the boot loader
@ -53,21 +49,36 @@ extern "C" uint8 __text_begin;
extern "C" uint8 _end;
static status_t
insert_virtual_range_to_keep(void *start, uint32 size)
{
return insert_address_range(gKernelArgs.arch_args.virtual_ranges_to_keep,
&gKernelArgs.arch_args.num_virtual_ranges_to_keep,
MAX_VIRTUAL_RANGES_TO_KEEP, (addr_t)start, size);
}
static status_t
remove_virtual_range_to_keep(void *start, uint32 size)
{
return remove_address_range(gKernelArgs.arch_args.virtual_ranges_to_keep,
&gKernelArgs.arch_args.num_virtual_ranges_to_keep,
MAX_VIRTUAL_RANGES_TO_KEEP, (addr_t)start, size);
}
static status_t
find_physical_memory_ranges(size_t &total)
{
int memory;
dprintf("checking for memory...\n");
if (of_getprop(gChosen, "memory", &memory, sizeof(int)) == OF_FAILED)
return B_ERROR;
int package = of_instance_to_package(memory);
intptr_t package = of_instance_to_package(sMemoryInstance);
total = 0;
// Memory base addresses are provided in 32 or 64 bit flavors
// #address-cells and #size-cells matches the number of 32-bit 'cells'
// representing the length of the base address and size fields
int root = of_finddevice("/");
intptr_t root = of_finddevice("/");
int32 regAddressCells = of_address_cells(root);
int32 regSizeCells = of_size_cells(root);
if (regAddressCells == OF_FAILED || regSizeCells == OF_FAILED) {
@ -76,50 +87,17 @@ find_physical_memory_ranges(size_t &total)
regSizeCells = 1;
}
// NOTE : Size Cells of 2 is possible in theory... but I haven't seen it yet.
if (regAddressCells > 2 || regSizeCells > 1) {
if (regAddressCells != 2 || regSizeCells != 2) {
panic("%s: Unsupported OpenFirmware cell count detected.\n"
"Address Cells: %" B_PRId32 "; Size Cells: %" B_PRId32
" (CPU > 64bit?).\n", __func__, regAddressCells, regSizeCells);
return B_ERROR;
}
// On 64-bit PowerPC systems (G5), our mem base range address is larger
if (regAddressCells == 2) {
struct of_region<uint64> regions[64];
int count = of_getprop(package, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
count = of_getprop(memory, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
return B_ERROR;
count /= sizeof(regions[0]);
for (int32 i = 0; i < count; i++) {
if (regions[i].size <= 0) {
dprintf("%d: empty region\n", i);
continue;
}
dprintf("%" B_PRIu32 ": base = %" B_PRIu64 ","
"size = %" B_PRIu32 "\n", i, regions[i].base, regions[i].size);
total += regions[i].size;
if (insert_physical_memory_range((addr_t)regions[i].base,
regions[i].size) != B_OK) {
dprintf("cannot map physical memory range "
"(num ranges = %" B_PRIu32 ")!\n",
gKernelArgs.num_physical_memory_ranges);
return B_ERROR;
}
}
return B_OK;
}
// Otherwise, normal 32-bit PowerPC G3 or G4 have a smaller 32-bit one
struct of_region<uint32> regions[64];
struct of_region<uint64, uint64> regions[64];
int count = of_getprop(package, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
count = of_getprop(memory, "reg", regions, sizeof(regions));
count = of_getprop(sMemoryInstance, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
return B_ERROR;
count /= sizeof(regions[0]);
@ -129,8 +107,8 @@ find_physical_memory_ranges(size_t &total)
dprintf("%d: empty region\n", i);
continue;
}
dprintf("%" B_PRIu32 ": base = %" B_PRIu32 ","
"size = %" B_PRIu32 "\n", i, regions[i].base, regions[i].size);
dprintf("%" B_PRIu32 ": base = %" B_PRIx64 ","
"size = %" B_PRIx64 "\n", i, regions[i].base, regions[i].size);
total += regions[i].size;
@ -168,7 +146,7 @@ is_physical_allocated(void *address, size_t size)
static bool
is_physical_memory(void *address, size_t size)
is_physical_memory(void *address, size_t size = 1)
{
return is_address_range_covered(gKernelArgs.physical_memory_range,
gKernelArgs.num_physical_memory_ranges, (addr_t)address, size);
@ -176,26 +154,122 @@ is_physical_memory(void *address, size_t size)
static bool
is_physical_memory(void *address)
map_range(void *virtualAddress, void *physicalAddress, size_t size, uint16 mode)
{
return is_physical_memory(address, 1);
}
// everything went fine, so lets mark the space as used.
int status = of_call_method(sMmuInstance, "map", 4, 0, mode, size,
virtualAddress, physicalAddress);
static void
map_page(void *virtualAddress, void *physicalAddress, uint8 mode)
{
panic("%s: out of page table entries!\n", __func__);
}
static void
map_range(void *virtualAddress, void *physicalAddress, size_t size, uint8 mode)
{
for (uint32 offset = 0; offset < size; offset += B_PAGE_SIZE) {
map_page((void *)(intptr_t(virtualAddress) + offset),
(void *)(intptr_t(physicalAddress) + offset), mode);
if (status != 0) {
dprintf("map_range(base: %p, size: %" B_PRIuSIZE ") "
"mapping failed\n", virtualAddress, size);
return false;
}
return true;
}
static status_t
find_allocated_ranges(void **_exceptionHandlers)
{
// we have to preserve the OpenFirmware established mappings
// if we want to continue to use its service after we've
// taken over (we will probably need less translations once
// we have proper driver support for the target hardware).
intptr_t mmu = of_instance_to_package(sMmuInstance);
struct translation_map {
void *PhysicalAddress() {
int64_t p = data;
// Sign extend
p <<= 23;
p >>= 23;
// Remove low bits
p &= 0xFFFFFFFFFFFFE000ll;
return (void*)p;
}
int16_t Mode() {
int16_t mode;
if (data & 2)
mode = PAGE_READ_WRITE;
else
mode = PAGE_READ_ONLY;
return mode;
}
void *virtual_address;
intptr_t length;
intptr_t data;
} translations[64];
int length = of_getprop(mmu, "translations", &translations,
sizeof(translations));
if (length == OF_FAILED) {
dprintf("Error: no OF translations.\n");
return B_ERROR;
}
length = length / sizeof(struct translation_map);
uint32 total = 0;
dprintf("found %d translations\n", length);
for (int i = 0; i < length; i++) {
struct translation_map *map = &translations[i];
bool keepRange = true;
TRACE("%i: map: %p, length %ld -> phy %p mode %d\n", i,
map->virtual_address, map->length,
map->PhysicalAddress(), map->Mode());
// insert range in physical allocated, if it points to physical memory
if (is_physical_memory(map->PhysicalAddress())
&& insert_physical_allocated_range((addr_t)map->PhysicalAddress(),
map->length) != B_OK) {
dprintf("cannot map physical allocated range "
"(num ranges = %" B_PRIu32 ")!\n",
gKernelArgs.num_physical_allocated_ranges);
return B_ERROR;
}
// insert range in virtual allocated
if (insert_virtual_allocated_range((addr_t)map->virtual_address,
map->length) != B_OK) {
dprintf("cannot map virtual allocated range "
"(num ranges = %" B_PRIu32 ")!\n",
gKernelArgs.num_virtual_allocated_ranges);
}
// insert range in virtual ranges to keep
if (keepRange) {
TRACE("%i: keeping free range starting at va %p\n", i,
map->virtual_address);
if (insert_virtual_range_to_keep(map->virtual_address,
map->length) != B_OK) {
dprintf("cannot map virtual range to keep "
"(num ranges = %" B_PRIu32 ")\n",
gKernelArgs.num_virtual_allocated_ranges);
}
}
total += map->length;
}
dprintf("total size kept: %" B_PRIu32 "\n", total);
// remove the boot loader code from the virtual ranges to keep in the
// kernel
if (remove_virtual_range_to_keep(&__text_begin, &_end - &__text_begin)
!= B_OK) {
dprintf("%s: Failed to remove boot loader range "
"from virtual ranges to keep.\n", __func__);
}
return B_OK;
}
@ -227,8 +301,9 @@ find_free_physical_range(size_t size)
= (void *)(addr_t)(gKernelArgs.physical_allocated_range[i].start
+ gKernelArgs.physical_allocated_range[i].size);
if (!is_physical_allocated(address, size)
&& is_physical_memory(address, size))
&& is_physical_memory(address, size)) {
return address;
}
}
return PHYSINVAL;
}
@ -278,8 +353,10 @@ arch_mmu_allocate(void *_virtualAddress, size_t size, uint8 _protection,
// If no address is given, use the KERNEL_BASE as base address, since
// that avoids trouble in the kernel, when we decide to keep the region.
void *virtualAddress = _virtualAddress;
#if 0
if (!virtualAddress)
virtualAddress = (void*)KERNEL_BASE;
#endif
// find free address large enough to hold "size"
virtualAddress = find_free_virtual_range(virtualAddress, size);
@ -294,6 +371,19 @@ arch_mmu_allocate(void *_virtualAddress, size_t size, uint8 _protection,
return NULL;
}
#if 0
intptr_t status;
/* claim the address */
status = of_call_method(sMmuInstance, "claim", 3, 1, 0, size,
virtualAddress, &_virtualAddress);
if (status != 0) {
dprintf("arch_mmu_allocate(base: %p, size: %" B_PRIuSIZE ") "
"failed to claim virtual address\n", virtualAddress, size);
return NULL;
}
#endif
// we have a free virtual range for the allocation, now
// have a look for free physical memory as well (we assume
// that a) there is enough memory, and b) failing is fatal
@ -308,12 +398,23 @@ arch_mmu_allocate(void *_virtualAddress, size_t size, uint8 _protection,
// everything went fine, so lets mark the space as used.
dprintf("mmu_alloc: va %p, pa %p, size %" B_PRIuSIZE "\n", virtualAddress,
physicalAddress, size);
#if 0
void* _physicalAddress;
status = of_call_method(sMemoryInstance, "claim", 3, 1, physicalAddress,
1, size, &_physicalAddress);
if (status != 0) {
dprintf("arch_mmu_allocate(base: %p, size: %" B_PRIuSIZE ") "
"failed to claim physical address\n", physicalAddress, size);
return NULL;
}
#endif
insert_virtual_allocated_range((addr_t)virtualAddress, size);
insert_physical_allocated_range((addr_t)physicalAddress, size);
map_range(virtualAddress, physicalAddress, size, protection);
if (!map_range(virtualAddress, physicalAddress, size, protection))
return NULL;
return virtualAddress;
}
@ -330,6 +431,7 @@ arch_mmu_free(void *address, size_t size)
// #pragma mark - OpenFirmware callbacks and public API
#if 0
static int
map_callback(struct of_arguments *args)
{
@ -420,11 +522,13 @@ callback(struct of_arguments *args)
return OF_FAILED;
}
#endif
extern "C" status_t
arch_set_callback(void)
{
#if 0
// set OpenFirmware callbacks - it will ask us for memory after that
// instead of maintaining it itself
@ -435,6 +539,7 @@ arch_set_callback(void)
return B_ERROR;
}
TRACE("old callback = %p; new callback = %p\n", oldCallback, callback);
#endif
return B_OK;
}
@ -443,6 +548,15 @@ arch_set_callback(void)
extern "C" status_t
arch_mmu_init(void)
{
if (of_getprop(gChosen, "mmu", &sMmuInstance, sizeof(int)) == OF_FAILED) {
dprintf("%s: Error: no OpenFirmware mmu\n", __func__);
return B_ERROR;
}
if (of_getprop(gChosen, "memory", &sMemoryInstance, sizeof(int)) == OF_FAILED) {
dprintf("%s: Error: no OpenFirmware memory\n", __func__);
return B_ERROR;
}
// get map of physical memory (fill in kernel_args structure)
size_t total;
@ -452,6 +566,44 @@ arch_mmu_init(void)
}
dprintf("total physical memory = %luMB\n", total / (1024 * 1024));
void *exceptionHandlers = (void *)-1;
if (find_allocated_ranges(&exceptionHandlers) != B_OK) {
dprintf("Error: find_allocated_ranges() failed\n");
return B_ERROR;
}
#if 0
if (exceptionHandlers == (void *)-1) {
// TODO: create mapping for the exception handlers
dprintf("Error: no mapping for the exception handlers!\n");
}
// Set the Open Firmware memory callback. From now on the Open Firmware
// will ask us for memory.
arch_set_callback();
// set up new page table and turn on translation again
// TODO "set up new page table and turn on translation again" (see PPC)
#endif
// set kernel args
dprintf("virt_allocated: %" B_PRIu32 "\n",
gKernelArgs.num_virtual_allocated_ranges);
dprintf("phys_allocated: %" B_PRIu32 "\n",
gKernelArgs.num_physical_allocated_ranges);
dprintf("phys_memory: %" B_PRIu32 "\n",
gKernelArgs.num_physical_memory_ranges);
#if 0
// TODO set gKernelArgs.arch_args content if we have something to put in there
gKernelArgs.arch_args.page_table.start = (addr_t)sPageTable;
gKernelArgs.arch_args.page_table.size = tableSize;
gKernelArgs.arch_args.exception_handlers.start = (addr_t)exceptionHandlers;
gKernelArgs.arch_args.exception_handlers.size = B_PAGE_SIZE;
#endif
return B_OK;
}

View File

@ -221,7 +221,7 @@ find_physical_memory_ranges(phys_addr_t &total)
// On 64-bit PowerPC systems (G5), our mem base range address is larger
if (regAddressCells == 2) {
struct of_region<uint64> regions[64];
struct of_region<uint64, uint32> regions[64];
int count = of_getprop(package, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
count = of_getprop(memory, "reg", regions, sizeof(regions));
@ -251,7 +251,7 @@ find_physical_memory_ranges(phys_addr_t &total)
}
// Otherwise, normal 32-bit PowerPC G3 or G4 have a smaller 32-bit one
struct of_region<uint32> regions[64];
struct of_region<uint32, uint32> regions[64];
int count = of_getprop(package, "reg", regions, sizeof(regions));
if (count == OF_FAILED)
count = of_getprop(memory, "reg", regions, sizeof(regions));

View File

@ -8,6 +8,7 @@ KernelMergeObject kernel_arch_sparc.o :
arch_debug_console.cpp
arch_elf.cpp
arch_int.cpp
arch_mmu.cpp
arch_platform.cpp
arch_real_time_clock.cpp
arch_smp.cpp

View File

@ -0,0 +1,76 @@
/*
** Copyright 2019, Adrien Destugues, pulkomandy@pulkomandy.tk. All rights reserved.
** Distributed under the terms of the MIT License.
*/
#include <arch_mmu.h>
#include <arch_cpu.h>
#include <debug.h>
// Address space identifiers for the MMUs
// Ultrasparc User Manual, Table 6-10
enum {
instruction_control_asi = 0x50,
data_control_asi = 0x58,
instruction_8k_tsb_asi = 0x51,
data_8k_tsb_asi = 0x59,
instruction_64k_tsb_asi = 0x52,
data_64k_tsb_asi = 0x5A,
data_direct_tsb_asi = 0x5B,
instruction_tlb_in_asi = 0x54,
data_tlb_in_asi = 0x5C,
instruction_tlb_access_asi = 0x55,
data_tlb_access_asi = 0x5D,
instruction_tlb_read_asi = 0x56,
data_tlb_read_asi = 0x5E,
instruction_tlb_demap_asi = 0x57,
data_tlb_demap_asi = 0x5F,
};
// MMU register addresses
// Ultrasparc User Manual, Table 6-10
enum {
tsb_tag_target = 0x00, // I/D, RO
primary_context = 0x08, // D, RW
secondary_context = 0x10, // D, RW
synchronous_fault_status = 0x18, // I/D, RW
synchronous_fault_address = 0x20, // D, RO
tsb = 0x28, // I/D, RW
tlb_tag_access = 0x30, // I/D, RW
virtual_watchpoint = 0x38, // D, RW
physical_watchpoint = 0x40 // D, RW
};
extern void sparc_get_instruction_tsb(TsbEntry **_pageTable, size_t *_size)
{
uint64_t tsbEntry;
asm("ldxa [%[mmuRegister]] 0x50, %[destination]"
: [destination] "=r"(tsbEntry)
: [mmuRegister] "r"(tsb));
*_pageTable = (TsbEntry*)(tsbEntry & ~((1ll << 13) - 1));
*_size = 512 * (1 << (tsbEntry & 3)) * sizeof(TsbEntry);
if (tsbEntry & (1 << 12))
*_size *= 2;
}
extern void sparc_get_data_tsb(TsbEntry **_pageTable, size_t *_size)
{
uint64_t tsbEntry;
asm("ldxa [%[mmuRegister]] 0x58, %[destination]"
: [destination] "=r"(tsbEntry)
: [mmuRegister] "r"(tsb));
*_pageTable = (TsbEntry*)(tsbEntry & ~((1ll << 13) - 1));
*_size = 512 * (1 << (tsbEntry & 3)) * sizeof(TsbEntry);
if (tsbEntry & (1 << 12))
*_size *= 2;
}