- Use the PVO_CACHEABLE flag in the pvo as the One True Indicator of
the cacheable status of a mapping instead of peeking at the PTEH.
- Don't inline some of the larger routines, in an effort to appease
the somewhat buggy compiler.
- Fix some comments.
- Fix some casts.
- Add a bunch more debugging instrumentation.
- Move usr, sr, pc, and the branch-target registers to the top of
the listing so that it is no longer necessary to scroll through
64 integer registers to see them.
The main bug fixes are:
- pmap_pvo_remove() must calculate the kipt index if the idx param is -1.
- Don't assume that if a pmap's ASID generation is out of date that we
can skip purging/invalidating the cache for any of its constituent
mappings. At this time, the ASID generation just indicates that none
of its mappings are in the TLB. However, there may still be some valid
cache entries for them.
Finally, the subtle NFS and buffer cache corruption problems disappear.
useful should one occur.
- Manually poke some config values into the sh5pci host bridge's
config registers since it doesn't appear in config. space.
- Reserve the first 256 bytes of i/o space to avoid assigning i/o
address 0 to any cards.
- Slight tweak to the initialisation code after consultation with
SuperH and the linux driver.
map accurately tracks the same flag in the segments belonging to it.
The map's copy can be set only if all the segments are coherent.
This finally gets NFS writes fully working on my PCI ex(4) card.
- Track unmanaged mappings of RAM more closely by allocating a pvo
for them. This allows us to check more accurately for multiple
cache-mode-incompatible mappings.
- As part of the above, implement pmap_steal_memory(). This has the
beneficial side-effect of moving a fair chunk of kernel data
structures into KSEG0.
pretty much working, at least for non-NFS use.
With NFS, it fails under pressure probably due to operand cache aliases
between KSEG0 and regular 4KB mappings elsewhere. Sigh.
operand cache synonyms and paradoxes for shared mappings:
- Writable mappings are cache-inhibited if the underlying physical
page is mapped at two or more *different* VAs.
This means that read-only mappings at different VAs are still
cacheable. While this could lead to operand cache synonyms, it
won't cause data loss. At worst, we'd have the same read-only
data in several cache-lines.
- If a new cache-inhibited mapping is added for a page which has
existing cacheable mappings, all the existing mappings must be
made cache-inhibited.
- Conversely, if a new cacheable mapping is added for a page which
has existing cache-inhibited mappings, the new mapping must also
be made cache-ibhibited.
- When a mapping is removed, see if we can upgrade any of the
underlying physical page's remaining mappings to cacheable.
TODO: Deal with operand cache aliases (if necessary).
of the loadable sections to correspond to the physical address of
RAM in the Cayman. This is so sh5gdb uploads the image to the correct
place. (Should've done this ages ago instead of manually running a
script...)
This can be removed when I get a native bootloader written.
cacheable attribute of a mapping.
- Honour PMAP_NC in pmap_enter() using NOCACHE, instead of DEVICE.
- No longer need to re-fetch the ptel in pmap_pa_unmap_kva() as
syncing the cache no longer risks causing a TLB miss.
- Re-define bus_size_t and bus_addr_t to be u_int32_t.
While this may well lose for future silicon with NEFFBITS > 32, the
original u_long was a waste on current designs (especially for _LP64).
Allocate/Prefetch one cache-line ahead of the one we're about to deal with.
This reduces the chances of the cpu stalling while waiting for the cache
to flush a dirty line in order to satisfy the Allocate/Prefetch request.
registers in any trap/interrupt exception frame found.
- Slight tweak to more accurately detect the correct call-site when
looking for a function's prologue.
This ensures we start from the actual call site, not the return address.
The latter may actually be in the next consecutive function if the current
function has the __noreturn__ attribute and the alignment is Just Right.
pointer, in case the caller grew its stack dynamically.
Also beef up the checks to catch cases where the call stack passes
through the exception handling code in locore. In this case, the
frame pointer and program counter are in the trapframe/intrframe.
problem, such that a TLB miss no longer occurs.
With the above, it is now safe to enable write-back caching for userland
mappings.
TODO: Deal with cache issues for shared mappings with different VAs.
- Add event counters for some key pmap events (similar to mpc6xx pmap).
- Use the cache-friendly, optimised copy/zero page functions.
- Add the necessary cache management code to enable WriteBack caching
of KSEG1 mappings. Seems to work fine so far.
- Use the PMAP_ASID_* constants from pmap.h
- Track pmap_pvo_{enter,remove}() depth in the same way as mpc6xx's pmap
(on which this pmap was originally based).
- Some misc. tidying up and added commentary.
- Use the VA/KVA to select whether to use the IPT or PTEG instead of
checking which pmap is being operated on.
- Add a handy DDB-callable function which will scan the kernel IPT
looking for inconsitencies.
- Finally, when unmapping a pool page, purge the data cache for the
page. This permits write-back caching to be enabled for kernel
text/data.
machine-specific code.
- Re-work the code which detects a nested critical section event.
We can now determine who is the owner of the critical section, and
what event occurred while it was owned.
- Work-around a silicon bug which can cause a nested critical event.
In the _EXCEPTION_ENTRY() macro (which sets up the critical section),
if there is a pending hardware interrupt which has a higher priority
than the current IMASK, then the "putcon" which supposedly clears SR.BL
and sets SR.IMASK to 0xf is not atomic. The pending hardware interrupt
will be taken, causing a nested critical section event. The work-around
is to update SR.BL and SR.IMASK separately using two "putcon" insns.
- Make it possible to at least *try* to resume execution if we
get an NMI.
- Major clean-up of the panic/critical section trap handlers.
The dumped state is now much more accurate.
so the register is correctly sign-extended.
- Some comment fixes.
- Restore the FP state from the sigcontext if FP regs were saved.
- Fix up r0 for the benefit of the syscall stub.
This merge changes the device switch tables from static array to
dynamically generated by config(8).
- All device switches is defined as a constant structure in device drivers.
- The new grammer ``device-major'' is introduced to ``files''.
device-major <prefix> char <num> [block <num>] [<rules>]
- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.
- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.
- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.
- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.
- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.
- in _EXCEPTION_EXIT, copy the current ASID to the pre-exception
context before we switch.
- fix the pteg hash generation code and EPN masking in the tlb
miss handler.
Sprinkle some DIAGNOSTIC checks.
to ensure the callee-saved set will be restored when we switch to it.
(It doesn't actually matter to the new process; it just inherits some
crud in those registers from the kernel if we don't set the bit).
Also ensure the strings pointer in r7 is sign-extended.
section of another exception. This is likely to happen if the kernel
stack is misaligned, has dropped off the bottom of the PCB, or has
otherwise gone into orbit.
In this case, switch to a safe stack, save as much of the machine
state as possible and dump it to the console.
accessing the kernel stack, since a TLB miss on the kernel stack
will result in r24 being trashed.
Also clear the ES_CRITICAL flag just before returning to the
previous context.
as intrframe and trapframe are concerned.
According to the ABI, only the low 32-bits of these registers are
guaranteed to be preserved by the callee. Therefore, we need to
preserve all 64-bits of them in the interrupt trampoline.
ToDo:
- Symbol support (can't test as yet, due to lack of symbols),
- Take notice of adjacent "movi/shori" instructions in order to display
the resulting 32/64-bit value, with symbol lookup if possible.
At the very least, this will dump the machine state. At best,
we get into ddb().
This provides a useful way to regain control using an NMI button
if the cpu decides to spin at a high ipl.
Make sure to zero-extend PTEH/PTEL values before comparing with TLB entries.
Don't use the two LSBs of CTC when choosing a "random" TLB entry to replace;
seems like these bits are always zero on this CPU.
waiting for ACKs from the DTF host, otherwise the simulator waits
way too long for the initial open-ACK (which never seems to arrive,
even though things work fine afterwards).
- selecting Simulator/ST50 Debugger targets,
- hard-coding the cpu speed instead of using the speed detection code,
- changing the default kernel IPT size,
- selecting the IRL[0-3] mode to configure in the interrupt controller.
previous behaviour of storing them with SR.BL clear was in breach
of the SH5 documentation.
Make an effort to catch PANIC traps and dump machine state to the console.
pcb contains valid state before copying it to p2's pcb.
Previously, we just lazy-sync the fpu state. This wasn't quite good
enough if p1 had not previously slept.
way for bus_space(9) to efficiently map device memory. (Although at
the moment, it doesn't quite work as efficiently as it will down
the line ...)
Fix a pool_init() botch.
Add a debug aid: dump_kipt(). This can be called from ddb(4) in order
to (partially) dump the contents of the kernel IPT.
- Clear SR.FD to enable the FPU. Seems like it starts up disabled. If the
core has no FPU, this is a nop.
- Preserve the debug bits (step/watch) in an attempt to appease the debugger.
counters. These counters do not exist on all CPUs, but where they
do exist, can be used for counting events such as dcache misses that
would otherwise be difficult or impossible to instrument by code
inspection or hardware simulation.
pmc(9) is meant to be a general interface. Initially, the Intel XScale
counters are the only ones supported.
through the net the first time around. Here's the relevant snippet
of the original commit message:
Rename cdev_systrace_init() to cdev_clonemisc_init(), so it can
be properly used by any misc. cloning device.
simple config file option.
Also, don't hard code the endian setting in a header file. Rely instead
on the compiler defining __LITTLE_ENDIAN__ and DTRT as appropriate.