process context ('reaper').
From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit
uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.
MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.
g/c now unneeded routines and variables, including the reaper kernel thread
instead.
With this change, we no longer need to save the current interrupt level
in the switchframe. This is no great loss since both cpu_switch and
cpu_switchto are always called at splsched, so the process' spl is
effectively saved somewhere in the callstack.
This fixes an evbarm problem reported by Allen Briggs:
lwp gets into sa_switch -> mi_switch with newl != NULL
when it's the last element on the runqueue, so it
hits the second bit of:
if (newl == NULL) {
retval = cpu_switch(l, NULL);
} else {
remrunqueue(newl);
cpu_switchto(l, newl);
retval = 0;
}
mi_switch calls remrunqueue() and cpu_switchto()
cpu_switchto unlocks the sched lock
cpu_switchto drops CPU priority
softclock is received
schedcpu is called from softclock
schedcpu hits the first if () {} block here:
if (l->l_priority >= PUSER) {
if (l->l_stat == LSRUN &&
(l->l_flag & L_INMEM) &&
(l->l_priority / PPQ) != (l->l_usrpri / PPQ)) {
remrunqueue(l);
l->l_priority = l->l_usrpri;
setrunqueue(l);
} else
l->l_priority = l->l_usrpri;
}
Since mi_switch has already run remrunqueue, the LWP has been
removed, but it's not been put back on any queue, so the
remrunqueue panics.
directly. The old code was totally bogus for the new pmap. New code
lifted from SH5 port.
Fixes panics in ffs_balloc_ufs2() seen while stress-testing a file
system on an XScale-based server platform.
* Define a new "MMU type", ARM_MMU_SA1. While the SA-1's MMU is basically
compatible with the generic, the SA-1 cache does not have a write-through
mode, and it is useful to know have an indication of this.
* Add a new PMAP_NEEDS_PTE_SYNC indicator, and try to evaluate it at
compile time. We evaluate it like so:
- If SA-1-style MMU is the only type configured -> 1
- If SA-1-style MMU is not configured -> 0
- Otherwise, defer to a run-time variable.
If PMAP_NEEDS_PTE_SYNC might evaluate to true (SA-1 only or run-time
check), then we also define PMAP_INCLUDE_PTE_SYNC so that e.g. assembly
code can include the necessary run-time support. PMAP_INCLUDE_PTE_SYNC
largely replaces the ARM32_PMAP_NEEDS_PTE_SYNC manual setting Steve
included with the original new pmap.
* In the new pmap, make pmap_pte_init_generic() check to see if the CPU
has a write-back cache. If so, init the PT cache mode to C=1,B=0 to get
write-through mode. Otherwise, init the PT cache mode to C=1,B=1.
* Add a new pmap_pte_init_arm8(). Old pmap, same as generic. New pmap,
sets page table cacheability to 0 (ARM8 has a write-back cache, but
flushing it is quite expensive).
* In the new pmap, make pmap_pte_init_arm9() reset the PT cache mode to
C=1,B=0, since the write-back check in generic gets it wrong for ARM9,
since we use write-through mode all the time on ARM9 right now. (What
this really tells me is that the test for write-through cache is less
than perfect, but we can fix that later.)
* Add a new pmap_pte_init_sa1(). Old pmap, same as generic. New pmap,
does generic initialization, then resets page table cache mode to
C=1,B=1, since C=1,B=0 does not produce write-through on the SA-1.
Some features of the new pmap are:
- It allows L1 descriptor tables to be shared efficiently between
multiple processes. A typical "maxusers 32" kernel, where NPROC is set
to 532, requires 35 L1s. A "maxusers 2" kernel runs quite happily
with just 4 L1s. This completely solves the problem of running out
of contiguous physical memory for allocating new L1s at runtime on a
busy system.
- Much improved cache/TLB management "smarts". This change ripples
out to encompass the low-level context switch code, which is also
much smarter about when to flush the cache/TLB, and when not to.
- Faster allocation of L2 page tables and associated metadata thanks,
in part, to the pool_cache enhancements recently contributed to
NetBSD by Wasabi Systems.
- Faster VM space teardown due to accurate referenced tracking of L2
page tables.
- Better/faster cache-alias tracking.
The new pmap is enabled by adding options ARM32_PMAP_NEW to the kernel
config file, and making the necessary changes to the port-specific
initarm() function. Several ports have already been converted and will
be committed shortly.
and hence save fewer into the PCB. This should give me enough free
registers in cpu_switch to tidy things up and support MULTIPROCESSOR
properly. While we're here, make the stacked registers into an
APCS stack frame, so that DDB backtraces through cpu_switch() will
work.
This also affects cpu_fork(), which has to fabricate a switchframe and
PCB for the new process.
counters. These counters do not exist on all CPUs, but where they
do exist, can be used for counting events such as dcache misses that
would otherwise be difficult or impossible to instrument by code
inspection or hardware simulation.
pmc(9) is meant to be a general interface. Initially, the Intel XScale
counters are the only ones supported.
nathanw_sa branch.
* In switch_exit(), set the outgoing-proc register to NULL (rather than
proc0) so that we actually use the "exiting process" optimization in
cpu_switch().
* Don't refer to VA 0, instead refer to a new variable: vector_page
* Delete the old zero_page_*() functions, replacing them with a new
one: vector_page_setprot().
* When manipulating vector page mappings in user pmaps, only do so if
the vector page is below KERNEL_BASE (if it's above KERNEL_BASE, the
vector page is mapped by the kernel pmap).
* Add a new function, arm32_vector_init(), which takes the virtual
address of the vector page (which MUST be valid when the function
is called) and a bitmask of vectors the kernel is going to take
over, and performs all vector page initialization, including setting
the V bit in the CPU Control register ("relocate vectors to high
address"), if necessary.
Be consistant in the way that MSIZE, MCLSHIFT, MCLBYTES and NMBCLUSTERS
are defined.
Remove old VM constants from cesfic port.
Bump MSIZE to 256 on mipsco (the only one that wasn't already 256).
pass. Rather than providing a whole slew of cache operations that
aren't ever used, distill them down to some useful primitives:
icache_sync_all Synchronize I-cache
icache_sync_range Synchronize I-cache range
dcache_wbinv_all Write-back and Invalidate D-cache
dcache_wbinv_range Write-back and Invalidate D-cache range
dcache_inv_range Invalidate D-cache range
dcache_wb_range Write-back D-cache range
idcache_wbinv_all Write-back and Invalidate D-cache,
Invalidate I-cache
idcache_wbinv_range Write-back and Invalidate D-cache,
Invalidate I-cache range
Note: This does not yet include an overhaul of the actual asm files
that implement the primitives. Instead, we've provided a safe default
for each CPU type, and the individual CPU types can now be optimized
one at a time.
switch_exit only needs to take 1 parameter, it loads the value of proc0 into R1 itself
Fixup some comments to reflect the real state of things.
Tweak a couple of bits of asm to avoid a load delay.
remove excess code for setting curpcb and curproc.
is shared with another process (as can happen if vfork is being used),
then that other process will end up not having a page 0, which is bad
news indeed, since then there is no way back into the kernel.
Found this using a multi-ice box, so they are useful after all!
This seems to fix pr port-arm32/11921 and (possibly) kern/9859.
This will allow improvements to the pmaps so that they can more easily defer expensive operations, eg tlb/cache flush, til the last possible moment.
Currently this is a no-op on most platforms, so they should see no difference.
Reviewed by Jason.