Commit Graph

31 Commits

Author SHA1 Message Date
jdolecek 089abdad44 Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
  as FPU state), and is the last potentially blocking operation;
  all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
  by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
  for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread
2004-01-04 11:33:29 +00:00
scw 52c15bbd20 Don't drop to spl0 in cpu_switch/cpu_switchto. Do it in the idle loop
instead.

With this change, we no longer need to save the current interrupt level
in the switchframe. This is no great loss since both cpu_switch and
cpu_switchto are always called at splsched, so the process' spl is
effectively saved somewhere in the callstack.

This fixes an evbarm problem reported by Allen Briggs:

        lwp gets into sa_switch -> mi_switch with newl != NULL
            when it's the last element on the runqueue, so it
            hits the second bit of:
                if (newl == NULL) {
                        retval = cpu_switch(l, NULL);
                } else {
                        remrunqueue(newl);
                        cpu_switchto(l, newl);
                        retval = 0;
                }

        mi_switch calls remrunqueue() and cpu_switchto()

        cpu_switchto unlocks the sched lock
        cpu_switchto drops CPU priority
        softclock is received
        schedcpu is called from softclock
        schedcpu hits the first if () {} block here:
                if (l->l_priority >= PUSER) {
                        if (l->l_stat == LSRUN &&
                            (l->l_flag & L_INMEM) &&
                            (l->l_priority / PPQ) != (l->l_usrpri / PPQ)) {
                                remrunqueue(l);
                                l->l_priority = l->l_usrpri;
                                setrunqueue(l);
                        } else
                                l->l_priority = l->l_usrpri;
                }

        Since mi_switch has already run remrunqueue, the LWP has been
            removed, but it's not been put back on any queue, so the
            remrunqueue panics.
2003-10-23 08:59:10 +00:00
lukem 08716eae82 __KERNEL_RCSID() 2003-07-15 00:24:37 +00:00
thorpej df7c83c4a0 Rewrite pagemove() to use pmap functions, rather than frobbing PTEs
directly.  The old code was totally bogus for the new pmap.  New code
lifted from SH5 port.

Fixes panics in ffs_balloc_ufs2() seen while stress-testing a file
system on an XScale-based server platform.
2003-05-17 00:41:36 +00:00
thorpej bbef46a7e9 Some ARM32_PMAP_NEW-related cleanup:
* Define a new "MMU type", ARM_MMU_SA1.  While the SA-1's MMU is basically
  compatible with the generic, the SA-1 cache does not have a write-through
  mode, and it is useful to know have an indication of this.
* Add a new PMAP_NEEDS_PTE_SYNC indicator, and try to evaluate it at
  compile time.  We evaluate it like so:
  - If SA-1-style MMU is the only type configured -> 1
  - If SA-1-style MMU is not configured -> 0
  - Otherwise, defer to a run-time variable.
  If PMAP_NEEDS_PTE_SYNC might evaluate to true (SA-1 only or run-time
  check), then we also define PMAP_INCLUDE_PTE_SYNC so that e.g. assembly
  code can include the necessary run-time support.  PMAP_INCLUDE_PTE_SYNC
  largely replaces the ARM32_PMAP_NEEDS_PTE_SYNC manual setting Steve
  included with the original new pmap.
* In the new pmap, make pmap_pte_init_generic() check to see if the CPU
  has a write-back cache.  If so, init the PT cache mode to C=1,B=0 to get
  write-through mode.  Otherwise, init the PT cache mode to C=1,B=1.
* Add a new pmap_pte_init_arm8().  Old pmap, same as generic.  New pmap,
  sets page table cacheability to 0 (ARM8 has a write-back cache, but
  flushing it is quite expensive).
* In the new pmap, make pmap_pte_init_arm9() reset the PT cache mode to
  C=1,B=0, since the write-back check in generic gets it wrong for ARM9,
  since we use write-through mode all the time on ARM9 right now.  (What
  this really tells me is that the test for write-through cache is less
  than perfect, but we can fix that later.)
* Add a new pmap_pte_init_sa1().  Old pmap, same as generic.  New pmap,
  does generic initialization, then resets page table cache mode to
  C=1,B=1, since C=1,B=0 does not produce write-through on the SA-1.
2003-04-22 00:24:48 +00:00
scw 41a1932e58 Add the generic arm32 bits of the new pmap, contributed by Wasabi Systems.
Some features of the new pmap are:

 - It allows L1 descriptor tables to be shared efficiently between
   multiple processes. A typical "maxusers 32" kernel, where NPROC is set
   to 532, requires 35 L1s. A "maxusers 2" kernel runs quite happily
   with just 4 L1s. This completely solves the problem of running out
   of contiguous physical memory for allocating new L1s at runtime on a
   busy system.

 - Much improved cache/TLB management "smarts". This change ripples
   out to encompass the low-level context switch code, which is also
   much smarter about when to flush the cache/TLB, and when not to.

 - Faster allocation of L2 page tables and associated metadata thanks,
   in part, to the pool_cache enhancements recently contributed to
   NetBSD by Wasabi Systems.

 - Faster VM space teardown due to accurate referenced tracking of L2
   page tables.

 - Better/faster cache-alias tracking.

The new pmap is enabled by adding options ARM32_PMAP_NEW to the kernel
config file, and making the necessary changes to the port-specific
initarm() function. Several ports have already been converted and will
be committed shortly.
2003-04-18 11:08:24 +00:00
thorpej 95281cabad Use PAGE_SIZE rather than NBPG. 2003-04-01 23:19:08 +00:00
thorpej 23bc250391 Merge the nathanw_sa branch. 2003-01-17 21:55:23 +00:00
chris 3dd552c1b2 Fix's DEBUG kernel's not making it into multiuser on cats. (as spotted by
nick)
When wiring a page with pmap_enter you must supply the protection in the
flags as well as in the prot.
2002-11-24 01:07:47 +00:00
bjh21 a531a4ae8e Undo recent cpu_switch register usage changes in order to decrease nathanw_sa
merge pain.
2002-10-19 00:10:53 +00:00
bjh21 3d1b6867f0 In cpu_switch(), stack more registers at the start of the function,
and hence save fewer into the PCB.  This should give me enough free
registers in cpu_switch to tidy things up and support MULTIPROCESSOR
properly.  While we're here, make the stacked registers into an
APCS stack frame, so that DDB backtraces through cpu_switch() will
work.

This also affects cpu_fork(), which has to fabricate a switchframe and
PCB for the new process.
2002-10-18 21:32:57 +00:00
thorpej 6cc7c1c1ff * Add PTE_SYNC() and PTE_SYNC_RANGE() macros. These don't actually do
anything yet.
* Use PTE_SYNC() and PTE_SYNC_RANGE() in some obvious places, i.e.
  where vtopte() is used.
2002-08-22 01:13:53 +00:00
thorpej 15a5e8f238 cpu_fork(): If PMCs are not enabled in the parent, clear the machine-
dependent PMC state in the child.
2002-08-09 23:44:17 +00:00
briggs 0b956d0b8b Implement pmc(9) -- An interface to hardware performance monitoring
counters.  These counters do not exist on all CPUs, but where they
do exist, can be used for counting events such as dcache misses that
would otherwise be difficult or impossible to instrument by code
inspection or hardware simulation.

pmc(9) is meant to be a general interface.  Initially, the Intel XScale
counters are the only ones supported.
2002-08-07 05:14:47 +00:00
thorpej 62d83d05b1 * Pass proc0 to switch_exit(), to make this a little more like the
nathanw_sa branch.
* In switch_exit(), set the outgoing-proc register to NULL (rather than
  proc0) so that we actually use the "exiting process" optimization in
  cpu_switch().
2002-08-06 17:44:35 +00:00
thorpej 20b1bb2655 Clean up handling of the vector page on 32-bit ARM systems:
* Don't refer to VA 0, instead refer to a new variable: vector_page
* Delete the old zero_page_*() functions, replacing them with a new
  one: vector_page_setprot().
* When manipulating vector page mappings in user pmaps, only do so if
  the vector page is below KERNEL_BASE (if it's above KERNEL_BASE, the
  vector page is mapped by the kernel pmap).
* Add a new function, arm32_vector_init(), which takes the virtual
  address of the vector page (which MUST be valid when the function
  is called) and a bitmask of vectors the kernel is going to take
  over, and performs all vector page initialization, including setting
  the V bit in the CPU Control register ("relocate vectors to high
  address"), if necessary.
2002-04-03 23:33:26 +00:00
simonb 6f0fb25121 Don't need to declare phys_map - it is declared in <uvm/uvm_extern.h>. 2002-03-04 02:43:22 +00:00
simonb d9ab16ba2f Purge CLSIZE, CLSIZELOG2 and MCLOFSET.
Be consistant in the way that MSIZE, MCLSHIFT, MCLBYTES and NMBCLUSTERS
  are defined.
Remove old VM constants from cesfic port.
Bump MSIZE to 256 on mipsco (the only one that wasn't already 256).
2002-02-26 15:13:19 +00:00
thorpej cb51977892 When initializing sf->sf_spl, simply always assume that 0 is
equivalent to spl0().
2002-01-29 23:02:48 +00:00
thorpej 4e990d9ccb Overhaul of the ARM cache code. This is mostly a simplification
pass.  Rather than providing a whole slew of cache operations that
aren't ever used, distill them down to some useful primitives:

	icache_sync_all         Synchronize I-cache
	icache_sync_range       Synchronize I-cache range

	dcache_wbinv_all        Write-back and Invalidate D-cache
	dcache_wbinv_range      Write-back and Invalidate D-cache range
	dcache_inv_range        Invalidate D-cache range
	dcache_wb_range         Write-back D-cache range

	idcache_wbinv_all       Write-back and Invalidate D-cache,
				Invalidate I-cache
	idcache_wbinv_range     Write-back and Invalidate D-cache,
				Invalidate I-cache range

Note: This does not yet include an overhaul of the actual asm files
that implement the primitives.  Instead, we've provided a safe default
for each CPU type, and the individual CPU types can now be optimized
one at a time.
2002-01-25 19:19:22 +00:00
thorpej a93f7ef419 Provide a hook for platform-specific interrupt code to specify
the "spl" cookie in the switch frame.
2001-11-29 17:12:22 +00:00
thorpej 87fe867c21 Move the ARM, Ltd. floating point emulator to arch/arm. 2001-11-24 01:26:23 +00:00
chris 165b023373 Give the idle loop a non-profiled entry, means it appears in profile info correctly (rather than all it's time being under remrunqueue)
switch_exit only needs to take 1 parameter, it loads the value of proc0 into R1 itself
Fixup some comments to reflect the real state of things.
Tweak a couple of bits of asm to avoid a load delay.
remove excess code for setting curpcb and curproc.
2001-11-19 20:38:58 +00:00
rearnsha 2c48187673 Don't unmap page 0 when preparing to swap out a process. If the pmap
is shared with another process (as can happen if vfork is being used),
then that other process will end up not having a page 0, which is bad
news indeed, since then there is no way back into the kernel.

Found this using a multi-ice box, so they are useful after all!

This seems to fix pr port-arm32/11921 and (possibly) kern/9859.
2001-10-18 09:26:08 +00:00
chris 8fd1ceb7bf Fix bug in vmapbuf, was using len before it had been adjusted. Found by Frank while Luke was tracking down a bug. 2001-09-20 23:32:23 +00:00
chris 0e7661f023 Update pmap_update to now take the updated pmap as an argument.
This will allow improvements to the pmaps so that they can more easily defer expensive operations, eg tlb/cache flush, til the last possible moment.

Currently this is a no-op on most platforms, so they should see no difference.

Reviewed by Jason.
2001-09-10 21:19:08 +00:00
toshii 7c9e82d6e3 Don't define pcb_* register macros.
pcb_sp macro conflicts with sys/netinet6/ipsec.c.
2001-09-09 10:33:42 +00:00
chris 140252be2b Arm has a vac, so we must use pmap_enter/remove for vmapbuf rather than k* versions, otherwise we may not be doing the right caching thing. 2001-08-20 21:52:09 +00:00
chris 6cca1c3f58 Update to make use of a proper kenter implementation for vmapbuf and vunmapbuf. 2001-08-11 12:57:25 +00:00
matt 9509e734a3 Force size_t formats/arge to be (u_long). I'd use 'z' for this but gcc
2.95.3
2001-08-05 05:07:27 +00:00
chris 27f96e8440 Move the generic arm32 files into arm/arm32 from arm32/arm32, tested kernel builds on cats and riscpc. 2001-07-28 13:28:03 +00:00