NetBSD

Commit Graph

Author	SHA1	Message	Date
scw	52c15bbd20	Don't drop to spl0 in cpu_switch/cpu_switchto. Do it in the idle loop instead. With this change, we no longer need to save the current interrupt level in the switchframe. This is no great loss since both cpu_switch and cpu_switchto are always called at splsched, so the process' spl is effectively saved somewhere in the callstack. This fixes an evbarm problem reported by Allen Briggs: lwp gets into sa_switch -> mi_switch with newl != NULL when it's the last element on the runqueue, so it hits the second bit of: if (newl == NULL) { retval = cpu_switch(l, NULL); } else { remrunqueue(newl); cpu_switchto(l, newl); retval = 0; } mi_switch calls remrunqueue() and cpu_switchto() cpu_switchto unlocks the sched lock cpu_switchto drops CPU priority softclock is received schedcpu is called from softclock schedcpu hits the first if () {} block here: if (l->l_priority >= PUSER) { if (l->l_stat == LSRUN && (l->l_flag & L_INMEM) && (l->l_priority / PPQ) != (l->l_usrpri / PPQ)) { remrunqueue(l); l->l_priority = l->l_usrpri; setrunqueue(l); } else l->l_priority = l->l_usrpri; } Since mi_switch has already run remrunqueue, the LWP has been removed, but it's not been put back on any queue, so the remrunqueue panics.	2003-10-23 08:59:10 +00:00
scw	63d24b09fd	A couple of Xscale tweaks: - Use the "clz" instruction to pick a run-queue, instead of using the ffs-by-table-lookup method. - Use strd instead of stmia where possible. - Use multiple ldr instructions instead of ldmia where possible.	2003-10-13 21:44:27 +00:00
martin	d505b18964	Make sure to include opt_foo.h if a defflag option FOO is used.	2003-06-23 11:00:59 +00:00
chris	93632a0574	Fix for port-arm/21962. Rather than fixing the #ifndef spl0, I removed the test as spl0 is actually a macro for splx(0). The code now calls splx(0) (note building with the #ifdef fixed, caused the build to fail on a GENERIC acorn32 kernel.)	2003-06-23 09:05:22 +00:00
kristerw	28f5335a9f	Fix LINTSTUB comments.	2003-05-31 01:40:05 +00:00
thorpej	c8bed530ac	Remove #ifdefs supporting the old pmap, switching fully to the new.	2003-05-21 18:04:42 +00:00
chris	70a9a33cc8	Remove a strh. I don't think it's available on archv3 and it doesn't work on acorn32's with an SA110 in them as the bus doesn't support halfword transfers.	2003-04-26 17:50:21 +00:00
thorpej	bbef46a7e9	Some ARM32_PMAP_NEW-related cleanup: * Define a new "MMU type", ARM_MMU_SA1. While the SA-1's MMU is basically compatible with the generic, the SA-1 cache does not have a write-through mode, and it is useful to know have an indication of this. * Add a new PMAP_NEEDS_PTE_SYNC indicator, and try to evaluate it at compile time. We evaluate it like so: - If SA-1-style MMU is the only type configured -> 1 - If SA-1-style MMU is not configured -> 0 - Otherwise, defer to a run-time variable. If PMAP_NEEDS_PTE_SYNC might evaluate to true (SA-1 only or run-time check), then we also define PMAP_INCLUDE_PTE_SYNC so that e.g. assembly code can include the necessary run-time support. PMAP_INCLUDE_PTE_SYNC largely replaces the ARM32_PMAP_NEEDS_PTE_SYNC manual setting Steve included with the original new pmap. * In the new pmap, make pmap_pte_init_generic() check to see if the CPU has a write-back cache. If so, init the PT cache mode to C=1,B=0 to get write-through mode. Otherwise, init the PT cache mode to C=1,B=1. * Add a new pmap_pte_init_arm8(). Old pmap, same as generic. New pmap, sets page table cacheability to 0 (ARM8 has a write-back cache, but flushing it is quite expensive). * In the new pmap, make pmap_pte_init_arm9() reset the PT cache mode to C=1,B=0, since the write-back check in generic gets it wrong for ARM9, since we use write-through mode all the time on ARM9 right now. (What this really tells me is that the test for write-through cache is less than perfect, but we can fix that later.) * Add a new pmap_pte_init_sa1(). Old pmap, same as generic. New pmap, does generic initialization, then resets page table cache mode to C=1,B=1, since C=1,B=0 does not produce write-through on the SA-1.	2003-04-22 00:24:48 +00:00
scw	41a1932e58	Add the generic arm32 bits of the new pmap, contributed by Wasabi Systems. Some features of the new pmap are: - It allows L1 descriptor tables to be shared efficiently between multiple processes. A typical "maxusers 32" kernel, where NPROC is set to 532, requires 35 L1s. A "maxusers 2" kernel runs quite happily with just 4 L1s. This completely solves the problem of running out of contiguous physical memory for allocating new L1s at runtime on a busy system. - Much improved cache/TLB management "smarts". This change ripples out to encompass the low-level context switch code, which is also much smarter about when to flush the cache/TLB, and when not to. - Faster allocation of L2 page tables and associated metadata thanks, in part, to the pool_cache enhancements recently contributed to NetBSD by Wasabi Systems. - Faster VM space teardown due to accurate referenced tracking of L2 page tables. - Better/faster cache-alias tracking. The new pmap is enabled by adding options ARM32_PMAP_NEW to the kernel config file, and making the necessary changes to the port-specific initarm() function. Several ports have already been converted and will be committed shortly.	2003-04-18 11:08:24 +00:00
thorpej	23bc250391	Merge the nathanw_sa branch.	2003-01-17 21:55:23 +00:00
bjh21	a531a4ae8e	Undo recent cpu_switch register usage changes in order to decrease nathanw_sa merge pain.	2002-10-19 00:10:53 +00:00
bjh21	7dd8880e90	The grand cpu_switch register reshuffle! In particular, use r8 to hold the old process, and r7 for medium-term scratch, saving r0-r3 for things we don't need saved over function calls. This gets rid of five register-to-register MOVs.	2002-10-18 23:06:33 +00:00
bjh21	3d1b6867f0	In cpu_switch(), stack more registers at the start of the function, and hence save fewer into the PCB. This should give me enough free registers in cpu_switch to tidy things up and support MULTIPROCESSOR properly. While we're here, make the stacked registers into an APCS stack frame, so that DDB backtraces through cpu_switch() will work. This also affects cpu_fork(), which has to fabricate a switchframe and PCB for the new process.	2002-10-18 21:32:57 +00:00
bjh21	441e8907fe	Switch to using the MI C versions of setrunqueue() and remrunqueue(). GCC produces almost exactly the same instructions as the hand-assembled versions, albeit in a different order. It even found one place where it could shave one off. Its insistence on creating a stack frame might slow things down marginally, but not, I think, enough to matter.	2002-10-15 20:53:38 +00:00
bjh21	d599df9587	Continue the " - . - 8" purge. Specifically: add rd, pc, #foo - . - 8 -> adr rd, foo ldr rd, [pc, #foo - . - 8] -> ldr rd, foo Also, when saving the return address for a function pointer call, use "mov lr, pc" just before the call unless the return address is somewhere other than just after the call site. Finally, a few obvious little micro-optimisations like using LDR directly rather than ADR followed by LDR, and loading directly into PC rather than bouncing via R0.	2002-10-14 22:32:50 +00:00
bjh21	3d91ec9fdd	Instead of "add rd, pc, #foo - . - 8", use either "adr rd, foo" or (where appropriate) "mov lr, pc". This makes things slightly less confusing and ugly.	2002-10-13 14:54:47 +00:00
bjh21	a7385c575f	Move curpcb into struct cpu_info in MULTIPROCESSOR kernels.	2002-10-12 12:20:08 +00:00
bjh21	6ae19cc8cd	Use ADR rather than an explicit ADD from PC.	2002-10-09 22:28:03 +00:00
bjh21	67ba9f99bf	Remove an outdated register assignment comment.	2002-10-08 23:48:24 +00:00
bjh21	3832819227	Minimal changes to allow a kernel with "options MULTIPROCESSOR" to compile and boot multi-user on a single-processor machine. Many of these changes are wildly inappropriate for actual multi-processor operation, and correcting this will be my next task.	2002-10-05 13:46:57 +00:00
thorpej	212cb9f78d	Add machine-dependent bits of RAS for arm32.	2002-08-31 03:07:32 +00:00
thorpej	003b8e8bca	More local label fixups.	2002-08-17 16:36:31 +00:00
thorpej	50fe583069	Must ... micro ... optimize! * Save an instruction in the transition from idle to have-process-to- switch-to, and eliminate two instructions that cause datadep-stalls on StrongARM And XScale (one in each idle block). * Rearrange some other instructions to avoid datadep-stalls on StrongARM and XScale. * Since cpu_do_powersave == 0 is by far the common case, avoid a pipeline flush by reordering the two idle blocks.	2002-08-17 01:08:21 +00:00
thorpej	ebff575bc3	* Add a new machdep.powersave sysctl, which controls the use of the CPU's "sleep" function in the idle loop. * Default all CPUs to not use powersave, except for the PDA processors (SA11x0 and PXA2x0). This significantly reduces inteterrupt latency in high-performance applications (and was good to squeeze another ~10% out of an XScale IOP on a Gig-E benchmark).	2002-08-16 15:25:53 +00:00
briggs	fa81e3d75e	* Use local label names (.Lfoo vs. (Lfoo or foo)) * When moving from cpsr, use "cpsr" instead of "cpsr_all" (which is provided, but doesn't make sense since mrs doesn't support fields like msr does).	2002-08-15 01:37:01 +00:00
thorpej	ad73349331	We only need to modify the CPSR's control field, so use cpsr_c rather than cpsr_all.	2002-08-14 23:23:06 +00:00
chris	f4c605201d	Tweak asm to avoid a couple of stalls.	2002-08-14 23:07:36 +00:00
thorpej	d7be866fc8	Rearrange the beginning of cpu_switch() slightly to reduce data-dep stalls on StrongARM and XScale.	2002-08-12 21:00:12 +00:00
thorpej	3d6f9f69ab	Make a slight tweak to register usage to save an instruction.	2002-08-12 19:33:01 +00:00
thorpej	0886c8cc0f	Rearrange the exit path so that we don't do a idcache_wbinv_all twice when a process exits.	2002-08-06 19:20:29 +00:00
thorpej	62d83d05b1	* Pass proc0 to switch_exit(), to make this a little more like the nathanw_sa branch. * In switch_exit(), set the outgoing-proc register to NULL (rather than proc0) so that we actually use the "exiting process" optimization in cpu_switch().	2002-08-06 17:44:35 +00:00
chris	a9e806ee0c	Implement scheduler lock protocol, this fixes PR arm/10863. Also add correct locking when freeing pages in pmap_destroy (fix from potr) This now means that arm32 kernels can be built with LOCKDEBUG enabled. (only tested on cats though)	2002-05-14 19:22:34 +00:00
thorpej	4e990d9ccb	Overhaul of the ARM cache code. This is mostly a simplification pass. Rather than providing a whole slew of cache operations that aren't ever used, distill them down to some useful primitives: icache_sync_all Synchronize I-cache icache_sync_range Synchronize I-cache range dcache_wbinv_all Write-back and Invalidate D-cache dcache_wbinv_range Write-back and Invalidate D-cache range dcache_inv_range Invalidate D-cache range dcache_wb_range Write-back D-cache range idcache_wbinv_all Write-back and Invalidate D-cache, Invalidate I-cache idcache_wbinv_range Write-back and Invalidate D-cache, Invalidate I-cache range Note: This does not yet include an overhaul of the actual asm files that implement the primitives. Instead, we've provided a safe default for each CPU type, and the individual CPU types can now be optimized one at a time.	2002-01-25 19:19:22 +00:00
thorpej	a2c8fc94fe	Provide a way for platforms to move away from the old RiscPC-centric interrupt code. Garbage-collect some unused stuff.	2001-11-29 17:14:02 +00:00
chris	165b023373	Give the idle loop a non-profiled entry, means it appears in profile info correctly (rather than all it's time being under remrunqueue) switch_exit only needs to take 1 parameter, it loads the value of proc0 into R1 itself Fixup some comments to reflect the real state of things. Tweak a couple of bits of asm to avoid a load delay. remove excess code for setting curpcb and curproc.	2001-11-19 20:38:58 +00:00
chris	8298c55eab	Correct comments for ffs algoritm (it isn't using register r0)	2001-11-11 22:07:41 +00:00
matt	d75fe4fc1e	Fix .type which uses wrong symbol name.	2001-09-16 17:38:08 +00:00
chris	27f96e8440	Move the generic arm32 files into arm/arm32 from arm32/arm32, tested kernel builds on cats and riscpc.	2001-07-28 13:28:03 +00:00

38 Commits