Commit Graph

307 Commits

Author SHA1 Message Date
nathanw 2cab03d64a In the fault handler, record growth of the stack, so that core dumps
actually contain the entire stack.
2002-09-21 00:29:04 +00:00
gehenna 77a6b82b27 Merge the gehenna-devsw branch into the trunk.
This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

	device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
  by using this grammer.

- Added the new naming convention.
  The name of the device switch must be <prefix>_[bc]devsw for auto-generation
  of device switch tables.

- The backward compatibility of loading block/character device
  switch by LKM framework is broken. This is necessary to convert
  from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
  We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
  the LKM framework will refer it to assign device major number dynamically.
2002-09-06 13:18:43 +00:00
jdolecek 8839507f5b whitespace fix past __KERNEL_RCSID() 2002-09-05 18:34:00 +00:00
thorpej 212cb9f78d Add machine-dependent bits of RAS for arm32. 2002-08-31 03:07:32 +00:00
thorpej 139cdc3125 Make nbuf, nswbuf, and bufpages unsigned. Make all operations on these
variables unsigned, and update places where their values are printed.
2002-08-25 20:21:33 +00:00
thorpej ffdedb6d80 In pmap_map_in_l1() and pmap_unmap_in_l1(), make sure that the VA
that is passed in is already aligned to a 4M super-section.
2002-08-24 03:10:40 +00:00
thorpej d158b3a37a When we allocate a PTP, make sure the offset we specify is for
the 4M super-section that the PTP will map, not some random 1M
chunk of it.  This gives the PTP hint code a much better chance
to working properly, and allows us to tidy up the code that
flushes a PTP from the cache in pmap_destroy().
2002-08-24 02:50:53 +00:00
thorpej 77a6866508 Enable caching on kernel and user page tables. This saves having
to do uncached memory access during VM operations (which can be
quite expensive on some CPUs).

We currently write-back PTEs as soon as they're modified; there is
some room for optimization (to write them back in larger chunks).
For PTEs in the APTE space (i.e. PTEs for pmaps that describe another
process's address space), PTEs must also be evicted from the cache
complete (PTEs in PTE space will be evicted durint a context switch).
2002-08-24 02:16:30 +00:00
thorpej 6cc7c1c1ff * Add PTE_SYNC() and PTE_SYNC_RANGE() macros. These don't actually do
anything yet.
* Use PTE_SYNC() and PTE_SYNC_RANGE() in some obvious places, i.e.
  where vtopte() is used.
2002-08-22 01:13:53 +00:00
thorpej 574a9cc019 Use a pool cache for PT-PTs. 2002-08-21 21:22:52 +00:00
thorpej 5fddbbe3d5 Do cached memory access to L1 tables, making sure to write-back the
cache after any L1 table modifications.
2002-08-21 18:34:31 +00:00
thorpej 003b8e8bca More local label fixups. 2002-08-17 16:36:31 +00:00
briggs 20267a208f Do not trim 'offset' from 'len' in _bus_dmamap_sync_linear(). 2002-08-17 05:14:10 +00:00
briggs d86c947b8c Inline bus_dma_inrange() and bus_dmamap_sync_*(). 2002-08-17 01:15:15 +00:00
thorpej 50fe583069 Must ... micro ... optimize!
* Save an instruction in the transition from idle to have-process-to-
  switch-to, and eliminate two instructions that cause datadep-stalls
  on StrongARM And XScale (one in each idle block).
* Rearrange some other instructions to avoid datadep-stalls on StrongARM
  and XScale.
* Since cpu_do_powersave == 0 is by far the common case, avoid a
  pipeline flush by reordering the two idle blocks.
2002-08-17 01:08:21 +00:00
thorpej ebff575bc3 * Add a new machdep.powersave sysctl, which controls the use of
the CPU's "sleep" function in the idle loop.
* Default all CPUs to not use powersave, except for the PDA processors
  (SA11x0 and PXA2x0).

This significantly reduces inteterrupt latency in high-performance
applications (and was good to squeeze another ~10% out of an XScale
IOP on a Gig-E benchmark).
2002-08-16 15:25:53 +00:00
briggs fa81e3d75e * Use local label names (.Lfoo vs. (Lfoo or foo))
* When moving from cpsr, use "cpsr" instead of "cpsr_all" (which is
   provided, but doesn't make sense since mrs doesn't support fields
   like msr does).
2002-08-15 01:37:01 +00:00
thorpej ad73349331 We only need to modify the CPSR's control field, so use cpsr_c rather
than cpsr_all.
2002-08-14 23:23:06 +00:00
chris f4c605201d Tweak asm to avoid a couple of stalls. 2002-08-14 23:07:36 +00:00
thorpej b45159bad0 When doing PREREAD sync operations, if the start and end addresses
of the range are aligned to a cacheline boundary, when do a dcache-inv
operation, rather than a dcache-wbinv operation.

XXX It could be a little smarter (align using wbinv, inv, then finish
up using wbinv), but even this simple change is good for a nearly 40%
improvement in my test case on XScale.
2002-08-14 22:56:55 +00:00
briggs a957deca48 G/c cowfault. 2002-08-14 21:52:36 +00:00
thorpej 203dd6b325 * Add an ARM32_DMAMAP_COHERENT flag to indicate that a loaded DMA
map contains "coherent" (non-cached in ARM-land) mappings.
* Set ARM32_DMAMAP_COHERENT in the map at the start of a load operation,
  and clear it in _bus_dmamap_load_buffer() if we encounter any cacheable
  mappings.
* In _bus_dmamap_sync(), if the map is marked COHERENT, skip any cache
  flushing.
2002-08-14 20:50:37 +00:00
thorpej d00a4a068d Whe making a mapping "coherent", clear *ALL* the cache bits, not
just L2_B and L2_C.
2002-08-14 19:21:50 +00:00
thorpej 98d6ec0b89 Add the brutal hack that allows us to limp along using the read/write
cache line allocation policy on XScale CPUs: in pmap_enter(), if the
pmap is the kernel pmap, clear the X-bit in the PTE, thus disabling
read/write-allocate for managed kernel mappings.

Yes, this is ugly.  But it makes userland code run with r/w-allocate,
which is a huge improvement on systems with low core memory performance.
2002-08-13 03:36:30 +00:00
thorpej d7be866fc8 Rearrange the beginning of cpu_switch() slightly to reduce data-dep
stalls on StrongARM and XScale.
2002-08-12 21:00:12 +00:00
bjh21 664bea62e3 __KERNEL_RCSID 2002-08-12 20:19:04 +00:00
bjh21 ca86069053 When pcb_onfault is set, pass the error code we get from uvm_fault()
(or EFAULT if we never called uvm_fault) to the onfault handler in R0,
in case it wants to use it.
2002-08-12 20:17:37 +00:00
thorpej 3d6f9f69ab Make a slight tweak to register usage to save an instruction. 2002-08-12 19:33:01 +00:00
bjh21 657216ff0f Remove a file which was accidentally resurrected. 2002-08-11 23:20:11 +00:00
bjh21 206c97ccc2 Move the arm32 copystr.S from arch/arm/arm32 to arch/arm/arm and add support
for 26-bit modes (basically saving R14 when we might get a page fault).
Use it on all ARM architectures now.
2002-08-11 23:17:24 +00:00
bjh21 b6228a7d06 New, improved version of copyin(), copyout(), and kcopy() by Allen Briggs.
This version works on both 26-bit and 32-bit machines.  For large copies,
it's up to three times as fast as the old arm32 version and five times as
fast as the old arm26 version.  For small copies it seems to be even faster
(getrusage() is apparently over ten times faster on an ARM610).

Hooray for Allen!
2002-08-11 21:19:12 +00:00
thorpej 76730bd0cc Tidy up pmap_clean_page() a little, and reenable some code that was
disabled previously: Skip cleaning mappings which are read-only, because
the pmap (now) does clean pages on a r/w -> r/o transition.
2002-08-10 00:48:35 +00:00
thorpej 006a578742 Clean up some warts in pmap_protect(). 2002-08-10 00:11:51 +00:00
thorpej 15a5e8f238 cpu_fork(): If PMCs are not enabled in the parent, clear the machine-
dependent PMC state in the child.
2002-08-09 23:44:17 +00:00
thorpej 6ce0a206cc Add an XSCALE_CACHE_READ_WRITE_ALLOCATE option for people who
want to play fast-and-loose.
2002-08-09 21:49:09 +00:00
thorpej 884bc64586 Add some code, conditional on PMAP_ALIAS_DEBUG, that can be used to
hunt for virtual aliases between managed (pmap_enter) and non-managed
(pmap_kenter_pa) mappings.
2002-08-09 18:22:59 +00:00
thorpej c979315325 Reduce stalls on StrongARM and XScale by waiting one insn before using
the result of a load.
2002-08-09 06:18:24 +00:00
thorpej afe3274eed Use ldrbt/strbt. Some other random cleanup. 2002-08-09 06:03:02 +00:00
thorpej 410785d6f0 Use ldrt/strt. 2002-08-09 04:13:20 +00:00
thorpej fdcc8560e4 Speed up bcopy_page() on the XScale slightly by using the "pld"
insn (prefetch) to look-ahead to the next chunk while we copy the
current chunk.

This could probably use a bit more tuning.
2002-08-07 16:21:29 +00:00
briggs 0b956d0b8b Implement pmc(9) -- An interface to hardware performance monitoring
counters.  These counters do not exist on all CPUs, but where they
do exist, can be used for counting events such as dcache misses that
would otherwise be difficult or impossible to instrument by code
inspection or hardware simulation.

pmc(9) is meant to be a general interface.  Initially, the Intel XScale
counters are the only ones supported.
2002-08-07 05:14:47 +00:00
thorpej 26bc8b27f4 - pmap_remove(): unmap the PTEs *after* we have finished with the
page tables.
- pmap_enter(): if making a mapping for the same PA rw->ro, write-back
  the cache before doing so.
- pmap_clearbit(): if revoking REF on a page, make sure to wbinv the
  cache if the page has write permission, else inv the cache if the page's
  PTE is valid (XXX we actually wbinv in this case, as well, due to lack
  of idcache_inv_range()).  Only flush the TLB if the PTE changed.
2002-08-06 21:43:51 +00:00
thorpej 0886c8cc0f Rearrange the exit path so that we don't do a idcache_wbinv_all *twice*
when a process exits.
2002-08-06 19:20:29 +00:00
thorpej 62d83d05b1 * Pass proc0 to switch_exit(), to make this a little more like the
nathanw_sa branch.
* In switch_exit(), set the outgoing-proc register to NULL (rather than
  proc0) so that we actually use the "exiting process" optimization in
  cpu_switch().
2002-08-06 17:44:35 +00:00
thorpej f7328ddbe7 Add dmoverio. 2002-08-02 00:50:25 +00:00
thorpej dce4476374 Overhaul how DMA ranges work in the ARM bus_dma implementation.
A new "arm32_dma_range" structure now describes a DMA window, with
a system address base, bus address base, and length.  In addition to
providing info about which memory regions are legal for DMA, the new
structure provides address translation support, as well.

As before, if a tag does not list any ranges, then all addresses are
considered valid, and no DMA address translation is performed.

This allows us to remove a large chunk of code which was duplicated and
tweaked slightly (to do the address translation) from the stock ARM
bus_dma in the XScale IOP and ARM Integrator ports.

Test compiled on all ARM platforms, test booted on Intel IQ80321 and Shark.
2002-07-31 17:34:23 +00:00
thorpej 79af00bddb Move the calls to uvm_page_physload() out of pmap_bootstrap() and
into platform-specific initialization code, giving platform-specific
code control over which free list a given chunk of memory gets put
onto.

Changes are essentially mechanical.  Test compiled for all ARM
platforms, test booted on Intel IQ80321 and Shark.

Discussed some time ago on port-arm.
2002-07-31 00:20:51 +00:00
thorpej d3aa5664b7 Move the uvm_setpagesize() call to platform-dependent code in preparation
for other changes to pmap_bootstrap().
2002-07-30 16:16:38 +00:00
thorpej 3dcad9ac9e Don't use pmap_kenter_pa() in pmap_map(); doing so causes an assertion
failure in pmap_kenter_pa().
2002-07-30 16:07:23 +00:00
thorpej 3ab4598cc0 Add sysmon at cdev 101. 2002-07-29 18:26:58 +00:00