- alpha_rpcc(), alpha_mb(), alpha_wmb() -- these are instructions, and
we win by inlining them: rpcc is generally used for profiling, and
the memory barriers really should execute as quickly as possible with
minimal side-effects (like additional loads/stores required to call the
functions!)
- alpha_pal_imb(), alpha_pal_rdps(), alpha_pal_swpipl(), alpha_pal_tbi(),
alpha_pal_whami() -- these are PALcode ops. We must specify some register
clobbers for these.
We have a very decent size savings as a result. My test system:
text data bss dec hex filename
2671724 235848 377016 3284588 321e6c /netbsd.bak
2617708 235736 377016 3230460 314afc /netbsd
Most of this comes from fewer register saves/restores around spl*() calls
(now that alpha_pal_rdps() and alpha_pal_swpipl() are inlined).
Note that alpha_pal_rdps() and alpha_pal_swpipl() remain in pal.s to
maintain binary compatibility with LKMs that may use spl*() functions.
add an IPI which causes the target CPU to perform AST processing when
it returns to userspace.
- Add a way to get/set a private pointer in the shared interrupt header.
longer need to lock the kernel pmap in pmap_kenter_pa() and pmap_kremove().
- Since locking the kernel pmap in interrupt context is no longer required,
don't go to splimp() when locking the kernel pmap.
- Implement a new pmap_remove() function, not yet enabled by default. It
is structured like pmap_protect, and should be *much* faster. This was
actually written quite some time ago, but never committed because it
didn't work properly. Given the recent bugfix to pmap_protect(), "duh,
of course it didn't work properly before...". It seems to work fine now
(have done several builds and run the UVM regression tests using the new
code), but it is currently run-time optional so that some performance
measurements can be easily run against the old and new code.
big-endian. i386, pc532 and vax still include <machine/byte_swap.h>
and define macros for the {n,h}to{h,n}*() functions. mips also
defines some endian-independent assembly-code aliases for unaligned
memory accesses.
that is priority is rasied. Add a new spllowersoftclock() to provide the
atomic drop-to-softclock semantics that the old splsoftclock() provided,
and update calls accordingly.
This fixes a problem with using the "rnd" pseudo-device from within
interrupt context to extract random data (e.g. from within the softnet
interrupt) where doing so would incorrectly unblock interrupts (causing
all sorts of lossage).
XXX 4 platforms do not have priority-raising capability: newsmips, sparc,
XXX sparc64, and VAX. This platforms still have this bug until their
XXX spl*() functions are fixed.
context, so we must block interrupts which may cause memory allocation
before asserting the kernel pmap's lock. Put this all in PMAP_LOCK()
and PMAP_UNLOCK() macros to make it easier.
* Implement fpgetsticky() for alpha.
* Direct fpsetsticky() and fp{get,set}mask() into alpha kernel via sysarch(2).
* Define new sysarch(2) stub for above and install and distribute sysarch.h
for alpha. (The fpcr IS user mode r/w, but for reasons beyond the scope
of a commit message kernel calls are needed.) And much kernel Magick is
required before these do anything, but this way programs compiled under
1.4 will DTRT on future snapshots and releases.
allocated from a pool, and the MIPS and Alpha use KSEG to map pool
pages. So, mb_map wasn't actually being used. Saves around 4MB of
kernel virtual address space in a typical configuration.
Garbage-collect the related VM_MBUF_SIZE constant.
"BUS_SPACE_ALIGNED_POINTER()".
Equal to the param.h "ALIGNED_POINTER()" normally, but obeys additional
requirements of the bus_space_xxx_n() macros. (BUS_SPACE_DEBUG)
additional processors are spun up on multiprocessor Alpha systems.
Now, each processor gets its own idle thread (the primary processor
uses proc0). This idle thread is used in switch_exit(), rather than
explicitly referencing proc0.
Also, make `curproc', `fpcurproc', and `curpcb' per-cpu values. This
required some data structure rearrangement; cpu info is now statically
allocated in the BSS, rather than via malloc(), and cpu_softc is gone.
(Modeled somewhat after NetBSD/sparc's multiprocessor info structures.)
minor of libc and the major of libutil). For little-endian architectures
merge the bnswap() assembly versions with nto* and hton* using symbols
aliasing. Use symbol renaming for the bswap function in this case to avoid
namespace pollution.
Declare bswap* in machine/bswap.h, not machine/endian.h. For little-endian
machines, common code for inline macros go in machine/byte_swap.h
Sync libkern with libc.
Adjust #include in kernel sources for machine/bswap.h.
exported to the MI kernel. Almost everything here was formerly in cpu.h.
Optionally, this module could in the future be used to #include anything
that is always needed by arch/alpha modules.
They should not be visible to the MI kernel and the MI kernel shouldn't
depend on this junk. Most of it moves to new module <machine/alpha.h>.
Leave badaddr() here, though, because it's used so widely.
I just cannot add one more platform without getting sick.
Instead, we do just one table for all platforms. More-or-less,
it was only the A12 that even named it's individual interrupts anyway.
Also, prototype set_iointr() here. It's a slightly odd place, but 10*
better than the old place it was, and this file is included by exactly
the perfect set of .c files for set_iointr() visibility.
- cpu_set_kpc() now takes void *arg third argument, passed to the
entry point.
- cpu_fork() allows parent to be non-curproc iff parent is proc0.
When forking non-curproc, assume its state has already been saved.
- Adjust various pieces of machine-dependent code to account of all of this.
Alpha system, conditional on MULTIPROCESSOR.
NOTE: This does not yet work completely. The secondary CPU begins the
boot process, but never makes it into the cpu spinup trampoline. This
is merely a snapshot of a work-in-progress.
bus_space(9), if drivers want it (they shouldn't; easy to convert) they
can define it right before including bus.h. There's been a release since
the interfaces were (slightly) changed, and no code in the source tree
uses the old interfaces as far as I can tell.
tty structures, and on some machines (namely the DraCo internal lpt, and some
multi-i/o boards for Amigas and DraCos), tying spltty to the pretty high printer
interupt level would hurt serial performance.
On all affected ports but Amiga, spllpt() has been defined in machine/intr.h
to be spltty(), thus preserving old behaviour. Portmasters are encouraged to
change is, if they feel something else is better (e.g., one of its own were
possible).
New macros:
LOCATE_PCS(struct rpb *hwrpb, int cpu_number)
PCS_PROC_MAJORTYPE(struct pcs *)
PCS_PROC_MINORTYPE(struct pcs *)
Define LOCATE_PCS() to map (hwrpb, cpu_number) -> Per-Cpu-Slot structure.
Replace the PCS_PROC_{MAJOR,MINOR}{,SHIFT} stuff with macros that simply
return the major and minor cpu type codes.
address on 2 architectures anyhow. Also, move the definition of the `label_t'
type inside _KERNEL protection, since it is specific to the in-kernel
setjmp()/longjmp() implementations.
as with user-land programs, include files are installed by each directory
in the tree that has includes to install. (This allows more flexibility
as to what gets installed, makes 'partial installs' easier, and gives us
more options as to which machines' includes get installed at any given
time.) The old SYS_INCLUDES={symlinks,copies} behaviours are _both_
still supported, though at least one bug in the 'symlinks' case is
fixed by this change. Include files can't be build before installation,
so directories that have includes as targets (e.g. dev/pci) have to move
those targets into a different Makefile.
is defined, the bus_space macros will check to ensure that the bus address
and the target buffer (if applicable) are aligned properly for the size
of the type being used. If they are not, a message will be displayed on
the console.
Strict alignment is required by the Alpha architecture, and a trap will
occur of unaligned access is performed. These changes will aid debugging
of broken device drivers.
UVM pmap's locking protocol, written by Chuck Cranor. Not all of the
support for multiple processors is here yet, but the kernel does run
under moderate loads with LOCKDEBUG (all locking operations are no-ops
unless LOCKDEBUG is turned on).
This is by no means complete... there are still some possible snares
to take a look at.
member to the DMA tag, and calling the direct-mapped back-ends directly,
rather than through chipset-specific front-ends which pass the window
base as an additional argument.
caller to pass an optional 3rd argument, which is the previous level
PTE corresponding the virtual address. If this argument is non-NULL,
the table walk otherwise necessary will be bypassed.