that is priority is rasied. Add a new spllowersoftclock() to provide the
atomic drop-to-softclock semantics that the old splsoftclock() provided,
and update calls accordingly.
This fixes a problem with using the "rnd" pseudo-device from within
interrupt context to extract random data (e.g. from within the softnet
interrupt) where doing so would incorrectly unblock interrupts (causing
all sorts of lossage).
XXX 4 platforms do not have priority-raising capability: newsmips, sparc,
XXX sparc64, and VAX. This platforms still have this bug until their
XXX spl*() functions are fixed.
* New features:
+ traceback for threads (i.e., pids): db>trace/t 0t<pid>
+ traceback over console restart (halt and SRM continue)
+ print ipl in trapframes when it's known and it changed
+ print emulation and system call entry name (!) if proc is known
--- syscall (240, netbsd.sys_nanosleep) ---
- fix emitrules() like emitfiles() to deal with the prefix (otherwise it
would attempt to find the file in the normal base for the NORMAL_C rule).
- add emitincludes() which adds include directives for each prefix to the
$INCLUDES variable in the makefile.
- add %INCLUDES to each Makefile.arch to deal with the above.
this makes "prefix" actually work in a usable manner, and now i can move
on to fixing compiler warnings (errors) in the ESP code. :)
- remove "need-flag" for mac68k esp driver, as it is not used in anywhere
and conflicts with IPsec ESP header.
This should be the only MD change in IPv6 support, except kernel config file.
Very sorry if you have any compilation problem with it (I believe it is okay).
If your favorite arch is not included in here, please add a
call to ip6intr() from softintr handle.
has PAGEABLE and INTRSAFE flags. PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that. INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now). This will eventually
change now these maps are locked, as well.
managed pages, into KVA space. Since the pages are managed, we should
use pmap_enter(), not pmap_kenter_pa().
Also, when entering the mappings, enter with an access_type of
VM_PROT_READ | VM_PROT_WRITE. We do this for a couple of reasons:
(1) On systems that have H/W mod/ref attributes, the hardware
may not be able to track mod/ref done by a bus master.
(2) On systems that have to do mod/ref emulation, this prevents
a mod/ref page fault from potentially happening while in an
interrupt context, which can be problematic.
This latter change is fairly important if we ever want to be able to
transfer DMA-safe memory pages to anonymous memory objects; we will need
to know that the pages are modified, or else data could be lost!
Note that while the pages are unowned (i.e. "just DMA-safe memory pages"),
they won't consume any swap resources, as the mappings are wired, and
the pages aren't on the active or inactive queues.
context, so we must block interrupts which may cause memory allocation
before asserting the kernel pmap's lock. Put this all in PMAP_LOCK()
and PMAP_UNLOCK() macros to make it easier.
directly, call the function pointer (*if_input)(ifp, m). The input routine
expects the packet header to be at the head of the packet, and will adjust
as necessary. Privatize the layer 2 input and output routines, allowing
*_ifattach() to set them up as appropriate.
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.
This is required for clone(2).
unaligned access handler and clean it up some. Add support for emulating
the BWX instructions (ld{b,w}u, st{b,w}, sext{b,w}), which user software
can expect to be emulated. (Thanks, Alpha Architecture!)
register names was confusing, and could not _be_ correct in some cases.
Also, add a couple of 'generic' instruction formats which should be used
when decoding instructions before the specific format is known.
* Implement fpgetsticky() for alpha.
* Direct fpsetsticky() and fp{get,set}mask() into alpha kernel via sysarch(2).
* Define new sysarch(2) stub for above and install and distribute sysarch.h
for alpha. (The fpcr IS user mode r/w, but for reasons beyond the scope
of a commit message kernel calls are needed.) And much kernel Magick is
required before these do anything, but this way programs compiled under
1.4 will DTRT on future snapshots and releases.
in DDB (e.g. if a bad pointer was dereferenced; the debugger will recover).
- Change a comment to indicate that we are on the debugger stack when we get
to ddb_trap().
- Fix possible buglet in computation of the branch target in db_branch_taken().
happen. If the debugger doesn't handle the trap, arrange things so the
debugger won't be called again before we panic.
- Before panic'ing, give the debugger a chance to field the trap, and
if the debugger has handled things, allow the kernel to continue running,
like the i386 port does.
debugger differently.
- Pull in debugger glue if DDB is configured.
And one unrelated change, while I was here: Don't create a fake trapframe
for main(); it hasn't been used by main() for quite some time, and panic
if main() returns, because that's not supposed to happen now.
- Actually display the kn300 irq, not the MCPCIA irq, in the interrupt
string. Also, don't bother displaying device/pin on strays, since
it doesn't play will with shared interrupts that would happen due to
a PCI-PCI bridge.
- Shave a few more cycles out of the interrupt dispatch routine.
supposed to be Window 1, but a cut'n'paste error made it stomp over
Window 0, thus breaking ISA DMA. Fix this. (Confirmed to work with
floppy driver.)
While I'm here, do something I've been meaning to do for a while: change
Window 1 from a 1G at 2G to a 2G at 2G direct-mapped window, and add
a Window 2 of 1G at 1G SGMAP-mapped. Chain Window 2 to Window 1, and
use it as a fall-back for PCI DMA if the system has more than 2G of RAM.
The access is more efficient this way (and this was done in the interrupt
dispatch code, so some cycles are actually shaved), and gcc gets annoyed
when chars are used as array subscripts.
- Adjust for the fixed Rawhide console initialization.
- When mapping a PCI interrupt, don't always map device 1 to IRQ 16. Device
1 is only the internal 53c810 on MID 5, and is an invalid device number
on any other MID.
- Adjust for change mcpcia_config/mcpcia_softc structures.
- Nuke the kludgy linked list of mcpcia_softc structures. Instead, just
use savunit[v] to index into mcpcia_cd.cd_devs[] to find the MCPCIA
which has the stray interrupt.
- Some other minor cosmetic cleanup.
which holds state of the MCPCIA to which the console is attached.
- All MCPCIA info is now stored in the mcpcia_config structure; the
mcpcia_softc only contains a struct device and a pointer to one of these.
- If attaching the console MCPCIA, use the static configuration, else allocate
the substructure.
- Rename mcpcia_init() to mcpcia_init0(), and make it take a "mallocsafe"
argument.
- Implement a new mcpcia_init(), which looks for the MCPCIA which has the
EISA bridge attached. Initialize this MCPCIA as the console MCPCIA (the
console on the Rawhide is only allowed on this MCPCIA; firmware rule).
- Eliminate the kludgy linked listed of mcpcia_softcs. Just use mcpcia_cd
to find all configured instances.
Separate bug fix: Actually clear the MCPCIA error mask after probing for
PCI (and ISA) devices, don't just clear it twice in mcpcia_init0().
Some other slight cleanup.
MID order.
- Export the shuffled MID order; other files now need it.
- Don't derive the GID from the unit number of the mcbus. A user could
render his kernel non-bootable by using a different unit number in the
kernel config file. We (and the hardware) only support one MCBUS, so
simply use instance 0. Note that this will need to be adjusted if there
are even any multiple-MCBUS systems.
Instead of using the PROM console until autoconfiguration is complete (at
which time we called dec_kn300_cons_init() directly!), make this work like
basically all of the other systems which have PCI attached consoles. That
is, initialize the PCI chipset which holds the console early, and perform
console initialization at the correct time.
This should make both PCI and ISA display consoles with PC keyboards work
(i.e. the deskside workstation version of the Rawhide).
define a flag UVM_PGA_USERESERVE to allow non-kernel object
allocations to use pages from the reserve.
use the new flag for allocations in pmap modules.
are all the same, so eliminate the redundancy. also, use mrg's
"Version:" trick to find the version rather than using the RCS ID.
(I must have been having a ... bad day.) Also, bump boot and netboot
versions for all the changes that have been happening lately.
provides the correct functions for primary, secondary, and unified
boot blocks. actually behave correctly (e.g. expect correct arguments,
perform correct operations) depending on which you are. also
some minor cleanup.
guts were actually functionally equivalent to the current guts, but were
much larger, filled with bugs, and indeed poked around at the disklabel
when some of those bugs prevented them from ever using the disklabel!
had a few bugs fixed that let the problem slip in, and since bootxx's
Makefile now goes out of its way to satisfy installboot's undocumented
and totally unreasonable assumptions about the bootxx file it's operating
on. No point in fixing the assumptions, because sooner rather than later
this incarnation of installboot is going to die.
assuming that there's always going to be space for the whole boot
block info struct. (the assumption would cause a malloc'd region
to be overrun, if it proved false.)
call to __main(), and therefore saves the size of the call and the
size of a stub implementation of __main().
in the primary boot block, don't bother saving/restoring the argument
passed in from the caller. There is no such argument (that we care
about, at least) to the primary. (for secondary, it's the firmware
FD being used.)
Clean up the "Region 1" related definitions, and define load addresses,
max load size, and max total size for as many boot block types as we can.
(types = unified, primary, secondary). We can't always define all
values for all boot blocks, though.
Make CPP flags selection less gross.
Use objcopy rather than headersize (yay, evil gets a stake to the heart!).
Use a little shell script to verify that the sizes of the boot blocks are OK.
Do not compile too much more of libsa than we actually have to.
* Map the message buffer with access_type = VM_PROT_READ|VM_PROT_WRITE `just
because'.
* Map the file system buffers with access_type = VM_PROT_READ|VM_PROT_WRITE to
avoid possible problems with pagemove().
* Do not use VM_PROT_EXEC with either of the above.
* Map pages for /dev/mem with access_type = prot. Also, DO NOT use
pmap_kenter() for this, as we DO NOT want to lose modification information.
* Map pages in dumpsys() with VM_PROT_READ.
* Map pages in m68k mappedcopyin()/mappedcopyout() and writeback() with
access_type = prot.
* For now, bus_dma*(), pmap_map(), vmapbuf(), and similar functions still use
access_type = 0. This should probably be revisited.
allocated from a pool, and the MIPS and Alpha use KSEG to map pool
pages. So, mb_map wasn't actually being used. Saves around 4MB of
kernel virtual address space in a typical configuration.
Garbage-collect the related VM_MBUF_SIZE constant.
"BUS_SPACE_ALIGNED_POINTER()".
Equal to the param.h "ALIGNED_POINTER()" normally, but obeys additional
requirements of the bus_space_xxx_n() macros. (BUS_SPACE_DEBUG)
by two processors concurrently. This means that we cannot modify the
lev1map in use by the processor which wishes to use the PROM.
Fix this by creating a separate lev1map for PROM users. This lev1map
is a copy of the kernel_lev1map, with the exception of the necessary
PROM mapping. When a processor wishes to use the PROM, it switches
its PTBR to point at the prom_lev1map, performs the PROM operation,
and switches back to its previous lev1map.
Note that kernels without multiprocessor support use the old method
of modifying the current lev1map.
Also, serialize access to the PROM via a spin lock.
additional processors are spun up on multiprocessor Alpha systems.
Now, each processor gets its own idle thread (the primary processor
uses proc0). This idle thread is used in switch_exit(), rather than
explicitly referencing proc0.
Also, make `curproc', `fpcurproc', and `curpcb' per-cpu values. This
required some data structure rearrangement; cpu info is now statically
allocated in the BSS, rather than via malloc(), and cpu_softc is gone.
(Modeled somewhat after NetBSD/sparc's multiprocessor info structures.)
* fix the ancient nice(1) bug, where nice +20 processes incorrectly
steal 10 - 20% of the CPU, (or even more depending on load average)
* provide a new schedclk() mechanism at a new clock at schedhz, so high
platform hz values don't cause nice +0 processes to look like they are
niced
* change the algorithm slightly, and reorganize the code a lot
* fix percent-CPU calculation bugs, and eliminate some no-op code
=== nice bug === Correctly divide the scheduler queues between niced and
compute-bound processes. The current nice weight of two (sort of, see
`algorithm change' below) neatly divides the USRPRI queues in half; this
should have been used to clip p_estcpu, instead of UCHAR_MAX. Besides
being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op,
and it was done after decay_cpu() which can only _reduce_ the value. It
has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can
scheduler-penalize themselves onto the same queue as nice +20 processes.
(Or even a higher one.)
=== New schedclk() mechansism === Some platforms should be cutting down
stathz before hitting the scheduler, since the scheduler algorithm only
works right in the vicinity of 64 Hz. Rather than prescale hz, then scale
back and forth by 4 every time p_estcpu is touched (each occurance an
abstraction violation), use p_estcpu without scaling and require schedhz
to be generated directly at the right frequency. Use a default stathz (well,
actually, profhz) / 4, so nothing changes unless a platform defines schedhz
and a new clock. Define these for alpha, where hz==1024, and nice was
totally broke.
=== Algorithm change === The nice value used to be added to the
exponentially-decayed scheduler history value p_estcpu, in _addition_ to
be incorporated directly (with greater wieght) into the priority calculation.
At first glance, it appears to be a pointless increase of 1/8 the nice
effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that
because it will ramp up linearly but be decayed only exponentially, thus
converging to an additional .75 nice for a loadaverage of one. I killed
this, it makes the behavior hard to control, almost impossible to analyze,
and the effect (~~nothing at for the first second, then somewhat increased
niceness after three seconds or more, depending on load average) pointless.
=== Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation.
Collect scheduler functionality. Try to put each abstraction in just one
place.