process context ('reaper').
From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit
uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.
MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.
g/c now unneeded routines and variables, including the reaper kernel thread
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.
This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.
On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.
Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.
All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.
PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)
containing signal posting, kernel-exit handling and sa_upcall processing.
XXX the pc532, sparc, sparc64 and vax ports should have their
XXX userret() code rearranged to use this.
Right now the only flag is used to indicate if a ksiginfo_t is a
result of a trap. Add a predicate macro to test for this flag.
* Add initialization macros for ksiginfo_t's.
* Add accssor macro for ksi_trap. Expands to 0 if the ksiginfo_t was
not the result of a trap. This matches the sigcontext trapcode semantics.
* In kpsendsig(), use KSI_TRAP_P() to select the lwp that gets the signal.
Inspired by Matthias Drochner's fix to kpsendsig(), but correctly handles
the case of non-trap-generated signals that have a > 0 si_code.
This patch fixes a signal delivery problem with threaded programs noted by
Matthias Drochner on tech-kern.
As discussed on tech-kern. Reviewed and OK's by Christos.
which is automatically included during kernel config, and add comments
to individual machine-dependant majors.* files to assign new MI majors
in MI file.
Range 0-191 is reserved for machine-specific assignments, range
192+ are MI assignments.
Follows recent discussion on tech-kern@
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.
Bump the kernel rev up to 1.6V
http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html
There were some side-effects that I didn't anticipate, and fixing them
is proving to be more difficult than I thought, do just eject for now.
Maybe one day we can look at this again.
Fixes PR kern/21517.
space is advertised to UVM by making virtual_avail and virtual_end
first-class exported variables by UVM. Machine-dependent code is
responsible for initializing them before main() is called. Anything
that steals KVA must adjust these variables accordingly.
This reduces the number of instances of this info from 3 to 1, and
simplifies the pmap(9) interface by removing the pmap_virtual_space()
function call, and removing two arguments from pmap_steal_memory().
This also eliminates some kludges such as having to burn kernel_map
entries on space used by the kernel and stolen KVA.
This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code,
this giving MD code greater flexibility over the bounds of the managed
kernel virtual address space if a given port's specific platforms can
vary in this regard (this is especially true of the evb* ports).
cd ${KERNSRCDIR}/${KERNARCHDIR}/compile && ${PRINTOBJDIR}
This is far simpler than the previous system, and more robust with
objdirs built via BSDOBJDIR.
The previous method of finding KERNOBJDIR when using BSDOBJDIR by
referencing _SRC_TOP_OBJ_ from another directory was extremely
fragile due to the depth first tree walk by <bsd.subdir.mk>, and
the caching of _SRC_TOP_OBJ_ (with MAKEOVERRIDES) which would be
empty on the *first* pass to create fresh objdirs.
This change requires adding sys/arch/*/compile/Makefile to create
the objdir in that directory, and descending into arch/*/compile
from arch/*/Makefile. Remove the now-unnecessary .keep_me files
whilst here.
Per lengthy discussion with Andrew Brown.
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals
kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)
based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
clean up some other stuff along the way, including:
- use m68k/cacheops.*, remove duplicates from cpu.h.
- centralize a few declarations in (all the copies of) cpu.h.
- define M68K_VAC on platforms which have a VAC.
- switch the sun platforms to the (now common) proc_trampoline().
- do the phys_map thang on the sun platforms too, no reason not to.
This merge changes the device switch tables from static array to
dynamically generated by config(8).
- All device switches is defined as a constant structure in device drivers.
- The new grammer ``device-major'' is introduced to ``files''.
device-major <prefix> char <num> [block <num>] [<rules>]
- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.
- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.
- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.
- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.
- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.
counters. These counters do not exist on all CPUs, but where they
do exist, can be used for counting events such as dcache misses that
would otherwise be difficult or impossible to instrument by code
inspection or hardware simulation.
pmc(9) is meant to be a general interface. Initially, the Intel XScale
counters are the only ones supported.
be properly used by any misc. cloning device. While here, correct
a comment to indicate that "open" is the only entry point and that
everything else is handled with fileops.
- Switch all m68k-based ports over to __HAVE_SYSCALL_INTERN.
- Add systrace glue.
- Define struct mdproc in <m68k/proc.h> instead of <machine/proc.h>.
(They were all defined exactly the same anyway, other than a couple
of the MDP_* flags.)
into kernel_object where this was missing.
(important here because VM_MIN_KERNEL_ADDRESS != 0)
-add some diagnostics
-eliminate some differences to other Utah derived pmaps
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:
* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.
From art@openbsd.org.
Be consistant in the way that MSIZE, MCLSHIFT, MCLBYTES and NMBCLUSTERS
are defined.
Remove old VM constants from cesfic port.
Bump MSIZE to 256 on mipsco (the only one that wasn't already 256).
one, so that we don't mess up the global count of wired pages by having
the page's wire_count be non-zero when we free the page.
pointed out by Michael Hitch.
Any problems reported by testers have been fixed, and massive
cross-compiling of kernels has shown that any problems that remain
with actually building kernels are not related to this.
uvm_fault_wire(). this allows us to make pt_map non-pageable,
but we need to be careful in pmap_remove() not to attempt to
reference PTEs after the PTP has been freed.
the etc Makefile override that by putting USETOOLS into $.MAKEOVERRIDES
This way the default for kernel compiles is still to use the installed
toolchain instead of depending on $TOOLDIR. $TOOLDIR can be used by
simply adding USETOOLS=yes to the command line as usual.
Adjust each ports template to set the default no setting and also pull in
bsd.own.mk if they weren't already to ensure they'll build correctly
with the new toolchain setup.
This will allow improvements to the pmaps so that they can more easily defer expensive operations, eg tlb/cache flush, til the last possible moment.
Currently this is a no-op on most platforms, so they should see no difference.
Reviewed by Jason.
and with the comment '4.2BSD TCP/IP bug compat. Not recommended'
Add commented out 'TCP_DEBUG # Record last TCP_NDEBUG packets with SO_DEBUG'
(All hail amiga and atari which make some attempt to automate the
multiplicity of config files...)
option for System V semaphores. It appears that there are no overrides
in the code and each file has the following added.
options SYSVSEM # System V semaphores
+#options SEMMNI=10 # number of semaphore identifiers
+#options SEMMNS=60 # number of semaphores in system
+#options SEMUME=10 # max number of undo entries per process
+#options SEMMNU=30 # number of undo structures in system
options SYSVSHM # System V shared memory
If anyone thinks that this is incorrect for any of these files, please
correct it.
Note - the i386 port was not forgotten. It was done separately.
port. cesfic is a VME board with one or two mc68040 processors. See
the README file for details.
The port is working well with a.out userland, there are some problems
with ELF still, like applications running out of memory where it is not
expected. Some parts, in particular the pmap (which was taken from hp300
four years ago), need updating, but this is easier done within the NetBSD
CVS tree.