Commit Graph

4105 Commits

Author SHA1 Message Date
atatat 5b22e79ada Remaining sysctl descriptions under kern subtree 2004-05-25 04:30:32 +00:00
jonathan 230fb9b8ab Eliminate several uses of `curproc' from the socket-layer code and from NFS.
Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded  by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize.   Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
2004-05-22 22:52:13 +00:00
atatat dcf1a79f46 Add a DIAGNOSTIC check to detect un-initialized pools. 2004-05-20 05:08:29 +00:00
nathanw 78c16ce8ed Adjust code that tries to prevent cc_microtime() from going backwards
so that it doesn't fire when called twice in the same microsecond,
which can lead to large error accumulation.

Appears to fix "repeated gettimeofday() goes backwards" on a fast
alpha and i386 box.
2004-05-18 16:09:07 +00:00
yamt efc80878d1 use lockstatus() instead of L_BIGLOCK to check if we're holding a biglock.
fix PR/25595.
2004-05-18 11:59:11 +00:00
yamt b4831906b2 introduce LK_EXCLOTHER for lockstatus().
from FreeBSD, but a little differently.  instead of letting lockstatus()
take an additional thread argument, always use curlwp/curcpu.
2004-05-18 11:55:59 +00:00
ragge ac1e5c0888 Fix connect() "bug": If connect() is interrupted by a signal, the connection
attempt is terminated,  so if a process needs frequent timer interrupts
it can't ever connect() to a machine far away.

Bug found by Erik Lundgren, bugfix (for the same problem) is similar to
the way FreeBSD solved the same problem.

As a side effect, the new connect() behaviour conformes to Posix.
2004-05-18 11:31:49 +00:00
christos d3f7c2a23c Check for bad offsets at the beginning of the functions to save processing.
Idea from OpenBSD.
2004-05-14 16:36:33 +00:00
kleink 71b3883248 KNF previous. 2004-05-13 17:56:14 +00:00
christos 6033f15f86 Disable chgsbsize. It is not MPSAFE 2004-05-13 17:43:11 +00:00
matt 617ba1df60 In proc_representative_lwp, if there is an outstanding trap signal, return
the lwp that had the trap.
2004-05-12 21:10:09 +00:00
yamt 054ed3afcb use callout_schedule() for schedcpu(). 2004-05-12 20:13:58 +00:00
cube 8a0e3b4be1 In sysctl_destroyv, the newly created dnode structure must have its
version set to the correct value to prevent later failure of
sysctl_cvt_in.
2004-05-12 12:21:39 +00:00
kleink 90c0c343b0 Regen from syscalls.master rev. 1.142:
POSIX-2001: Change readlink(2)'s return type from int to ssize_t.
2004-05-10 22:30:41 +00:00
kleink 43b7ae77fa POSIX-2001: Change readlink(2)'s return type from int to ssize_t. 2004-05-10 22:28:23 +00:00
yamt 68b4772ef6 redo the previous (rev.1.58; overwrite a duplicate entry rather than leave it)
differently so that entries entered during we're doing pool_get() are
checked as well.  pointed by Paul Kranenburg on source-changes@.
2004-05-07 12:05:41 +00:00
pk fba1aa540d Provide a mutex for the process limits data structure. 2004-05-06 22:20:30 +00:00
yamt 8d615f3e18 cache_enter: when we found a duplicate entry,
simply overwrite it rather than leaving a stale entry.
2004-05-06 22:02:02 +00:00
yamt f573d83f7a no need to cache_purge() in getnewvnode().
it should be already done by vclean().
2004-05-06 22:01:14 +00:00
atatat 778eadaf46 Add a printf() to the other case in sysctl_createv() where a node did
not get attached for what should be an extremely unusual case.
2004-05-06 07:06:46 +00:00
pk b2260877bf proc_reparent() must be called with proclist write lock held. Make it so. 2004-05-04 21:58:47 +00:00
pk 2fb3dac280 Since a `vmspace' always includes a `vm_map' we can re-use vm_map's
reference count lock to also protect the vmspace's reference count.
2004-05-04 21:33:40 +00:00
pk 3ec3f724be crcopy: no need to lock if we're only reading the structure's reference count. 2004-05-04 21:27:28 +00:00
pk f3f1104ed8 Change sigactsfree() to take a `struct sigacts' pointer, to fit the needs
of exit1 (its only client).
2004-05-04 21:25:47 +00:00
pk d190ac352d exit1: if !BIGLOCK, once the exiting process has been placed on the zombie
list and the proclist lock is released, we shouldn't touch the process
structure anymore, since it may be collected immediately by a waiting
parent.
2004-05-04 21:23:39 +00:00
martin efe61cce0d Fix a comment.
Approved by Andrew Brown.
2004-05-03 13:39:50 +00:00
pk 7d0afa7f41 Add mutex to protect the ucred reference counter. 2004-05-02 12:36:55 +00:00
pk 2834786715 Add a mutex for mount point I/O and wait counters (i.e. the `mnt_wcnt',
`mnt_writeopcountupper' and `mnt_writeopcountlower' members).
2004-05-02 12:21:02 +00:00
pk 5c36071518 cache_enter: concurrent lookups in the same directory may race for a
cache entry. Upon detection, free our tentative entry and return.
2004-05-02 12:00:34 +00:00
pk 1bc2407362 sys_access: use crdup(). 2004-05-02 11:13:29 +00:00
matt d1fcd75db0 Define link_sets start/stop as ptype * const [] since they are in a
readonly section.
2004-05-01 07:16:55 +00:00
matt a029630354 Commons are not allowed in header files. extern them and declare them in
the appropriate .c file.
2004-05-01 06:17:26 +00:00
matt a035030007 Use EVCNT_ATTACH_STATIC 2004-05-01 02:24:38 +00:00
enami a874187808 ANSI'fy the rest of functions. 2004-04-30 07:51:59 +00:00
simonb 01837603b0 Fix "comments within comments" problem pointed out by Geoff Wing on
source-changes.
2004-04-27 05:25:33 +00:00
kleink 3925dc263a Regen from syscalls.master 1.141: [gs]ettimeofday(2) argument declaration
change.
2004-04-27 01:15:38 +00:00
kleink 681b62c2ce POSIX-2001: Add restrict keywords to gettimeofday(2) and setitimer(2);
further deprecate struct timezone usage by changing `tzp' argument to
gettimeofday() to void *; align utimes(2) declaration by changing `times`
argument from struct timeval * to struct timeval[2].  From Murray
Armfield in PR standards/25331.

In due curse, reflect these changes in futimes(2), lutimes(2), and
settimeofday(2).
2004-04-27 01:12:44 +00:00
kleink 679cb3e5a5 Regen from rev. 1.140:
POSIX-2001: Change the `who' argument to [gs]etpriority(2) from int
to id_t.  Partially addressing PR standards/25216 from Murray Armfield.
2004-04-25 22:21:17 +00:00
kleink 3e7f30c118 POSIX-2001: Change the `who' argument to [gs]etpriority(2) from int
to id_t.  Partially addressing PR standards/25216 from Murray Armfield.
2004-04-25 22:18:08 +00:00
simonb b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
yamt ab195ed32f bio_doread: vp is always non-NULL here. 2004-04-25 12:41:12 +00:00
matt f86644a808 Constify the table argument to ttspeedtab. 2004-04-25 06:13:38 +00:00
atatat 3f800573aa Be consistent about using sysc_init_field() 2004-04-25 05:54:38 +00:00
atatat 990f278f7a Remove dynamic sysctl node version 0 from the tree. It seemed okay at
first, but quickly showed its shortcomings.  The version 1 node we're
now using should be good for a while.
2004-04-25 05:47:52 +00:00
simonb 9bc855a931 s/the the/the/ (only in sources that aren't regularly imported from
elsewhere).
2004-04-23 02:58:27 +00:00
yamt 05076bfbb9 chgsbsize: correct limit check and ui_sbsize calculation.
ok'ed by Christos Zoulas.
2004-04-23 02:13:29 +00:00
enami 45a4841ce9 Copy fsidx so that not to break binary compatibility of mountd etc. 2004-04-22 03:47:58 +00:00
matt e50668c7fa Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.
2004-04-22 01:01:40 +00:00
matt fde909e1a1 Add prototype for uiomove_frombuf. Change uiomove_frombuf to use size_t
for its length argument (to be the same as uiomove).  Remove code that
dealt with length being negative.
2004-04-21 20:31:50 +00:00
itojun d2f1c029b9 kill sprintf, use snprintf 2004-04-21 18:40:37 +00:00
christos 6bd1d6d4db Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
2004-04-21 01:05:31 +00:00
christos ed95f3e980 Charge root for socket buffers without a socket pointer. 2004-04-19 03:44:46 +00:00
lukem fad77af2ac Add "vfs.generic.fstypes" sysctl, which contains a space separate
list of file system types currently supported by the kernel.
Previously there wasn't an easy way to determine this.
(Code shamelessly cribbed from subr_disk.c::sysctl_hw_disknames().)

Use LIST_FOREACH() appropriately.
2004-04-19 00:15:55 +00:00
matt ac57eb9d5b Constify sun_noname. 2004-04-18 22:20:32 +00:00
matt 70e1f0d3ac ANSI'fy. 2004-04-18 21:48:15 +00:00
matt 91bb3497f5 Constify the addr parameter to sbappenaddr. 2004-04-18 21:47:11 +00:00
matt 8f23e3baa1 sbreserve can be called with a NULL socket, deal with it. 2004-04-18 16:38:42 +00:00
christos f13a3d0852 PR/9347: Eric E. Fair: socket buffer pool exhaustion leads to system deadlock
and unkillable processes.
1. Introduce new SBSIZE resource limit from FreeBSD to limit socket buffer
   size resource.
2. make sokvareserve interruptible, so processes ltsleeping on it can be
   killed.
2004-04-17 15:15:29 +00:00
atatat 904ca21614 Prefer that kern.hostid is printed in hex, not as a signed decimal,
and avoid accidental sign-extension when setting it.
2004-04-16 13:25:40 +00:00
pk f663e2397e checkalias: pass LK_NOWAIT to vget() while holding the spechash spinlock. 2004-04-16 09:59:32 +00:00
provos 2e7c8ca97f check process flags, noted by Stefan Esser 2004-04-09 16:49:33 +00:00
atatat 3a5915c0ae Lots of sysctl descriptions (if someone wants to help out here, that
would be good) mostly copied from sysctl(3).  This takes care of the
top-level, most of kern.* and hw.* (modulo the ath and bge stuff), and
all of proc.*.

If you don't want the added rodata in your kernel, use "options
SYSCTL_NO_DESCR" in your kernel config.
2004-04-08 06:20:29 +00:00
atatat a70c39ff35 Clear out the struct kinfo_drivers before stuffing things into it.
Avoids leaking garbage from the stack (left over from the earlier
call to sysctl_locate()).
2004-04-08 03:35:10 +00:00
atatat 275e2ae7f3 First caller to set a description on a node sets it. This allows one
setup function to set the description, even if the node has been
instantiated elsewhere.  Or not, depending on the other that the setup
functions are called.
2004-04-06 18:52:35 +00:00
yamt 4972de4cca make cache_purge more controlable.
namely, allow following operations.
	- purge only an entry specified by a component name.
	- purge only child entries.
	- purge only parent entries.
no objections on tech-kern@.
2004-04-05 10:20:52 +00:00
yamt 9aa8d354bd add assertions related to file descriptor allocation. 2004-04-05 10:10:29 +00:00
pk b3efee4b3b We use maxdmap and maxsmap, so remove comment questioning that. 2004-04-04 18:22:44 +00:00
matt 4dfb2be423 When a process is being traced (debugged) and a catchable signal arrives,
make sure to save its ksiginfo_t for eventual delivery.  This makes debugging
SA_SIGINFO signal handlers work.
2004-04-03 19:46:10 +00:00
matt dfe066e777 Add the notion of an "empty" ksiginfo_t (one where on signo is filled in).
Add an initializer for them: KSI_INIT_EMPTY
Add a predicate for them: KSI_EMPTY_P
Don't bother storing empty ksiginfo_t's since they have no information.
Change uses of KSI_INIT to KSI_INIT_EMPTY where no other information other
than the signo is being filled in.
2004-04-03 19:43:08 +00:00
matt 11d24b6c29 Replace memset's of ksiginfo_t with KSI_INIT (which is the proper way to
initialize ksiginfo_t structures).
2004-04-03 19:38:04 +00:00
matt 44708bbf1a If a signal is the result of trap, only invoke a supplied handler if it's
not blocked.  Otherwise (it if it blocked or the hanlder is set to SIG_IGN)
reset the signal back to its default settings so that a coredump can be
generated.
2004-04-01 16:56:44 +00:00
atatat f06d00c1a8 Add the standard "is this tree writeable" check to sysctl_describe()
and a comment to sysctl_destroy() about why the check is slightly
different there.
2004-04-01 04:50:06 +00:00
yamt f74afe6463 ras_fork: don't do PR_WAITOK holding a spinlock. 2004-04-01 02:37:42 +00:00
yamt b27349c286 ras_install: don't do pool_get(PR_WAITOK) while we're holding a spinlock. 2004-04-01 01:49:04 +00:00
matt b173c9d332 Make kernel continuations optional for now. 2004-03-28 22:43:56 +00:00
atatat d97889de23 Fix sysctl_createv() so that rnode and cnode can refer to the same
pointer.  Fix sysctl_create() so that nodes cannot be added to an
alias node.
2004-03-27 04:26:23 +00:00
petrov 2edf945c23 sys_sa_yield returns EJUSTRETURN. 2004-03-27 00:49:47 +00:00
jonathan 63fe9ef057 Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically.  Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)
2004-03-27 00:42:38 +00:00
drochner 945c30f4ab all ports define __HAVE_SIGINFO now, so remove the CPP conditionals 2004-03-26 17:13:37 +00:00
drochner 4f4ec7e627 regen after __HAVE_SIGINFO removal 2004-03-26 15:29:28 +00:00
drochner 9fd8e8983b all ports define __HAVE_SIGINFO now, so remove the CPP conditionals 2004-03-26 15:18:54 +00:00
simonb 1c13fd358f Give buf_lotsfree() a bit of a service:
- Fix a 32-bit overflow that could erroneously return true even if the
  currently allocated buffer memory was greater than the high water mark.
- Add an early check for bufmem > hiwater to avoid a needless call to
  random().
- Sprinkle some comments.

Add a vm.bufmem sysctl so the current bufmem value can be easily queried
from userland.

Reviewed by Thor Simon.
2004-03-26 00:31:55 +00:00
simonb 07056cd3d1 More white space nits. 2004-03-25 23:17:16 +00:00
enami c4a655ef80 Misc. style fix; white-space usage, comment style, ansify, multiple
include protection, rcsid, typo, ...
2004-03-25 23:02:58 +00:00
atatat 76f167c40b Set version in node destroy request 2004-03-25 22:16:04 +00:00
pooka 845a217e15 Convert pool_get()'s from nowait to waitok. We're allowed to block,
and this is more acceptable since the code assumes success.

gmcgarry ok
2004-03-25 22:08:33 +00:00
atatat 44afe14cb6 Unwind the nested designators for fields within structs within structs
(or unions).  This should really be put back once we're all using gcc3
for everything, since that makes it look a *lot* cleaner.
2004-03-25 18:36:49 +00:00
drochner bcb7a96b95 In exec_sigcode_map(), do nothing if the sigcode is of
size 0.
This way, individual ports can circumvent sigcode mapping
by setting sigcode/esigcode.
(would be better to clean up the __HAVE_SIGINFO/COMPAT_XX
stuff, but it is not a good moment now)
2004-03-25 18:29:24 +00:00
simonb c67d420cbf White-space nit. 2004-03-25 08:22:31 +00:00
pooka 8a7ed44002 * replace incorrect M_WAITOK flag from pool_get() by proper PR_WAITOK
and remove redundant check for NULL return value
* switch pool page allocator to nointr allocator

jdolecek sayeth ok
2004-03-24 20:25:28 +00:00
atatat 38c4183b04 Implement sysctl descriptions. Now all that remains is actually to
write them.
2004-03-24 18:11:09 +00:00
atatat 5aab77f087 Framework for sysctl descriptions. Implementation to follow shortly. 2004-03-24 17:40:02 +00:00
atatat c6abd47f96 New node version and layout. This should take care of the netbsd32
emulation problem, formalizes the versioning (should it ever be needed
again), and provides a slot for descriptions.
2004-03-24 17:21:02 +00:00
atatat 289b641ef9 Implement sysctllog and sysctl_teardown(), which unwinds the log. 2004-03-24 16:55:49 +00:00
atatat d42aae36c0 The new sysctl query interface returns the same information as the old
one, but you must pass in an empty node that indicates the version
you're using.
2004-03-24 16:34:34 +00:00
atatat 19af35fd0d Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
2004-03-24 15:34:46 +00:00
atatat 70057f1d4e That copystr() should be copyinstr(), and fix a couple of places where
aliasing needs to be avoided.
2004-03-24 15:25:43 +00:00
pooka cca4b8c09d tyop in comment 2004-03-24 10:01:46 +00:00
matt 3549b49655 Don't use malloc/free for fixed sized items, use a pool instead. 2004-03-24 01:27:57 +00:00
junyoung fdc32973e7 - Nuke __P().
- Drop trailing spaces.
2004-03-23 13:22:32 +00:00
junyoung a222c81884 Nuke __P(). 2004-03-23 13:22:03 +00:00
cl d636a8b9cb On MP, exit postsig() when another LWP has already handled the signal while
this LWP was waiting for the kernel lock.

Fixes PR kern/24829
2004-03-21 18:41:38 +00:00
mycroft 9f9d44127e Remove part of a very old change that caused NFS to not enforce socket buffer
limits.  No idea why it was done in the first place.

Don't remember who reported this, but I think it was yamt.
2004-03-21 00:54:46 +00:00
he cbeffeb007 Make this compile on platforms which do not define
__HAVE_GENERIC_SOFT_INTERRUPTS, such as sun3.
2004-03-20 18:34:57 +00:00
martin c72dda16a9 Include <lib/libkern/libkern.h> for KASSERT. 2004-03-20 10:39:21 +00:00
snj 089c0cfc10 Fix typos. 2004-03-20 02:57:34 +00:00
jonathan ec8cb83cd3 Initial import of kcont(9), as posted to tech-kern for discussion in
January 2004. This version also incorporates fixes (several typos and
other detailed improvemnts) commented upon by Nathan J Williams.
2004-03-20 02:22:49 +00:00
enami 55c19744c4 - remove unnecessary code.
- factor out common code.
- don't stop searching before the target.
- touch the correct object.
- validate the argument before the loop otherwise we need to roll back.
2004-03-18 22:57:38 +00:00
enami a67d24818d Whitespace nits and wrap some lines. 2004-03-18 22:53:16 +00:00
christos b4d69b5716 PR/24814: Colin Percival: sysv_sem waiter counting problem 2004-03-18 01:16:44 +00:00
yamt 639cdf812b sokvaalloc: unreserve kva if uvm_km_valloc_wait failed. 2004-03-17 10:30:18 +00:00
yamt 82b343cc81 - move kern.somaxkva sysctl stuff from init_sysctl.c to uipc_socket.c.
- when changing its value, wakeup sokva waiters.
2004-03-17 10:21:59 +00:00
yamt 097a3aea2e - fix locking of sosend kva allocation.
- some comments.
2004-03-17 10:03:26 +00:00
yamt 2429c10607 remove per-socket pendfree list. 2004-03-17 09:58:15 +00:00
cl ea5ec0212d add kernel part of concurrency support for SA on MP systems
- move per VP data into struct sadata_vp referenced from l->l_savp
  * VP id
  * lock on VP data
  * LWP on VP
  * recently blocked LWP on VP
  * queue of LWPs woken which ran on this VP before sleep
  * faultaddr
  * LWP cache for upcalls
  * upcall queue
- add current concurrency and requested concurrency variables
- make process exit run LWP on all VPs
- make signal delivery consider all VPs
- make timer events consider all VPs
- add sa_newsavp to allocate new sadata_vp structure
- add sa_increaseconcurrency to prepare new VP
- make sys_sa_setconcurrency request new VP or wakeup idle VP
- make sa_yield lower current concurrency
- set sa_cpu = VP id in upcalls
- maintain cached LWPs per VP
2004-03-14 01:08:47 +00:00
cl f1bacc8b38 disable SA upcalls during "systrmsg" sleep
-> improves problem from PR bin/23429
2004-03-14 00:48:58 +00:00
cl 63fe298156 regen after:
g/c sys_sa_unblockyield which has been unused since 2004/01/02
2004-03-14 00:47:25 +00:00
cl 919b9e33c4 g/c sys_sa_unblockyield which has been unused since 2004/01/02 2004-03-14 00:45:21 +00:00
matt 879040549d Only do the pmap_procwr if the uvm_io succeeded. 2004-03-13 18:43:18 +00:00
christos 7bd0e983e2 PR/24750: Frank Kardel: panic when process is signalled during
proc initialization.
2004-03-11 22:34:26 +00:00
christos cde926b610 PR/24745: Jared Momose: kernel prompts for a root device when using md_root 2004-03-11 15:17:55 +00:00
yamt f75335b469 - add a function prototype.
- consitify.
2004-03-09 12:23:07 +00:00
yamt cd9b5b72f5 m_cat: assert mbuf types only when coalescing them by copying.
mbuf n often have 0-sized "headers" and their types don't matter much.

PR/24713 from Darrin B. Jewell.
2004-03-09 06:37:59 +00:00
dbj 7a30c4a987 add more spltty() calls around TTY_LOCK/UNLOCK where needed 2004-03-09 05:30:24 +00:00
junyoung 70706199eb Whitespaces. 2004-03-09 02:35:45 +00:00
dbj 436daafe7e add splvm() around a few pa_slock and psppool calls since they
may be shared with pools that can be used in interrupt context.
2004-03-08 22:48:09 +00:00
atatat 73c41a46cc Some optimization for sysctl_locate() 2004-03-08 03:31:26 +00:00
junyoung 0f89803028 Drop trailing spaces. 2004-03-05 11:30:50 +00:00
junyoung 103afd6ebf lwp_exit2(): set lwp state to SZOMB at more appropriate point. 2004-03-05 11:17:41 +00:00
dbj f8e0478668 add some spltty() calls around TTY_LOCK() calls that didn't have them 2004-03-05 07:27:22 +00:00
matt 1b4f540b78 Look at _UC_STACK to decide whether the process' SS_ONSTACK state needs to
be updated.  (This is needed to be compatible with how pre-SIGINFO signals
operated.  If you siglongjmp out of a signal handler, the SS_ONSTACK state
needs to be cleared.  This commit restores that functionality).
2004-03-04 00:05:58 +00:00
dsl 1288fac2ba No need to initialise [rw]pipe twice.
Initialise locks before trying to allocate pipe buffer, when allocate
fails we'll not explode trying to acquire the locks when tidying up.
2004-03-03 22:00:34 +00:00
christos 08230af71c initialize rpipe and wpipe to NULL, so that they are initialized in the
error path.
2004-03-03 21:35:52 +00:00
yamt 471ef5f249 once exit1() releases big kernel lock, the struct proc can be freed and
re-used by another cpu immediately.  in that case, lwp_exit2() will
access freed memory.  to fix this:

- remove curlwp from p_lwps in exit1() rather than letting lwp_exit2() do so.
- add assertions to ensure freed proc has no lwps.

kern/24329 from me and kern/24574 from Havard Eidnes.
2004-03-02 09:15:26 +00:00
yamt 395e9958f2 change the way to handle NEW_BUFQ_STRATEGY option.
instead of putting #ifdefs into each drivers,
use a global variable to indicate default strategy.

XXX should have a way to specify other strategies.
2004-02-28 06:28:47 +00:00
junyoung d177d4c744 More typos in comments. 2004-02-27 02:43:25 +00:00
junyoung c5a0b24bb5 pgrpdump() is gone. 2004-02-26 11:29:41 +00:00
junyoung 213495299b - Fix typos.
- De-__P().
- Remove trailing spaces.
2004-02-26 11:20:08 +00:00
jdolecek 52197d307a pipelock() must release the pipe simplelock during tsleep()
fixes PR kern/24551 by Havard Eidnes
2004-02-26 08:15:31 +00:00
itojun efcd57f822 m_cat() - if it is safe, copy data portion into 1st mbuf even if 1st mbuf
is M_EXT mbuf.
2004-02-26 02:30:04 +00:00
enami dab2cb5bb0 Whitespace nits. 2004-02-25 21:40:40 +00:00
enami f7b4bb80a5 Make ktrwrite() and ktrinitheader() private again. ktrsyscall32() no longer
exists.
2004-02-25 21:34:18 +00:00
dbj 5fd36718ae fix typo in comment s/MNT_LAXY/MNT_LAZY/ 2004-02-25 04:10:28 +00:00
christos 7088db9a48 remove error(1) comment. 2004-02-24 20:57:26 +00:00
wiz f05e6f1a3a occured -> occurred. From Peter Postma. 2004-02-24 15:12:51 +00:00
jdolecek 4d49760268 use the new NOTE_SUBMIT to flag if the locking is necessary
for EVFILT_READ/EVFILT_WRITE knotes

fixes PR kern/23915 by Martin Husemann (pipes), and similar locking problem
in tty code
2004-02-22 17:51:25 +00:00
jdolecek 2ae728e7ef mount(2): if vinvalbuf() fails, we must also vput() the mountpoint vnode
fixes stale vnode lock after attempt to mount something on a NTFS directory
2004-02-22 09:56:26 +00:00
dan 5819919614 micro-optimisation - if we're going to return 0, do so before doing
other unnecessary work
2004-02-22 01:00:41 +00:00
enami 06107df871 Modify pool page header allocation strategy as follows:
In addition to current one (i.e., don't wast so large part of the page),
- if the header fitsin the page without wasting any items, put it there.
- don't put the header in the page if it may consume rather big item.

For example, on i386, header is now allocated in the page for the pools
like fdescpl or sigapl, and allocated off the page for the pools like
buf1k or buf2k.
2004-02-22 00:19:48 +00:00
atatat 56392ab40b Use KERN_PROCSLOP for struct kinfo_proc and KERN_LWPSLOP for
struct kinfo_lwp, and not vice versa.

Should solve the issue with top dying because it's unable to "allocate
memory".
2004-02-21 03:27:57 +00:00
atatat 42d379d041 Use new PTRTOUINT64() macro instead of local PTRTOINT64() macro. 2004-02-19 03:57:56 +00:00
atatat caea20e952 Add PTRTOUINT64() and UINT64TOPTR() macros to sys/sysctl.h for use by
kern.proc, kern.proc2, kern.lwp, and kern.buf.

Define more MIB for kern.buf so that specific buffers can be selected
(only all/all is supported right now), and use a 32/64 bit agnostic
structure for communcating buffer information to userland.

Convert systat to the new kern.buf method.

Clean up the vm.buf* handling a little.  There's no actual need to
record the dynamically assigned OIDs, since sysctl_data can tell us
what we're looking at.

Oh, and fix a typo in a comment.
2004-02-19 03:56:30 +00:00
matt cb57c6f8e9 Move detection of a special symbol into a separate function. Add some more
special symbols.
2004-02-19 03:42:01 +00:00
matt f01501b2c6 Support really large LKMs. Find out how much space is needed for symbols
and then allocate it on demand.  Rename some common symbols (__bss_start,
_edata, _end, __start_link_set_*, __stop_link_set_*) so that ".<module>"
is appended to them.  This shrinks an amd64 kernel by 20KB of BSS.
2004-02-18 23:44:49 +00:00
matt 004f0d503a s/sumbols/symbols/ 2004-02-18 20:41:09 +00:00
hannken c59d4851b8 Run pmap_deactivate() earlier in exit1(). Prevents a panic on sparc MP
where p->p_vmspace was 0xdeadbeef in pmap_deactivate().

Approved by: YAMAMOTO Takashi <yamt@netbsd.org>
2004-02-18 14:42:20 +00:00
simonb d7ee872c5f Don't shadow a function name with a parameter. 2004-02-17 11:36:01 +00:00
tron 7008209ace Include "sys/systm.h" to get the prototype for panic() which is required
for diagnostic kernels.
2004-02-17 08:22:12 +00:00
rtr 8845b1e975 split off the evcnt code (which is unrelated to autoconfiguration)
into a separate file

approved by simonb@
2004-02-17 05:03:15 +00:00
enami 456851e71a Some whitespace fix. 2004-02-17 01:45:34 +00:00
enami d59c88c291 The vnode capability id is gone. 2004-02-17 01:35:33 +00:00
enami 6a268a570b Rewind the `bp' advanced backward by cache_revlookup() if getcwd_getcache()
finally returns cache miss.

# Slightly modified from posted version so that it is cleanly patchable
# at least on 1.6 branch.
2004-02-17 01:29:39 +00:00
yamt 0e9e078e22 - raise ipl when calling buf_canrelease() because it traverses buffer queue.
- correct/add comments on buf_canrelease().
2004-02-16 09:34:15 +00:00
jdolecek 159f41eca4 allocate wired memory for the marker kevent in kqueue_scan() instead
of using on-stack memory, so that this wouldn't eventually cause kernel
panic if the process get swapped out and another process runs kqueue_scan()
problem pointed out in kern/24220 by Stephan Uphoff
2004-02-14 11:56:28 +00:00
hannken 142e9d5deb Add a generic copy-on-write hook to add/remove functions that will be
called with every buffer written through spec_strategy().

Used by fss(4). Future file-system-internal snapshots will need them too.

Welcome to 1.6ZK

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
2004-02-14 00:00:56 +00:00
wiz d20841bb64 Uppercase CPU, plural is CPUs. 2004-02-13 11:36:08 +00:00
enami 7ff66821f4 Also defer the writing of KTR_EMUL entry. Otherwise, the parent process
may sleep with setting KTRFAC_ACTIVE of child process and the child will
run without emitting any ktrace entry.
2004-02-12 23:47:21 +00:00
tls eb9b96577c Fix bug noted by yamt@netbsd.org: the UVM free target is in *pages*,
so the last change has us comparing pages to bytes instead of pages
to buffers!  The consequence was to try to free radically less memory
than UVM wanted us to -- though always at least one buffer, which is
probably why the results weren't dire.

This does suggest that buf_canrelease() could be a *lot* more
conservative about how much to release than "2 * page deficit".  In
fact, serious trouble seems to ensue if it's not -- when anything
else on the system demands enough pages, we slam down to the low
water mark nd stay there.  I've adjusted it to use min(page defecit,
buffer memory / 16), which still isn't quite right but seems better.

Another change: consider the case of an infinite loop that does
"tar xzf pkgsrc.tar.gz ; rm -rf pkgsrc".  Each time the rm runs,
all the dead metadata will go on the AGE list -- and, until we hit
the high-water mark, stay there, at which point it may be slowly
recycled.  Two adjustments seem to solve this:  1) whack buf_lotsfree()
to return 0 if there's anything on the AGE list; 2) whack buf_canrelease()
to count the memory used by the AGE list and always return at least
that much.

This basically turns the AGE list into a "delayed free" list, since we
can't entirely eliminate it as we can't free pool items from interrupt
context (e.g. from biodone()).

To consider: with the bookkeeping corrected, should buf_drain() move
back to the _end_ of the pagedaemon, and should the calculation then
try to give back at least the current defecit?
2004-02-11 17:36:31 +00:00
yamt 1e18e59746 - borrow vmspace0 in uvm_proc_exit instead of uvmspace_free.
the latter is not a appropriate place to do so and it broke vfork.
- deactivate pmap before calling cpu_exit() to keep a balance of
  pmap_activate/deactivate.
2004-02-09 13:11:21 +00:00
yamt fa47baddee lwp_exit2: grab kernel_lock to preserve locking order. 2004-02-09 13:02:48 +00:00
yamt a45adbd9c7 don't deactivate pmap in exit1 because we'll touch the pmap later.
instead, borrow vmspace0 immediately before destroying the pmap
in uvmspace_free.
2004-02-07 10:05:52 +00:00
christos 13057976a6 include <uvm/uvm_object.h> for the benefit of ports that don't include
it in <machine/pmap.h>
2004-02-06 13:46:27 +00:00
junyoung 48d5030e12 ANSIfy & zap some blank lines. 2004-02-06 08:08:46 +00:00
junyoung 9a410f9ed0 Rename es_check in struct execsw to es_makecmds. 2004-02-06 08:02:58 +00:00
pk f092315b50 pg_delete: re-arrange SESSRELE() calls to allow for better code generation. 2004-02-06 06:59:33 +00:00
pk 7026ce08c8 ioctl TIOCSCTTY: re-arrange SESSHOLD() calls to allow for better code generation. 2004-02-06 06:58:21 +00:00
christos 0283dcd8a6 - Don't use uao_ functions directly; use them through the pgops methods.
- Fix missing reference leak in the error path of shmat() mentioned in
  Full-Disclosure.
2004-02-05 22:28:33 +00:00
christos 6b1b54b981 Don't use uao_reference, directly use the pgops instead. XXX: we should
prolly make all the uao_ functions used in pgops static.
2004-02-05 22:26:52 +00:00
tls aeaf748ff2 Buffer cache fixes to avoid thrashing between high and low water marks
and uncontrolled growth.

The key fix is from Dan Carasone, who noticed that buf_canfree() was
counting in _bytes_ but freeing in _buffers_, which caused the instant
drop to lowater observed by some users.

We now control the rate of growth; the probability of getting a new
allocation is inversely proportional to the current size of the
cache.  This idea is from a long-ago conversation with Kirk McKusick
and, if memory serves, was used for the file-system cache in some
other BSD variant at some point in history.

With growth and shrinkage more or less dealt with, we return the
default maximum cache size to 15%.  The default _minimum_ cache size
is raised from 1/16 of the maximum cache size to 1/8, since 1/16 was
chosen when the maximum size was 30% of memory.

Finally, after observing the behaviour of the pagedaemon and the
buffer cache drainer under pathological workloads (e.g. a benchmark
that steps through 75% of available memory backwards) I have moved
the call to buf_drain() to the beginning of the pagedaemon from the
end; if the pagedaemon bogs down, it still won't get run as often
as it should, but at least this way it will see the state of the
free count and free target _before_ the scan step does its thing.
2004-01-30 11:32:16 +00:00
tsarna 72489e1ea0 uuidgen(2) syscall. Originally from FreeBSD, ported by John Franklin in
PR#23470, with minor updates by me. This is only the syscall support
from that PR, for now.

Changes: port over fix from FreeBSD for multicast address generation.
Changed bcopy to memcpy.  For now, #ifdef notyet the portions of
kern_uuid.c that are meant to be used by (currently nonexistent) other
things in the kernel.  Added syscall to COMPAT_FREEBSD as well, though
that's currently not useful, as any program new enough to use this call
also uses other syscalls we don't (yet) emulate.
2004-01-29 02:00:02 +00:00
dan c6ba3edf9d Reduce the default BUFCACHE to 10% for now. Too many users are
tripping over this getting too large, and suffering other performance
problems due to the lack of good backpressure shrinking the bufcache
when other memory is required.  Again, this tunable should be
revisited when the backpressure mechanism has been improved.

sysctl vm.bufcache can be used to manually tune those rare machines
that might need more than this.

See comments in rev 1.106 for more detail.
2004-01-27 11:35:23 +00:00
hannken 3db4e2acd8 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:06:48 +00:00
hannken d7f6cbf8bc Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern. 2004-01-25 18:02:04 +00:00
hannken b1cb363c11 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:02:03 +00:00
wiz f62661e104 Add semicolons after variable declarations; closes PR 24201. 2004-01-24 01:40:57 +00:00
simonb 2763a4b916 Fix NTP PPSAPI support (enabled with "options PPS_SYNC"):
From PR kern/13702 from Charles Carvalho.  Tested on alpha and
i386 with a Laipac TF10 PPS-capable GPS.  The com.c change was
copied wholesale from Charles' z8530tty.c patch.
2004-01-23 05:01:19 +00:00
atatat 4fe5b245f9 Fix the kern.mbuf tunables. 2004-01-21 02:11:20 +00:00
yamt ce0a402d3c bufpool_page_alloc: for no-wait allocations, specify UVM_KMF_TRYLOCK as well. 2004-01-19 11:57:42 +00:00
atatat 47768f08f2 In sysctl_locate(), use "rnode" like everywhere else, don't call it
"rv".

In sysctl_destroyv(), deal with deleting alias nodes, and pass a token
size_t to sysctl_destroy().

In sysctl_free(), check that "node" has not reached "rnode", not that
"pnode" has.

In sysctl_realloc(), don't bother setting sysctl_clen...the value is
unchanged.
2004-01-17 04:01:14 +00:00
atatat 5d3b89e2f4 Avoid dereferencing l...it might be NULL 2004-01-17 03:33:24 +00:00
yamt 047fc0b378 - fix locking order problem. (pa_slock -> pr_slock)
- protect pr_phtree with pr_slock.
- add some LOCK_ASSERTs.
2004-01-16 12:47:37 +00:00
mrg 23884e8622 clean up a little:
- delete ktrsyscall32()
- add a check #ifdef _LP64 to do the conversion if P_32 is set to the
standard ktrsyscall()
- add a couple of similar _LP64/P_32 checks to the systrace code.

this should get systrace working for 32 bit apps as well as complete
ktrace support for "trace_enter/trace_exit" using platforms such as amd64.

XXX: systrace isn't supported on sparc64 currently... (it doesn't use
trace_enter/trace_exit, or have it's own calls to systrace_xxx()...)
2004-01-16 05:03:02 +00:00
mrg 4c2a3c644a export ktrinitheader() and ktrwrite() for ktrsyscall32(), which is used
to write 32 bit syscall arguments in a 64 bit format.
2004-01-15 14:29:20 +00:00
enami 9e2ac76ac4 Obviously, sizeof(u_int) is not enough to copy struct buf.
Prevents ``sysctl -a'' from dumping core.
2004-01-15 09:03:26 +00:00
yamt 7f20b0c529 bump vnode hold count for page cache as well
to resolve unfairness between page cache and traditional buffer cache.
pointed by enami tsugutomo on current-users@.
2004-01-14 11:28:04 +00:00
jdolecek 475a5858bf g/c process state SDEAD - it's not used anymore after 'reaper' removal 2004-01-11 19:39:48 +00:00
jdolecek a1090edbd2 fix assertion - non-alive processes are in SZOMB state now
fixes PR kern/24033 by Martin Husemann
2004-01-11 18:51:15 +00:00
hannken ed68c4e34c Allow vfs_write_suspend() to wait if the file system is already
suspending.

Move vfs_write_suspend() and vfs_write_resume() from kern/vfs_vnops.c
to kern/vfs_subr.c.

Change vnode write gating in ufs/ffs/ffs_softdep.c (from FreeBSD).

When vnodes are throttled in softdep_trackbufs() check for
file system suspension every 10 msecs to avoid a deadlock.
2004-01-10 17:16:38 +00:00
yamt a3b2d1879c add a new bufq strategy, BUFQ_PRIOCSCAN (per-priority CSCAN).
discussed on tech-kern@
2004-01-10 14:49:44 +00:00
yamt 8c55727694 reset i/o priority in geteblk() as well. 2004-01-10 14:43:05 +00:00
yamt 7266a95907 store a i/o priority hint in struct buf for buffer queue discipline. 2004-01-10 14:39:50 +00:00
thorpej 4aeba6790d Initialize buffer pools with PR_IMMEDRELEASE. Don't use pool_reclaim()
on those pools; it is no longer necessary.
2004-01-09 19:01:01 +00:00
thorpej 7f125220f4 Add a new pool initialization flag, PR_IMMEDRELEASE. This flag causes
idle pool pages to be returned to the system immediately upon becoming
de-fragmented.

Also, in pool_do_put(), don't free back an idle page unless we are over
our minimum page claim.
2004-01-09 19:00:16 +00:00
tls e4758a97ae Change BUFCACHE (default hard limit on physmem consumption by metadata
cache) from 30% to 20%.  This seems to significantly smooth the oscillation
between "almost no memory available" and "UVM free target available" caused
by the current sudden, heavy backpressure on the metadata cache.  We should
revisit this again once the backpressure mechanism is better tuned; ideally,
the hard limit should almost never come into play, because the metadata
cache should gradually give back pages as buffers hit the AGE list and as
the page cache demands them, rather than giving back a big slug of pages
all at once when UVM decides it's in a hurry and fires off the page daemon.

Just how well this adjustment works is likely to vary significantly from
machine to machine depending on I/O mix, filesystem frag size, and total
memory.  However, 20% seems to be quite a bit better than 30% on several
systems I've tested and is, coincidentally, more than enough to cache
the entire metadata working set of the AnonCVS server with 100 clients,
which is a useful worst-case stake in the ground...
2004-01-09 06:26:15 +00:00
tls 0d6723b09f Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels).  This
brings the hit rate on my machines from below 70% to above 90%.  We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes.  Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.
2004-01-09 00:04:53 +00:00
tls 28364b01be Add pool_reclaim() on pool to which we just pool_put() a buffer in
buf_mrelease().  Without this, though the pages are returned to the
relevant *pool*, they are never available for any other use in the
system.

Now the backpressure on the physical size of the buffer cache through
the buf_drain() call in the pagedaemon works correctly.  If anything,
it may be a bit more aggressive than intended.  On my 256MB system,
with vm.bufcache set to the default 30% of physmem, a kernel with this
fix can do 5 simultaneous config/makedep/builds of different NetBSD
kernels in 1313 seconds; with the "traditional" buffer cache code it
requires 1320 seconds.  Running "find / -type d -exec ls -l {}" while
the build is going demonstrates that the backpressure is working
correctly: free memory oscillates slowly between close to none and
the UVM target free, and vmstat -m shows a large number of releases
for the buffer pools.

For future work: how is "bufpl" memory returned to the system?  This
is not obvious to me (I must be looking in the wrong place).  Also,
buf_mrelease() is also called from brelse() in some cases.  Would it
be better to add a pool flag causing automatic release of full pages
as they become available (not fragmented)?  Jason Thorpe proposed this
and it seems more elegant than cleaning the _entire_ pool only upon
memory pressure.

Greg Oster did a lot of the work of figuring this out.  Jason proposed
the use of pool_reclaim as a way to fix it.
2004-01-08 23:41:14 +00:00
cube 3bf5e4c13b If ksyms have not been initialized, return ENXIO in ksymsopen instead of
ksymsread, because ksyms client test availability with open() and not
read().
2004-01-08 22:48:26 +00:00
thorpej d76fa360ef Back out >2 PT_LOAD changes from rev 1.96. They cause older GCC3-compiled
PowerPC binaries to fail.  The compiler has since been fixed, but
compatibility with older binaries needs to be maintained.

PR kern/23758.
2004-01-07 16:42:53 +00:00
jdolecek 26767eb2ae fix F_MAXFD fcntl - it returned the value as errno instead
of return value from the syscall
from mouss <usebsd at free dot fr>
2004-01-07 09:26:29 +00:00
atatat 5efc584023 Expose the buf_map symbol so that pmap(1) can find it.
Split the sysctl setup routine into two routines, one for each
"subtree".  Perhaps it's a little pedantic, but it's cleaner.  Also,
assert that the "kern" and "vm" nodes exist.
2004-01-06 13:51:09 +00:00
lukem 7bb9d6c875 Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate  const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.
2004-01-05 03:33:06 +00:00
christos b76a454b90 Ad F_CLOSEM, F_MAXFD from Matt Thomas. 2004-01-05 00:36:49 +00:00
pk 90cc172b86 bufpool_page_free: pass `buf_map' to uvm_km_free(). 2004-01-04 16:17:13 +00:00
kleink 1b16e3f0a3 ; may be a comment character in assembly, use \n as a separator instead. 2004-01-04 13:27:53 +00:00
jdolecek 089abdad44 Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
  as FPU state), and is the last potentially blocking operation;
  all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
  by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
  for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread
2004-01-04 11:33:29 +00:00
jdolecek 6ea538748e constify a bit 2004-01-03 20:10:01 +00:00
jdolecek 1d86b1f39f fix some comments, use NULL instead of 0 for pointer comparison 2004-01-03 19:43:55 +00:00
cl d5645dec8e regen 2004-01-02 18:53:45 +00:00
cl e7045955c7 kernel part of no-syscall upcall stack return: libpthread registers
an offset between ss_sp and struct sa_stackinfo_t (located in struct
__pthread_st) when calling sa_register.  The kernel increments the
sast_gen counter in struct sastack when an upcall stack is used.
libpthread increments the sasi_stackgen counter in struct
sa_stackinfo_t when an upcall stack is freed.  The kernel compares the
two counters to decide if a stack is free or in use.

- add struct sa_stackinfo_t with sasi_stackgen to count stack use in
  userland
- add sast_gen to struct sastack to count stack use in kernel
- add SA_FLAG_STACKINFO to enable the stackinfo_offset argument in the
  sa_register syscall
- add sa_stackinfo_offset to struct sadata for offset between ss_sp
  and struct sa_stackinfo_t
- add ssize_t stackinfo_offset argument to sa_register, initialize
  struct sadata's sa_stackinfo_offset from it if SA_FLAG_STACKINFO is
  set
- add sa_getstack, sa_getstack0, sa_stackused and sa_setstackfree
  functions to find/use/free upcall stacks and use these where
  appropriate
- don't record stack for upcall in sa_upcall0
- pass sau to sa_switchcall instead of l2 (l2 = curlwp in sa_switchcall)
- add sa_vp_blocker to struct sadata to pass recently blocked lwp to
  sa_switchcall
- delay finding a stack for blocked upcalls to sa_switchcall
- add sa_stacknext to struct sadata pointing to next most likely free
  upcall stack; also g/c sa_stackslist in struct sadata and sast_list
  in struct sastack
- add L_SA_WOKEN flag: LWP is on sa_woken queue
- add L_SA_RECYCLE flag: LWP should be recycled in sa_setwoken
- replace l_upcallstack with L_SA_WOKEN/L_SA_RECYCLE/L_SA_BLOCKING
  flags
- g/c now unused sast_blocker in struct sastack
- make sa_switchcall, sa_upcall0 and sa_upcall_getstate static in
  kern_sa.c
- call sa_upcall_userret only once in userret
- split sa_makeupcalls out of sa_upcall_userret and use to process
  the sa_upcalls queue
- on process exit: mark LWPs sleeping in saunblock interruptible; also
  there are no LWPs sleeping on l->l_upcallstack anymore; also clear
  sa_wokenq_head to prevent unblocked upcalls

additional changes:
- cleanup timerupcall sa_vp == curlwp check
- add check in sa_yield if we didn't block on our way here and we
  wouldn't any longer be the LWP on the VP
- invalidate sa_vp_ofaultaddr after resolving pagefault
2004-01-02 18:52:17 +00:00
mycroft a9866938b5 Welcome to 2004! 2004-01-01 00:00:05 +00:00
pk dc6d5d0dd1 getnewbuf: return buffer locked. 2003-12-31 14:37:17 +00:00
thorpej 7e958083b1 Consistently use ANSI-style function decls. 2003-12-30 20:40:39 +00:00
thorpej 6a833751e0 Remove allocsys(); nothing uses it anymore. 2003-12-30 18:29:43 +00:00
pk 70f20a1217 Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes).  It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms.  Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
2003-12-30 12:33:13 +00:00
martin 44b17951f2 Avoid using m_clget() on a mbuf already in use, especially when we
need the data in the mbuf later and m_clget() changes some fields
overlaid to regular mbuf data. Instead, rearange code a bit, create
data into a new allocated buffer and and use MEXTADD to attach it to
the mbuf, if the mbuf internal space is not sufficient.

This fixes a crash on sparc64 (and probably all other archs where
sizeof(int) != sizeof(struct file *)) when running
regress/sys/kern/unfdpass.

Idea for solution from Matt Thomas, with additional input from YAMAMOTO
Takashi.
2003-12-29 22:08:02 +00:00
yamt 192843ffc2 pool_prime_page: initialize ph_time to mono_time instead of zero
as it's a mono_time relative value.
2003-12-29 16:04:58 +00:00
atatat 1cab3635c2 Avoid dereferencing l in sysctl_lookup(), because it can be NULL.
Note one point where a possibility of a fault exists.
2003-12-29 04:19:28 +00:00
atatat 74dad84b6e Remove two uses of uvm_kernacc(), which wasn't quite getting the job
done anyway.  On a related change, use kcopy() instead of memcpy() for
kernel-to-kernel copying so that the same service warranty can be
given.
2003-12-29 04:16:25 +00:00
atatat b1c111a62a Sysctl functions called for "generic" nodes should forward "query"
requests (where possible), rather than returning errors.
2003-12-28 22:36:37 +00:00
atatat 0f7550bbf8 Adjust error returns in kern.cp_time when a specific processor is
being requested so that (1) the uniprocessor case and the
multiprocessor case are more similar and (2) so that we return ENOENT
when a non-existent processor is requested (which is both more
sensible and follows the general order of things anyway).
2003-12-28 22:24:12 +00:00
atatat c703d9821f Rename sysctl_kern_hostname() to sysctl_setlen() and use it also for
domainname.  Note that there's no need to copy rnode since we're not
changing any of it, nor protecting anything from change.

Thanks to martin for initial work.
2003-12-28 22:19:59 +00:00
atatat 8e0c1f1594 RCSid police 2003-12-28 22:12:00 +00:00
martin c22fd25c47 After changing hostname, adjust hostnamelen.
This closes PR kern/23907.
2003-12-28 14:39:36 +00:00
martin be59b63fe2 Make kern.rtc_offset writable at securelevel <= 0.
This allows boot-time adjustment when a machine runs other OSes with
RTC == localtime.
2003-12-26 23:49:39 +00:00
manu ffb3de5522 Move the sigfilter hook to a more adequate location, and rename it to better
fit what it does.

The softsignal feature is used in Darwin to trace processes. When the
traced process gets a signal, this raises an exception. The debugger will
receive the exception message, use ptrace with PT_THUPDATE to pass the
signal to the child or discard it, and then it will send a reply to the
exception message, to resume the child.

With the hook at the beginnng of kpsignal2, we are in the context of the
signal sender, which can be the kill(1) command, for instance. We cannot
afford to sleep until the debugger tells us if the signal should be
delivered or not.

Therefore, the hook to generate the Mach exception must be in the traced
process context. That was we can sleep awaiting for the debugger opinion
about the signal, this is not a problem. The hook is hence located into
issignal, at the place where normally SIGCHILD is sent to the debugger,
whereas the traced process is stopped. If the hook returns 0, we bypass
thoses operations, the Mach exception mecanism will take care of notifying
the debugger (through a Mach exception), and stop the faulting thread.
2003-12-24 22:53:59 +00:00
manu 54db0e51ad Split sys_lwp_suspend, just like sys_lwp_unsuspend is split. We get
sys_lwp_suspend, with the sanity checks, and lwp_suspend, with the
actual implementation.
2003-12-24 22:42:11 +00:00
simonb 16846040d7 Remove trailing blank line. 2003-12-21 11:54:16 +00:00
fvdl d99705e941 Put back Emmanuel's sigfilter hooks, as decided by Core. 2003-12-20 19:01:29 +00:00
manu b23b73b953 Introduce lwp_emuldata and the associated hooks. No hook is provided for the
exec case, as the emulation already has the ability to intercept that
with the e_proc_exec hook. It is the responsability of the emulation to
take appropriaye action about lwp_emuldata in e_proc_exec.

Patch reviewed by Christos.
2003-12-20 18:22:16 +00:00
yamt 8b9614a490 update a comment to match with the previous change (rev.1.12). 2003-12-20 07:33:03 +00:00
yamt 4dd4230680 restore functionality to decrease kern.maxvnodes which
has been backed out during sysctl rework.
2003-12-20 07:26:27 +00:00
dsl 97356a5fa8 Defer writing of KTR_EMUL entry until first trace done by target process.
Stops ktrops sleeping with the pid table locked.
2003-12-14 22:56:45 +00:00
simonb 701a167dd3 In sysctl_kern_lwp adjust offsets into the mib entries so that
they are now correct.  Fixes problems with "ps -s" not working.
Also use KERN_LWPSLOP instead of KERN_PROCSLOP.

Both changes from Andrew Brown.
2003-12-12 23:21:44 +00:00
atatat e3796202c5 Make kern.dump_on_panic writeable again, too 2003-12-10 14:16:12 +00:00
agc 7db1d33cba Modify the licences of code written by Theo De Raadt from a 4-clause
to a 2-clause licence (retaining UCB clauses (1) and (2)), per PR
22409 from Joel Baker, approved by Theo de Raadt, and ratified by
myself - the only discrepancy being the handling of the original
clause 3 in src/usr.sbin/yppoll/yppoll.c.
2003-12-10 12:06:25 +00:00
hannken fbae381aaa The file system snapshot pseudo driver.
Uses a hook in spec_strategy() to save data written from a mounted
file system to its block device and a hook in dounmount().

Not enabled by default in any kernel config.

Approved by: Frank van der Linden <fvdl@netbsd.org>
2003-12-10 11:40:11 +00:00
atatat 38f213672c Make kern.sbmax writeable again as well.
From a follow-on to PR kern/23695 by a Mr. Davis, which I missed at a
quick glance.
2003-12-09 01:52:07 +00:00
atatat a5d6d5ebfd Make kern.logsigexit writeable again.
Fixes PR kern/23695.
2003-12-09 01:25:33 +00:00
hannken 37efcf9045 Fix the last commit(s). On machines with sizeof(long) != sizeof(int)
the hash compare would fail.
2003-12-08 14:23:33 +00:00
hannken 10654a5c0a Fix last commit. The current spl was an implicit argument to the ACQUIRE
macro.  With help and approval from YAMAMOTO Takashi <yamt@netbsd.org>
2003-12-08 14:21:25 +00:00