Commit Graph

5544 Commits

Author SHA1 Message Date
dsl e4a2671dea Put the RCSID before any other headers 2007-09-16 15:17:36 +00:00
skrll 9fdaf800d9 Merge nick-csl-alignment. 2007-09-10 11:34:05 +00:00
rmind 4492a08ad7 Regen syscalls. 2007-09-07 18:58:46 +00:00
rmind 2cecf9bbe9 Implementation of POSIX message queues.
Reviewed by: <ad>, <tech-kern>
2007-09-07 18:56:02 +00:00
ad 513227e941 - Fix sleepq_block() to return EINTR if the LWP is cancelled. Pointed out
by yamt@.

- Introduce SOBJ_SLEEPQ_LIFO, and use for LWPs sleeping via _lwp_park.
  libpthread enqueues most waiters in LIFO order to try and wake LWPs that
  ran recently, since their working set is more likely to be in cache.
  Matching the order of insertion reduces the time spent searching queues
  in the kernel.

- Do not boost the priority of LWPs sleeping in _lwp_park, just let them
  sleep at their user priority level. LWPs waiting for some I/O event in
  the kernel still wait with kernel priority and get woken more quickly.
  This needs more evaluation and is to be revisited, but the effect on a
  variety of benchmarks is positive.

- When waking LWPs, do not send an IPI to remote CPUs or arrange for the
  current LWP to be preempted unless (a) the thread being awoken has kernel
  priority and has higher priority than the currently running thread or (b)
  the remote CPU is idle.
2007-09-06 23:58:56 +00:00
rmind 94fb9a4b80 Fix various possible dereferences via uvmspace_free() of non-initialized *vm.
Also, error case might happen before proc_vmspace_getref() (hi <ad>!).
Thanks CID 4551 and 4552. This is serious, pullup will be requested.

OK by <wrstuden>.
2007-09-06 04:00:44 +00:00
rmind 7b2bfeb941 uid_find: Destroy mutex before free.
From CID: 4555
2007-09-06 02:03:06 +00:00
rmind 93f0cb5cdf do_sys_sendmsg: Plug a possible leak.
From CID: 4535
2007-09-06 01:21:00 +00:00
xtraeme c371d1d093 Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.
2007-09-02 00:41:24 +00:00
pooka 3f3cac88a3 Make bioops a pointer and point it to the softdeps struct in softdep
init.  Decouples "options SOFTDEP" from the main kernel and ffs code.
2007-09-01 23:40:21 +00:00
dsl 7a90b5e6bc Don't error calls to copy socket addresses to userspace when the application
has provided a non-null buffer pointer and a zero length.
2007-09-01 17:04:58 +00:00
dyoung b3870c371a In sockaddr_copy(), stop caring about the destination sockaddr's
family and length, it doesn't matter in the post-pool(9) sockaddr
regime.
2007-09-01 06:50:44 +00:00
yamt 5ea51f80da pull the following change from vmlocking branch.
revision 1.7.2.10
	date: 2007/08/27 12:51:13;  author: yamt;  state: Exp;  lines: +6 -7
	sleepq_block: don't call lwp_unsleep twice.
	(fix an assertion failure in lwp_unsleep.)
2007-08-31 15:27:18 +00:00
dyoung b3fc296326 Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain.  Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size.  Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead.  Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
2007-08-30 02:17:34 +00:00
pooka 6ae9cab127 In quotactl, move vrele() to after the VFS call: protects the
mountpoint from being wiped under us better.

from David Holland
2007-08-28 09:28:10 +00:00
dsl c232133678 ktrace socket control structures (ie msghdr, address etc) using ktrkuser(). 2007-08-27 20:09:44 +00:00
dsl 22c0ab6d47 Only ktrace the part of the buffer actually read/written. 2007-08-27 16:23:16 +00:00
dsl 31c3c56394 Fix inverted test in ktrpoint(), NAMI traces weren't being generated.
Also inline the 'ktrace_on' part of the test.
2007-08-27 13:33:45 +00:00
dyoung 5204966a96 Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.
2007-08-26 22:59:08 +00:00
ad 6b79143ab0 pool_drain: add a comment. 2007-08-18 00:37:14 +00:00
ad e492e656cc pool_do_cache_invalidate_grouplist: drop locks while calling the destructor.
XXX Expensive - to be revisited.
2007-08-18 00:33:38 +00:00
ad b5866eb299 Make the uarea cache per-CPU and drain in batches of 4. 2007-08-18 00:21:10 +00:00
ad c3ffc50d7a Remove obsolete comments. 2007-08-18 00:11:00 +00:00
ad 215f99bf1f Remove obsolete comment. 2007-08-17 23:46:34 +00:00
skd 617b9b58ef Don't put the condvars in the *middle* of the undo structures.
(semu + seminfo.semmnu) is wrong, because the type of semu is int*.
You could fix the offset ((char *)semu + seminfo.semusz), but simply
putting the condvars first is more clear.
2007-08-17 23:05:06 +00:00
ad 26c3495f7d Timecounters are lockless. Add conservative memory barriers to ensure that
loads and stores occur in the correct order.
2007-08-17 21:20:24 +00:00
ad c0fd052388 Fix a couple of comments. 2007-08-17 17:25:14 +00:00
ad 399122feeb subr_prf_bitmask.c -> subr_prf2.c 2007-08-15 20:34:48 +00:00
ad e59f9f3e20 proc_free: don't destroy locks until the last LWP is confirmed off the CPU.
This is an ideal candidate for pool_cache.
2007-08-15 12:20:28 +00:00
ad d3675885a8 Regen. 2007-08-15 12:09:12 +00:00
ad df7945cf28 - Update for ktrace changes.
- Mark a few more syscalls MPSAFE.
2007-08-15 12:08:38 +00:00
ad 63c4506184 Changes to make ktrace LKM friendly and reduce ifdef KTRACE. Proposed
on tech-kern.
2007-08-15 12:07:23 +00:00
rmind d2142b3188 sys__lwp_suspend: Handle the possible problem when target LWP might exit via
lwp_exit() before suspending.  In such case, LWP might be already freed after
cv_wait_sig() and checking the list of LWPs via lwp_find() is necessary.

Possible problem catched by Andrew Doran.
2007-08-15 02:50:40 +00:00
pooka 3340382aec more vfs_subr -> vfs_subr2 dance for rump:
vwakeup, vinvalbuf, vtruncbuf, vflushbuf, bgetvp, brelvp, reassignbuf
2007-08-14 13:51:31 +00:00
pooka e7c5957392 Revert code part of rev 1.95, yamt pointed out it changes NFS semantics. 2007-08-12 23:40:40 +00:00
pooka 67c57c75e0 CREATE is a write operation in my book, so check for that also when
checking for a readonly lookup.  This shouldn't make a difference
now, though, as the only RDONLY lookup is done by getcwd(), and
that a) doesn't create files b) calls LOOKUP directly anyway.

Also, fix comment I managed to miss in the previous commit (I didn't
expect the same comment to be there twice).
2007-08-12 19:42:09 +00:00
pooka 5450d0d87b cn_flags RDONLY brilliantly has nothing to do with the file system
itself being r/o, so fix a couple of misguided comments.
2007-08-12 19:31:12 +00:00
pooka 5a92d448e1 POOL_INIT -> pool_init, we need to call bufinit() anyway 2007-08-11 19:56:53 +00:00
dyoung 596a16c16b Don't run ctags(1) on sys/altq/altq.h, it redefines useful NetBSD
tag targets.
2007-08-10 22:50:12 +00:00
dyoung ffbbd3ac2d Fix kernel compilation with 'options KSTACK_CHECK_MAGIC': change
'void *' to 'char *' so pointer arithmetic will work.
2007-08-10 21:50:48 +00:00
pooka b4b6be82da do the shuffle: move mount_specificdata stuff from vfs_subr to vfs_subr2 2007-08-09 20:55:30 +00:00
he fd961c4429 Add a new socket option for unix domain sockets: LOCAL_PEEREID, to make
it possible to get the pid, euid and egid of the process at the remote
end at the time it did bind() or connect().

Add a new libc function, getpeereid() to easily get at the euid and egid.
As a consequence, bump libc's minor number.

Document the LOCAL_PEEREID socket option in unix(4).

Based on contribution by Arne H. Juul, minor modifications by myself.
2007-08-09 15:23:01 +00:00
pooka 3de9f5d391 Instead of having lfs muck directly about with vnode free lists,
introduce vrele2(), which allows to release vnodes the way lfs
sometimes wants it:
  + without calling inactive
  + inserting the vnode at the head of the freelist (this is a very
    questionable optimization that isn't even enabled by default,
    but I went along with the same semantics for now)
2007-08-09 08:51:21 +00:00
pooka 5fbd525b19 Shuffle routines which just roll values around from kern_clock.c
and kern_time.c to subr_time.c.
2007-08-09 07:36:18 +00:00
ad 41368c8e7e Grab locks in getrusage/getrlimit. 2007-08-08 14:07:11 +00:00
ad 06f7ccf01d Regen. 2007-08-07 19:01:23 +00:00
ad 830ab6bb3c - Fix a bug with _lwp_park() where if the computed wakeup time was under
1 microsecond into the future, the thread could enter an untimed sleep.
- Change the signature of _lwp_park() to accept an lwpid_t and second
  hint pointer, but do so in a way that remains compatible with older
  pthread libraries. This can be used to wake another thread before the
  calling thread goes asleep, saving at least one syscall + involuntary
  context switch. This turns out to be a fairly large win on the condvar
  benchmarks that I have tried.
- Mark some more syscalls MP safe.
2007-08-07 19:00:42 +00:00
yamt 69aa06cd40 don't bother to set thread's priority by ourselves,
as kthread_create does it for us now.  from Andrew Doran.
2007-08-07 12:50:26 +00:00
ad eef90c7197 Regen. 2007-08-07 12:48:52 +00:00
ad b9d8ad095d wait() can't yet be MPSAFE since it's impractical to hold proclist_mutex
across exit(), and so there is a short race against cv_wait_sig(). This
can be reverted when proclist_mutex/proclist_lock merge.
2007-08-07 12:48:30 +00:00
ad 5005559992 Do cv_broadcast() on proc::p_waitcv to be on the safe side (the parent
could be multithreaded).
2007-08-07 12:45:54 +00:00
ad c1bc924601 No reason not to make itimespecfix() generally available.. 2007-08-07 11:43:35 +00:00
ad 4a8903393a Export itimespecfix() until itimerfix() dies. 2007-08-07 11:39:18 +00:00
yamt e3fe8e011e - don't assume the order of cpus in a CPU_INFO_FOREACH loop.
- remove unused structure members.
- simplify.
2007-08-07 10:42:22 +00:00
ad 23cf810fc7 Regen. 2007-08-07 09:46:39 +00:00
ad 9dab7d5077 gettimeofday() doesn't need locks, and MySQL seems to make heavy use of it. 2007-08-07 09:46:24 +00:00
dyoung 04d14f227e Lengthen sockaddr_dl so that a 16-byte FireWire address will fit
into sdl_data[].

Move the macro satocsdl() to net/if_dl.h, and introduce satosdl().

Add some helpers for initializing sockaddr_dl (sockaddr_dl_init),
for finding out the length to put in a sockaddr_dl's sdl_len member
(sockaddr_dl_measure), and for setting the link-layer address in
a sockaddr_dl to a new value (sockaddr_dl_setaddr).

Make sockaddr_copy() panic if the caller tries to copy a sockaddr
to a destination where it will not fit.
2007-08-07 04:06:20 +00:00
pooka 3fbf89c648 Initialize size of outsize-of-fs device vnodes also, since they
can migrate to file systems due to checkalias() and cause KASSERT
panics.

fixes nfsroot panic reported by martin
2007-08-06 17:09:11 +00:00
yamt 261e7c1e79 remove a homegrown definition of CPU_INFO_FOREACH. 2007-08-06 11:51:46 +00:00
yamt 954298e279 suspendsched: reduce #ifdef. 2007-08-06 11:48:23 +00:00
yamt e42cf10955 sosetopt: clear SB_AUTOSIZE when setting buffer size explicitly. 2007-08-06 11:41:52 +00:00
ad 5b2aca96b9 Current convention is to name/number objects after ci->ci_cpuid, so do
that when creating the kthreads. We may want to change this.
2007-08-05 13:47:25 +00:00
rmind c8c024369c Improve per-CPU support for the workqueue(9):
- Make structures CPU-cache friendly, as suggested and explained
   by Andrew Doran.  CACHE_LINE_SIZE definition is invented.
 - Use current CPU if NULL is passed to the workqueue_enqueue().
 - Implemented MI CPU index, which could be used as an index of array.
   Removed linked-lists usage for work queues.

The roundup2() function avoids division, but works only with power of 2.

Reviewed by: <ad>, <yamt>, <tech-kern>
2007-08-05 01:19:17 +00:00
ad c3085c5fd6 A quick hack to get things building again. Don't refer to curlwp
if !MULTIPROCESSOR.
2007-08-04 11:57:54 +00:00
ad 18af8ee9bd Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.
2007-08-04 11:02:56 +00:00
ad e4f6da7b0c Mark the SysV semaphore syscalls MP safe. 2007-08-04 10:52:57 +00:00
martin d5d0a7225c PR kern/32842:
do not leak file descriptors when sending a datagram with SCM_RIGHTS
fails. Patch from Gary Thorpe, based on changes in FreeBSD and work
from Christian Biere.
2007-08-03 20:49:45 +00:00
ad a363c5f5b2 cv_wakeup: the entire queue has to be searched, as we can't know how many
waiters there are.
2007-08-02 22:01:40 +00:00
rmind 4175f8693b TCP socket buffers automatic sizing - ported from FreeBSD.
http://mail-index.netbsd.org/tech-net/2007/02/04/0006.html

! Disabled by default, marked as experimental. Testers are very needed.
! Someone should thoroughly test this, and improve if possible.

Discussed on <tech-net>:
http://mail-index.netbsd.org/tech-net/2007/07/12/0002.html
Thanks Greg Troxel for comments.

OK by the long silence on <tech-net>.
2007-08-02 02:42:40 +00:00
rmind 00cdc8df70 sys__lwp_suspend: implement waiting for target LWP status changes (or
process exiting). Removes XXXLWP.

Reviewed by <ad> some time ago..
2007-08-02 01:48:44 +00:00
ad 45e2aff386 sleepq_block: if a pending signal is detected but has already been taken
by the time the calling thread tries to take it, don't return EINTR.
Instead return zero leading to a spurious wakeup.
2007-08-01 23:30:54 +00:00
ad d028f9dec2 KNF 2007-08-01 23:24:26 +00:00
ad 7fecd4ded9 callout_softclock: add a couple of assertions. 2007-08-01 23:23:41 +00:00
ad fe1b7cd1f7 Ressurect cv_wakeup() and use it on lbolt. Should fix PR kern/36714.
(background/foreground signal lossage in -current with various programs).
2007-08-01 23:21:14 +00:00
ad b5dd2da738 Improve assertions slightly. When awakening assert that the CV has not
been destroyed.
2007-08-01 20:30:38 +00:00
degroote 0a057fefdd Fix compilation in the POWERHOOK_DEBUG case 2007-08-01 19:50:24 +00:00
christos fbce1923d7 improve on poerhooks debugging. from Anon Ymous 2007-08-01 10:57:07 +00:00
pooka 8d1f899239 * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
  use VFS_PROTOS() instead of manually prototyping the methods
2007-07-31 21:14:15 +00:00
tnn 95342a9670 Fix previous; lwp status are states, not flags. 2007-07-31 09:56:31 +00:00
tnn 7a9d8e5613 proc_representative_lwp:
- Correct expression for checking if the lwp is running.
 - Remove dead code. Ok'd by Andrew Doran.
2007-07-31 00:52:04 +00:00
ad 00348ca253 callout_barrier: drop kernel_lock before blocking. 2007-07-30 21:36:54 +00:00
pooka a06e97c8ef move setrootfstime() from init_main.c to vfs_subr2.c 2007-07-30 08:45:26 +00:00
pooka 59f0f4532f Split vfs_subr.c into routines which need much of the kernel
infrastructure (vfs_subr.c) and routines which need little or none
of the kernel infra (vfs_subr2.c).
2007-07-29 14:44:08 +00:00
ad 10b11b97b0 B_ERROR is gone. 2007-07-29 13:53:46 +00:00
pooka abf11c212d Define a new lockmgr flag LK_RESURRECT which can be used in
conjunction with LK_DRAIN.  This has the same effect as LK_DRAIN
except it atomically does NOT mark the lock as drained.  This
guarantees that when we got the lock, we were the last one currently
waiting for the lock.

Use LK_DRAIN|LK_RESURRECT in vclean() to make sure there are no
waiters for the lock.  This should fix behaviour theoretized to be
caused by vfs_subr.c 1.289 which caused vclean() to run into
completion and free the vnode before all lock-waiters had been
processed.  Should therefore fix the "simple_lock: unitialized lock"
problems seen recently.

thanks to Juergen Hannken-Illjes for some analysis of the problem
and Erik Bertelsen for testing
2007-07-29 12:40:37 +00:00
ad 66fefd117b It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
2007-07-29 12:15:35 +00:00
ad c52c14050e Be more forgiving if panicstr != NULL. 2007-07-29 11:45:21 +00:00
pooka dff8581037 Print also the topmost flag hex in vprint().
fun fact: this bug was introduced over 10 years ago, so I don't
think anyone has really keenly missed it.
2007-07-29 10:00:15 +00:00
pooka e97d926940 Move bitmask_snprintf() from subr_prf.c to subr_prf_bitmask.c to permit
standalone compilation.  No functional change.
2007-07-29 09:38:01 +00:00
pooka 0a0815b77e Move hashinit() & hashdone() from kern_subr.c to subr_hash.c to
permit standalone compilation.  No functional change.
2007-07-28 12:53:52 +00:00
pooka 33f2f6779a minor header cleanup 2007-07-28 08:19:36 +00:00
ad 46022e56e5 Update the blurb to match reality. 2007-07-28 00:12:26 +00:00
pooka af927546de Move vfs_attach(), vfs_detach() and vfs_reinit() from vfs_subr.c
to vfs_init.c.  This permits easier standalone compilation of these
routines.
2007-07-27 14:25:21 +00:00
pooka 90f58074b5 regen: VOP_MMAP fflags -> prot 2007-07-27 08:27:38 +00:00
pooka 1ce406a846 Change unused fflags parameter in VOP_MMAP to prot and pass in
desired vm protection.
2007-07-27 08:26:38 +00:00
pooka d9970c8066 Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter. 2007-07-26 22:57:36 +00:00
pooka 4aa4a0d0ae regen: assert that vnode creating operations set the size 2007-07-22 21:27:49 +00:00
pooka f5ba107ccd Introduce WILLMAKE for vnode operations which create a new vnode.
Insert a KASSERT along the return path of such operations to check
that the operation set the vnode size.
2007-07-22 21:26:53 +00:00
pooka 05ce20f4a0 Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
2007-07-22 19:16:04 +00:00
xtraeme 5623c9a1de Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.
2007-07-21 23:15:16 +00:00