Commit Graph

6464 Commits

Author SHA1 Message Date
ad 1cc9a3ae7e If converting a process/thread from SCHED_OTHER to a realtime thread,
ignore the existing priority. If no priority is specified, give threads
the minimum user RT priority.
2008-05-25 23:34:24 +00:00
ad 90035a10c9 sched_tick:
- Do timeslicing for SCHED_RR threads. At ~16Hz it's too slow but better
  than nothing. XXX

- If a SCHED_OTHER thread has hogged the CPU for 1/8s without taking a
  trip through mi_switch(), try to force a kernel preemption to give other
  threads a chance.
2008-05-25 22:04:50 +00:00
christos 6e0baf783e don't forget to fill in the emulation. 2008-05-25 20:18:33 +00:00
ad 5e4b324300 Properly fix the "hanging in tty" bug that was worked around with cv_wakeup()
some time again.
2008-05-25 19:22:21 +00:00
jmcneill 1e2888bbbd Export device-driver and device-unit properties via drvctl 2008-05-25 15:03:01 +00:00
jmcneill 3a8a32076d Add DRVGETEVENT support for /dev/drvctl, based on devmon support by
Jachym Holecek for Google Summer of Code. DRVGETEVENT plist is currently
limited to event type, device name, and device parent name.
2008-05-25 12:30:40 +00:00
christos 934b677fde Coverity CID 5015: Remove unnecessary test; if l was null we would have
crashed before when p = l->l_proc.
2008-05-24 18:43:02 +00:00
christos 2847938186 Coverity CID 5019: Check before deref. 2008-05-24 16:49:30 +00:00
christos a2c63c0004 Coverity CID 5025: sbreserve is never called with a null socket. 2008-05-24 16:35:28 +00:00
ad 25866fbff7 Set cpu_onproc on entry to the idle loop. 2008-05-24 12:59:06 +00:00
njoly 12da67c77e Make msgsnd return EINVAL instead of 0, when the value of mtype is
less than 1.
2008-05-22 11:25:54 +00:00
ad 697d5e2cd4 PR kern/38663 Kernel preemption can't be enabled on x86 because of amd64
FPU handling

Ugly hack until the amd64 fpu handling is working (which should be soon):
enable kernel preemption on i386.
2008-05-21 15:41:03 +00:00
ad d88444761b Ignore return from module_load() and just try vfsop lookup again. 2008-05-20 19:30:03 +00:00
ad ce7cbbfb63 Back out unintentional change. 2008-05-20 19:21:23 +00:00
ad 61270d54f1 If autoloading a module, don't consider the current working directory. 2008-05-20 19:20:38 +00:00
ad 88435c0e48 Remove stale comment. 2008-05-20 19:16:07 +00:00
ad a72f5a57fb Don't try to load a module while holding a vnode lock. 2008-05-20 17:28:59 +00:00
ad 7bf8432671 If mount fails because the needed file system code isn't in kernel, try
to autoload with the needed vfsops.
2008-05-20 17:25:49 +00:00
ad 67280de1f2 Allow module class to be passed to module_load(), as a basic sanity check
that we are loading the right kind of module.
2008-05-20 17:24:56 +00:00
martin bd3d112a87 fix !MODULAR compiles 2008-05-20 16:18:51 +00:00
ad e69aa3297c Take $MACHINE into account when looking for modules. 2008-05-20 16:04:08 +00:00
ad ef621e3353 Remove pointless COMPAT ifdef. 2008-05-20 16:03:31 +00:00
ad 7a3561a8dc PR kern/38694 module dependencies do not work as expected
Autoload modules from the correct path based on kernel version.
2008-05-20 14:11:55 +00:00
ad d0bd9aa452 - Do local relocs before loading requisite modules, and all others only
after requisite modules have been loaded. For PR kern/38697.
- Simplify kobj interface slightly to make error handling easier.
2008-05-20 13:34:44 +00:00
jmcneill 3ea8229871 If we see a non-loadable BSS section in a pre-loaded module, make sure we
don't return success from kobj_load or nasty things will happen.
2008-05-19 17:33:42 +00:00
ad b0f6f924e3 Give devsw_detach() a dummy error return. 2008-05-19 17:15:00 +00:00
ad 245f0726ac Reduce ifdefs due to MULTIPROCESSOR slightly. 2008-05-19 17:06:02 +00:00
rmind 5f701aa0a3 - Make periodical balancing mandatory.
- Fix priority raising in M2 (broken after making runqueues mandatory).
2008-05-19 12:48:54 +00:00
hannken 940992e2c0 Remove a bad assertion from last commit.
Non bufcache buffers may have BC_BUSY unset.
2008-05-16 14:08:56 +00:00
hannken 5d2bff060a Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write.  Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn().  If set the caller
  intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
  may clear the buffer and runs copy-on-write.  Process possible errors
  from getblk() or fscow_run().  Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-05-16 09:21:59 +00:00
hannken 961a5d4bcb Fscow_run() may recurse into itself.
Take care by adding a per-lwp recursion counter.
2008-05-16 09:01:56 +00:00
ad cefdd6012a In panic, we busy wait if another CPU is already panicking. Don't spl0(),
because we could recurse and run off the end of the stack. Pointed out by
chs@.
2008-05-13 11:54:45 +00:00
yamt c27d8958e0 sys_ptrace: fix a locking botch. PR/38649 from Martin Husemann. 2008-05-13 09:16:11 +00:00
ad a9ee17c54d Use cpu_index(), not ci_cpuid. 2008-05-12 14:28:22 +00:00
rmind 76db3ec4cd sys_shmget: fix an object leak in case of error. 2008-05-11 18:48:00 +00:00
ad 32c5d76875 Fix locking botch. Pointed out by kardel@. 2008-05-11 14:42:18 +00:00
wrstuden 97003b024b Oops. These are supposed to come alive on the branch, not the head. 2008-05-11 00:18:09 +00:00
wrstuden dbbab92bc9 Initial checkin of re-adding SA. Everything except kern_sa.c
compiles in GENERIC for i386. This is still a work-in-progress, but
this checkin covers most of the mechanical work (changing signalling
to be able to accomidate SA's process-wide signalling and re-adding
includes of sys/sa.h and savar.h). Subsequent changes will be much
more interesting.

Also, kern_sa.c has received partial cleanup. There's still more
to do, though.
2008-05-10 23:48:44 +00:00
rumble a1221b6d4a Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
2008-05-10 02:26:09 +00:00
ad 541e4662f9 - Add tc_gonebad(): allows timecounter to be flagged as bad and removed at
the next clock tick.
- Remove time_lock, which is no longer required.
2008-05-08 18:56:58 +00:00
njoly 24cbc2830b - Make semctl SETVAL/SETALL commands validate the semaphore value to
be set, which needs to be in the range [0,SEMVMX].
- Adjust the man page.
2008-05-06 20:25:09 +00:00
ad 82f138617e sys_unmount: drop ref to root dir before dounmount(), otherwise we'll
always get EBUSY.
2008-05-06 19:14:32 +00:00
ad 42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad a4e0004be3 LOCKDEBUG: try to speed it up a bit by not using so much global state.
This will break the build briefly but will be followed by another commit
to fix that..
2008-05-06 18:40:57 +00:00
ad 81194e34f1 Allow rw_tryenter(&lock, RW_READER) to recurse, for vfs_busy(). 2008-05-06 17:11:45 +00:00
ad d63f54a366 lookup: Do a vfs_trybusy(). If the file system is being unmounted, then
just fail the operation.
2008-05-06 15:04:00 +00:00
xtraeme f5b5967c0e Make this build again. 2008-05-06 13:31:02 +00:00
ad 39d40db63f PR kern/38141 lookup/vfs_busy acquire rwlock recursively
- sys_sync: acquire a write lock on the mount since the operation modifies
  the mount structure.
- sys_fchdir: use vfs_trybusy(). If an unmount is in progress, just fail it.
2008-05-06 12:54:25 +00:00
ad 8384d709ae Fix a couple of problems with checkdirs():
- vnode and cwd locks were being taken with proc_lock held, which is bad
  because proc_lock can only be held for a short period of time.

- Processes could have continually forked and escaped notice, keeping
  a reference to the old directory on top of which a new mount exists.
2008-05-06 12:51:22 +00:00
ad bdaf7ef5fc PR kern/38141 lookup/vfs_busy acquire rwlock recursively
vfs_busy: don't deadlock if curlwp is unmounting.
2008-05-06 12:39:32 +00:00
ad f4d5a72c7b PR kern/38141 lookup/vfs_busy acquire rwlock recursively
getvnode: Use vfs_trybusy, not vfs_busy. If unmount is in progress we
could deadlock, because vnode locks can be held during getnewvnode().
dounmount() locks in the reverse order (vfs_busy -> vnode).
2008-05-06 12:37:04 +00:00
ad e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
ad e7d7e96395 vfs_dobusy: add assertions. 2008-05-05 17:08:54 +00:00
ad 2bbb14eaa4 Back out previous. It broke the build. 2008-05-05 13:41:29 +00:00
jmcneill 729313d52c Use 2-clause license. 2008-05-05 00:12:49 +00:00
ad e724ef3c86 Provide zcalloc()/zcfree() so that uncompress() works. 2008-05-04 23:48:05 +00:00
ad b407147f14 Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.
2008-05-04 23:07:09 +00:00
rumble 7846a18697 Fix an error path that previously panicked when module_fetch_info failed. 2008-05-04 21:35:12 +00:00
ad 517f9684fe Make it compile as part of librump. 2008-05-04 12:51:44 +00:00
ad 0915c596fe Ensure that there is always a link_set_vfsops, until we kill VFS_ATTACH(). 2008-05-04 12:43:58 +00:00
ad 5982e60c2b Broken assertions. 2008-05-03 15:57:17 +00:00
yamt 839080f755 lockdebug: try to detect recursive acquirements of read-write locks. 2008-05-03 06:24:55 +00:00
yamt 59547f3b9f enterpgrp: 0 -> NULL for pointers. no functional changes. 2008-05-03 05:36:02 +00:00
yamt 63c2ef6a53 use sigismasked. no functional change. 2008-05-03 05:34:23 +00:00
yamt 6d3b5bc3c9 - encrypt/decrypt offsets if DIAGNOSTIC.
- add an assertion.
these changes allow to detect a use of uninitialized percpu_t *.
2008-05-03 05:31:56 +00:00
yamt cf71e3c4fb add a comment. 2008-05-03 05:18:36 +00:00
ad 04feebca3b PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Until the code paths are fixed properly, put in place an ugly workaround
to make it safe to recursively acquire a read lock on a mount.
2008-05-02 17:40:30 +00:00
ad 1253c2cad4 Allow md_root_setconf() to set in a miniroot as the root file system
even if MEMORY_DISK_IS_ROOT is not defined (a runtime override).
2008-05-02 13:02:31 +00:00
ad 3f1b4f1759 Keep the program table and section strings around after loading the object,
since module_find_section() needs them.
2008-05-02 13:00:01 +00:00
ad 5d413581c7 Re-do yesterday's build fix to hook in the MD stuff if available. 2008-05-02 12:59:34 +00:00
rmind 0fe9197c91 lwp_suspend: check for LW_* flags in l_flag, not l_stat. 2008-05-01 21:25:23 +00:00
ad 1bb1fee762 - Add module_find_section(), allows a module to look up data in its object.
- Work around build failure.
2008-05-01 17:23:16 +00:00
ad 416e98a01e Another fix for pre-loaded modules. 2008-05-01 17:07:10 +00:00
ad 8ef40c772a Get the pre-loaded module code working. 2008-05-01 14:44:48 +00:00
drochner d41cbd880a fix soabort(): sofree() wants to be called with the lock held
approved by ad
2008-05-01 09:21:56 +00:00
ad ca230e9919 #error if __HAVE_PREEMPTION && !MULTIPROCESSOR. 2008-05-01 00:20:12 +00:00
ad 95369fc30f vfs_destroy: fix a broken assertion. 2008-04-30 21:06:28 +00:00
ad ed84275d91 Disable the freecheck stuff atomically so we only get one warning about
being out of slots.
2008-04-30 20:20:53 +00:00
ad 35d5de0433 KERN_FILE_BYPID: fix locking botch. 2008-04-30 17:18:53 +00:00
ad 928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
ad 83bf8c56bb PR kern/38547 select/poll do not set l_kpriority
Among other things this could have made X11 seem sluggish.
2008-04-30 12:45:21 +00:00
ad e1df701f0d Avoid unneeded AST faults. 2008-04-30 12:44:27 +00:00
reinoud 5fc434dc18 Add a BUFQ_CANCEL() next to BUFQ_PUT() and BUFQ_GET().
BUFQ_CANCEL(queue, element) removes the specified element previously queued
on the queue. It returns NULL if it was not found on the queue and the
element if it was successfully removed.

Run trough tech-kern and changed name from BUFQ_REVOKE() by suggestion of
Jason Thorpe.
2008-04-30 12:09:02 +00:00
rmind 5d285c31ff Set minimal count of LWPs for catching to 1, and cache-hotness time to ~3ms 2008-04-30 09:17:12 +00:00
ad 10f791f083 kpreempt: fix a block that should only have compiled as C++... I gues
there is a parsing bug in gcc that let it through.
2008-04-30 00:52:22 +00:00
yamt d1de8f5e7f mutex_vector_enter: fix a typo in a comment. 2008-04-30 00:40:13 +00:00
ad 0329609eb4 Reapply 1.235 which was lost with a subsequent merge. 2008-04-30 00:30:56 +00:00
ad e3610f1886 kern/38135 vfs_busy/vfs_trybusy confusion
The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
  unmounting the file system. Simple contention on mnt_lock doesn't
  cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
2008-04-29 23:51:04 +00:00
ad 68c83ab9c7 kern/38502 ifconfig wi0 hangs
Don't acquire the socket lock for PRU_CONTROL.
2008-04-29 18:35:14 +00:00
ad bf797086e6 Don't try grabbing a zombie's p_reflock. 2008-04-29 18:13:24 +00:00
ad 322906d197 solisten: don't leak lock if the socket is busy. 2008-04-29 17:35:31 +00:00
ad b872c0e53d PR kern/37917 /bin/ps no longer shows zombies 2008-04-29 16:21:27 +00:00
ad a4c98bcccd Ignore processes with PK_MARKER set. 2008-04-29 16:21:01 +00:00
ad 1074fa7182 Ignore processes with PK_MARKER set. 2008-04-29 15:55:24 +00:00
ad ddeba2439c Ignore processes with PK_MARKER set. 2008-04-29 15:51:23 +00:00
rmind 1942fc2548 Split the runqueue management code into the separate file.
OK by <ad>.
2008-04-29 14:35:20 +00:00
ad 254bed5bd3 Fix a race condition that could cause a deadlock between two threads in
the same process simultaneously trying to dump core. Fixes PR kern/37704.
2008-04-29 14:04:06 +00:00
ad 0910800372 Suspended LWPs are no longer created with l_mutex == spc_mutex. Remove
workaround in setrunnable. Fixes PR kern/38222.
2008-04-29 13:56:14 +00:00
martin 3028e483e4 Convert to new 2 clause license 2008-04-29 06:53:00 +00:00
ad ffc4969f6e Don't count many items as EVCNT_TYPE_INTR because they clutter up the
systat vmstat display.
2008-04-28 23:00:22 +00:00
ad ca24210d8c EVCNT_TYPE_INTR -> EVCNT_TYPE_MISC 2008-04-28 22:15:47 +00:00
ad b96eb5aec9 Make the preemption switch a __HAVE instead of an option. 2008-04-28 21:17:16 +00:00
martin ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad 499f0dfad6 Even if PREEMPTION is defined, disable it by default until any preemption
safety issues have been ironed out. Can be enabled at runtime with sysctl.
2008-04-28 15:38:03 +00:00
ad 4c7ba24481 Add MI code to support in-kernel preemption. Preemption is deferred by
one of the following:

- Holding kernel_lock (indicating that the code is not MT safe).
- Bracketing critical sections with kpreempt_disable/kpreempt_enable.
- Holding the interrupt priority level above IPL_NONE.

Statistics on kernel preemption are reported via event counters, and
where preemption is deferred for some reason, it's also reported via
lockstat. The LWP priority at which preemption is triggered is tuneable
via sysctl.
2008-04-28 15:36:01 +00:00
ad 5668836100 MUTEX_SPIN_SPLRAISE: add another __insn_barrier() for safety. 2008-04-28 13:18:50 +00:00
ad 5be927ae20 Make preemption safe. 2008-04-27 22:43:08 +00:00
ad 02691198b8 Minor fix for preemption safety. 2008-04-27 14:29:09 +00:00
ad 12e587a1a3 Fix a use-after-free in soabort(). It would be better to kill SS_NOFDREF
and maintain a per-socket reference count, but SS_NOFDREF is slightly
more than a simple reference count and I don't want to break anything.
2008-04-27 14:26:58 +00:00
ad 92cbb1b2af Extend spl protection to keep all kernel_lock state in sync. There could
have been problems before. This might help with the assertion failures
seen on sparc64.
2008-04-27 14:13:05 +00:00
ad 1f8aca087d Disable preemption during the final stages of LWP exit. 2008-04-27 11:39:20 +00:00
ad 27168d9d58 - Rename crit_enter/crit_exit to kpreempt_disable/kpreempt_enable.
DragonflyBSD uses the crit names for something quite different.
- Add a kpreempt_disabled function for diagnostic assertions.
- Add inline versions of kpreempt_enable/kpreempt_disable for primitives.
- Make some more changes for preemption safety to the x86 pmap.
2008-04-27 11:37:48 +00:00
ad 2759896048 Add a comment. 2008-04-27 11:29:12 +00:00
ad 8f9c8f5ea5 lockdebug_barrier: disable preemption using the interrupt priority level,
not crit_enter/crit_exit. Since this is called from mi_switch(), crit_exit
could recurse and skew statistics.
2008-04-27 11:28:49 +00:00
ad 65af92c2d9 Adjust previous: orphang() shouldn't have been playing about with tty_lock.
It was a bit of code that I accidenally left in.
2008-04-27 10:56:28 +00:00
christos 4c10a03972 orphanpg wants the tty lock held. 2008-04-27 01:12:27 +00:00
yamt 0e18a54641 fix a comment. 2008-04-26 08:09:30 +00:00
yamt 52c2e613a9 idle_loop: unsigned -> uint32_t to be consistent with the rest of the code.
no functional change.
2008-04-26 08:08:27 +00:00
yamt 582ad655c2 fix a comment. 2008-04-26 08:06:11 +00:00
ad c925598aae lwp_startup: spl0 after pmap_activate, otherwise we could be preempted
without a pmap active.
2008-04-25 14:34:41 +00:00
joerg d4752c626d Before allowing rmdir to progess into the netherhells called VFS,
check if no filesystem is mounted on this node. This can happen
for null mounts on top of null mounts.
2008-04-25 13:40:55 +00:00
ad 4079c2dd69 Use pool_cache+atomics for sigacts. 2008-04-25 11:24:11 +00:00
ad 607f7941b7 Remove unneeded playing about with kernel_lock. 2008-04-25 11:23:42 +00:00
ad 1ae2046c17 semexit: do nothing if the process has not used semaphores. 2008-04-25 11:21:18 +00:00
ad 8c71a574b0 Remove unneeded kernel_lock/splvm stuff. 2008-04-25 00:07:24 +00:00
alc 8c99fcbe3b fix typo in comment 2008-04-24 23:26:00 +00:00
ad 3cef738139 lwp_userret: don't drop p_lock while holding a scheduler lock. 2008-04-24 21:47:11 +00:00
ad 284c2b9aef Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
2008-04-24 18:39:20 +00:00
ad 6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
ad c2deaa264e xc_broadcast: don't try to run cross calls on CPUs that are not yet running. 2008-04-24 13:56:30 +00:00
ad 026542bb25 Regen. 2008-04-24 11:51:47 +00:00
ad 30abe39468 - Retire SYCALL_MPSAFE. With the exceptions of darwin and irix emulations,
all system calls are now MPSAFE.
- Remove unneeded acquire/release of kernel_lock.
2008-04-24 11:51:18 +00:00
ad 15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
sborrill cbfd0202cd It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)
2008-04-24 08:51:06 +00:00
ad c8ff5c0c50 kmutex_t * -> void *, to avoid MD header fallout. 2008-04-23 13:19:44 +00:00
ad ebca8ee832 mutex_owned, rw_read_held, rw_write_held, rw_lock_held: check for a NULL
pointer.
2008-04-22 14:46:35 +00:00
njoly 8668268571 Fix semaphore permissions returned by semctl+IPC_STAT, by masking
anything else that the expected lower 9 bits.
2008-04-22 12:14:12 +00:00
ad 43d8bae932 Give callout_halt() an additional 'kmutex_t *interlock' argument. If there
is a need to block and wait for the callout to complete, and there is an
interlock, it will be dropped while waiting and reacquired before return.
2008-04-22 12:04:22 +00:00
ad ecebc8b473 Implement MP callouts as discussed on tech-kern. The CPU binding code is
disabled for the moment until we figure out what we want to do with CPUs
being offlined.
2008-04-22 11:45:28 +00:00
ad 3fbed79bb8 Mark the callout MPSAFE and use callout_halt(). 2008-04-22 11:44:24 +00:00
reinoud 0971ac9234 When using nested buffers, allow one erroring-out nested buffer to
error-out the master buffer.

The old setup was undeterministic since a later sheduled nested buffer
could clear the error again since there is no B_ERROR flag anymore. It also
would discard the error the nested buffer returned.
2008-04-22 11:05:06 +00:00
ad 6bd54792e3 Regen. 2008-04-21 12:57:00 +00:00
ad a2249ef75c Make ntp, pmc, reboot, sysarch, time syscalls MPSAFE. 2008-04-21 12:56:30 +00:00
ad 4d9ff4744f Fix TIOCSIG handling for SIGINFO. 2008-04-21 12:49:20 +00:00
yamt a2c6efe92f ttygetinfo: fix a locking error in rev.1.215. 2008-04-21 11:56:01 +00:00
ad d9bace2a92 Acquire kernel_lock directly in LFS syscalls. 2008-04-21 11:45:34 +00:00
ad e655812d5c Regen. 2008-04-21 00:14:22 +00:00
ad 08b44dd8b9 timer fixes for PR 37093:
- Fix serious concurrency problems, making the code MT and MP safe in
  the process.
- Don't allocate memory or inspect process state from hardclock().
2008-04-21 00:13:46 +00:00
ad 573e08da0c ttys are allocated/freed infrequently enough that there is no point having
a seperate pool for them.
2008-04-20 19:30:13 +00:00
ad 664f91e474 Improve ^T / SIGINFO handling:
- Restore code removed during LWPification.
- Don't touch proc state from a hardware interrupt handler.
- Fix the locking.
2008-04-20 19:22:44 +00:00
mlelstv 77f5b73003 When unp_internalize fails (due to the sanity check or an out-of-memory
condition), it leaves the control message with file descriptors. Calling
unp_dispose() will interpret the message as containing file pointers
and crash the system.
This change removes unp_dispose() from this failure path and avoids
using goto to jump into switch statements...
The previous workaround to ignore such messages in unp_scan() is removed.
2008-04-20 07:47:18 +00:00
mjf ede732e020 If cm->cmsg_len is not valid for unp_internalize do not use it to work out
where the data is in unp_scan.

Fixes PR/38391
2008-04-19 22:26:52 +00:00
plunky 7c3f385475 correct cut and paste error in uuid_dec_be(); le16dec -> be16dec 2008-04-19 18:21:38 +00:00
yamt d8d1533c48 pidtbl_dump: use queue.h macros. no functional change. 2008-04-17 14:16:22 +00:00
yamt 91c77f1c78 enterpgrp: update a comment. 2008-04-17 14:14:20 +00:00
yamt 69bbf68c6e acquire proclist_lock for SESSHOLD/SESSRELE. 2008-04-17 14:07:31 +00:00
yamt bc397338d9 sched_tick: don't expire timeslices for SCHED_FIFO lwps. 2008-04-17 14:03:42 +00:00
yamt 70f8f58cac s/selwakeup/selnotify/ in a comment. 2008-04-17 14:02:24 +00:00
rmind 5c0e3318e2 Adjust comments: spc_mutex is now always a per-CPU lock, L_INMEM -> LW_INMEM,
L_WSUSPEND -> LW_WSUSPEND, and remove white-spaces, while here.
2008-04-15 18:54:30 +00:00
ad db0173b9a6 SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.
2008-04-14 18:07:51 +00:00
ad acc82f8808 soreceive: dom_externalize/dom_dispose can block. If new messages are
appended while the receiver is blocked, the sockbuf will be corrupted.
Dequeue control messages from the sockbuf and sync its state in one
pass. Only then process the control messages. From FreeBSD.
2008-04-14 15:42:20 +00:00
yamt 7bf20daff6 remove unnecessary __MUTEX_PRIVATE. 2008-04-14 09:40:43 +00:00
yamt e5c1081112 make decay_cpu static. 2008-04-14 09:39:31 +00:00
ad fa71518fbc Fix comments. 2008-04-14 00:18:43 +00:00
yamt 7ab55e0ff2 sched_print_runqueue: add __printf__ attribute to the 'pr' argument. 2008-04-13 22:54:19 +00:00
yamt 3cd40e9f41 sched_print_runqueue: fix printf formats. 2008-04-13 22:53:31 +00:00
dogcow 7bcb554c5f Since nobody else has fixed it yet: fix case of GDB && !MULTIPROCESSOR. 2008-04-13 16:22:14 +00:00
rmind 8d700f664c Fix shared memory code that it could handle > 4GB addresses correctly.
PR/38109, patch (a little bit modified) from Chris Brand.
2008-04-12 20:49:22 +00:00
ad d87d01d660 Fix typo. Spotted by kardel@. 2008-04-12 18:22:03 +00:00
ad a78ad62cfb cache_enter: inline LIST_INSERT_HEAD so that the membar_producer() can be
put in the right spot. The 'next' link in the new entry must become globally
visible before the list head is updated. This could have affected systems
with weak memory ordering like the alpha.
2008-04-12 17:34:26 +00:00
ad da60beabf5 softint_overlay: bind the stolen LWP to the current CPU while processing,
to prevent it blocking and migrating to another CPU.
2008-04-12 17:17:28 +00:00
ad b60416c0e2 Move the LW_BOUND flag into the thread-private flag word. It can be tested
by other threads/CPUs but that is only done when the LWP is known to be in a
quiescent state (for example, on a run queue).
2008-04-12 17:16:09 +00:00
ad 06e0894e76 Take the run queue management code from the M2 scheduler, and make it
mandatory. Remove the 4BSD run queue code. Effects:

- Pluggable scheduler is only responsible for co-ordinating timeshared jobs.
- All systems run with per-CPU run queues.
- 4BSD scheduler gets processor sets / affinity.
- 4BSD scheduler gets a significant peformance boost on some workloads.

Discussed on tech-kern@.
2008-04-12 17:02:08 +00:00
ad 3f5f5fa2a4 Maintain a circular queue of cpu_info's. 2008-04-11 15:31:34 +00:00
ad baba274422 mutex_vector_enter: reduce reads of mtx_owner slightly. 2008-04-11 15:28:34 +00:00
ad 1e11b07bfa Restructure the name cache code to eliminate most lock contention
resulting from forward lookups. Discussed on tech-kern@.
2008-04-11 15:25:24 +00:00
ad 4598f15028 rwlock changes, discussed on tech-kern:
- Use atomic ops directly, since rwlocks work the same way on all platforms.
- Try to make it a bit more cache efficient, and use branch hints.
- Fix a bug in rw_downgrade() where the turnstile lock was not released.
- Remove a couple of redundant assertions.
- Use atomic_swap instead of atomic_cas where it's safe to do so.
- After acquiring the turnstile lock in rw_vector_enter, check if the
  owner is running again and spin if so.
- Introduce and use rw_onproc() instead of abusing mutex_onproc().
- Change the handoff/release algorithm to reduce the window when a rwlock
  can held, but the owner not on a CPU.
2008-04-11 14:55:51 +00:00
wiz e47f3f6ebe Commit fix for the fdfile leak described in PR 38374.
Patch provided by YAMAMOTO Takashi.

Ok ad@
2008-04-09 19:36:59 +00:00
thorpej 0cfa6e7487 Make the percpu API a little more friendly:
- percpu_getptr() is now called percpu_getref() and implicitly disables
  preemption (via crit_enter()) when it is called.
- Added percpu_putref() which implicitly reenables preemption (via
  crit_exit()).
2008-04-09 05:11:20 +00:00
ad d49b125626 When accessing a block/char device, cache the D_MPSAFE flag on initial
access, in case the devsw record is modified.
2008-04-06 17:27:39 +00:00
tsutsui cde344d3be Allow MD cycle counter routines to pass their own optimized
tc_get_timecount function to MI cc_init().
2008-04-05 18:17:36 +00:00
yamt 3426b80b5e - l_wmesg is not always valid. check l_wchan when using l_wmesg.
should fix a crash reported by Juan RP on current-users@.
- ttyinfo: lock lwp when accessing l_wmesg.
- fill_lwp: add an assertion.
2008-04-05 14:03:16 +00:00
yamt c6f589405e assertions. 2008-04-05 13:58:12 +00:00
cegger 224670ae98 use device_xname() where appropriate
OK martin
2008-04-04 20:13:18 +00:00
ad f07d372316 When a timeshared LWP blocks on a turnstile, elevate its priority into the
PRI_KTHREAD range. This is kind of ugly, but needed because of direct handoff
with rwlocks, and because threads that block holding a mutex regularly hold
other locks/resources.

Problem addressed: priority lending works well where a thread blocking on a
turnstile has a high priority level (eg realtime). For timeshared threads
(low priority) it's unlikely to have much effect. In the latter case threads
awoken from a turnstile can and do compete for CPU time with regular waits
like disk I/O. On MP systems this can result in a feedback loop where
threads cannot quickly get access to a resource held by a thread waking from
a turnstile. The waking thread eventually runs when enough of the other
threads block waiting for it, freeing up the CPU. The end result is a lot of
idle time during builds.
2008-04-04 19:16:24 +00:00
ad 15efd9ad99 Do adaptive spinning for rwlocks, but only if the lock is write held and
there are no waiters. This gives a major boost to build.sh on larger
systems as directory vnode locks are exclusive for lookup, but are often
only held for a very short period of time.

This change has the potential to more readily expose lock order reversals
and other types of deadlock.
2008-04-04 17:25:09 +00:00
ad 61a0a96054 Maintain a bitmap of idle CPUs and add idle_pick() to find an idle CPU
and remove it from the bitmap.
2008-04-04 17:21:22 +00:00
ad 2940b88b72 sched_tick: only case a preemption if the current thread is hogging the CPU,
or if we are idle and should look for new work (matters with per-CPU queues).
2008-04-02 17:40:15 +00:00
ad 42bc09155e yield: don't drop priority to zero. libpthread doesn't make much use of
this any more but applications do and it now pessimizes benchmarks.
2008-04-02 17:38:16 +00:00
xtraeme 247cd610f6 Revert rev 1.126-1.128. The original code was correct and rmind and I
didn't look correctly at them.
2008-04-02 10:53:23 +00:00
xtraeme dcf3ee7d3b When copying l_name and l_wmesg use KI_LNAMELEN and KI_WMESGLEN
respectively, so that we don't care if l_name/wmesg is longer
than kl_name/wmesg and the KASSERTs added in previous can go away.
2008-04-01 21:05:37 +00:00
drochner 76ad1614e9 remove useless passing of the lwp from the KERNEL_LOCK() ABI
(not the API; this would be easy as well)
agreed (a while ago) by ad
2008-04-01 19:49:31 +00:00
xtraeme 3189c49560 Fix previous: use the length of l->l_foo not kl->l_foo and add
two KASSERTs to check for max lenght limits before copying.

As suggested by rmind@.
2008-04-01 18:06:06 +00:00
xtraeme 03c6a6aa65 fill_lwp: when copying l_wmesg and l_name, use the size of the string
not of the variable.

Found and ok by rmind@.
2008-04-01 17:39:58 +00:00
ad 520b46da7e Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.
2008-04-01 10:37:42 +00:00
xtraeme bc77ebea72 Remove useless returns at the end of void functions. 2008-03-31 15:28:47 +00:00
ad 164f30df1b Don't report kernel lock spinouts if init has not yet started.
XXX This should be backed out when we are sure that the drivers
are good citizens and configure nicely with interrupts enabled /
the system running.
2008-03-30 15:39:46 +00:00
ad bd9b59aafe selwakeup: convert a while() loop into a do/while() since the first test
isn't needed.
2008-03-29 14:08:35 +00:00
ad 96a6231c10 callout_halt: remove unneeded extern decl. 2008-03-29 14:07:23 +00:00
ad c7e03d2b58 callout_destroy: fix assertion to not fire when a callout is destroying
its own handle. PR kern/38324.
2008-03-29 14:00:55 +00:00
ad 58420c122f mutex_vector_exit: add another panicstr check. 2008-03-28 22:19:39 +00:00
ad 6c180fd421 Pull in sys/cpu.h for cpu_intr_p(). 2008-03-28 21:58:43 +00:00
ad 03489e636c sleepq_block: use callout_halt, as we have to wait for the callout to
stop (it might be running on another CPU). Otherwise, 'curlwp' could
exit before it completes.
2008-03-28 20:48:36 +00:00
ad c3338aabf1 Enable blocking synchronization for callouts as discussed at length on
tech-kern last year. Instead of modifying callout_stop, add a new
routine (callout_halt) which will sleep if the callout is already in
flight. Note that if a callout can take locks, the caller of callout_halt
must not hold any of those locks - otherwise the two could deadlock.
2008-03-28 20:44:38 +00:00
ad 13c3856c4c Remove dead code from previous. 2008-03-28 16:23:39 +00:00
ad 4bd84ff96a Prevent overlapping calls to bind() and/or connect() on a Unix socket. 2008-03-28 12:14:22 +00:00
ad c2f3592995 Prevent listen() on a socket that is already connected - we already prevent
connect() on a listening socket.
2008-03-28 12:12:20 +00:00
dholland d868f7242f Yet another rename workaround - this time check for . and .. early because
relookup() objects to being asked to handle them.
2008-03-28 05:02:08 +00:00
ad bb61e73cd5 Add code for dynamically allocated mutexes, as posted on tech-kern. 2008-03-27 19:11:05 +00:00
ad be04ac4896 Make rusage collection per-LWP and collate in the appropriate places.
cloned threads need a little bit more work but the locking needs to
be fixed first.
2008-03-27 19:06:51 +00:00
ad feb4783fdf Replace use of CACHE_LINE_SIZE in some obvious places. 2008-03-27 18:30:15 +00:00
ad 2ffc44f47b Regen. 2008-03-27 17:14:21 +00:00
ad a19a515177 Put kqueue/kevent back as MPSAFE. 2008-03-27 17:13:25 +00:00
ad 78656b1e91 - kqueue_scan: work around problem noted by yamt@: if an event fires while
we have unlocked the kqueue to check its state, leave it queued and
  re-check later.
- knote_dequeue: fold into knote_detach since nothing else uses it.
- Note a couple more problems.
2008-03-26 13:32:32 +00:00
yamt 91ae756395 - for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well.  fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map.  XXX a bit arbitrary.
- add a comment.
2008-03-25 23:21:42 +00:00
ad 8910b668ba mount_domount: hold an additional reference to the mountpoint across the
call to VFS_START. The file system can be unmounted before VFS_START
returns. Partially addresses PR kern/38291.
2008-03-25 22:13:32 +00:00
yamt ef630703b1 regen. 2008-03-24 23:47:06 +00:00
yamt f725bfefcf after yamt-lazymbuf merge, mark send/recv syscalls MPSAFE.
pointed out by Andrew Doran.
2008-03-24 23:46:43 +00:00
yamt 2acbe71698 regen. 2008-03-24 12:25:42 +00:00
yamt 9a4b7dd279 merge yamt-lazymbuf branch. 2008-03-24 12:24:37 +00:00
yamt 8df658c8ca add some DEBUG checks. 2008-03-24 09:09:55 +00:00
yamt d3d0ab9bb2 kqueue_scan: skip markers correctly. 2008-03-23 22:39:48 +00:00
ad 36cd74d4d8 Undo 1.150 (Don't make root an exception when enforcing rlimits). No other
Unix behaves this way and it breaks too many things, e.g. web servers.
2008-03-23 17:40:25 +00:00
ad 28a2c8b191 Reorder a code block slightly, to allow proclist_mutex to be an adaptive
mutex (purely for testing).
2008-03-23 16:53:45 +00:00
ad 25b10dbb15 lwp_ctl_alloc: initialize lcp_kaddr to vm_map_min(kernel_map), in order to
prevent uvm_map() from spuriously failing.
2008-03-23 16:39:34 +00:00
ad 3acbed8e48 Split select/poll into their own file. 2008-03-23 14:02:49 +00:00
yamt 93109cd91b when calculating some cache sizes, consider the amount of available kva.
PR/33185.
2008-03-23 10:39:52 +00:00
yamt d3bb744af3 make buf_map static. 2008-03-23 10:33:15 +00:00
rmind 579caa1e17 - Support for select/poll.
- Convert pool to pool-cache.
- Wrap long lines, adjust the license.
2008-03-23 00:44:15 +00:00
ad 40379c8716 Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.
2008-03-22 18:04:42 +00:00
ad 5214147407 LWP_CACHE_CREDS: instead of testing (l_cred != p_cred), use a per-LWP
flag bit to indicate a pending cred update. Avoids touching one item of
shared state in the syscall path.
2008-03-22 17:53:34 +00:00
christos 4897e6c085 bring some stuff from time_t=64...
- add sysalign parameter to syscalls.conf
- add compat_50
2008-03-22 15:11:01 +00:00
ad 527a0b7dab Regen. 2008-03-22 14:20:30 +00:00
ad f5405b27b2 Unmark kevent/kqueue as MPSAFE. There seems to be some kind of deadlock
involving kernel_lock.
2008-03-22 14:20:09 +00:00
yamt 9adad93037 wrap a long line. 2008-03-22 10:24:17 +00:00
rmind cbb7f92857 unp_gc: unlock filelist_lock in a case of restart. 2008-03-21 23:38:40 +00:00
ad 40607e5bba Regen. 2008-03-21 21:59:27 +00:00
ad bc370563bc Mark kqueue/kevent MPSAFE. 2008-03-21 21:58:57 +00:00
ad a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
ad c743ad7159 File descriptor changes, discussed on tech-kern:
- Redo reference counting to be sane.  LWPs accessing files take a short
  term reference on the local file descriptor.  This is the most common
  case.  While a file is in a process descriptor table, a reference is
  held to the file.  The file reference count only changes during control
  operations like open() or close().  Code that comes at files from an
  unusual direction (i.e. foreign to the process) like procfs or sysctl
  takes a reference on the file (f_count), and not on a descriptor.

- Remove knowledge of reference counting and locking from most code that
  deals with files.

- Make the usual case of file descriptor lookup lockless.

- Make kqueue MP and MT safe. PR kern/38098, PR kern/38137.

- Fix numerous file handling bugs, and bugs in the descriptor code that
  affected multithreaded processes.

- Split descriptor system calls out into sys_descrip.c.

- A few stylistic changes: KNF, remove unused casts now that caddr_t is
  gone. Replace dumb gotos with loop control in a few places.

- Don't do redundant pointer passing (struct proc, lwp, filedesc *) unless
  the routine is likely to be inlined.  Most of the time it's about the
  current process.
2008-03-21 21:53:35 +00:00
plunky 606c30fdca add devsw_name2chr() function to look up character devices 2008-03-21 19:32:07 +00:00
ad 5e7d890908 - Extract the guts of soo_poll() into sopoll(), which takes a struct socket *.
This is for netsmb which wants to poll sockets directly.
- When polling a socket, first check for pending I/O without acquring any
  locks. If no I/O seems to be pending, acquire locks/spl and check again
  doing selrecord() if necessary.
2008-03-20 19:23:15 +00:00
ad a0bce9fd99 softint_execute: add more assertions. 2008-03-20 19:12:23 +00:00
njoly d5bb355fe9 Handle rumpcalls/rumpcallshdr differently by always defining a default
value, which can be overwritten with syscalls.conf defines (just like
sys_nosys).

This fix a problem where rump awk variables are set to 0 value,
leading to the creation of an unexpected file with that name.

ok by pooka.
2008-03-18 12:36:15 +00:00
mrg 5e0e0b0a59 need <sys/atomic.h> now. 2008-03-18 02:49:15 +00:00
ad 1b558d1305 uid_find:
- Issue membar_producer() before inserting the new uidinfo.
- Optimize slightly and fix a couple of KNF nits.
- Need sys/atomic.h.
2008-03-18 02:35:29 +00:00
rmind 33928e0f83 - Replace uihashtbl_lock and struct uidinfo::ui_lock with atomic operations.
This make uid_find(), chgproccnt(), chgsbsize() and lf_alloc(), lf_free()
  functions lock-less.
- Increase the size of uihashtbl in case of MP system, as suggested by <ad>.
- Add HASH_SLIST type for hashinit().

Reviewed by <ad>.
2008-03-17 21:16:03 +00:00
ad c796b0c73a Add a boolean parameter to syncobj_t::sobj_unsleep. If true we want the
existing behaviour: the unsleep method unlocks and wakes the swapper if
needs be. If false, the caller is doing a batch operation and will take
care of that later. This is kind of ugly, but it's difficult for the caller
to know which lock to release in some situations.
2008-03-17 18:01:44 +00:00
ad ad4f28d1e9 Make them compile again. 2008-03-17 17:05:54 +00:00
ad c42a4d1422 Add a boolean parameter to syncobj_t::sobj_unsleep. If true we want the
existing behaviour: the unsleep method unlocks and wakes the swapper if
needs be. If false, the caller is doing a batch operation and will take
care of that later. This is kind of ugly, but it's difficult for the caller
to know which lock to release in some situations.
2008-03-17 16:54:51 +00:00