Commit Graph

7582 Commits

Author SHA1 Message Date
jruoho 1cbdcd8dc6 Use CTLTYPE_BOOL. 2010-04-19 11:20:56 +00:00
pooka d8c5395931 Don't loop eternal if init of a builtin module fails. 2010-04-16 11:51:23 +00:00
rmind 5f0ac9a4fa - Merge sched_pstats() and uvm_meter()/uvm_loadav(). Avoids double loop
through all LWPs and duplicate locking overhead.

- Move sched_pstats() from soft-interrupt context to process 0 main loop.
  Avoids blocking effect on real-time threads.  Mostly fixes PR/38792.

Note: it might be worth to move the loop above PRI_PGDAEMON.  Also,
sched_pstats() might be cleaned-up slightly.
2010-04-16 03:21:49 +00:00
rmind 93deacb9f4 Remove mclpool_allocator, which is unnecessary since mb_map removal. 2010-04-16 02:57:15 +00:00
pooka cc69e4568b will it include, that is the question
(to everyone's disappointment on some archs it didn't)
2010-04-15 20:46:08 +00:00
pooka 0d8b367e2d Need a few funny #defines in kern_syscall.c too. 2010-04-14 15:15:37 +00:00
pooka 64d6a27dc2 need opt_modular.h in kern_syscall 2010-04-14 15:12:44 +00:00
pooka 7ea24651a7 Move routines related to syscall establishment from kern_subr.c and
kern_stub.c to kern_syscall.c.
2010-04-14 14:46:59 +00:00
pooka 00dd646066 regen: rump vnodeif went on a diet 2010-04-14 14:00:04 +00:00
pooka fcde1e9ca0 Make rump vnode interface lightweight: the only things we
really need are:

  0) provide VOP_OP in the alternate RUMP_VOP_OP namespace

  and for each op:
  1) schedule rump cpu
  2) call VOP_OP
  3) unschedule rump cpu

While here, take the opportunity to get rid of _t lossage in the
rump-exported interfaces.
2010-04-14 13:58:51 +00:00
pooka 592f1701c5 regenefactor for comment and whitespace changes 2010-04-14 12:21:04 +00:00
pooka 7b0c00ddf9 Print "end of special cases" only where special cases end and not
a second time at the end of the file.  Adjust whitespace for the
sheer functional joy of it.

(i hope i didn't ruin someone's joke by missing a humorous implication
that all vnode operations are considered a little special)
2010-04-14 12:19:50 +00:00
pooka 794d5a7111 _KERNEL_OPT 2010-04-13 22:46:10 +00:00
pooka 2c4f731dd6 tyop 2010-04-12 23:20:18 +00:00
christos 302c1e42ae void police! 2010-04-12 23:09:28 +00:00
pooka 290fe400e0 Separate lwp specificdata data structure management from lwp cpu/vm
management.

No functional change.

(specificdata routines went from kern_lwp.c to subr_lwp_specificdata.c)
2010-04-12 22:15:31 +00:00
mrg 18d175fa3c reject attempts to write CTLTYPE_BOOL nodes with a value other than 0 or 1. 2010-04-11 01:50:25 +00:00
pooka cfff0936d1 regen: remove unused vdesc_transports 2010-04-10 19:44:02 +00:00
pooka 790df5ab8d "Not yet" since 4.4BSD is quite a lot of "not yet", so remove
vdesc_transports from vnodeop_desc until we have a "not not yet"
situation.

Ride 5.99.27 bump (full build still in progress.  i wanted to get
this in as soon as possible to most effectively ride the bump.)
2010-04-10 19:41:54 +00:00
njoly cb925a949e Make lwp_ctl_alloc() return 0 instead of EINVAL, when lwpctl user
address already exists. This allow calling _lwp_ctl(2) more than once
on the same LWP.
2010-04-09 11:47:17 +00:00
njoly 4f2ea8f3c9 Add a new clock_gettime1() function that holds most of the
clock_gettime syscall code (except for the copyout). Adjust all
corresponding syscalls to make use of it.
2010-04-08 11:51:13 +00:00
christos 2909eda13b fix build for ports that don't have PT_STEP (Havard Eidnes) 2010-04-07 13:10:46 +00:00
christos ca843a73b0 PR/43128: Paul Koning: Threads support in ptrace() is insufficient for gdb to
debug threaded live apps: Add an optional lwpid in PT_STEP and PT_CONTINUE to
indicate which lwp to operate on, and implement the glue required to make it
work.
2010-04-06 13:50:22 +00:00
he d399695a05 Follow christos' suggestions, and make ks_active a u_short, and
also only use 16 u_shorts instead of 32 ints.  Also add panic()
calls for under- and overflow of the ks_active members under
DIAGNOSTIC.  The MAXBUCKET constant ended up in sys/mallocvar.h
and not sys/param.h, as the latter caused build problems.

Ride the kernel revision bump of my previous change.
2010-04-05 08:03:41 +00:00
he bb89b7208d Extend struct malloc_type to count the number of active allocations
per size, and make vmstat report this information under the "Memory
statistics by type" display, which is only printed when the kernel
has been compiled with KMEMSTATS defined, like this:

Memory statistics by type                                Type  Kern
           Type InUse  MemUse HighUse   Limit   Requests Limit Limit Size(s)
          wapbl    15   4192K   4192K  78644K     376426     0     0 32:0,256:3,512:6,131072:1,262144:2,524288:3

Since struct malloc_type is user-visible and is changed, bump kernel
revision to 5.99.26.

While it is true that malloc(9) is in general on the path of slowly
being replaced by kmem(9) (kmem_alloc/kmem_free), there remains a
lot of points of usage of malloc/free, and this could aid in finding
any leaks.  (It helped finding the leak fixed in PR#42661.)

This was discussed with and somewhat hestitantly OKed by rmind@
2010-04-05 07:16:12 +00:00
jnemeth bc239ea58a don't leak a vnode and don't call namei (implicitly) twice 2010-04-04 17:18:04 +00:00
njoly 0876f873dd Move most clock_getres syscall code, except for coypout call, to a new
clock_getres1() function which can be used by emulations. Adjust all
clock_getres syscalls to now make of use it.
2010-04-03 17:20:05 +00:00
tsutsui 55bc1f1a41 Use time_t (not long) to save time_second value. 2010-04-02 23:31:42 +00:00
christos 8c20e0e884 fix debugging printf. 2010-04-02 14:11:18 +00:00
ad 78f9946c6b Fix copyrights. 2010-03-31 19:59:39 +00:00
pooka 242bf1c3e7 Stop exposing fifofs internals and leave only fifo_vnodeop_p visible. 2010-03-29 13:11:32 +00:00
pooka 8b70574df1 Add init/fini for components (modules etc.). These eat the standard
driver/attach/data typically present and once some locking is grown
in here, these routines can be made to fail or succeed a component
attachment/detachment atomically.
2010-03-25 19:23:18 +00:00
drochner 713b10dc38 When choosing the start address of a dynamic (ie relocatable) executable,
respect the alignment in the ELF phdr.
Also, for correctness, use the maximum alignment of the PT_LOAD
sections rather than just the first one found.
Also, use more meaningful types.
2010-03-22 22:10:10 +00:00
christos b691db097d more debugging compilation fixes. 2010-03-20 01:52:16 +00:00
christos 7fa75c35d6 fix debugging code. 2010-03-20 01:47:12 +00:00
christos 6d16572ef4 minimize ifdefs and avoid duplicated code. 2010-03-20 01:45:30 +00:00
christos 3e2a63c711 - Make maximum memory limits for various things #define constants and use the
consistently across the code.
- Re-do note parsing code to read the section headers instead of the program
  headers because the new binutils merge all the note sections in one program
  header. This fixes all the pax note parsing which has been broken for all
  binaries built with the new binutils.
- Add diagnostics to the note parsing code to detect malformed binaries.
- Allocate and free note scratch space only once, not once per note.
2010-03-19 22:08:13 +00:00
pooka 1c55854229 Print builtin "use -f" message only if not autoloading. Otherwise
it'll get spammy.

XXX: this should probably be printed iff the toplevel module is
not being autoloaded (i.e. there is a human to interpret the error).
Otherwise disabled dependencies give a misleading EPERM.
2010-03-18 18:25:45 +00:00
pooka d76b630321 Never autounload builtin modules (they will never be autoloaded if disabled). 2010-03-18 17:33:18 +00:00
christos 724aa20200 rename DEBUG_ASLR -> PAX_ASLR_DEBUG 2010-03-15 20:35:19 +00:00
darran 0ede06284a DTrace: Make the CTF handling conditional on KDTRACE_HOOKS for now since
it breaks the boot of the atari kernel (and possibly others).
2010-03-14 21:27:49 +00:00
christos 05cb23b544 call accept_filter_init int setopt so that don't use an uninitialized lock
from the setsockopt path.
2010-03-13 23:03:39 +00:00
christos e8cb686278 make this compile. 2010-03-13 16:27:06 +00:00
christos 04140a33ca make this compile. 2010-03-13 01:41:14 +00:00
darran 38c72d335c DTrace: Add support for CTF sections in the netbsd elf image, load these
at boot.
Add a ksyms_mod_foreach() function to iterate a callback function over the
set of elf symbols for a specific module (netbsd included).
Add kern_ctf.c and mod_ctf_get() to allow the retrieval and decompression
of CTF sections for a specific module.
2010-03-12 21:43:10 +00:00
pooka effc302a58 Make module_{lookup,enqueue}() static now that it's possible again
(effectively reverts my kern_module rev. 1.53 from some months ago)
2010-03-05 20:10:05 +00:00
pooka ee7bfacd73 Move builtin modules to a list in init and load them from there
instead of using linksets directly.  This has two implications:

1) It is now possible to "unload" a builtin module provided it is
   not busy.  This is useful e.g. to disable a kernel feature as
   an immediate workaround to a security problem.  To re-initialize
   the module, modload -f <name> is required.
2) It is possible to use builtin modules which were linked at
   runtime with an external linker (dlopen + rump).
2010-03-05 18:35:01 +00:00
pooka 7363f77230 Replace unsafe use of TAILQ_FOREACH: as the comment says, the
structures are pulled off the list in the loop and it's anyone's
guess where they go after that.
2010-03-03 17:58:36 +00:00
pooka 4a21cd9096 Remove fs_lfs now that the syscall is always defined. 2010-03-03 00:49:39 +00:00
yamt b1521a3612 remove redundant checks of PK_MARKER. 2010-03-03 00:47:30 +00:00
pooka 7f245c149b regen: lfs megamaid syscalls -> MODULAR 2010-03-02 19:37:02 +00:00
pooka 773120876e Make lfs syscalls loadable. This nukes fs_lfs.h & #ifdef LFS.
(I don't mind if someone wants to go further and OBSOL them).
2010-03-02 19:34:26 +00:00
pooka e867c34ab2 Make is possible to add extra output at the top of syscallargs.h.
Use this feature to stick sys/mount.h in there.
2010-03-02 19:33:12 +00:00
pooka 3f57313fc5 fs_ffs.h is no longer required (since the death of bufops / softdep) 2010-03-02 14:22:44 +00:00
darran c9e0343fff Revert accidental commit of CTF work-in-progress changes. 2010-03-01 22:27:07 +00:00
darran 6a9056a926 DTrace: Add an SDT (Statically Defined Tracing) provider framework, and
implement most of the proc provider.  Adds proc:::create, exec,
exec_success, exec_faillure, signal_send, signal_discard, signal_handle,
lwp_create, lwp_start, lwp_exit.
2010-03-01 21:10:13 +00:00
mlelstv 7ad5c184b5 Move block number computations to callers of wapl_read/wapl_write and
conditionally build DEV_BSIZE adjustments for kernel. fsck_ffs shares
the same code but accesses physical blocks.

Also compute correct block numbers for each physical sector.
2010-02-27 16:51:03 +00:00
mlelstv ef95b640b0 Store physical block numbers in superblock that point to the journal.
Calculate position of both commit headers correctly for disks with
large sectors.
Correct calculation of circular buffer size.
2010-02-27 12:04:19 +00:00
mlelstv c30b0f26b2 mnt_fs_bshift is the filesystem block size, not the fragment size.
Revert to physical block size. This is fine as long as filesystem
and log stay on a similar physical medium.
2010-02-26 22:24:07 +00:00
jym cbdb1f8831 Change RSS (resident set size) limit. Instead of setting it arbitrarily
to the total free memory available to the system, use the smallest value
between VM_MAXUSER_ADDRESS and total free memory (having a RSS limit
bigger than VM_MAXUSER_ADDRESS has no real meaning).

Fix a possible int overflow when ptoa(uvmexp.free) is bigger than 4GB
with a 32 bits vaddr_t.

This change is similar to the one made in rev 1.144 of uvm/uvm_glue.c.
2010-02-26 18:47:13 +00:00
dyoung c1b390d493 A pointer typedef entails trading too much flexibility to declare const
and non-const types, and the kernel uses both const and non-const
PMF qualifiers and device suspensors, so change the pmf_qual_t and
device_suspensor_t typedefs from "pointers to const" to non-pointer,
non-const types.
2010-02-24 22:37:54 +00:00
darran eda764aaf5 DTrace: remove kern_dtrace.c since it is no longer used. (Its functions
are inlined in dtrace_bsd.h).
2010-02-23 22:22:29 +00:00
darran 383b7f700b DTrace: Get rid of the KDTRACE_HOOKS ifdefs in the kernel. Replace the
functions with inline function that are empty when KDTRACE_HOOKS is not
defined.
2010-02-23 22:19:27 +00:00
mlelstv b4d69db7b5 Use correct offset to block number calculations.
Also change access to filesystem blocks to be done by fragment instead
of by physical block. Fragments are the fundamental blocks of the
filesystem.

For a theoretical filesystem that accesses the disk in smaller units
than stored in mp->mnt_fs_bshift, the assumption might be wrong. But
this will also break other subsystems. The value mp->mnt_dev_bshift
which formerly represents the physical sector size is currently only
virtual in NetBSD (always DEV_BSIZE).
2010-02-23 20:51:25 +00:00
drochner ec0c8f12ca Run binaries with ELF_TYPE==DYN at virtual address PAGE_SIZE rather
than 0. This is still not the intent of PIE, but it allows them to
run with VA 0 disabled.
(The PAX_ASLR stuff which should deal with this needs work.)
CV: ----------------------------------------------------------------------
2010-02-22 19:46:18 +00:00
darran 2c398f9ea9 DTrace: Add __predict_false() to the DTrace hooks per rmind's suggestion. 2010-02-21 07:39:18 +00:00
darran 8d0c2f9cd9 DTrace: missed kern_dtrace.c (thanks rmind!) 2010-02-21 07:28:51 +00:00
darran 37422f86b0 Added a defflag option for KDTRACE_HOOKS and included opt_dtrace.h in the
relevant files. (Per Quentin Garnier - thanks!).
2010-02-21 07:01:57 +00:00
darran 1bc28ea1e9 Add the DTrace hooks to the kernel (KDTRACE_HOOKS config option).
DTrace adds a pointer to the lwp and proc structures which it uses to
manage its state.  These are opaque from the kernel perspective to keep
the kernel free of CDDL code. The state arenas are kmem_alloced and freed
as proccesses and threads are created and destoyed.

Also add a check for trap06 (privileged/illegal instruction) so that
DTrace can check for D scripts that may have triggered the trap so it
can clean up after them and resume normal operation.

Ok with core@.
2010-02-21 02:11:39 +00:00
dyoung 754590e092 Avoid a potential crash: get more struct device initialization
out of the way before trying to get a unit number.  If we cannot
get a unit number, we call config_devfree(), which expects for
fields such as dv_flags, dv_cfattach, and dv_private to be initialized.
2010-02-19 22:28:47 +00:00
skrll f3c7b2c4cd Fix comment(s).
OK'ed by rmind
2010-02-18 20:58:23 +00:00
dyoung c26d0a3ad4 Initialize the temporary pmf_qual_t in pmf_device_subtree_release()
to avoid a failed ds != NULL assertion, later.
2010-02-17 00:15:24 +00:00
dholland 1b722ce8d0 Don't inspect vn_stat() results until after checking that it succeeded.
If anyone's been seeing random "File too large" results from module loading,
this should fix it.
2010-02-16 05:47:52 +00:00
dyoung ff79c75809 Extract a subroutine, const char *cfdata_ifattr(cfdata_t cf), that
returns the name of the interface attribute that associates cf with
its parent.  Use cfdata_ifattr() at several sites in the autoconf
code.
2010-02-15 20:20:34 +00:00
yamt ca9d84bc07 sysctl_doeproc: don't follow a possibly stale pointer. 2010-02-13 11:22:21 +00:00
haad aa8090778a Add vrele_async routine which asynchronously release vnodes in different contex
and in some time in the future.

Ok: ad@.
2010-02-11 23:16:35 +00:00
haad a23681588b Add kmem_asprintf rotuine which allocates string accordingly to format
string from kmem pool. Allocated string is string length + 1 char for ending
zero.

Ok: ad@.
2010-02-11 23:13:46 +00:00
wiz 8e35c759e7 Fix typo in comment. 2010-02-09 23:05:16 +00:00
joerg 1476e7a45a Handle rump like the direct mapping case. 2010-02-08 22:55:36 +00:00
joerg d621e29eca Remove separate mb_map. The nmbclusters is computed at boot time based
on the amount of physical memory and limited by NMBCLUSTERS if present.
Architectures without direct mapping also limit it based on the kmem_map
size, which is used as backing store. On i386 and ARM, the maximum KVA
used for mbuf clusters is limited to 64MB by default.

The old default limits and limits based on GATEWAY have been removed.
key_registered_sb_max is hard-wired to a value derived from 2048
clusters.
2010-02-08 19:02:25 +00:00
skrll c53ddcfff2 Re-apply:
Invert the sense of the bit to mark if LOCKDEBUG is enabled to
	disabled.

	This will help my fellow developers spot "use before initialised"
	problems that hppa picks up very well.

but fix the !LOCKDEBUG case by defining the "no debug" bits to zero so
they have no effect on lock stubs.
2010-02-08 09:54:27 +00:00
uebayasi 2903e6d834 __inline -> inline 2010-02-06 12:10:59 +00:00
cube 5ba423200b Revert commit from Fri Feb 5 06:43:17 UTC 2010 by skrll:
Invert the sense of the bit to mark if LOCKDEBUG is enabled to disabled.

      This will help my fellow developers spot "use before initialised" problems
      that hppa picks up very well.

It has to be done differently, because the semantics of mtx_owner in the non-
LOCKDEBUG case can vary significantly between archs, and thus it is not
possible to simply flip a bit to 1.

Ok core@, as at least i386 is unbootable right now.
2010-02-06 04:50:19 +00:00
cegger 8e585686fb fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().
2010-02-05 11:06:36 +00:00
skrll 60b795dc0a Invert the sense of the bit to mark if LOCKDEBUG is enabled to disabled.
This will help my fellow developers spot "use before initialised" problems
that hppa picks up very well.
2010-02-05 06:43:16 +00:00
njoly 0da168aed4 Switch SSP init output to aprint_debug() instead of aprint_normal()
under DIAGNOSTIC ifdefs.
2010-02-01 16:14:58 +00:00
njoly 69c8ab9322 Aprintify. 2010-02-01 12:58:04 +00:00
pooka 0f9bb09e12 Device accessors are only marginally related to autoconf, so put them
into subr_device.c instead of having them in subr_autoconf.c.

Since none of the copyrights in subr_autoconf.c really match the
history of device accessors, I took the liberty of slapping (c)
2006 TNF onto subr_device.c.
2010-01-31 15:10:11 +00:00
skrll e975ed8767 1 CTASSERT(foo) is enough for anyone. 2010-01-31 11:54:32 +00:00
martin 476c17bc5a This is using device_t, so it needs to include <sys/device.h>. 2010-01-31 09:27:40 +00:00
pooka 8288b650e5 uncommit part which wasn't supposed to get committed yet 2010-01-31 03:57:01 +00:00
pooka 3cba324816 Pass root device as a parameter to domountroothook(). 2010-01-31 02:04:43 +00:00
pooka 04b824ef52 Place *hook implementations in kern_hook.c instead of them floating
around in the kern_subr.c gruel.  Arrrrr.
2010-01-31 01:38:48 +00:00
pooka 3b780690a4 Use proper static initializers for *hooklist (currently they happened
to work accidentally anyway since the initializer is 0).
2010-01-31 00:48:07 +00:00
hubertf af120bb199 Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.
2010-01-31 00:43:37 +00:00
pooka 92394bf5f3 Trade 200-something bytes for the death of an ifdef. 2010-01-30 23:19:55 +00:00
rmind b8ea6ca48b aio_suspend1: remove wrong comment, add one new.
Tidy up a little, while here.
2010-01-30 21:23:46 +00:00
mlelstv 49be4d025a Add helper function that determines the size and block size of a disk device.
For now we query
- the disk label
- the wedge info and data from disk(9)
2010-01-30 11:57:17 +00:00
he ce1061323d On a recursive panic(), don't try to take a dump, as that may very
well have triggered the recursive panic.
Fix the comment for panic() to reflect now-current reality: the code
was already changed never to sync() on panic(), now we avoid dumping
core on a recursive panic.
2010-01-26 12:59:50 +00:00
dholland 3c82208a56 Amplify comment about ultrix bits. 2010-01-24 19:56:26 +00:00
hubertf 739e259054 Let kernel build when MALLOCLOG is defined but DIAGNOSTIC is not.
Else, hitmlog() is defined but not used, which triggers a warning.
2010-01-22 08:32:05 +00:00
pgoyette 17d5113226 Remove unnecessary call to kauth_cred_free().
This resolves an occassional crash I'd been experiencing as reported on
current-users@

Fix suggested by and OK elad@
2010-01-21 04:40:22 +00:00
rmind f6d80c92e0 pool_cache_invalidate: comment out invalidation of per-CPU caches (nobody depends
on it, at the moment) until we decide how to fix it (xcall(9) cannot be used from
interrupt context).  XXX: Perhaps implement XC_HIGHPRI.
2010-01-20 23:40:42 +00:00
pooka 654415b2b7 Get rid of last "easy" kernel symbols starting with __:
__assert -> kern_assert
__sigtimedwait1 -> sigtimedwait1
__wdstart -> wdstart1

The rest are MD and/or shared with userspace, so they will require
a little more involvement than what is available for this quick
"ride the 5.99.24 bump" action.
2010-01-19 22:28:30 +00:00
pooka f32c83c1bd Rename a few routines from _file() to _vfs() for consistency.
Ride 5.99.24 bump.
2010-01-19 22:17:44 +00:00
pooka 10fe49d72c Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client.  This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached.  However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff.  ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.
2010-01-19 22:06:18 +00:00
dyoung 71080992ef A new survey of the code indicates that the very highest interrupt
priority level where the kernel accesses alldevs is IPL_VM, where
some hardware interrupt handlers call config_deactivate(9).  Lower
the IPL of alldevs_mtx from IPL_HIGH to IPL_VM, accordingly.
2010-01-19 21:54:53 +00:00
dyoung 2905b5fc8d Refactor: as suggested by rmind@, extract duplicate code into
subroutines config_alldevs_enter() and config_alldevs_exit().  This
change amounts to textual substitution.  No functional change intended.

We do not collect garbage in device_lookup(), so there is no use dumping
it: get rid of the garbage list.  Do not call config_dump_garbage().

In device_lookup_private(), call device_lookup() instead of duplicating
the code from device_lookup().
2010-01-19 21:24:36 +00:00
pooka 27d8901688 Update comment: unloaded modules which were pumped up by the
bootloader are not freed at the end of bootstrap (there should be
none, although this is not asserted.  maybe it should be?).
2010-01-19 15:23:14 +00:00
bouyer 85e9e8e2b4 Revert previous. The KASSERT() is right and my analysis is wrong,
as pointed out by pooka@.
2010-01-15 19:28:26 +00:00
pooka 07df6e2689 Fix reference counting for vfsops in mount. Otherwise it's possible
(for an unprivileged user) to force vfs modules to remain loaded
forever.  Also, it's possible for an admin with fat fingers to have
to curse out loud (a lot) and reboot.

.. or at least fix things as much as seems to be possible without
involving 1000 zorkmids.  do_sys_mount() takes either struct vfsops
(which hopefully came properly referenced) or a userspace string
for file system type.  The standard in-kernel calling convention
of "do_sys_mount(l, vfs_getopsbyname("nfs"), NULL," is not to be
considered healthy, kosher, or even tasty (although if vfs_getopsbyname()
fails the whole thing *currently* fails without the program counter
pointing to hyperspace).
2010-01-15 01:00:46 +00:00
bouyer 7ffaf66ccb Remove KASSERT(vp->v_usecount == 1) in getnewvnode() and ungetnewvnode().
Another process could be vget()ing the vnode and bump v_usecount while
getcleanvnode() is vclean()ing it (as vclean drops the interlock).
vget() will then wait for VI_XLOCK or VI_FREEING to clear; and we could test
this assertion while the other process is still slepping. We could even
end up in ungetnewvnode() before this other process got a chance to run.
2010-01-14 22:41:52 +00:00
mrg efc854cf68 introduce a new function that returns a unique string for each cpu:
char *cpu_name(struct cpu_info *);

and use it when setting up the runq event counters, avoiding an 8 byte
kmem(4) allocation for each cpu.  there are more places the cpuname is
used that can be converted to using this new interface, but that can
and will be done as future work.

as discussed with rmind.
2010-01-13 01:57:17 +00:00
pooka 065afcb61a Minimize unnecessary differences in rump. 2010-01-13 01:53:38 +00:00
rmind 17990e0041 Revert 1.194 rev. 2010-01-12 22:11:13 +00:00
martin 693845d2c3 Add a new optional function device_register_post_config(), symmetric to
device register, called after config is done with a device.
Only used if an arch defines  __HAVE_DEVICE_REGISTER_POSTCONFIG.
2010-01-10 13:42:34 +00:00
rmind a4c32a06f6 softint_overlay: disable kernel preemption before curlwp->l_cpu use. 2010-01-09 19:02:17 +00:00
dyoung cd6e1fbf91 Expand PMF_FN_* macros. 2010-01-08 19:53:10 +00:00
pooka 113544b039 vcount() lost its purpose when opening multiple block devices was
made impossible, oh, two years ago.  nuke it (yes, the interface
name is overgeneric).
2010-01-08 13:07:26 +00:00
rmind 8431ea0b5e softint_execute: release/re-acquire kernel-lock depending on SOFTINT_MPSAFE
flag.  Keeping it held for MP-safe cases break the lock order assumptions.
Per discussion with <martin>.
2010-01-08 12:10:46 +00:00
rmind 97bb57c79f Simplify device G/C: use global list and config_alldevs_unlock_gc(). 2010-01-08 12:07:08 +00:00
pooka c3183f3251 The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase).  Plenty of mix'n match upper/lowercase has creeped
into the tree since then.  Nuke the macros and convert all callsites
to lowercase.

no functional change
2010-01-08 11:35:07 +00:00
dyoung e94f23b742 Move all copies of ifattr_match() to sys/kern/subr_autoconf.c. 2010-01-08 00:09:44 +00:00
dyoung bb333bec6f Add a do-nothing child-detachment hook, null_childdetached(device_t,
device_t).
2010-01-07 22:39:52 +00:00
pooka 8797d86fd0 Make sure struct vattr contains no random bits of kernel memory
after vattr_null().  This is especially nice considering things
like puffs, where the contents are copied to userspace.
2010-01-07 19:54:40 +00:00
dyoung 3fec0e6fa3 Call device_lookup() from device_lookup_private() instead of
duplicating code.

Per suggestions by rmind@:

Simplify some code that used "empty statements," ";".

Don't collect garbage in device_lookup{,_private}(), since they
are called in interrupt context from certain drivers.

Make config_collect_garbage() KASSERT() that it does not run in
interrupt or software-interrupt context.
2010-01-05 22:42:16 +00:00
skrll 7fe4e16803 Regen. 2010-01-05 15:25:32 +00:00
skrll e359b65038 Check for dev_t and time_t arguments and mark them as 64bit. 2010-01-05 15:23:32 +00:00
uebayasi 80d41370e7 Use CTASSERT() for constant only assertions. 2010-01-04 16:01:42 +00:00
mlelstv c0a2fae3f5 drop __predict micro optimization in pool_init for cleaner code. 2010-01-03 09:42:22 +00:00
mlelstv 0ca557be77 Pools are created way before the pool subsystem mutexes are
initialized.

Ignore also pool_allocator_lock while the system is in cold state.

When the system has left cold state, uvm_init() should have
also initialized the pool subsystem and the mutexes are
ready to use.
2010-01-03 01:07:19 +00:00
mlelstv d5c1a554d8 Move initialization of pool_allocator_lock before its first use.
This failed on archs where a mutex isn't initialized to a zero
value.

Defer allocation of pool log to the logging action, if allocation
fails, it will be retried the next time something is logged.

Clear pool log on allocation so that ddb doesn't crash when showing
so far unused log entries.
2010-01-02 15:20:39 +00:00
tsutsui 9d6449710b Update default TOD value to 2010/01/01 12:00:00. 2010-01-02 10:57:35 +00:00
dholland a4ce70f1ad typo in comment 2010-01-01 03:22:13 +00:00
elad a0c694197e Tiny cosmetics... 2009-12-31 02:20:36 +00:00
rmind ac4dea4ab5 - nextlwp: do not set l_cpu, it should be returned correct (add assert).
- resched_cpu: avoid double set of ci.
2009-12-30 23:54:30 +00:00
rmind 65265dedb7 sched_catchlwp: fix the case when other CPU might see curlwp->l_cpu != curcpu()
while LWP is finishing context switch.  Should fix PR/42539, tested by martin@.
2009-12-30 23:49:59 +00:00
rmind ffb9a7ee3c sigactsunshare(): set reference count in a case of new sigacts allocation.
Bug (e.g. memory leak) can happen when using clone(2) call.
2009-12-30 23:31:56 +00:00
elad 097059fb23 Don't bother caching egid. It'll be removed soon. 2009-12-30 22:12:12 +00:00
elad d4b368687f Turn PA_INITIALIZED to a reference count for the pool allocator, and once
it drops to zero destroy the mutex we initialize. This fixes the problem
mentioned in

	http://mail-index.netbsd.org/tech-kern/2009/12/28/msg006727.html

Also remove pa_flags now that it's no longer needed.

Idea from matt@, okay matt@.
2009-12-30 18:57:16 +00:00
elad 149888f85d Always use resource limits from the process, as proposed in
http://mail-index.netbsd.org/tech-kern/2009/12/30/msg006756.html

okay christos@.
2009-12-30 18:33:53 +00:00
elad 7bbc644a97 Use credentials from the socket. 2009-12-30 06:58:50 +00:00
elad 4046eb056d Move the listener plugging to module_init(), as it runs after kauth_init()
now. (Leaving only the module kthread creation in module_init2().)
2009-12-29 17:49:21 +00:00
elad 841ec82ba2 Add credentials to to sockets.
We don't need any deferred free etc. because we no longer free the
credentials in interrupt context.

Tons of help from matt@, thanks!
2009-12-29 04:23:43 +00:00
elad 34ce871d58 Remove commented-out code that should not have gone in. 2009-12-29 03:48:18 +00:00
elad ac90530da8 In veriexec_file_verify(), always check 'lockstate' before unlocking
'veriexec_op_lock'. Triggering a panic is possible in the path from
veriexec_openchk() (easily repeatable). The two switch cases at the
bottom of the function are going to panic anyway, but they might as well
panic as they're intended to as opposed to tripping over a locking
violation...
2009-12-28 07:16:41 +00:00
elad fa8206aeb0 Our error paths can call veriexec_file_free(), whicn in turn will try to
rw_destroy() the vfe lock. The easiest way to fix it for now is simply to
initialize the lock right after allocating the vfe...
2009-12-28 02:35:20 +00:00
elad 3bd7842cba Put a space after ':'... 2009-12-26 21:41:14 +00:00
elad d67e78d45a Only kmem_free() the filename if we have one. 2009-12-25 22:57:54 +00:00
elad 066035a515 Oops - unintentional locking bit that's not yet ready. 2009-12-25 20:07:18 +00:00
elad 471d0b3079 This subsystem had leftovers from the time it was part of Veriexec, and then
from when I first implemented it as "metahook."

Cleanup a lot of the mess by unifying variable names, add struct member
prefixes, adjust comments, etc.

No functional change intended.
2009-12-25 20:05:43 +00:00
elad c2d2f61cc2 No need for these prototypes here. 2009-12-25 18:51:41 +00:00
elad 36ec4b320c When reporting open files using sysctl, don't use 'filehead' to fetch files,
as we don't have a process context to authorize on. Instead, traverse the
file descriptor table of each process -- as we already do in one case.

Introduce a "marker" we can use to mark files we've seen in an iteration, as
the same file can be referenced more than once.

Hopefully this availability of filtering by process also makes life easier
for those who are interested in implementing process "containers" etc.
2009-12-24 19:01:12 +00:00
mbalmer 1ce3f76abb Fix typo, no code change. 2009-12-23 09:23:53 +00:00
pooka 3142d3ac31 Define namei flag INRENAME and set it if a lookup operation is part
of rename.  This helps with building better asserts for rename in
the DELETE lookup ... the RENAME lookup is quite obviously a part
of rename.
2009-12-23 01:09:24 +00:00
elad 4f2529fdb9 Including sysctl.h once is enough. 2009-12-23 00:21:38 +00:00
dsl 668acfeeca Use sizeof correct type, not pointer to wrong type.
Fixes PR/42498.
This has been wrong since the initial import!
2009-12-22 20:50:46 +00:00
rmind 4fff15550a Add comment about locking. 2009-12-20 23:00:59 +00:00
mrg 9a7ae38999 remove dated and wrong comments about curlwp being NULL.
_kernel_{,un}lock() always assume it is valid now.
2009-12-20 20:42:23 +00:00
pooka f015d3c5a1 Add a pointing to an explanation of why we have #ifdef pmax stuff in here. 2009-12-20 19:06:44 +00:00
dsl 2a54322c7b If a multithreaded app closes an fd while another thread is blocked in
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
2009-12-20 09:36:05 +00:00
rmind 3c74cdf150 signal(9) code: add some comments, improve/fix wrong ones. While here, kill
trailing whitespaces, wrap long lines, etc.  No functional changes intended.
2009-12-20 04:49:09 +00:00
martin cecef5e6d5 Use the kernel space version of the vfs name, not the original userspace
pointer. Avoids crashes on archs with completely separate userspace VA.
2009-12-19 20:28:27 +00:00
rmind ebd0ab14ab sigtimedwait: fix a memory leak (which happens since newlock2 times).
Allocate ksiginfo on stack since it is safe and sigget() assumes that it is
not allocated from pool (pending signals via sigput()/sigget() "mill" should
be dynamically allocated, however).  Might be useful to revisit later.

Likely the cause of PR/40750 and indirect cause of PR/39283.
2009-12-19 18:25:54 +00:00
rmind 1069745866 Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions. 2009-12-17 01:25:10 +00:00
dsl bc86c9b425 Don't ERESTART write() calls for now.
I suspect some programs don't allow for the partial transfer.
2009-12-15 18:35:18 +00:00
dyoung 62f43df82a Per rmind@'s suggestion, avoid an acquire/release-mutex dance by
collecting garbage in two phases:  in the first stage, with
alldevs_mtx held, gather all of the objects to be freed onto a
list.  Drop alldevs_mtx, and in the second stage, free all the
collected objects.

Also per rmind@'s suggestion, remove KASSERT(!mutex_owned(&alldevs_mtx))
throughout, it is not useful.

Find a free unit number and allocate it for a new device_t atomically.
Before, two threads would sometimes find the same free unit number
and race to allocate it.  The loser panicked.  Now there is no
race.

In support of the changes above, extract some new subroutines that
are private to this module: config_unit_nextfree(), config_unit_alloc(),
config_devfree(), config_dump_garbage().

Delete all of the #ifdef __BROKEN_CONFIG_UNIT_USAGE code.  Only
the sun3 port still depends on __BROKEN_CONFIG_UNIT_USAGE, it's
not hard for the port to do without, and port-sun3@ had fair warning
that it was going away (>1 week, or a few years' warning, depending
how far back you look!).
2009-12-15 03:02:24 +00:00
matt 15aa4c53c9 Regen (new makesyscalls.sh) 2009-12-14 00:53:32 +00:00
matt e110dba586 Merge from matt-nb5-mips64 2009-12-14 00:47:10 +00:00
dsl 723a159171 Another, better, fix for PR/26567.
Only sleep once within each pipe_read/pipe_write call.
If there is no data/space available after we wakeup return ERESTART so
then the 'fd' number is validated again.
A simple broadcast of the cvs is then enough to evict the correct threads
when close() is called from an active thread.
2009-12-13 20:02:23 +00:00
dsl e19cad8fcc Revert most of the previous change.
Only one fd needs clobbering, not all fds that reference the pipe.
This may be what ad@ realised when he tried to add the same code to
sockets. Unfixes part of PR/26567.
2009-12-13 18:27:02 +00:00
matt dfa7467a6e Pullup from matt-nb5-mips64.
For each syscall, add a flag for the return value or an argument indicating
that it is a 64-bit argument.  Also include the number of 64-bit arguments.
In theory this could get most of the code in compat/netbsd32/netbsd32_netbsd.c
but not at the moment due to multiply defined structures.
2009-12-13 04:47:45 +00:00
dsl c7517e0921 Add support for unblocking read/write when close called.
Fixes PR/26567 for pipes.
(NB ad backed out the fix for sockets)
2009-12-12 21:28:04 +00:00
dsl 9987412565 Fix comment for arg types of sys_profil(). 2009-12-12 17:48:54 +00:00
dsl ef379fcb95 Bounding the 'nfds' arg to poll() at the current process limit for actual
open files is rather gross - the poll map isn't required to be dense.
Instead limit to a much larger value (1000 + dt_nfiles) so that user
programs cannot allocate indefinite sized blocks of kvm.
If the limit is exceeded, then return EINVAL instead of silently truncating
the list.
(The silent truncation in select isn't quite as bad - although even there
any high bits that are set ought to generate an EBADF response.)
Move the code that converts ERESTART and EWOULDBLOCK into common code.
Effectively fixes PR/17507 since the new limit is unlikely to be detected.
2009-12-12 17:47:05 +00:00
dsl 17a42f25f1 Report L_INMEM in the lwp info as well. 2009-12-12 17:29:34 +00:00
dsl f537a9ce5f Always set L_INMEM to maintain binary compatibility. 2009-12-12 17:03:19 +00:00
tsutsui 428585a7d8 Remove `volatile' qualifier from argument types of
struct timeval passed to todr_gettime(9) and todr_settime(9).
We no longer have an ancient and volatile struct timeval `time'
global since we have switched to MI timercounter(9) on all port.

XXX1: some of these RTC drivers still assume 32bit time_t
XXX2: some of these should be rewritten to use todr_[gs]ettime_ymdhms()
XXX3: todr(9) man page doesn't mention todr_[gs]ettime_ymdhms()
2009-12-12 15:10:34 +00:00
tsutsui a49264523b Use bool where appropriate. 2009-12-12 11:35:16 +00:00
tsutsui efd28fda6a Don't use int to get delta of time_t values. 2009-12-12 11:28:40 +00:00
dsl eff3e2124a Avoid leaking a mutex_obj when pipe_create() fails for the read pipe.
Remove the unused argument from pipeclose().
2009-12-10 20:55:17 +00:00
matt 6a9e4e8eeb Change u_long to vaddr_t/vsize_t in exec code where appropriate (mostly
involves setregs and vmcmds).  Should result in no code differences.
2009-12-10 14:13:48 +00:00
drochner a1a04dd1be If a struct sigevent with SIGEV_SIGNAL is passed to timer_create(2),
check the signal number to be in the allowed range. An invalid
signal number could crash the kernel by overflowing the sigset_t
array.
More checks would be good, and SIGEV_THREAD shouldn't be dropped
silently, but this fixes at least the local DOS vulnerability.
2009-12-10 12:39:12 +00:00
drochner fe1db36da9 fix some security critical bugs:
-an invalid signal number passed to mq_notify(2) could crash the kernel
 on delivery -- add a boundary check
-mq_receive(2) from an empty queue crashed the kernel by NULL dereference
 in timeout calculation -- handle the NULL case
-likewise for mq_send(2) to a full queue
-a user could set mq_maxmsg (the maximal number of messages in a queue)
 to a huge value on mq_open(O_CREAT) and later use up all kernel
 memory by mq_send(2) -- add a sysctl'able limit which defaults
 to 16*mq_def_maxmsg

(mq_notify(2) should get some more checks, and SIGEV_* values other
than SIGEV_SIGNAL should be handled somehow, but this doesn't look
security critical)
2009-12-10 12:22:48 +00:00
dsl 7a42c833db Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
2009-12-09 21:32:58 +00:00
dsl 43bac9730d Correct comment, pipelock() no longer releases the mutex. 2009-12-06 20:26:55 +00:00
pooka d2445bdd09 tsleep() on lbolt is now illegal. Convert cv_wakeup(&lbolt) to
cv_broadcast(&lbolt) and get rid of the prior.
2009-12-05 22:38:19 +00:00
pooka faa8e1b3e3 Convert tsleep(&lbolt) to kpause(). Make ltsleep/mtsleep on lbolt
illegal.  I examined all places where lbolt is referenced to make
sure there were pointer aliases of it passed to tsleep, but put a
KASSERT in m/ltsleep() just to be sure.
2009-12-05 22:34:43 +00:00
pooka debaf78619 explicitly initialize static boolean 2009-11-30 15:37:56 +00:00
pooka 051b421f3f Create CTL_HW before creating nodes on top of it (sysctl constructors
run in "random" order).
2009-11-30 11:28:35 +00:00
pooka 0fb0ab1101 Fix kernel build on platforms which define __BROKEN_CONFIG_UNIT_USAGE
and therefore don't take config_alldevs_lock() in config_devalloc().
2009-11-29 15:17:30 +00:00
dsl 454df0687b When truncating a request in bounds_check_with_mediasize() multiply
by the provided sector size instead of 512.
Fixes last bit of PR/31565
2009-11-28 22:38:07 +00:00
bouyer 8c392da154 Previous did cause a deadlock with layered FS: the vrele thread
can sleep on the vnode lock, while vget is sleeping on the
VI_INACTNOW flag (or the vget caller is looping on vget returning failure
because of the VI_INACTNOW flag). With layered FSes, the upper and lower
vnodes share the same lock, so the vget() caller above can be already
holding the vnode lock.

Fix by dropping VI_INACTNOW before sleeping on the vnode lock in
vrelel(), and check the ref count again once we have the lock. If the
vnode has more than one reference, donc VOP_INACTIVE it.
Fix PR kern/42318 and PR kern/42377
patch tested by Hisashi T Fujinaka, Joachim König, Stephen Borrill and
Matthias Scheler.
2009-11-28 10:10:17 +00:00
pooka bbc50ef41d Due to the schizophrenic nature of kobj (mem + vfs source),
split the module in twain to subj_kobj.c (master + mem) and
subr_kobj_vfs.c (vfs).
2009-11-27 17:54:11 +00:00
pooka 8102fe7341 Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.
2009-11-27 16:43:51 +00:00
pooka 8257134a74 Make this work on some m68k ports which like putting the disklabel
in the third sector (or have copypasted disklabel.h from a port
which likes doing that ;).
2009-11-27 13:29:33 +00:00
tsutsui c48b085654 u_short -> uint16_t, some KNF. 2009-11-27 11:23:50 +00:00
pooka 1798957738 Add DV_VIRTUAL for non-backed virtual devices and allow to mount
root from a DV_VIRTUAL device.
2009-11-26 20:52:19 +00:00
pooka baffc0cbae typo in comment (it actually breaks the script totally. i wish
more typos in comments were as effective)
2009-11-26 17:23:48 +00:00
pooka 91ac00ac3a pipe +RUMP 2009-11-26 17:20:20 +00:00
pooka 67ff6315cd Add rump support for the special handling required by pipe(2). 2009-11-26 17:19:54 +00:00
pooka a91020162b Instead of a single register_t as the retval of rump syscalls,
use an array of two.  No functional change ... yet.
2009-11-26 16:34:24 +00:00
pooka 024c040316 modctl +RUMP 2009-11-26 09:00:45 +00:00
matt 11af2f9cfa Kill proc0paddr. Use lwp0.l_addr instead. 2009-11-26 00:19:11 +00:00
pooka 64ab232858 make WAPBL_DEBUG_PRINT compile 2009-11-25 14:43:31 +00:00
pooka 5fc3d70195 Remove highly questionable assert which demans that the kernel symbol
table is in memory at a lower address than the string table.
2009-11-25 13:16:55 +00:00
rmind 606b1d9782 Add assert that ce->ce_func is not NULL. 2009-11-24 20:11:50 +00:00
dyoung c8fed843e1 Address some of the concerns that SPLDEBUG is not machine-independent,
Part 1 of N:

        There is not an MI ordering of interrupt priority levels,
        so use == IPL_HIGH and != IPL_HIGH instead of >= IPL_HIGH
        and < IPL_HIGH.  Ignore 'cold' and always use curcpu(),
        since cpu_info_primary is MD.

Other changes:

        There is no need to create symbols named _spldebug_* and
        strong aliases to them.  Just use symbols spldebug_*,
        instead.  Use a temporary variable instead of repeat
        cpu_index(9) calls.  KASSERT() that cpu_index(9) is <
        MAXCPUS.
2009-11-24 17:28:32 +00:00
pooka 09dbb89b44 If cpu_disklabel includes struct dkbad, define __HAVE_DISKLABEL_DKBAD.
This allows use of subr_disk_mbr on all archs.  Default to it for
the rump disk component.  No functional change for regular kernels.
(The other option would've been to include dkbad in disklabels
everywhere, but arguably this approach has less possible side-effects,
especially given that wedges and related magic will take over the
world any second now).
2009-11-23 13:40:08 +00:00
mbalmer 0ae57f90dd more s/the the/the/ 2009-11-22 19:09:15 +00:00
enami 07ab814664 Fix indentation, wrap long line and remove unused variable. 2009-11-19 03:01:05 +00:00
enami 9f91c09ebc Add missing vfs_unbusy() call in error path of sysctl_kern_vnode().
This allows us to reboot machine successfully even if pstat -v fails once.
2009-11-19 02:59:33 +00:00
pooka a8ed404de6 * make it possible to include kern_module in a kernel without vfs
support, i.e. move vfs functionality to a separate module
  (kern_module_vfs.c)
* make module proplist size an MI constant (now 8k) instead of PAGE_SIZE
* change some error values to something else than the karmic EINVAL
2009-11-18 17:40:45 +00:00
yamt d8b340409c turnstile_block: reduce code duplication. 2009-11-18 12:26:22 +00:00
yamt e8ed984955 turnstile_block: turn a comment into KASSERTs. 2009-11-18 12:25:15 +00:00
bouyer e3c6fd050a Fix getcleanvnode() in previous: in the if (vp->v_usecount != 0)
case we didn't bump the refcount, so don't decrease it through vrelel().
call mutex_exit() on v_interlock directly instead.
2009-11-17 22:20:14 +00:00
pooka 1d8a950195 Add a comment saying "name" to pool_init() is never freed (fixing
requires touching pool implementation).  No biggie, though, since
the pools themselves are never freed.
2009-11-17 14:38:31 +00:00
elad 903af42390 Include miscfs/specfs/specdev.h for spec_init(). 2009-11-15 02:37:13 +00:00
rmind 16347a5be7 kpsignal2: do not make the signal pending twice when tracing the process,
also update a comment and add an assert.  Fixes PR/42309 by Nicolas Joly.
2009-11-14 19:06:54 +00:00
elad 1570e68c40 - Move kauth_init() a little bit higher.
- Add spec_init() to authorize special device actions (and passthru too for
  the time being). Move policy out of secmodel_suser.
2009-11-14 18:36:56 +00:00
dsl e6a11930a4 Christos was worried about clrbits() being called with a length of zero.
This can't happen, but rework so it doesn't matter.
Remove 'optimisation' for length 1, that doesn't happen often enough.
2009-11-14 13:18:41 +00:00
dsl f3583ee6ce Fix clrbits() so that it doesn't mask no bits out of the byte after the
range (when the last bit to be cleared is the msb of a byte).
Fixes PR/42312 in a slightly better way than proposed.
2009-11-13 19:15:24 +00:00
dsl be258d919e Change args to clrbits() to be unsigned for efficiency. 2009-11-13 19:00:15 +00:00
dyoung 3ea78c91dc Use TAILQ_FOREACH() instead of open-coding it.
I applied this patch with Coccinelle's semantic patch tool, spatch(1).
I installed Coccinelle from pkgsrc: devel/coccinelle/.  I wrote
tailq.spatch and kdefs.h (see below) and ran this command,

spatch -debug -macro_file_builtins ./kdefs.h -outplace \
    -sp_file sys/kern/tailq.spatch sys/kern/subr_autoconf.c

which wrote the transformed source file to /tmp/subr_autoconf.c.  Then I
used indent(1) to fix the indentation.

::::::::::::::::::::
::: tailq.spatch :::
::::::::::::::::::::

@@
identifier I, N;
expression H;
statement S;
iterator name TAILQ_FOREACH;
@@

- for (I = TAILQ_FIRST(H); I != NULL; I = TAILQ_NEXT(I, N)) S
+ TAILQ_FOREACH(I, H, N) S

:::::::::::::::
::: kdefs.h :::
:::::::::::::::

#define MAXUSERS 64
#define _KERNEL
#define _KERNEL_OPT
#define i386

/*
 * Tail queue definitions.
 */
#define	_TAILQ_HEAD(name, type, qual)					\
struct name {								\
	qual type *tqh_first;		/* first element */		\
	qual type *qual *tqh_last;	/* addr of last next element */	\
}
#define TAILQ_HEAD(name, type)	_TAILQ_HEAD(name, struct type,)

#define	TAILQ_HEAD_INITIALIZER(head)					\
	{ NULL, &(head).tqh_first }

#define	_TAILQ_ENTRY(type, qual)					\
struct {								\
	qual type *tqe_next;		/* next element */		\
	qual type *qual *tqe_prev;	/* address of previous next element */\
}
#define TAILQ_ENTRY(type)	_TAILQ_ENTRY(struct type,)

#define	PMF_FN_PROTO1	pmf_qual_t
#define	PMF_FN_ARGS1	pmf_qual_t qual
#define	PMF_FN_CALL1	qual

#define	PMF_FN_PROTO	, pmf_qual_t
#define	PMF_FN_ARGS	, pmf_qual_t qual
#define	PMF_FN_CALL	, qual

#define __KERNEL_RCSID(a, b)
2009-11-12 23:16:28 +00:00
dyoung 972989f5e3 Move a device-deactivation pattern that is replicated throughout
the system into config_deactivate(dev): deactivate dev and all of
its descendants.  Block all interrupts while calling each device's
activation hook, ca_activate.  Now it is possible to simplify or
to delete several device-activation hooks throughout the system.

Do not deactivate a driver while detaching it!  If the driver was
already deactivated (because of accidental/emergency removal), let
the driver cope with the knowledge that DVF_ACTIVE has been cleared.
Otherwise, let the driver access the underlying hardware (so that
it can flush caches, restore original register settings, et cetera)
until it exits its device-detachment hook.

Let multiple readers and writers simultaneously access the system's
device_t list, alldevs, from either interrupt or thread context:
postpone changing alldevs linkages and freeing autoconf device
structures until a garbage-collection phase that runs after all
readers & writers have left the list.

Give device iterators (deviter(9)) a consistent view of alldevs no
matter whether device_t's are added and deleted during iteration:
keep a global alldevs generation number.  When an iterator enters
alldevs, record the current generation number in the iterator and
increase the global number.  When a device_t is created, label it
with the current global generation number.  When a device_t is
deleted, add a second label, the current global generation number.
During iteration, compare a device_t's added- and deleted-generation
with the iterator's generation and skip a device_t that was deleted
before the iterator entered the list or added after the iterator
entered the list.

The alldevs generation number is never 0.  The garbage collector
reaps device_t's whose delete-generation number is non-zero.

Make alldevs private to sys/kern/subr_autoconf.c.  Use deviter(9)
to access it.
2009-11-12 19:10:30 +00:00
rmind ad4f42d499 workqueue_finiqueue: remove unused variable. 2009-11-11 14:54:40 +00:00
rmind 1283950019 - selcommon/pollcommon: drop redundant l argument.
- Use cached curlwp->l_fd, instead of p->p_fd.
- Inline selscan/pollscan.
2009-11-11 09:48:50 +00:00
rmind e6f025f1da Add a small comment on buffer cache locking, fix mark letter b_objlock. 2009-11-11 09:15:42 +00:00
rmind 484f70316c G/C unused breada() and bdirty(). 2009-11-11 07:22:33 +00:00
cegger 9480c51b04 Add a flags argument to pmap_kenter_pa(9).
Patch showed on tech-kern@ http://mail-index.netbsd.org/tech-kern/2009/11/04/msg006434.html
No objections.
2009-11-07 07:27:40 +00:00
pooka 1dac1a8cbc g/c M_SOFTINTR 2009-11-06 13:32:41 +00:00
dyoung fbe2bb0ace Use deviter(9) instead of accessing alldevs directly. 2009-11-05 18:07:19 +00:00
pooka 11b02a2b55 Excommunicate comment not abiding to the 80col dogma.
(well, turns out it was no longer valid either)
2009-11-05 16:15:51 +00:00
pooka 35a75982e4 expose module_{lookup,enqueue}() 2009-11-05 14:09:14 +00:00
bouyer 6b8161200e getcleanvnode(): don't vclean() the vnode if it has gained another
reference while we were getting the v_interlock.
vget(): attempt prevent it from returning a clean vnode:
  if the vnode is being inactivated (by vrelel()), wait for
  vrelel() to complete (or return EBUSY if we can't wait), and return
  ENOENT if the vnode has been vclean'ed by vrelel()
Fix kern/41147 in a better way, hopefully fix other related race conditions.
2009-11-05 08:18:02 +00:00
rmind 4c1098f541 do_sys_wait(): fix previous by checking for ru != NULL. Noticed by
Onno van der Linden.  Also, remove redundant arguments (seems that
was_zombie was not used since rev 1.177 ?).
2009-11-04 21:23:02 +00:00
pooka fcc20a4ba1 Split uiomove() and high-level copy routines out of the crowded
kern_subr and into their own cozy home in subr_copy.
2009-11-04 16:54:00 +00:00
pooka ab72032a6c nuke unused local variable 2009-11-04 15:35:09 +00:00
pooka 83685e650c Heave-ho mutex/rwlock object routines into separate modules -- they
don't have anything to do with the lock internals.
2009-11-04 13:29:45 +00:00
dyoung e48f8429d1 Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs.  SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().
2009-11-03 05:23:27 +00:00
dyoung 648f423c6f Make lockdebug_lock_print(NULL, ...) dump all locks. Now, in ddb,
'show lock 0x0' dumps all of the locks.

XXX I still need to fix 'show all lock'.
2009-11-03 00:29:11 +00:00
rmind b9a294cf04 - Move inittimeleft() and gettimeleft() to subr_time.c, where they belong.
- Move abstimeout2timo() there too and export.  Use it in lwp_park().
2009-11-01 21:46:09 +00:00
rmind 1ceff942e5 Move common logic in selcommon() and pollcommon() into sel_do_scan().
Avoids code duplication.  XXX: pollsock() should be converted too, except
it's a bit ugly.
2009-11-01 21:14:21 +00:00
rmind 1ff7612225 do_sys_wait: clear rusage, instead of returning garbage. Patch from
dholland@ via PR/40717, with minor change by me.
2009-11-01 21:05:30 +00:00
rmind 5ccbe1e208 orphanpg: remove no longer user variable. 2009-11-01 20:59:24 +00:00
njoly b83467c466 Make flock(2) more robust to invalid operation, such as
(LOCK_EX|LOCK_SH).
2009-10-28 18:24:44 +00:00
rmind e4be2748a3 - Amend fd_hold() to take an argument and add assert (reflects two cases,
fork1() and the rest, e.g. kthread_create(), when creating from lwp0).

- lwp_create(): do not touch filedesc internals, use fd_hold().
2009-10-27 02:58:28 +00:00
rmind 0ca6708c13 - Use pool(9) for pmf_event_workitem_t, instead of pool_cache(9). Still,
meta-data of this pool takes more space than the actual data..

- Reduce lowat/hiwat to 1..8, since intensity is very low.

- Remove unused pew_next_free from pmf_event_workitem_t.
2009-10-27 02:55:07 +00:00
rmind c32b625d4c Update comment about proc0_init(). 2009-10-26 19:03:17 +00:00
rmind 554a0142dc Initialise struct emul members by name (it is readable now and one can search
them in the tree).
2009-10-25 01:14:03 +00:00