Commit Graph

440 Commits

Author SHA1 Message Date
rmind f76667381c - Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().
2012-01-29 22:55:40 +00:00
christos dd23e71080 Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.
2012-01-24 20:03:36 +00:00
jym 926571dfa7 Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

    bool isroot;
    error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
        cred, &isroot);
    if (error == 0 && !isroot)
            result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.
2011-12-04 19:24:58 +00:00
tls 3afd44cf08 First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>.  This change includes
the following:

	An initial cleanup and minor reorganization of the entropy pool
	code in sys/dev/rnd.c and sys/dev/rndpool.c.  Several bugs are
	fixed.  Some effort is made to accumulate entropy more quickly at
	boot time.

	A generic interface, "rndsink", is added, for stream generators to
	request that they be re-keyed with good quality entropy from the pool
	as soon as it is available.

	The arc4random()/arc4randbytes() implementation in libkern is
	adjusted to use the rndsink interface for rekeying, which helps
	address the problem of low-quality keys at boot time.

	An implementation of the FIPS 140-2 statistical tests for random
	number generator quality is provided (libkern/rngtest.c).  This
	is based on Greg Rose's implementation from Qualcomm.

	A new random stream generator, nist_ctr_drbg, is provided.  It is
	based on an implementation of the NIST SP800-90 CTR_DRBG by
	Henric Jungheim.  This generator users AES in a modified counter
	mode to generate a backtracking-resistant random stream.

	An abstraction layer, "cprng", is provided for in-kernel consumers
	of randomness.  The arc4random/arc4randbytes API is deprecated for
	in-kernel use.  It is replaced by "cprng_strong".  The current
	cprng_fast implementation wraps the existing arc4random
	implementation.  The current cprng_strong implementation wraps the
	new CTR_DRBG implementation.  Both interfaces are rekeyed from
	the entropy pool automatically at intervals justifiable from best
	current cryptographic practice.

	In some quick tests, cprng_fast() is about the same speed as
	the old arc4randbytes(), and cprng_strong() is about 20% faster
	than rnd_extract_data().  Performance is expected to improve.

	The AES code in src/crypto/rijndael is no longer an optional
	kernel component, as it is required by cprng_strong, which is
	not an optional kernel component.

	The entropy pool output is subjected to the rngtest tests at
	startup time; if it fails, the system will reboot.  There is
	approximately a 3/10000 chance of a false positive from these
	tests.  Entropy pool _input_ from hardware random numbers is
	subjected to the rngtest tests at attach time, as well as the
	FIPS continuous-output test, to detect bad or stuck hardware
	RNGs; if any are detected, they are detached, but the system
	continues to run.

	A problem with rndctl(8) is fixed -- datastructures with
	pointers in arrays are no longer passed to userspace (this
	was not a security problem, but rather a major issue for
	compat32).  A new kernel will require a new rndctl.

	The sysctl kern.arandom() and kern.urandom() nodes are hooked
	up to the new generators, but the /dev/*random pseudodevices
	are not, yet.

	Manual pages for the new kernel interfaces are forthcoming.
2011-11-19 22:51:18 +00:00
jruoho 8f43552059 Initialize cpufreq(9) normally from main(). 2011-09-28 15:52:47 +00:00
rmind 52b220e91d Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot.  MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@
2011-08-07 13:33:01 +00:00
christos 44968cba76 Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.
2011-07-30 17:01:04 +00:00
bouyer 1f1772a97c Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.
2011-07-02 17:53:50 +00:00
rmind e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
dyoung 0840f9ccfc Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.
2011-05-31 23:28:52 +00:00
uebayasi dcf649145a Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.
From jmmv@, no objections seen in the proposed thread:

	http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html
2011-05-26 04:25:26 +00:00
rmind 95ea6d26ab Re-implement kthread_join(9), so that it actually works (hi haad@). 2011-05-19 03:07:29 +00:00
yamt 6809106b9e comment 2011-04-14 16:20:52 +00:00
pooka dd7a40671a Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes).  This makes them
usable in a rump kernel, in case somebody was wondering.
2011-01-28 18:44:44 +00:00
matt 962f7d2435 Move up evcnt_init to before cpu_startup() 2011-01-18 08:18:43 +00:00
uebayasi 9d567f003d Include internal definitions (uvm/uvm.h) only where necessary. 2011-01-17 07:13:31 +00:00
eeh a94c765810 ubc_init needs to run while we're still cold or uvmhist breaks. 2010-12-16 00:42:22 +00:00
pgoyette 4a743ad47d Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex.  Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex.  Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !
2010-08-21 13:17:31 +00:00
pgoyette 224f73d8d9 1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
   MODFLG_MUST_FORCE.  If this flag is set and the entry is on the list
   of builtins, it means that the module has been explicitly unloaded
   and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
   module_require_force() method to set this flag;  once set, it should
   never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
   more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
   MODFLG_MUST_FORCE flag for any module that has not yet successfully
   been initialized.  Call it after module_init_class(MODULE_CLASS_ANY)
   to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented.  No
additional discussion for last week.  Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.
2010-06-26 07:23:57 +00:00
tsutsui 9aa0a261b5 Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.
2010-06-25 15:10:42 +00:00
pooka 51542301d2 lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c.  This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).
2010-06-10 20:54:53 +00:00
pooka 34244e1069 Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)
2010-04-21 16:51:24 +00:00
cegger 8e585686fb fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().
2010-02-05 11:06:36 +00:00
pooka 8288b650e5 uncommit part which wasn't supposed to get committed yet 2010-01-31 03:57:01 +00:00
pooka 3cba324816 Pass root device as a parameter to domountroothook(). 2010-01-31 02:04:43 +00:00
hubertf af120bb199 Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.
2010-01-31 00:43:37 +00:00
pooka 10fe49d72c Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client.  This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached.  However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff.  ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.
2010-01-19 22:06:18 +00:00
elad 4f2529fdb9 Including sysctl.h once is enough. 2009-12-23 00:21:38 +00:00
rmind 1069745866 Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions. 2009-12-17 01:25:10 +00:00
pooka 8102fe7341 Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.
2009-11-27 16:43:51 +00:00
elad 903af42390 Include miscfs/specfs/specdev.h for spec_init(). 2009-11-15 02:37:13 +00:00
elad 1570e68c40 - Move kauth_init() a little bit higher.
- Add spec_init() to authorize special device actions (and passthru too for
  the time being). Move policy out of secmodel_suser.
2009-11-14 18:36:56 +00:00
dyoung e48f8429d1 Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs.  SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().
2009-11-03 05:23:27 +00:00
rmind c32b625d4c Update comment about proc0_init(). 2009-10-26 19:03:17 +00:00
elad 2cb56be586 Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.
2009-10-06 21:07:05 +00:00
elad b2f3768346 - Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
  in sys_sched.c where we (a) initialize the sysctl node (no more
  link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.
2009-10-03 22:32:56 +00:00
elad 2ae3a70827 Move ptrace's security policy back to the subsystem itself.
Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().
2009-10-02 22:18:56 +00:00
elad 53ca19a3b3 First part of secmodel cleanup and other misc. changes:
- Separate the suser part of the bsd44 secmodel into its own secmodel
    and directory, pending even more cleanups. For revision history
    purposes, the original location of the files was

        src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
        src/sys/secmodel/bsd44/suser.h

  - Add a man-page for secmodel_suser(9) and update the one for
    secmodel_bsd44(9).

  - Add a "secmodel" module class and use it. Userland program and
    documentation updated.

  - Manage secmodel count (nsecmodels) through the module framework.
    This eliminates the need for secmodel_{,de}register() calls in
    secmodel code.

  - Prepare for secmodel modularization by adding relevant module bits.
    The secmodels don't allow auto unload. The bsd44 secmodel depends
    on the suser and securelevel secmodels. The overlay secmodel depends
    on the bsd44 secmodel. As the module class is only cosmetic, and to
    prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
    "secmodel_".

  - Adapt the overlay secmodel to recent changes (mainly vnode scope).

  - Stop using link-sets for the sysctl node(s) creation.

  - Keep sysctl variables under nodes of their relevant secmodels. In
    other words, don't create duplicates for the suser/securelevel
    secmodels under the bsd44 secmodel, as the latter is merely used
    for "grouping".

  - For the suser and securelevel secmodels, "advertise presence" in
    relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

  - Get rid of the LKM preprocessor stuff.

  - As secmodels are now modules, there's no need for an explicit call
    to secmodel_start(); it's handled by the module framework. That
    said, the module framework was adjusted to properly load secmodels
    early during system startup.

  - Adapt rump to changes: Instead of using empty stubs for securelevel,
    simply use the suser secmodel. Also replace secmodel_start() with a
    call to secmodel_suser_start().

  - 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

	http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html
2009-10-02 18:50:12 +00:00
dyoung e533051d0f #include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.
2009-09-29 22:40:15 +00:00
pooka 9b040bc3a9 Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.
2009-09-21 12:14:46 +00:00
pooka 11281f01a0 Replace a large number of link set based sysctl node creations with
calls from subsystem constructors.  Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL
2009-09-16 15:23:04 +00:00
pooka fbd53556dc Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor.  For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation.  This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL
2009-09-13 18:45:10 +00:00
pooka 5e46a7c29a Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization.  Also, apply _KERNEL_OPT
to autoconf where necessary.
2009-09-03 15:20:08 +00:00
pooka 5523d7f5c9 Initialize devsw (lock) early so that subsystems may play with it. 2009-09-02 08:07:05 +00:00
mbalmer 9d8b69b23a Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!
2009-07-27 17:40:57 +00:00
mbalmer 953ebaaf3d Allow gpiosim(4) to attach if configured in the kernel configuration. 2009-07-25 16:23:39 +00:00
yamt 0436400c70 set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.
2009-07-19 10:11:55 +00:00
rmind 7512d1e720 Make POSIX message queues a kernel module. 2009-07-19 02:50:44 +00:00
ad 5c5bb856e1 Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.
2009-07-17 23:31:51 +00:00
dholland effcf1af5c Convert 67 namei call sites to use namei_simple, in these functions:
check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.
2009-06-29 05:08:15 +00:00