Commit Graph

1935 Commits

Author SHA1 Message Date
martin
13a218af47 Recently uvm_page_locked_p() leaked outside of uvm/MD code, so rump
needs to provide one.
2012-05-23 14:59:21 +00:00
dholland
b4e2a66cb4 Revert previous. It seems that some or all makefiles in tests/ do not
bother to set DPADD and thereby fail silently on library changes.
2012-05-13 09:42:36 +00:00
dholland
c0f1048093 quota1_subr.c and vfs_quotactl.c are not needed here any more. 2012-05-13 06:12:43 +00:00
riastradh
aeadee1d6d Adapt ffs, lfs, and ext2fs to use genfs_rename.
ok dholland, rmind
2012-05-09 00:21:17 +00:00
riastradh
aff071a220 Adapt tmpfs_rename to use genfs_rename. 2012-05-09 00:16:07 +00:00
riastradh
5ecfdf8dea Implement a genfs_rename abstraction.
First major step in incrementally adapting all the file systems to a
saner rename VOP protocol.
2012-05-08 23:53:26 +00:00
martin
a8e0448e41 Revert previous and add a comment - I misunderstood what this code is
emulating.
2012-05-06 16:58:31 +00:00
martin
6dfd42c24f If we are not delivering a host iso file (USE_TOSI_ISO is undefined), use
-1 as file descriptor initially. The -2 value confused a few other checks
later and led to inconsistent "media present" reports.
2012-05-06 16:33:02 +00:00
dsl
e05eb71de5 Remove everything to do with 'struct malloc_type' and the malloc link_set.
To make code in 'external' (etc) still compile, MALLOC_DECLARE() still
  has to generate something of type 'struct malloc_type *', with
  normal optimisation gcc generates a compile-time 0.
MALLOC_DEFINE() and friends have no effect.
Fix one or two places where the code would no longer compile.
2012-04-29 20:27:31 +00:00
dsl
dbd0815551 Remove the unused 'struct malloc_type' args to kern_malloc/realloc/free
The M_xxx arg is left on the calls to malloc() and free(),
  maybe they could be converted to an enumeration and just saved in
  the malloc header (for deep diag use).
Remove the malloc_type from mbuf extension.
Fixes rump build as well.
Welcome to 6.99.6
2012-04-29 16:36:53 +00:00
rmind
e206fc57e3 Fix RUMP build. 2012-04-29 14:00:15 +00:00
rmind
911cbc2790 G/C kern_malloc_stdtype.c 2012-04-29 02:29:41 +00:00
stacktic
645f62c493 Fixed build with locks_up.c 2012-04-28 18:04:02 +00:00
rmind
16bec229c7 Update rumpdev_npf; use WARNS=4. 2012-04-14 19:01:21 +00:00
rmind
9cd8b15e05 rumpnet_net: add pfil.c 2012-04-14 18:26:31 +00:00
gson
92d7381de1 Fix cut-and-paste-os in panic messages 2012-04-10 13:45:07 +00:00
njoly
1976a9c7db Do not ignore kauth errors when setting file flags. 2012-03-30 18:09:12 +00:00
njoly
d147c93527 Use the appropriates vop_*_args structures. 2012-03-22 22:48:56 +00:00
hannken
f9e0cf816f Don't take a mutex we already took 6 lines above. 2012-03-17 17:58:38 +00:00
njoly
a01e8f8ae6 Use VOP va_vaflags attribute for genfs_can_chtimes(), not rumpfs node
one.
2012-03-15 12:42:28 +00:00
joerg
66dd2755f5 Add __printflike attribution to use vprintf and friends with an argument
as format string.
2012-03-15 02:02:20 +00:00
elad
0c9d8d15c9 Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

    http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
    http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
    http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.
2012-03-13 18:40:26 +00:00
gson
d26e197d58 Fix obvious cut-and-paste-o in error message string 2012-03-11 13:14:04 +00:00
joerg
99c3eea80c P1003_1B_SEMAPHORE is no longer optional. 2012-03-10 21:51:48 +00:00
joerg
4acff4c01b Implement sem_timedwait. 2012-03-08 21:59:24 +00:00
para
051f37f320 adjust rump for static pool_cache count
should have went in with subr_vmem 1.73
2012-03-05 13:43:56 +00:00
mrg
c1d02dab7b add a _kernel_locked_p(). 2012-02-20 22:35:14 +00:00
rmind
ad12c77015 Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.
2012-02-19 21:05:51 +00:00
martin
a94c36f0b3 Adapt to constification in sys/uvm/uvm_export.h 2012-02-19 09:19:41 +00:00
martin
8cd221e226 Regen for posix_spawn 2012-02-11 23:18:13 +00:00
para
fa6083dc6c make acorn26 compile by fixing up subpage pool allocations
ok: riz@
2012-02-04 22:11:42 +00:00
njoly
54f5b36db5 Now that rnd is not optional anymore, add needed rnd_init() for rump.
Fix dev/{scsipi,sysmon} testcases.
2012-02-04 10:02:25 +00:00
tls
7b0b7dedd9 Entropy-pool implementation move and cleanup.
1) Move core entropy-pool code and source/sink/sample management code
   to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
   source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
   avoid expensive operations on disabled entropy sources; make the
   rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
   have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
   system events, and skew between clocks, with a sample implementation
   for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files).  Tested with release
builds on amd64 and evbarm and live testing on amd64.
2012-02-02 19:42:57 +00:00
dholland
56cf2d9a90 Regen syscalls with proper id info. 2012-02-01 05:42:17 +00:00
dholland
59b296daa7 Change the syscall API for quotas over to the new non-proplib one.
- struct vfs_quotactl_args -> struct quotactl_args
   - add sys/stdint.h to sys/quotactl.h for clean userland build
   - install sys/quotactl.h in /usr/include
   - update set lists for same
   - add new marshalling code in libquota
   - add new unmarshalling code in vfs_syscalls.c
   - discard proplib interpreter code in vfs_quotactl.c
   - add dispatching code for the 14 quotactl ops in vfs_quotactl.c
   - mark the proplib quotactl syscall obsolete
   - add a new syscall number for the new quotactl syscall
   - change the name of the syscall to __quotactl()
   - remove the decl of the old quotactl from quota/quotaprop.h
   - add a decl of the new quotactl to sys/quotactl.h
   - update the libc build
   - update ktruss
   - remove proplib marshalling code from libquota
   - update copy of syscall table in gdb ppc sources
   - hack rumphijack to accomodate new quotactl name (as I recall,
     pooka wanted such a name change to simplify something, but I
     don't really see what/how)

This change appears to require a kernel version bump for rumpish
reasons.
2012-02-01 05:34:38 +00:00
njoly
d04e1e754f Check credentials when setting uid, gid or mode attributes. 2012-01-31 19:00:03 +00:00
njoly
b702a1d739 Add permissions support to rump_vop_access(), to be used by
rump_vop_lookup().
2012-01-30 16:17:14 +00:00
njoly
e334a4ccda Move pool subsystem init from rump__init() to uvm_init(), following
kernel code. Fix RUMP_LOCKDEBUG early panic.
2012-01-29 14:57:31 +00:00
dholland
749c2c6e19 Add vfs_quotactl.c. This is where filesystem-independent quota
handling will go.
2012-01-29 06:26:54 +00:00
rmind
9a80847a9f Remove obsolete ltsleep(9) and wakeup_one(9). 2012-01-28 12:22:33 +00:00
para
e62ee4d475 extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged
2012-01-27 19:48:38 +00:00
apb
4309071985 Put the path to the compat/common directory in a .PATH line, not in
an element of the SRCS list.  This should fix a problem in which build
products were created in the source tree.

Also add a comment about where COMPAT_50 is defined.
2011-12-20 17:09:04 +00:00
apb
e48fd3a0e7 SRCS += ${.CURDIR}/../../../../compat/common/rndpseudo_50.c
to fix build errors like this:

DESTDIR/usr/lib/librumpdev_rnd.so: undefined reference to
`rumpns_compat_50_rnd_ioctl'
2011-12-19 21:56:18 +00:00
tls
6e1dd068e9 Separate /dev/random pseudodevice implemenation from kernel entropy pool
implementation.  Rewrite pseudodevice code to use cprng_strong(9).

The new pseudodevice is cloning, so each caller gets bits from a stream
generated with its own key.  Users of /dev/urandom get their generators
keyed on a "best effort" basis -- the kernel will rekey generators
whenever the entropy pool hits the high water mark -- while users of
/dev/random get their generators rekeyed every time key-length bits
are output.

The underlying cprng_strong API can use AES-256 or AES-128, but we use
AES-128 because of concerns about related-key attacks on AES-256.  This
improves performance (and reduces entropy pool depletion) significantly
for users of /dev/urandom but does cause users of /dev/random to rekey
twice as often.

Also fixes various bugs (including some missing locking and a reseed-counter
overflow in the CTR_DRBG code) found while testing this.

For long reads, this generator is approximately 20 times as fast as the
old generator (dd with bs=64K yields 53MB/sec on 2Ghz Core2 instead of
2.5MB/sec) and also uses a separate mutex per instance so concurrency
is greatly improved.  For reads of typical key sizes for modern
cryptosystems (16-32 bytes) performance is about the same as the old
code: a little better for 32 bytes, a little worse for 16 bytes.
2011-12-17 20:05:38 +00:00
njoly
973e485533 Start making fs read(2) fail with EISDIR if the implementation does
not allow read on directories (kernfs, rumpfs, ptyfs and sysvbfs).
Adjust man page accordingly, and add a small corresponding vfs
testcase.
2011-12-12 19:11:21 +00:00
njoly
e01a9cd0c9 Remove the unneeded rump component; the library already includes the
module code that will be initialised by rump.

Fix PR/44708, t_zpool:create test failure for RUMP_LOCKDEBUG=yes
builds.
2011-12-06 18:12:25 +00:00
njoly
eb308a41e8 Do not protect wrong KASSERT by LOCKEDBUG ifdef/endif, the latter uses
its own mecanism. Kill them both.
From discussion with pooka@.
2011-12-06 18:04:31 +00:00
jym
926571dfa7 Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

    bool isroot;
    error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
        cred, &isroot);
    if (error == 0 && !isroot)
            result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.
2011-12-04 19:24:58 +00:00
tls
42653868f7 Initialize the kern_cprng in rump startup. Oops.
Should fix some "mysterious" rump test failures.  Thanks to Nicholas Joly
for pointing out exactly what was wrong.
2011-12-01 19:15:15 +00:00
tls
f27d6532f5 Remove arc4random() and arc4randbytes() from the kernel API. Replace
arc4random() hacks in rump with stubs that call the host arc4random() to
get numbers that are hopefully actually random (arc4random() keyed with
stack junk is not).  This should fix some of the currently failing anita
tests -- we should no longer generate duplicate "random" MAC addresses in
the test environment.
2011-11-28 08:05:05 +00:00
tsutsui
1895e14ada Revert "stopcap fix" for rump by christos, which causes build failure on
most non-x86 ports and seems unnecessary. (caused by wrong rump_namei.h?)
2011-11-27 00:38:12 +00:00
njoly
b9701adc0f Do not call cprng_fast32() before locks init. Makes rump build with
RUMP_LOCKDEBUG=yes work again.
2011-11-26 21:41:02 +00:00
christos
f537702e26 Add subr_open_disk.c for getdiskinfo(). Once we get rid of getdiskinfo,
this will not be needed.
2011-11-25 17:54:15 +00:00
dholland
49ef74b8de Regen. 2011-11-25 16:52:47 +00:00
tsutsui
73f3cca235 No need to include MD <machine/cpu_counter.h> here. 2011-11-21 13:42:37 +00:00
tls
3afd44cf08 First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>.  This change includes
the following:

	An initial cleanup and minor reorganization of the entropy pool
	code in sys/dev/rnd.c and sys/dev/rndpool.c.  Several bugs are
	fixed.  Some effort is made to accumulate entropy more quickly at
	boot time.

	A generic interface, "rndsink", is added, for stream generators to
	request that they be re-keyed with good quality entropy from the pool
	as soon as it is available.

	The arc4random()/arc4randbytes() implementation in libkern is
	adjusted to use the rndsink interface for rekeying, which helps
	address the problem of low-quality keys at boot time.

	An implementation of the FIPS 140-2 statistical tests for random
	number generator quality is provided (libkern/rngtest.c).  This
	is based on Greg Rose's implementation from Qualcomm.

	A new random stream generator, nist_ctr_drbg, is provided.  It is
	based on an implementation of the NIST SP800-90 CTR_DRBG by
	Henric Jungheim.  This generator users AES in a modified counter
	mode to generate a backtracking-resistant random stream.

	An abstraction layer, "cprng", is provided for in-kernel consumers
	of randomness.  The arc4random/arc4randbytes API is deprecated for
	in-kernel use.  It is replaced by "cprng_strong".  The current
	cprng_fast implementation wraps the existing arc4random
	implementation.  The current cprng_strong implementation wraps the
	new CTR_DRBG implementation.  Both interfaces are rekeyed from
	the entropy pool automatically at intervals justifiable from best
	current cryptographic practice.

	In some quick tests, cprng_fast() is about the same speed as
	the old arc4randbytes(), and cprng_strong() is about 20% faster
	than rnd_extract_data().  Performance is expected to improve.

	The AES code in src/crypto/rijndael is no longer an optional
	kernel component, as it is required by cprng_strong, which is
	not an optional kernel component.

	The entropy pool output is subjected to the rngtest tests at
	startup time; if it fails, the system will reboot.  There is
	approximately a 3/10000 chance of a false positive from these
	tests.  Entropy pool _input_ from hardware random numbers is
	subjected to the rngtest tests at attach time, as well as the
	FIPS continuous-output test, to detect bad or stuck hardware
	RNGs; if any are detected, they are detached, but the system
	continues to run.

	A problem with rndctl(8) is fixed -- datastructures with
	pointers in arrays are no longer passed to userspace (this
	was not a security problem, but rather a major issue for
	compat32).  A new kernel will require a new rndctl.

	The sysctl kern.arandom() and kern.urandom() nodes are hooked
	up to the new generators, but the /dev/*random pseudodevices
	are not, yet.

	Manual pages for the new kernel interfaces are forthcoming.
2011-11-19 22:51:18 +00:00
yamt
1cc677b523 fix a type in a printf message 2011-10-31 13:25:21 +00:00
yamt
8b0a34508b replace a non us-ascii character in a comment 2011-10-31 13:23:55 +00:00
yamt
432b5ec94f comment 2011-10-31 13:17:22 +00:00
mbalmer
54ac94cda5 Underscores are sometimes overrated. 2011-09-27 14:24:52 +00:00
christos
1094105849 fix confusion between MAXPATHLEN and MAXNAMLEN 2011-09-27 13:53:26 +00:00
christos
a6015585d7 use RUMPFS_MAXNAMLEN consistently. 2011-09-27 01:45:04 +00:00
christos
44bf8904fe define RUMPFS_MAXNAMLEN and use it. 2011-09-27 01:25:32 +00:00
christos
4f78ec2aec add rfc6056.c 2011-09-24 21:11:23 +00:00
dyoung
78b0e18345 Report vmem(9) errors out-of-band so that we can use vmem(9) to manage
ranges that include the least and the greatest vmem_addr_t.  Update
vmem(9) uses throughout the kernel.  Slightly expand on the tests in
subr_vmem.c, which still pass.  I've been running a kernel with this
patch without any trouble.
2011-09-02 22:25:08 +00:00
christos
f05fc604ec trylockowner is not needed anymore. 2011-09-02 10:18:38 +00:00
christos
158bd4bab3 fix the build for rumpserver. 2011-09-01 21:09:07 +00:00
joerg
9eba1e423c Use __dead 2011-08-29 20:41:06 +00:00
dyoung
6c6cb72d7f Use VMEM_ADDR_MIN and VMEM_ADDR_MAX. 2011-08-25 15:14:19 +00:00
dyoung
64311e1f9d Introduce a couple of new constants, VMEM_ADDR_MIN (the least possible
address in a vmem(9) arena, 0) and VMEM_ADDR_MAX (the maximum possible
address, currently 0xFFFFFFFF).  Modify several boundary conditions so
that a vmem(9) arena can allocate ranges including VMEM_ADDR_MAX.
Update documentation and tests.

These changes pass the tests in sys/kern/subr_vmem.c.  To compile the
and run the test program, run "cd sys/kern/ && gcc -DVMEM_SANITY -o
subr_vmem ./subr_vmem.c && ./subr_vmem".
2011-08-23 22:00:57 +00:00
hannken
fc2e6c60c4 When consuming only part of a path in rump_vop_lookup():
- Make sure to consume complete path components.
- Consume trailing slashes too.
- Do not clear REQUIREDIR.

Test rump/modautoload/t_modautoload now passes.
2011-08-23 07:40:32 +00:00
manu
c817bc5d19 regen 2011-08-08 12:17:27 +00:00
rmind
a0ffc02ab8 Rename slightly misleading KTHREAD_JOINABLE to KTHREAD_MUSTJOIN. 2011-08-07 14:03:15 +00:00
hannken
2a24cc6572 Allow removal of a directory containing only whiteouts and free them first. 2011-08-07 05:56:32 +00:00
hannken
40cf7e4cfa Make whiteouts work on rumpfs:
- On lookup it is ok to create if the name exists and is a whiteout
- When replacing a whiteout directory entry remove the whiteout first.
- Set UF_OPAQUE when creating a node in place of a whiteout.
2011-08-05 08:13:59 +00:00
uch
7ce939b3e2 v7fs rump support 2011-07-24 08:55:28 +00:00
drochner
3c39863810 regen after *setxattr constification 2011-07-18 11:43:53 +00:00
joerg
3eb244d801 Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
2011-07-17 20:54:30 +00:00
dyoung
9c14481bd4 Use <sys/bus.h> not <machine/bus.h>. 2011-07-15 23:40:56 +00:00
hannken
49511bba25 Change VOP_BWRITE() to take a vnode as its first argument like all other
VOPs do.  Layered file systems no longer have to modify bp->b_vp and run
into trouble when an async VOP_BWRITE() uses the wrong vnode.

- change all occurences of VOP_BWRITE(bp) to VOP_BWRITE(bp->b_vp, bp).
- remove layer_bwrite().
- welcome to 5.99.55

Adresses PR kern/38762 panic: vwakeup: neg numoutput

No objections from tech-kern@.
2011-07-11 08:27:37 +00:00
mrg
5fb8a5b39d don't define multiple cwdi0's, mark this one as extern.
fixes various mips build issues i've seen with both GCC 4.1 and 4.5.
2011-07-04 11:31:37 +00:00
manu
be95d60797 Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.

There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
  extattr_list_file(2), which is obtanined by setting the
  EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)

This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
2011-07-04 08:07:29 +00:00
mrg
f50d565b1c define ARCH_ELFSIZE=32 and add kobj_stubs.c rumpcpu_generic.c. 2011-07-03 08:53:23 +00:00
christos
bc12521d3b regen 2011-06-26 17:05:55 +00:00
mrg
331c95a1e1 fix an operator precedence error picked up by GCC 4.5.3. real bug. 2011-06-22 04:01:08 +00:00
hannken
b632c2dabd Make ubc_purge() a noop. 2011-06-19 18:29:25 +00:00
hannken
86dc5d9ce0 Revert previous. ubc_purge() is already defined in rumpkern/vm.c 2011-06-19 18:28:24 +00:00
hannken
035869f9d4 Add a noop wrapper for ubc_purge() to make file system tests work again.
Not really sure if this is the right way -- Antti?
2011-06-19 11:22:42 +00:00
rmind
7083a919fc - Fix a silly bug: remove umap from uobj in ubc_release() UBC_UNMAP case.
- Use UBC_WANT_UNMAP() consistently.

ARM (PMAP_CACHE_VIVT case) works again.
2011-06-19 02:42:53 +00:00
hannken
d296304e60 Rename uvm_vnp_zerorange(struct vnode *, off_t, size_t) to
ubc_zerorange(struct uvm_object *, off_t, size_t, int) changing
the first argument to an uvm_object and adding a flags argument.

Modify tmpfs_reg_resize() to zero the backing store (aobj) instead
of the vnode.  Ubc_purge() no longer panics when unmounting tmpfs.

Keep uvm_vnp_zerorange() until the next kernel version bump.
2011-06-16 09:21:02 +00:00
mrg
a14dae9853 include uvm_object.c in the rump kernel for the new uvm_obj* functions.
don't build the uvm_object.c uvm_object_printit() for _RUMPKERNEL. (XXX)
add empty panic() stubs for uvm_loanbreak() and ubc_purge().

fixes some more 5.99.53 rump build issues.
2011-06-12 06:36:38 +00:00
rmind
e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
matt
d8b60e6f0b Appease rump. 2011-06-10 00:32:52 +00:00
tron
8bcd25a1a3 Fix rump build which got broken by the fix for PR kern/44986. 2011-05-28 16:07:43 +00:00
joerg
a216da57a6 Default to -Wno-sign-compare -Wno-pointer-sign for clang.
Push -Wno-array-bounds down to the cases that depend on it.
Selectively disable warnings for 3rd party software or non-trivial
issues to be reviewed later to get clang -Werror to build most of the
tree.
2011-05-26 12:56:24 +00:00
joerg
cfb300c780 Mark rumpuser_exit and rumpuser_thread_exit as dead. 2011-05-23 20:49:35 +00:00
joerg
188ae306aa Spell --fatal-warnings with two hyphens 2011-05-19 21:24:55 +00:00
christos
6009929c48 add a hacky version of sigsuspendsetup() to satisfy link requirements. 2011-05-18 15:57:14 +00:00
matt
12a3861acb Make rump compile things with -std=gnu99 like the kernel and modules. 2011-05-10 00:33:58 +00:00
dyoung
c2e43be1c5 Reduces the resources demanded by TCP sessions in TIME_WAIT-state using
methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime
Truncation (MSLT).

MSLT and VTW were contributed by Coyote Point Systems, Inc.

Even after a TCP session enters the TIME_WAIT state, its corresponding
socket and protocol control blocks (PCBs) stick around until the TCP
Maximum Segment Lifetime (MSL) expires.  On a host whose workload
necessarily creates and closes down many TCP sockets, the sockets & PCBs
for TCP sessions in TIME_WAIT state amount to many megabytes of dead
weight in RAM.

Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to
a class based on the nearness of the peer.  Corresponding to each class
is an MSL, and a session uses the MSL of its class.  The classes are
loopback (local host equals remote host), local (local host and remote
host are on the same link/subnet), and remote (local host and remote
host communicate via one or more gateways).  Classes corresponding to
nearer peers have lower MSLs by default: 2 seconds for loopback, 10
seconds for local, 60 seconds for remote.  Loopback and local sessions
expire more quickly when MSLT is used.

Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket
dead weight with a compact representation of the session, called a
"vestigial PCB".  VTW data structures are designed to be very fast and
memory-efficient: for fast insertion and lookup of vestigial PCBs,
the PCBs are stored in a hash table that is designed to minimize the
number of cacheline visits per lookup/insertion.  The memory both
for vestigial PCBs and for elements of the PCB hashtable come from
fixed-size pools, and linked data structures exploit this to conserve
memory by representing references with a narrow index/offset from the
start of a pool instead of a pointer.  When space for new vestigial PCBs
runs out, VTW makes room by discarding old vestigial PCBs, oldest first.
VTW cooperates with MSLT.

It may help to think of VTW as a "FIN cache" by analogy to the SYN
cache.

A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT
sessions as fast as it can is approximately 17% idle when VTW is active
versus 0% idle when VTW is inactive.  It has 103 megabytes more free RAM
when VTW is active (approximately 64k vestigial PCBs are created) than
when it is inactive.
2011-05-03 18:28:44 +00:00
pgoyette
18fecdaed4 More lim_free() fallout 2011-05-01 02:52:42 +00:00
dholland
82f660e639 Regen for ISSYMLINK removal. 2011-04-18 00:43:56 +00:00
rmind
fbc8beae75 Split off parts of vfs_subr.c into vfs_vnode.c and vfs_mount.c modules.
No functional change.  Discussed on tech-kern@.
2011-04-02 04:28:56 +00:00
dyoung
060522dec8 Hide the radix-trie implementation of the forwarding table so that we
will have an easier time replacing it with something different, even if
it is a second radix-trie implementation.

sys/net/route.c and sys/net/rtsock.c no longer operate directly on
radix_nodes or radix_node_heads.

Hopefully this will reduce the temptation to implement multipath or
source-based routing using grotty hacks to the grotty old radix-trie
code, too. :-)
2011-03-31 19:40:51 +00:00
pooka
bf89b7ec3f actually add libpud and revert damage to libputter.
pax -rw and forgetting to rm -rf CVS has some nasty side-effects ....
2011-03-31 08:36:25 +00:00
pooka
fe98957153 add pud as a rump component 2011-03-31 08:22:54 +00:00
dyoung
149dd44b66 __HAVE_DEVICE_REGISTER_POSTCONFIG and __HAVE_DEVICE_REGISTER
are no more, so don't use them here.
2011-03-28 22:23:39 +00:00
riz
d16ddb6294 Don't try to kmem_alloc() 0 bytes. Without this change, some trivial
kernel modules were not loadable by rump_server.
2011-03-27 21:16:52 +00:00
bouyer
d9210c2405 Add a new libquota library, which contains some blocks to build and/or
parse quota plists; as well as a getfsquota() function to retrieve quotas
for a single id from a single filesystem (whatever filesystem this is:
a local quota-enabled fs or NFS). This is build on functions getufsquota()
(for local filesystems with UFS-like quotas) and getnfsquota();
which are also available to userland programs.
move functions from quota2_subr.c to libquota or libprop as appropriate,
and ajust in-tree quota tools.
move some declarations from kernel headers to either sys/quota.h or
quota/quota.h as appropriate. ufs/ufs/quota.h still installed because
it's needed by other installed ufs headers.
ufs/ufs/quota1.h still installed as a quick&dirty way to get a code
using the old quotactl() to compile (just include ufs/ufs/quota1.h instead of
ufs/ufs/quota.h - old code won't compile without this change and this is
on purpose).
Discussed on tech-kern@ and tech-net@ (long thread, but not much about
libquota itself ...)
2011-03-24 17:05:39 +00:00
pooka
a3a20972d9 pnbuf_cache is used all over the place outside of vfs, so put it
in one place to avoid many definitions.
2011-03-22 15:16:23 +00:00
pooka
23bbd0e078 Update copyright statements.
no functional change.
2011-03-21 16:41:08 +00:00
pooka
056c4b30fa remove historic test 2011-03-21 15:51:34 +00:00
pooka
20c88ef126 this was moved to usr.bin ages ago 2011-03-21 15:47:53 +00:00
joerg
ad65a463d1 Include bsd.own.mk before making decisions based on mk.conf. 2011-03-21 05:15:18 +00:00
pooka
2750f1b5f9 make the if-else logic more obvious 2011-03-11 12:11:00 +00:00
pooka
86a95d8e4b After my change to the "interface accepts this packet" logic
yesterday the CARP test stopped working, since CARP depends on
IFF_PROMISC (which was previously always accidentally enabled).
While making the interface honor IFF_PROMISC, also make it compare
the received frame's address against ifp->if_sadl instead of a
local enaddr value we cached when the interface was created.
2011-03-11 12:10:15 +00:00
pooka
a6893ed075 Don't assume rump kernel PAGE_SIZE and host page size are the same. 2011-03-11 09:25:59 +00:00
wiz
fd1ad431e8 When panicing, at least tell the _real_ reason. 2011-03-10 22:11:05 +00:00
pooka
cd97edf46b autocreate /dev/zfs. requested by riz 2011-03-10 19:24:37 +00:00
pooka
8fa2364979 Support bpf. shmif_dumpbus(1) can be used for much the same effect,
but sometimes it's just more convenient to run tcpdump live.
2011-03-10 13:27:03 +00:00
pooka
d377d1cc83 Pass packet up if it's *for* us, not if it's from someone else.
This fixes a rather curious forwarding/redirect/etc. storm which
happened when there were >2 shmif kernels on the same shmbus with
ip forwarding set on. (at least it stress-tested other code ;)
2011-03-10 13:20:54 +00:00
pooka
df23472915 track lockdebug data even in the special path 2011-03-09 23:41:24 +00:00
pooka
49bc93eb11 Mark cv_wait mutex as locked before doing any further dances.
Fixes a LOCKDEBUG panic in case the uncommon condition is hit.
2011-03-09 18:15:39 +00:00
pooka
d469e02a3b Create cgd block device files in the right directory.
hi pooka!
2011-03-09 11:56:17 +00:00
pooka
08f26b12e5 Duh, the nfsd hacks in tests still used RUMP_SYS_NETWORKING. It
appears that using nxr to search for users wasn't a very good idea.
Put networking back and make the test of the defines give out
#errors.

me be fixink this
2011-03-09 10:10:19 +00:00
pooka
53b769ebd0 g/c old-style syscall selection method 2011-03-08 18:35:10 +00:00
pooka
ffad644563 regen: include rumpclient syscall headers from source tree instead of host 2011-03-08 18:31:11 +00:00
pooka
91240244df Nuke all threads belonging to a process calling exec before allowing
the exec handshake to return.

In addition to being The Right Thing To Do, fixes some nasty
conditions for CLOEXEC fd's (or at least does so in theory, I
couldn't create any problems although I tried).
2011-03-08 12:39:28 +00:00
pooka
9d382a98c5 Fill in a functional struct lwp (especially l_mutex) before exposing
it on p_lwps.
2011-03-07 21:04:47 +00:00
bouyer
063f96f3c2 merge the bouyer-quota2 branch. This adds a new on-disk format
to store disk quota usage and limits, integrated with ffs
metadata. Usage is checked by fsck_ffs (no more quotacheck)
and is covered by the WAPBL journal. Enabled with kernel
option QUOTA2 (added where QUOTA was enabled in kernel config files),
turned on with tunefs(8) on a per-filesystem
basis. mount_mfs(8) can also turn quotas on.

See http://mail-index.netbsd.org/tech-kern/2011/02/19/msg010025.html
for details.
2011-03-06 17:08:10 +00:00
joerg
8871ccf0f3 Fix spelling of MKZFS 2011-03-05 03:15:25 +00:00
pooka
67365afb80 We track page modified info with PG_CLEAN, so make clear_modify
return false.  This makes rump lfs unmount work on platforms which
use the pmap stub (i.e. non-x86, which already returned false here).
Otherwise, lfs would hang itself trying to flush some buffers but
couldn't fill a segment and therefore wouldn't actually write
anything.
2011-03-02 13:11:52 +00:00
pooka
da0742f9b8 Reset node's parent pointer when it's removed. Technically the
parent still exists, but allows us to avoid complicated g/c algorithms
if the parent *is* removed.
2011-03-01 15:14:35 +00:00
pooka
405dec72d6 Pass accurate protection info from ubc_uiomove() to the pager.
Fixes nfs{,ro}_fileio tests on at least sparc64 (and probably macppc
and other fat endian machines).

The problem was that nfs was fooled to thinking read() caused a
write fault because of VM_PROT_WRITE being unconditionally set and
therefore set NMODIFIED on a r/o file system.  It is absolutely
beyond me why the test worked on i386/amd64.  Incidentally, I seem
to have "misplaced" a few goats.
2011-03-01 10:02:11 +00:00
pooka
9bd36d4186 tmpfs has two layers of uvm objects (vnode->uobj and the anon object
in tmpfs_node), so when playing with pages make sure we lock the
uvm object the pages belong to instead of the vnode's uvm object.

per test from Nicolas Joly (which I'm sure he will commit soon ;)
2011-02-27 13:37:39 +00:00
riz
7559b861a8 Use AUDIO_DEVICE instead of 0 as the minor number for /dev/audio since
0 is incorrect. While I'm here, add /dev/sound, audioctl, and mixer too.

ok pooka@
2011-02-25 19:27:05 +00:00
pooka
4d59265f95 Don't autogenerate a large number of unnecessary device nodes, just
slows bootstrap.
2011-02-25 18:56:20 +00:00
pooka
db3f366798 Shuffle the pagedaemon algorithm a bit to record the number of
pageouts active and give up only if the pagedaemon could not free
memory and there are no outstanding pageouts.

This should fix the "out of memory" pauses reported by Mihai Chelaru
and Taylor R Campbell.  Tested by copying files to and from an ffs
backed by /dev/wd0 (with and without -o log) using a 1MB rump kernel
memory limit.
2011-02-22 20:17:37 +00:00
pooka
30a736a57d complete the incomplete pagesize rototill 2011-02-22 18:43:20 +00:00
pooka
bfe95d83e2 omstart 2011-02-22 14:09:35 +00:00
pooka
b4a00f1f22 regenagain: make returning off_t work (without breaking other return
types on some archs)
2011-02-22 14:06:29 +00:00
pooka
65fa004526 unregen 2011-02-22 13:05:07 +00:00
pooka
6578cf43da regen: cast rval to return type instead of just using rval[0] 2011-02-22 10:34:06 +00:00
pooka
686a05ebe4 regen: NOERR syscalls 2011-02-21 23:31:00 +00:00
pooka
5c02795ed4 regen: preadv/pwritev 2011-02-21 12:49:49 +00:00
pooka
2e56e2896f regen: always explicitly set errno (fixes some apps) 2011-02-21 11:33:36 +00:00
pooka
a21e393066 commit regen for int -> pid_t fix 2011-02-21 11:32:26 +00:00
pooka
dd69b8ebde Change the default sigmodel to "raise", it makes more sense than
causing a panic.
2011-02-20 13:09:57 +00:00
pooka
9b097994c2 Support FD_CLOEXEC in rump kernels. 2011-02-15 15:54:28 +00:00
pooka
abcc13e159 Add an "exec" callback for the proxy code. The client can now
notify the rump kernel of an exec having taken place.
2011-02-15 10:35:05 +00:00