Commit Graph

1309 Commits

Author SHA1 Message Date
njoly
e800a040dc Make vers.c depend on sys/param.h too, to ensure that this file is
regenerated for on kernel version bump. Avoids __NetBSD_Version__ and
osrelease out of sync problem for mkupdate builds.

ok from pooka@.
2010-06-06 20:09:38 +00:00
njoly
edf3b80fb4 Regen for pathconf/fpathconf rumpification. 2010-06-04 16:33:32 +00:00
pooka
c91a51436f Don't use rumpuser_malloc() directly. 2010-06-03 19:36:21 +00:00
pooka
8b0211dc06 Implement a sort-of pagedaemon: adjust all memory allocators to go
through an in-rumpkernel hypermemory allocator which knows it should
kick the pagedaemon and block in case ``waitok'' memory allocation
fails.

This allows us to recover from some out-of-memory situations.
Realworld'istically speaking (as opposed to whatever "should be"
theory), these OOM situations will happen extremely rarely if ever
when our hypervisor is a regular process.  Speculatively, this
should be useful for other types of hosts.

issues remaining:
 * the hypervisor does not know how to reclaim kernel memory (and
   for the reason I stated above, I'm not sure if it makes sense
   to teach the current implementation about that)
 * vfs memory (buffers, vm object pages etc.) is not reclaimed
2010-06-03 10:56:20 +00:00
pooka
03d9f8436f In aiodone, call uvm_pageout_done() with number of PG_PAGEOUT pages
processed.
2010-06-02 12:07:03 +00:00
pooka
8e9f71e9f5 rumpvm_init -> uvm_init to get rid of local prototype.
no functional change
2010-06-02 10:55:18 +00:00
pooka
e3c273abc1 Don't pass "canfail" down to rumpuser_malloc -- there's quite little
we can do with that info way down there.  Instead, pass alignment.
Implement rumpuser_malloc() with posix_memalign().
2010-06-01 20:11:33 +00:00
pooka
89600f9afb Always use rumpuser_malloc() for allocating both poolpage and
poolpage_cache -- its bootstrap cost is slightly higher than
anonmmap, but it's faster in the long run.
2010-06-01 19:18:20 +00:00
pooka
6a58bb3a83 * remove rumpvm_makepage, just use uvm_pagealloc()
* update copyright to reflect reality a little better
2010-06-01 10:29:21 +00:00
pooka
a590bfb92a Support mtsleep() without a biglocked sleeper (uvm uses this in
UVM_UNLOCK_AND_WAIT())
2010-05-31 23:18:33 +00:00
pooka
977a0ef122 Dump rump kernel bootstrap time. 2010-05-31 23:13:17 +00:00
pooka
9970bb9e64 Support KTHREAD_JOINABLE/kthread_join(). Also fixes earlier bug
where all pthreads were created non-detached.
2010-05-31 23:09:29 +00:00
pooka
7210d18f43 The x86 kernel ABI depends on __cpu_simple_lock stuff being present.
Since they are practically never used (only when prehistoric code
uses simple_lock()), their efficiency doesn't matter that much and
we can simply adapt the versions from x86 lock.h.
2010-05-31 22:31:07 +00:00
pooka
01c45b7fe9 Deal with the "we get a portably arbitrary set of headers on
different archs" problem.
2010-05-28 18:17:24 +00:00
pooka
d11274ecfd Improve the CPU scheduler for a host MP system with multithreaded
access.  The old scheduler had a global freelist which caused a
cache crisis with multiple host threads trying to schedule a virtual
CPU simultaneously.

The rump scheduler is different from a normal thread scheduler, so
it has different requirements.  First, we schedule a CPU for a
thread (which we get from the host scheduler) instead of scheduling
a thread onto a CPU.  Second, scheduling points are at every
entry/exit to/from the rump kernel, including (but not limited to)
syscall entry points and hypercalls.  This means scheduling happens
a lot more frequently than in a normal kernel.

For every lwp, cache the previously used CPU.  When scheduling,
attempt to reuse the same CPU.  If we get it, we can use it directly
without any memory barriers or expensive locks.  If the CPU is
taken, migrate.  Use a lock/wait only in the slowpath.  Be very
wary of walking the entire CPU array because that does not lead to
a happy cacher.

The migration algorithm could probably benefit from improved
heuristics and tuning.  Even as such, with the new scheduler an
application which has two threads making rlimit syscalls in a tight
loop experiences almost 400% speedup.  The exact speedup is difficult
to pinpoint, though, since the old scheduler caused very jittery
results due to cache contention.  Also, the rump version is now
70% faster than the counterpart which calls the host kernel.
2010-05-28 16:44:14 +00:00
pooka
5f75591d60 regen: rump_vfs_mount_print 2010-05-26 21:51:58 +00:00
pooka
7e5ec0880b Add public namespace helper routine for dumping info on mountpoints. 2010-05-26 21:50:56 +00:00
pooka
c468a8e0bf print vm object refcount 2010-05-26 21:48:20 +00:00
pooka
9df433ebd4 include extattr support 2010-05-20 18:23:59 +00:00
pooka
8cfbddcc72 one more file to commit with regen 2010-05-20 15:58:09 +00:00
pooka
1a99fb9d2c regen: rump_vfs_extattrctl 2010-05-20 15:47:45 +00:00
pooka
e1f101dffa open VFS_EXTATTRCTL to user namespace 2010-05-20 15:46:47 +00:00
martin
d806a53fa5 Add missing include 2010-05-18 20:18:18 +00:00
pooka
0f6a90c207 Whoops, default to MP locking. 2010-05-18 16:30:22 +00:00
pooka
d47455e39d Add uniprocessor versions of mutex/rw/cv. They work only on virtual
unicpu configurations (i.e. RUMP_NCPU==1), but are massively faster
than the multiprocessor versions since the fast path does not have
to perform any cache coherent operations.  _Applications_ with
lock-happy kernel paths, i.e. _not_ lock microbenchmarks, measure
up to tens of percents speedup on my Core2 Duo.  Every globally
atomic state required by normal locks/atomic ops implies a hideous
speed penalty even for the fast path.

While this requires a unicpu configuration, it should be noted that
we are talking about a virtual unicpu configuration.  The host can
have as many processors as it desires, and the speed benefit of
virtual unicpu is still there.  It's pretty obvious that in terms
of scalability simple workload partitioning and replication into
multiple kernels wins hands down over complicated locking or
locklessing algorithms which depend on globally atomic state.
2010-05-18 16:29:36 +00:00
pooka
a955550ec3 Namespace rump-only kernel biglock routines appropriately.
No functional change.
2010-05-18 15:16:10 +00:00
pooka
fdeac1d7df Move routines related to kernel locking and scheduling from
locks.c to klock.c.

No functional change.
2010-05-18 15:12:19 +00:00
pooka
b1b7862792 Make it possible to use the scheduler lock as the rumpuser condvar
interlock.  This is applicable in cases where the actual interlock
is the CPU the currently running thread is scheduled on.  Borrowing
the scheduler lock as the mutex mandated by pthread_cond_wait()
does away with need to have an additional mutex.  This both optimizes
runtime execution and simplifies code, as the extra lock typically
lead to quite some trickeries to avoid the dungeon collapsing due
to zaps from the wand of deadlock.
2010-05-18 14:58:41 +00:00
njoly
d52f4f14b6 Regen for multiple inclusion protection. 2010-05-17 12:37:20 +00:00
pooka
366a313d12 Pick up after people who find build-testing their changes too difficult. 2010-05-14 13:04:14 +00:00
pooka
dc34f07022 fix inversion: advance clock on cpu0, not the complement of cpu0 2010-05-12 16:48:21 +00:00
pooka
1616addc6a Actually, push defining _RUMPKERNEL down to libkern, since it's
not needed elsewhere.
2010-05-11 22:21:05 +00:00
pooka
d3280f90bc Limit visibility of _RUMPKERNEL to prevent abuse. 2010-05-11 21:08:07 +00:00
pooka
65972a0f32 add __HAVE_CPU_COUNTER stubs where possible (i.e. where the arch
doesn't think inlines are the second compiling)
2010-05-11 21:03:41 +00:00
pooka
17bb799409 adjust comment in previous.
XXX: should make that (and physmem) mean something here
2010-05-11 20:25:14 +00:00
pooka
a96791040e remove unnecessary #ifdef 2010-05-11 20:21:56 +00:00
pooka
7e3cbd3f20 regen: _RUMPKERNEL -> _KERNEL 2010-05-11 20:11:47 +00:00
pooka
14d288df20 _RUMPKERNEL -> _KERNEL 2010-05-11 20:09:11 +00:00
pooka
27d01ae5b3 Cache directory entry name length. This brings kernel bootstrap
time down: 14ms -> 12ms.  Further hashing etc. did not seem to have
any noticable effect.
(without /dev node creation bootstrap time is 8ms, so it's still
the bottleneck)
2010-05-11 16:59:42 +00:00
pooka
6e2452d938 Initialize p_pgrp when creating a new process structure (and not
only for proc0).  This makes something work.  I just can't remember
what it was anymore.
2010-05-11 14:57:20 +00:00
pooka
7037dbf8d5 Set default number of vnodes to 1k instead of 64k: a large default
reserves a large amount of memory by default and this is not
desirable in a rump kernel where the typical usage is minimal.
Maybe I should write a few lines to autoscale desiredvnodes up to
a hard limit after the soft limit is reached?
2010-05-11 14:49:07 +00:00
pooka
2315e5705a Fix reclaim locking so that we don't attempt lock reentry if making
a new rumpfs vnode triggers a reclaim for a rumpfs vnode.
2010-05-11 14:42:24 +00:00
pooka
484a50b1cb uvm_object_printit() should be wrapped in DEBUGPRINT 2010-05-11 14:06:08 +00:00
pooka
c33b4c9a6b update slightly 2010-05-11 11:58:14 +00:00
pooka
ed541767a0 drop silly backronym. just rump. 2010-05-11 09:45:59 +00:00
pooka
14a8ac5592 Reclaim spec-type vnodes properly. 2010-05-11 09:28:40 +00:00
pooka
e7f4f9320b ABC2010 paper 2010-05-02 11:11:36 +00:00
pooka
811310d0d4 remember to add audio to the list of device components 2010-05-01 23:24:40 +00:00
pooka
3a2dd9aab6 support pad(4) 2010-05-01 23:21:24 +00:00
pooka
f60e2f41a7 add audio(4) support 2010-05-01 23:19:56 +00:00