- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.
first lwp in the list is the last created and in the firefox and gtk-gnash
case this is usually a zombie, so the status in ps was ZLl. This now picks
the lwp in order ONPROC > RUN > SLEEP > STOP > SUSPENDED > IDL > DEAD > ZOMB
and breaks ties using cpticks.
walk the list, we're looking for a vp to do something with. We do
this in the signal code and in the timer code. The signal code already
runs with proc::p_lock held, so it's a very natural lock to use. The
timer code, however, calls into the sa timer code with a spinlock held.
Since proc::p_lock is an adaptable mutex, we can sleep to get it. Sleeping
with a spinlock is BAD. So proc::p_lock is _not_ the right lock there,
and something like sadata::sa_mutex would be best.
Address this difficulty by noting that both uses actually just read
the list. Changing the list of VPs is rare - once one's added, it stays
until the process ends. So make the locking protocol that to write the
list you have to hold both proc::p_lock and sadata::sa_mutex (taken
in that order). Thus holding either one individually grants read access.
This removes a case where we could sleep with timer_lock, a spinlock at
IPL_SCHED (!!), while trying to get p_lock. If that ever happened, we'd
pretty much be dead. So don't do that!
This fixes a merge botch from how I handled our gaining p_lock - p_lock
should not have simply replaced p_smutex.
While here, tweak the sa_unblock_userret() code for the case
when the blessed vp is actually running (on another CPU). Make its
resched RESCHED_IMMED so we whack the CPU. Addresses a hang I've
observed in starting firefox on occasion when I see one thread running
in userland and another thread sitting in lwpublk, which means it's on
the list of threads for which we need an unblocked upcall. This list is
one on which things should NOT linger.
- Remove remaining #ifdef INET.
- Avoid holding locks so we don't need to do KM_NOSLEEP allocations.
- Use a rwlock to protect the accept filter list.
- Make it safe to unload accept filter modules.
- Minor KNF.
mail: http://mail-index.netbsd.org/source-changes/2008/10/10/msg211109.html
* Scary-looking socket locking stubs (changed to KASSERT of locked)
* depends on INET inappropriately (though now you must add new
accept filter names to the uipc_accf.c line in conf/files if
you aren't using dataready or httpready)
* New code uses MALLOC/FREE -- changed to kmem_alloc/kmem_free;
could be pool_cache, these are all fixed-size allocations.
We need to verify that this works as expected with protocols with per-socket
locking, like PF_LOCAL. I'm a little concerned about the case where the
lock on the listen socket isn't the same lock as on the eventual connected
socket.
Typecasting quad_t * to long * and using atomic_add_long can't
possibly be expected to work!
Another fine error caught by the gcc type-punning warning. That
really really should be on by default in the kernel.
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.
Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.
- Change minimal time-quantum to ~20 ms.
- Thus remove unneeded pool in M2, and unused sched_lwp_exit().
- Do not increase l_slptime twice for SCHED_4BSD (regression fix).
UDF. Future users could be msdosfs, ufs, nilfs2 (when ready), cd9660 etc.
Note that its not the same as UFS's DIRHASH support; UFS would need a good
cleanup/splitout of directory operations to adopt to this new directory
hashing support since most directory operations are interweaved with the
vnops itself. This is a TODO.
Let's look for threads in the TARGET process, not in the
debugger process (gdb). Noticed when a KASSERT fired while
running gdb on a threaded app.
I will adjust wrstuden-revivesa-base-3 to include this change.
free a buffer. This will fail if the buffer owner has a chance to
modify the BC_DONTFREE flag before putphysbuf() examines it.
Fix by removing get/putphysbuf() and BC_DONTFREE. Physio_done() now
has an explicit test for a buffer coming from the call of physio().
Observed by Lars Nordlund when writing a DVD with growisofs, see PR kern/39536.
Reviewed by: Jason Thorpe <thorpej@netbsd.org>
fssvar.h: struct device * -> device_t.
fss.c: establish unmount hook on first attach, remove on last detach.
vfs_syscalls.c: remove the call of fss_umount_hook().
vfs_trans.c: destroy cow handlers on unmount as fstrans_unmount() will be
called before vfs_hooks.
- make sure that sigput always initializes ksi
- initialize si_code with SI_NOINFO instead of lying (SI_USER)
- if a process is being ktraced, make siginfo available
- always pass the available siginfo to ktrace, even if it has SI_NOINFO
without releasing locks.
* Corrected a panic caused by veriexec_file_verify() not setting the
returned struct veriexec_file_entry **vfep in all cases.
Thanks to Stathis Kamperis for finding the issues and testing the fixes.
data which can be read, as expected. Before, the call fell through a
"case" statement and was forwarded to the slave side, returning the
data which can be read by the slave.
The new behaviour also matches Linux and OSF/1.
in pmf_deregister, don't constantly realloc. just shift everything closer
to the front. and then if empty, free. When adding, add space for 4 more
entries.
Instead of n * sizeof(type) use C99 sizeof(type [n]).
Re-init the condvar after memcpy'ing so that the TAILQ is valid.
regress/sys/kern/lockf now completes rather than failing in cv_wait_sig
with KASSERT(cv_has_waiters()).
support for specifying an accept filter for a service (mostly as a usage
example, but it can be handy for other things). Manual pages to follow
in a day or so.
OK core@.
Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.
OK'd by core@, releng@.
it was not registered before. I assume that a lot of drivers do not
proper bookkeeping in the case the attach() exits early due to
errors. This is hard to fix and to test, so just be generous here.
lwp_exit_switchaway() to convince gcc that cpu_switchto(NULL, ...) is
really not going to return in that case. Exposed by gcc4.3.
Reported on tech-kern by Alexander Shishkin.
into two parts so that some of the routines could be used by rump.
Now that rump uses both vfs_subr and vfs_subr2 and there is no
reason to keep two files lying around, re-unite them.
sensitive information in memory longer than necessary
(We could make this depend on ~ECHO or so, but this would be an API
change and I don't think it is worth the effort.)
a union that contains either a callout [for CLOCK_REALTIME] or a flag
and a list [for other clock types]. Make sure we perform the right actions
on the right union member depending on the clock type. Otherwise this would
result in crashes.
queue, if it's being flushed. But in this case, we'll also have B_VFLUSH.
While there fix checkfreelist() so that it can be used to check that a
buffer is not in the free lists.
its not on a free list.
Also change buf_init() to not automatically mark buffers `busy' since this
only makes sense for bufcache buffers.
Mark all buf_init'd buffers 'busy' on the places where they ought to be
flagged as such to not confuse the buffer cache.
Fixes PR 38923.
machine
Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.
- Back out the previous workaround now that the sleep queue code has
been changed to never let the queue become empty if there are valid
waiters.
- Use sleepq_hashlock() to improve clarity.
- Sprinkle some assertions.
sleepq_changepri, sleepq_lendpri: don't let an active sleep queue head become
empty. The condvar code inspects the queue head without holding the sleep
queue lock and needs to see a non-empty queue if there are waiters.
PR kern/38761.
The condvar must access the sleepq with the sleepq lock held, doing so
is causing inconsistent sleepq state to be read.
This is because some accesses to the sleepq don't come via the cv code,
but are call directly into sleepq_changepri and sleepq_lendpri, which take
the sleepq lock, and removes then re-inserts lwps into the sleepq.
Running a build.sh with -j8 now completes on my quad-core, also tested by
Simon@ on a 8-core server and matt@ on a quad-core.
I believe there is room to be more efficient with this, as we now take the
sleepq lock for all cv_broadcast and cv_signal calls. I'll look into this
and post a diff to tech-kern.
reference to the PMF private data until the private data has no
more waiters. This protects against a NULL dereference.
In device_pmf_lock1(), test a device_t for PMF registration before
dereferencing its PMF private data.
use both types of list.
- Make page coloring and idle zero state per-CPU.
- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.
- Don't use goto in critical paths, it can confuse the compiler.
- Sprinkle some branch hints.
- Make namecache stats per-CPU and collate once per second.
- Use vtryget().
to scale more gracefully when there are thousands of active threads.
Proposed on tech-kern@.
- Use LOCKDEBUG to catch some errors in the use of condition variables:
freeing an active CV
re-initializing an active CV
using multiple distinct mutexes during concurrent waits
not holding the interlocking mutex when calling cv_broadcast/cv_signal
waking waiters and destroying the CV before they run and exit it
- Tweak it so it can also catch common errors with condition variables.
The change to kern_condvar.c is not included in this commit and will
come later.
- Don't call kmem_alloc() if operating in interrupt context, just fail
the allocation and disable debugging for the object. Makes it safe
to do mutex_init/rw_init/cv_init in interrupt context, when running
a LOCKDEBUG kernel.