* introduce fsetown(), fgetown(), fownsignal() - this sets/retrieves/signals
the owner of descriptor, according to appropriate sematics
of TIOCSPGRP/FIOSETOWN/SIOCSPGRP/TIOCGPGRP/FIOGETOWN/SIOCGPGRP ioctl; use
these routines instead of custom code where appropriate
* make every place handling TIOCSPGRP/TIOCGPGRP handle also FIOSETOWN/FIOGETOWN
properly, and remove the translation of FIO[SG]OWN to TIOC[SG]PGRP
in sys_ioctl() & sys_fcntl()
* also remove the socket-specific hack in sys_ioctl()/sys_fcntl() and
pass the ioctls down to soo_ioctl() as any other ioctl
change discussed on tech-kern@
during a read(v)/write(v). If that happeded, we either passed a NULL
pointer or a pointer to something uninitialized as iov to ktrace,
or we got a memory leak. In read/write (w/o -v) we zero-initialize
the iov now to limit damage, in the -v calls the (dynamically allocated)
pointer is checked after the I/O.
- prevent BLOCKED upcalls on double page faults and during upcalls
- make libpthread handle blocked threads which hold locks
- prevent UNBLOCKED upcalls from overtaking their BLOCKED upcall
this adds a new syscall sa_unblockyield
see also http://mail-index.netbsd.org/tech-kern/2003/09/15/0020.html
- add simple lock for the list
- make get function remove the item from the list
- eliminate superfluous functions
thanks to enami and matt for the feedback.
the information there.
TODO:
1. since timer stuff gets called from an interrupt context, we could
preallocate ksiginfo_t's from the pool, so we don't need a kmem
pool.
2. probably the sa signal delivery syscall can be changed to take
a ksiginfo_t so we can use only one pool.
3. maybe when we add realtime signal support, add a resource limit
on the number of ksiginfo_t's a process can allocate.
and exit(): we have to lookup the entry in the private copy
again, otherwise the wrong list is manipulated.
should fix a panic on postgres shutdown reported by Marc Recht
being here, improve sone debug messages
shminfo.shmseg, in view of the fact that only few processes utilize a
significant fraction of it:
-turn the table into a linked list, elements allocated from a pool(9)
-On fork(), just bump a refcount instead of copying the list; it will
be decremented on exit() and exec(). Only copy if an attach or detach
takes place in between, which is rarely the case.
into compiled LKMs. Check this information when LKM is loaded into kernel
and refuse LKMs not matching currently running kernel. Provide LMFORCE
ioctl to skip this check for those feeling adventurous.
as discussed on tech-kern@, thanks to feedback from Bill Studenmund and others
every field; some need to stay around.
Fixes a bug where by calling shutdown() on a socket with knotes
will cause the kernel to panic when the kernel closes the socket.
Other access, such as calling kevent() may also trigger the panic.
Debugged with help from Jason and Allen. Patch reviewed by same plus
Itojun and Matt Thomas.
This problem seems to be the same one that FreeBSD saw in their PR
number 54331.
Kernel version _not_ bumped as we will piggyback the bump earlier today.
and the subsequent namei(): inform the kernel portion of
valid filenames and then disallow symlink lookups for
those filenames by means of a hook in namei().
with suggestions from provos@
also, add (currently unused) seqnr field to struct
systrace_replace, from provos@
and make the stack and heap non-executable by default. the changes
fall into two basic catagories:
- pmap and trap-handler changes. these are all MD:
= alpha: we already track per-page execute permission with the (software)
PG_EXEC bit, so just have the trap handler pay attention to it.
= i386: use a new GDT segment for %cs for processes that have no
executable mappings above a certain threshold (currently the
bottom of the stack). track per-page execute permission with
the last unused PTE bit.
= powerpc/ibm4xx: just use the hardware exec bit.
= powerpc/oea: we already track per-page exec bits, but the hardware only
implements non-exec mappings at the segment level. so track the
number of executable mappings in each segment and turn on the no-exec
segment bit iff the count is 0. adjust the trap handler to deal.
= sparc (sun4m): fix our use of the hardware protection bits.
fix the trap handler to recognize text faults.
= sparc64: split the existing unified TSB into data and instruction TSBs,
and only load TTEs into the appropriate TSB(s) for the permissions.
fix the trap handler to check for execute permission.
= not yet implemented: amd64, hppa, sh5
- changes in all the emulations that put a signal trampoline on the stack.
instead, we now put the trampoline into a uvm_aobj and map that into
the process separately.
originally from openbsd, adapted for netbsd by me.
- Write label to all netbsd (type 169) mbr partitions (even if they don't
already have a label).
- Update any label found in sector LABELSECTOR and sector 0.
Latter change makes DIOCWDINFO (etc) work on raidframe (fixing bin/22529).
The patch below (hopefully) improves some signaling problems
found by Nathan.
It also contains some cleanup of the sa_upcall_userret() function
removing any sleep calls using PCATCH.
Unblocked threads now only use an upcall stack after they
acquire the virtual CPU.
This prevents unblocked threads from stealing all available
upcall stacks.
Tested by Nick Hudson.
If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.
Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.
Lock setting/clearing of tp->t_oproc to guarantee concurrent opens can't
both suceed and that code in tty.c can't get a NULL t_oproc if the value
is re-read after being checked.
There are still MP issues with pt_flags, pt_send and pt_unctl.
Maybe problems that require TTY_LOCK() to be taken before calling std tty
functions.
to make users of the callout facility able to cooperate to work around the
race caused by the callout code lowering interrupt priority level when
invoking callout handlers, something which allows other code to run before
the callout handler gets to it's spl*() call.
This is to enable the workaround for the TCP code found in PR#20390 to be
applied.
This should be backed out once a more comprehensive fix can be put in
place.
(enabled by defining __HAVE_BIGENDIAN_BITOPS in <machine/types.h>). The
default is still LSB ordering. This change will allow the powerpc MD
implementations of setrunqueue/remrunqueue to be nuked.
used in an ad hoc way by a couple of eval board ports, so might as well
tidy it up a little and add some formality. (And, yes, I need to use it
in another eval board port.)
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.
Bump the kernel rev up to 1.6V
whether there is anything to do - almost as if it were a predicate
test outside of a condition wait. This prevents returning to userland
when tsleep() has woken up spuriously, such as from a signal that was
caught and then removed by a tracing process.
Kills off some double-stops in GDB due to signals as well as a couple
of pthread__idle assertions when detaching from a process.
XXX stopping inside tsleep, via CURSIG(), is evil.
is set to 0, the function will always return 0 (no packets/events
are permitted)." Before this patch, ppsratecheck returned 1 once
a second when maxpps was 0.
1. sa_len was not properly checked.
2. sa_family was not properly checked [even used as an array index!]
3. we only know about inet4 and inet6, so make sure that the corresponding
data is valid before using it.
4. keep reference counts of addresses used (is that necessary?)
out the writing of an lwp's registers to a separate function. XXX Although
not really the correct way to do this, make the thread that caused the
coredump has it's register set written first so GDB is happy. (this is a
bridge until TRT is done).
- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
and factored out some of the vfsop statfs() code to copy_statfs_info(). This
fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
* m_apply(), which applies a function to each mbuf in chain
starting at a specified offset, for a specified length.
* m_getptr(), which returns a pointer to the mbuf, as well as
the offset into that mbuf, corresponding to an offset from
the beginning of an mbuf chain.
From OpenBSD, cleaned up slightly by me.
in the mbuf header.
* Use the new cached paddr feature of the pool_cache API to record
the physical address of mbuf clusters. (We cannot use a ctor for
clusters, since clusters have no constructed form; they are merely
buffers).
Bus_dma back-ends may use the cached physical addresses to save having to
extract the physical address from virtual.
* Provide space in m_ext recording the vm_page *'s for an SOSEND_LOAN_CHUNK-
sized non-cluster external buffer. Use this in the sosend_loan code to
save having to extract the physical address from virtual and then look
up the vm_page *'s.
* Provide an indication that an external buffer is mapped read-only at
the MMU. Set this flag for the external buffer in the sosend_loan
case, since loaned pages are always mapped read-only. Bus_dma back-ends
may use this information to save cache flushing, since a cache flush of
a read-only mapping is redundant on some architectures (the cache would
have already been flushed when making the mapping read-only).
Part 2 in a series of simple patches contributed by Wasabi Systems
to improve network performance.
objects. Clients of the pool_cache API must consistently use
the "paddr" variants or not, otherwise behavior is undefined.
Enable this on Alpha, ARM, MIPS, and x86. Other platforms must
define POOL_VTOPHYS() in the appropriate manner in order to enable
the feature.
Part 1 of a series of simple patches contributed by Wasabi Systems
to improve network performance.
checking against nprocs is wrong in any case btw - we do allow
maxproc higher than number of current processes, it would just mean
no new process could be started until number of processes would
be lower than the new limit
Avoids a lot of casting and removes the need for some line breaks.
Removed a load of (caddr_t) casts from calls to copyin/copyout as well.
(approved by christos - he has a plan to remove caddr_t...)
- #ifdef out a DIAGNOSTIC printf() that was too annoying (rule of thumb,
don't make DIAGNOSTIC printfs() that print *very* frequently...)
- fix DIAGNOSTIC test that would always get triggered on a new session.
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.
having some #ifdef UNION code in vfs_vnops.c, introduce variable
'vn_union_readdir_hook' which is set to address of appropriate
vn_readdir() hook by union filesystem when it's loaded & mounted
Treat +ve numbers as process group ids and -ve as pids (see tcsetpgrp() in part of current session.
Treat +ve numbers as process group ids and -ve as pids - see tcsetpgrp(3).
(approved by christos)
syscall sys_timer_settime()) to take an absolute value for realtime
timers. This avoids a pair of gratiuitous conversions with the
possibility that the timer's intermediate value would be 0.0, which
would signal timer_settime() to cancel the timer.
Adjust callers of timer_settime() to compensate; catch the case where
sys_timer_settime() with an absolute time value of now and a virtual
timer would also be subtracted down to a timer-cancelling 0.0.
This should fix the bug seen in libpthread's nanosleep() where certain
applications, such as xmms, would wedge with unexpired userlevel
alarms.
people would read that list in a more timely fashion!), change the new
64-bit memory reporting sysctl nodes to report bytes. This should not
be a problem, since it's only a week old, and no applications use the
new nodes yet.
be returning because the code path that calls is will very likely call
setrunnable() again on the returned thread, leading to a panic because
the thread returned is already at LSRUN. This fixes a problem where netbsd
would panic when using gdb (5.3) on a process with multiple lwp's like this:
% gdb program
(gdb) run
^C
(gdb) quit
We cannot store LWP pointers permanently in lock structures, for two reasons:
1) They are somewhat ephemeral. Dangling pointers are bad.
2) A different LWP may issue the unlock, and in this case, we were not actually
doing the unlock at all. This was causing processes to exit without undoing
fcntl(2) locks. Furthermore, the locks are process-specific to begin with,
so the test was just plain wrong.
Instead, we go back to storing a proc pointer for POSIX locks. In addition, we
add an extra pointer to the LWP, which is used in deadlock detection. After
the lock is granted, this pointer is 0ed and there is no reference to the LWP.
Now evolution can inc my mail again.