arguments passed to accept(), bind(), connect(), getpeername(), getsockname(),
getsockopt(), recvfrom(), sendto() and sendmsg() unsigned, which also elimiates
a few casts.
* Reflect the (now) signedness of msg_iovlen, which necessiates the addition
of a few casts.
- no longer conditionalized
- when traced, charge time to real parent, not debugger
- make it clear for future rototillers that p_estcpu should be moved
to the "copy" region of struct proc.
parents for children's p_estcpu. I think this problem has always been there.
It's particularly noticable with X because the server builds up non-trivial
CPU, and hence, non-trivial p_estcpu scheduler penalty. The repeatedly
forked children were always starting from scratch and receiving a scheduler
preference.
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.
data after the cmsghdr when accessing internalized SCM_RIGHTS messages
(i.e. array of struct file *s). The historic interface does not align
the externalized SCM_RIGHTS messages (i.e. array of ints).
date: 1998/08/01 01:47:24; author: thorpej; state: Exp; lines: +18 -8
Don't call the protocol drain routines if how == M_NOWAIT, which typically
means we're in interrupt context. Since we can be called from a network
hardware interrupt, we could corrupt the protocol queues we try to drain
them at that time.
The problem has been addressed by letting the drain'able protocols use
a locking scheme to prevent queue corruption.
Initialize pool hash table with PR_HASHTABSIZE (i.e., 8) LIST_INIT()s
instead of one memset().
Only check for page != ph->ph_page if PR_PHINPAGE is set (in pool_chk()).
Print pool base pointer when reporting page inconsistency in pool_chk().
set SS_MORETOCOME as a hint to the lower layer that more data is coming
on the next iteration of the loop. Clear the flag after the PRU_SEND
call.
Suggested by Justin Walker <justin@apple.com> on the freebsd-net
mailing list.
There are two reasons for this:
* We should be able to pass file descriptors without sending any data.
* We could send zero-length iovecs anyway (but we shouldn't have to do this).
Also, msg_iovlen is already a u_int, so delete a bunch of casts.
by Ken Hornstein and myself.
Add flags to struct device, and define one as "active". Devices are
initially active from config_attach(). Their active state may be changed
via config_activate() and config_deactivate().
These new functions assume that the device being manipulated will recursively
perform the action on its children.
Together, config_deactivate() and config_detach() may be used to implement
interrupt-driven device detachment. config_deactivate() will take care of
things that need to be performed at interrupt time, and config_detach()
(which must run in a valid thread context) finishes the job, which may
block.
kthread_create(). Implement kthread_exit() (causes a thrad to exit).
Set P_NOCLDWAIT on kernel threads, which will cause any of their children
to be reparented to init(8) (which is already prepared to wait out orphaned
processes).
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().
DDB_ONPANIC. Lets user ignore breaks but enter DDB on panic. Intended
for machines where debug on panic is useful, but DDB entry is not,
(public-access console, or terminal-servers which send spurious breaks)
Add new ddb hook, console_debugger(), which decides whether or not to
ignore console ddb requests. Console drivers should be updated to call
console_debugger(), not Debugger(), in response to serial-console
break or ddb keyboard sequence.
* Change the usage of B_DONE so that it is only set when a buffer is in sync
with the data on disk.
* If a buffer is being waited for, don't put it on the age queue.
* Make sure to clear B_DONE when pages are stolen from a buffer.
* Make sure to clear B_CACHE after each use.
* If we find a buffer for the block we want with valid data, but it is too
small, panic. (This isn't supposed to happen.)
Fixes potential file corruption problems with clustering.
* Increase the size of sigset_t to accomodate 128 signals -- adding new
versions of sys_setprocmask(), sys_sigaction(), sys_sigpending() and
sys_sigsuspend() to handle the changed arguments.
* Abstract the guts of sys_sigaltstack(), sys_setprocmask(), sys_sigaction(),
sys_sigpending() and sys_sigsuspend() into separate functions, and call them
from all the emulations rather than hard-coding everything. (Avoids uses
the stackgap crap for these system calls.)
* Add a new flag (p_checksig) to indicate that a process may have signals
pending and userret() needs to do the full (slow) check.
* Eliminate SAS_ALTSTACK; it's exactly the inverse of SS_DISABLE.
* Correct emulation bugs with restoring SS_ONSTACK.
* Make the signal mask in the sigcontext always use the emulated mask format.
* Store signals internally in sigaction structures, rather than maintaining a
bunch of little sigsets for each SA_* bit.
* Keep track of where we put the signal trampoline, rather than figuring it out
in *_sendsig().
* Issue a warning when a non-emulated sigaction bit is observed.
* Add missing emulated signals, and a native SIGPWR (currently not used).
* Implement the `not reset when caught' semantics for relevant signals.
Note: Only code touched by the i386 port has been modified. Other ports and
emulations need to be updated.
number modulo the given alignment.
To do this the function extent_alloc_subregion() takes an additional `skew'
parameter. For compatibility's sake, this function has been renamed to
extent_alloc_subregion1().
processes.
- Create a new data structure, the proclist_desc, which contains a
pointer to a proclist, and eventually, a pointer to the lock for that
proclist. Declare a static array of proclist_descs, proclists[],
consisting of allproc, deadproc, and zombproc.
The only benefit this provides is that we don't use kmem_map to map the memory
used for name cache entries (though, this is a 13 virtual page savings on my
PPro) since vnodes are never freed (they have their own freelist).
only benefit this provides is that we don't use kmem_map to map the memory
used for vnodes (though, this is a 30 virtual page savings on my PPro)
since vnodes are never freed (they have their own freelist).
* the first one would cause an unnecessary malloc() of iovec storage for
a msg_iovlen of UIO_SMALLIOV although the required amount of memory has
been allocated on the stack.
* the second one would cause a recvmsg() or sendmsg() with a msg_iovlen of
UIO_MAXIOV to fail with EMSGSIZE, which is also a violation of XNS5.
* availability of POSIX Synchronized I/O (kern.synchronized_io),
* maximum number of iovec structures to be used in readv(2) etc. (kern.iov_max)
via sysctl().
* if synchronized I/O file integrity completion of read operations was
requested, set IO_SYNC in the ioflag passed to the read vnode operator.
* if synchronized I/O data integrity completion of write operations was
requested, set IO_DSYNC in the ioflag passed to the write vnode operator.
means we're in interrupt context. Since we can be called from a network
hardware interrupt, we could corrupt the protocol queues we try to drain
them at that time.
called when devices attach, take two.
Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.
The read/write system calls return ssize_t because -1 is used to indicate
error, therefore the transfer size MUST be limited to SSIZE_MAX, otherwise
garbage can be returned to the user.
There is NO change from existing behavior here, only a more precise
definition of that the semantics are, except in the Alpha case, where
the full SSIZE_MAX transfer size can now be realized (ssize_t is 64-bit
on the Alpha).
The read/write system calls return ssize_t because -1 is used to indicate
error, therefore the transfer size MUST be limited to SSIZE_MAX, otherwise
garbage can be returned to the user.
There is NO change from existing behavior here, only a more precise
definition of that the semantics are, except in the Alpha case, where
the full SSIZE_MAX transfer size can now be realized (ssize_t is 64-bit
on the Alpha).
- If either an alloc or release function is provided, make sure both are
provided, otherwise panic, as this is a fatal error.
- If using the default allocator, default the pool pagesz to PAGE_SIZE,
since that is the granularity of the default allocator's mechanism.
- In the default allocator, use new functions:
uvm_km_alloc_poolpage()/uvm_km_free_poolpage(), or
kmem_alloc_poolpage()/kmem_free_poolpage()
rather than doing it here. These functions may use pmap hooks to
provide alternate methods of mapping pool pages.
- pread() (#173) and pwrite() (#174), which are defined by XPG4.2. System
call numbers match Solaris.
- preadv() (#289) and pwritev() (#290), which are the positional cousins
of readv() and writev(), but not defined by any standard.
* we already have the vnode interlock, so vref() should not ask for it again.
* we call VOP_RECLAIM/VOP_INACTIVE(), which shouldn't be duplicated in vrele().
the new proc structure when performing a fork. This makes it much
easier to abort a fork operation and return an error if we run out
of KVA space.
The U-area pages are still wired down in {,u}vm_fork(), as before.
readlink() from type `int' to type `size_t'. This isn't an ABI change, since
the calling convention of our only LP64 platform (the Alpha) already promotes
this argument to a `long'.
This may not be the final action on this matter; readlink() still returns
an `int', which may change in a future revision of the standard.
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.
This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.
again - the facility required in this context would be a filesystem-specific
super-user determination, which is not available yet. Also, add some
clarification to a comment.
vectors; defer that to vfs_opv_init().
Change the interface to vfs_opv_init() and export it; it now takes a
pointer to an array of vnodeopv_desc *'s to initialize. Allocate
the vnode operations vectors here. Called by vfs_attach().
Implement vfs_opv_free(), which deallocates the vnode operations
vectors. Called by vfs_detach().
Change vfsinit() to build the initial vfs_list by traversing the
vfs_list_inital[] table, and vfs_attach()'ing those file systems.
Also, initialize special vnodeopv_descs (dead, fifo, spec) which
are not associated with any particular file system.
change_owner().
* Change the semantics of chown(), fchown() and lchown(): when requesting a
change of the owner of a file, clear the set-user-id bit; analogous behaviour
for group changes.
* Since the above is a violation of the semantics specified by POSIX and
X/Open, add corresponding compatibility syscalls: __posix_chown(),
__posix_fchown(), __posix_lchown(). (Neither fchown() nor lchown() is
specified by POSIX; the prefix is intended to reflect the semantics.)
* Rename posix_rename() to __posix_rename() to follow the above convention.
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.
Submitted by Tom Proett <proett@nas.nasa.gov>.