by Ken Hornstein and myself.
Add flags to struct device, and define one as "active". Devices are
initially active from config_attach(). Their active state may be changed
via config_activate() and config_deactivate().
These new functions assume that the device being manipulated will recursively
perform the action on its children.
Together, config_deactivate() and config_detach() may be used to implement
interrupt-driven device detachment. config_deactivate() will take care of
things that need to be performed at interrupt time, and config_detach()
(which must run in a valid thread context) finishes the job, which may
block.
kthread_create(). Implement kthread_exit() (causes a thrad to exit).
Set P_NOCLDWAIT on kernel threads, which will cause any of their children
to be reparented to init(8) (which is already prepared to wait out orphaned
processes).
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().
DDB_ONPANIC. Lets user ignore breaks but enter DDB on panic. Intended
for machines where debug on panic is useful, but DDB entry is not,
(public-access console, or terminal-servers which send spurious breaks)
Add new ddb hook, console_debugger(), which decides whether or not to
ignore console ddb requests. Console drivers should be updated to call
console_debugger(), not Debugger(), in response to serial-console
break or ddb keyboard sequence.
* Change the usage of B_DONE so that it is only set when a buffer is in sync
with the data on disk.
* If a buffer is being waited for, don't put it on the age queue.
* Make sure to clear B_DONE when pages are stolen from a buffer.
* Make sure to clear B_CACHE after each use.
* If we find a buffer for the block we want with valid data, but it is too
small, panic. (This isn't supposed to happen.)
Fixes potential file corruption problems with clustering.
* Increase the size of sigset_t to accomodate 128 signals -- adding new
versions of sys_setprocmask(), sys_sigaction(), sys_sigpending() and
sys_sigsuspend() to handle the changed arguments.
* Abstract the guts of sys_sigaltstack(), sys_setprocmask(), sys_sigaction(),
sys_sigpending() and sys_sigsuspend() into separate functions, and call them
from all the emulations rather than hard-coding everything. (Avoids uses
the stackgap crap for these system calls.)
* Add a new flag (p_checksig) to indicate that a process may have signals
pending and userret() needs to do the full (slow) check.
* Eliminate SAS_ALTSTACK; it's exactly the inverse of SS_DISABLE.
* Correct emulation bugs with restoring SS_ONSTACK.
* Make the signal mask in the sigcontext always use the emulated mask format.
* Store signals internally in sigaction structures, rather than maintaining a
bunch of little sigsets for each SA_* bit.
* Keep track of where we put the signal trampoline, rather than figuring it out
in *_sendsig().
* Issue a warning when a non-emulated sigaction bit is observed.
* Add missing emulated signals, and a native SIGPWR (currently not used).
* Implement the `not reset when caught' semantics for relevant signals.
Note: Only code touched by the i386 port has been modified. Other ports and
emulations need to be updated.
number modulo the given alignment.
To do this the function extent_alloc_subregion() takes an additional `skew'
parameter. For compatibility's sake, this function has been renamed to
extent_alloc_subregion1().
processes.
- Create a new data structure, the proclist_desc, which contains a
pointer to a proclist, and eventually, a pointer to the lock for that
proclist. Declare a static array of proclist_descs, proclists[],
consisting of allproc, deadproc, and zombproc.
The only benefit this provides is that we don't use kmem_map to map the memory
used for name cache entries (though, this is a 13 virtual page savings on my
PPro) since vnodes are never freed (they have their own freelist).
only benefit this provides is that we don't use kmem_map to map the memory
used for vnodes (though, this is a 30 virtual page savings on my PPro)
since vnodes are never freed (they have their own freelist).
* the first one would cause an unnecessary malloc() of iovec storage for
a msg_iovlen of UIO_SMALLIOV although the required amount of memory has
been allocated on the stack.
* the second one would cause a recvmsg() or sendmsg() with a msg_iovlen of
UIO_MAXIOV to fail with EMSGSIZE, which is also a violation of XNS5.
* availability of POSIX Synchronized I/O (kern.synchronized_io),
* maximum number of iovec structures to be used in readv(2) etc. (kern.iov_max)
via sysctl().
* if synchronized I/O file integrity completion of read operations was
requested, set IO_SYNC in the ioflag passed to the read vnode operator.
* if synchronized I/O data integrity completion of write operations was
requested, set IO_DSYNC in the ioflag passed to the write vnode operator.
means we're in interrupt context. Since we can be called from a network
hardware interrupt, we could corrupt the protocol queues we try to drain
them at that time.
called when devices attach, take two.
Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.
The read/write system calls return ssize_t because -1 is used to indicate
error, therefore the transfer size MUST be limited to SSIZE_MAX, otherwise
garbage can be returned to the user.
There is NO change from existing behavior here, only a more precise
definition of that the semantics are, except in the Alpha case, where
the full SSIZE_MAX transfer size can now be realized (ssize_t is 64-bit
on the Alpha).
The read/write system calls return ssize_t because -1 is used to indicate
error, therefore the transfer size MUST be limited to SSIZE_MAX, otherwise
garbage can be returned to the user.
There is NO change from existing behavior here, only a more precise
definition of that the semantics are, except in the Alpha case, where
the full SSIZE_MAX transfer size can now be realized (ssize_t is 64-bit
on the Alpha).
- If either an alloc or release function is provided, make sure both are
provided, otherwise panic, as this is a fatal error.
- If using the default allocator, default the pool pagesz to PAGE_SIZE,
since that is the granularity of the default allocator's mechanism.
- In the default allocator, use new functions:
uvm_km_alloc_poolpage()/uvm_km_free_poolpage(), or
kmem_alloc_poolpage()/kmem_free_poolpage()
rather than doing it here. These functions may use pmap hooks to
provide alternate methods of mapping pool pages.
- pread() (#173) and pwrite() (#174), which are defined by XPG4.2. System
call numbers match Solaris.
- preadv() (#289) and pwritev() (#290), which are the positional cousins
of readv() and writev(), but not defined by any standard.
* we already have the vnode interlock, so vref() should not ask for it again.
* we call VOP_RECLAIM/VOP_INACTIVE(), which shouldn't be duplicated in vrele().
the new proc structure when performing a fork. This makes it much
easier to abort a fork operation and return an error if we run out
of KVA space.
The U-area pages are still wired down in {,u}vm_fork(), as before.
readlink() from type `int' to type `size_t'. This isn't an ABI change, since
the calling convention of our only LP64 platform (the Alpha) already promotes
this argument to a `long'.
This may not be the final action on this matter; readlink() still returns
an `int', which may change in a future revision of the standard.
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.
This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.
again - the facility required in this context would be a filesystem-specific
super-user determination, which is not available yet. Also, add some
clarification to a comment.
vectors; defer that to vfs_opv_init().
Change the interface to vfs_opv_init() and export it; it now takes a
pointer to an array of vnodeopv_desc *'s to initialize. Allocate
the vnode operations vectors here. Called by vfs_attach().
Implement vfs_opv_free(), which deallocates the vnode operations
vectors. Called by vfs_detach().
Change vfsinit() to build the initial vfs_list by traversing the
vfs_list_inital[] table, and vfs_attach()'ing those file systems.
Also, initialize special vnodeopv_descs (dead, fifo, spec) which
are not associated with any particular file system.
change_owner().
* Change the semantics of chown(), fchown() and lchown(): when requesting a
change of the owner of a file, clear the set-user-id bit; analogous behaviour
for group changes.
* Since the above is a violation of the semantics specified by POSIX and
X/Open, add corresponding compatibility syscalls: __posix_chown(),
__posix_fchown(), __posix_lchown(). (Neither fchown() nor lchown() is
specified by POSIX; the prefix is intended to reflect the semantics.)
* Rename posix_rename() to __posix_rename() to follow the above convention.
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.
Submitted by Tom Proett <proett@nas.nasa.gov>.
UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.
this is the rest of the MI portion changes.
this will be KNF'd shortly. :-)
down "Data modified on freelist" and "muliple free" problems.
The log is activated by the MALLOCLOG option, and the size of the
event ring buffer is controlable via the MALLOGLOGSIZE option (default
is 100000 entries).
From Chris Demetriou, cleaned up a little by me per suggestions in the
e-mail from Chris that contained the code.
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.
this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)
enabled with the LOCAL_CREDS socket option on the listener. Semantics are
similar to BSD/OS's:
- Creds are available with first data on SOCK_STREAM, and with every datagram
on SOCK_DGRAM.
- It is not possible to forge credentials.
Different in that:
- Different credential data structure (ours does not rely on the format
of internal kernel data structures, and does not pass the login name).
- We can pass creds and file descriptors at the same time (this does not
work in BSD/OS).
Luke Mewburn <lukem@netbsd.org> gets credit for inspiring me to implement
this. :-)
so_linger is used as an argument to tsleep(), so was stuffed with
clockticks for the TCP linger time. However, so_linger is set directly from
l_linger if the linger time is specified, and l_linger is seconds (although
this is not currently documented anywhere). Fix this to set the TCP
linger time in seconds, and multiply so_linger by hz when tsleep() is
called to actually perform the linger.
3BSD vfork(2), i.e. share address space w/ parent and block parent.
Keep statistics on the total number of forks, the number of forks that
block the parent, and the number of forks that share the address space
with the parent.