external storage. Highlights:
- additional "void *" argument to (*ext_free)(), an opaque
cookie for use by the free function.
- MCLALLOC() and MCLFREE() calls are gone. They are replaced
by MEXTADD() (add external storage to mbuf), MEXTMALLOC()
(malloc() external storage and attach to mbuf), and
MEXTREMOVE() (remove external storage from mbuf).
- completely new external storage reference counting
mechanism; mclrefcnt[] is gone.
These changes will eventually be used to pass driver DMA buffers up
the network stack, and reduce/eliminate copies in certain code paths
(e.g. NFS writes).
From Matt Thomas <matt@3am-software.com> and myself <thorpej@nas.nasa.gov>,
with some input from Chris Demetriou <cgd@cs.cmu.edu> and review by
Charles Hannum <mycroft@mit.edu>.
Some of the stuff (e.g., rarpd, bootpd, dhcpd etc., libsa) still will
only support Ethernet. Tcpdump itself should be ok, but libpcap needs
lot of work.
For the detailed change history, look at the commit log entries for
the is-newarp branch.
work. Not quite as good as with the Lite2 merges, but it'll do until then.
* dounmount() expects to be called with the mountpoint marked busy
* all callers of dounmount() thus make the call themselves
* if a filesystem was being unmounted, and we're woken up in vfs_busy(),
don't reference the mountpoint struct pointer, as it has very probably
been freed.
in the assembly file genassym.s into the usual assym.h file. The
assym.h file generated this way is identical to the output generated
if I simply compile and run the genassym.s file. "Heh, Kewl!"
Thanks to Matthias Pfaller for the "translate the .s file" idea!
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.
mount the root file system. If the operator specified the root
file system type in the kernel configuration file, attempt to
mount that file system type on the root device. If the root
file system type was wildcarded (or unspecified), try all of
the file systems statically built into the kernel until one
succeeds. If no file systems succeed, return an error. The
system will recover from this condition.
- Implement vfs_getopsbyname(). This function returns the file
system ops vector given a file system name.
from the version used by NetBSD/alpha, with several changes by me.
Support for asking for root device and root file system type on any
kernel, obsoleting "options GENERIC".
- Make my mountroothook implementation used by the sparc and x68k
ports machine-independent, and use it here. Mountroothooks allow
devices to execute special functions before being mounted as the
root device (such as ejecting the floppy and prompting for a new
floppy disk).
- Make swapconf() machine-independent. It was identical on all ports.
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.
is running (and NTP is not enabled), the adjtime()-handling code clobbers
any tickfix that may be necessary for systems with clocks with frequency
greater than 1000Hz.
Eliminate obsolete global kernel variable "struct timezone tz"
Add RTC_OFFSET option
Add global kernel variable rtc_offset, which is initialized by
RTC_OFFSET at kernel compile time.
on i386, x68k, mac68k, pc532 and arm32, RTC_OFFSET indicates how many
minutes west (east) of GMT the hardware RTC runs. Defaults to 0.
Places where tz variable was used to indicate this in the past have
been replaced with rtc_offset.
Add sysctl interface to rtc_offset.
Kill obsolete DST_* macros in sys/time.h
gettimeofday now always returns zeroed timezone if zone is requested.
settimeofday now ignores and logs attempts to set non-existant kernel
timezone.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes
* Change sockargs()'s second argument to be a const void *, to help
with dealing with the syscall argument type fixups/const poisoning.
* change in-kernel syscall prototypes to match user-land prototypes in
the following ways:
+ add 'const' where appropriate.
+ make the following "safe" type changes where appropriate:
caddr_t -> struct msghdr *
caddr_t -> struct sockaddr *
caddr_t -> void *
char * -> void *
int -> uid_t (safe because uid_t not used as index/count)
int -> gid_t (safe because gid_t not used as index/count)
u_int -> size_t
+ change "int" to "u_long" in flags arguments to chflags() and
fchflags(). This is safe because the arguments are used as
flag bits and there's nothing that would cause the top bit
of the int to be set yet, and because the user-land definitions
already specified u_long, so a u_long's worth of argument was
already being passed in.
wrong for a bunch of functions:
void: sys_exit, sys_sync
ssize_t: sys_read, sys_write, sys_recvmsg, sys_sendmsg,
sys_recvfrom, sys_readv, sys_writev, sys_sendto
long: sys_pathconf, sys_fpathconf
void *: sys_shmat
* Note that sys_open, sys_ioctl, and sys_fcntl are defined such that their
last argument is optional.
These changes should not have any real effect, because right now this
information is not actually used for anything.
* Don't output prototypes for INDIR syscalls (since they always show up as
sys_nosys() in the syscall table).
* Add "indir" to the comment for INDIR syscalls in the syscalls table, so
it's more obvious why they call sys_nosys().
* Deal with multi-word system call return types (i.e. foo *, or
struct foo *, or struct foo, etc.).
* Add a new class of system calls "INDIR" (for "indirect"), which
is to be used to represent indirect syscalls like syscall() and
__syscall() which are implemented in MD code and which don't want
args structures defined. (The old way of declaring this type of
syscalls still works.)
* Allow system calls to be marked as having a variable number of
arguments, by inserting "..." (no trailing comma) before the
first hf the optional arguments in the syscall definition. Because
of the way syscall arguments are handled by MI code, _ALL_ syscall
arguments must actually be included in the definition, i.e.
"optional" arguments are either "are there or aren't," i.e. these
aren't really varargs functions. Therefore, for normal syscalls,
there _must_ be arguments listed after the "...". For INDIR
syscalls, which really do have a variable number of arguments and
which aren't handled via the normal mechanism, that requirement is
not in force.
* output primitive (machine-parsable) syscall descriptions as comments
in <sys/syscall.h>. These can be used to easily build real function
prototypes, or to build stub functions for use by lint.
>- Optional systems calls are "UNIMPL" if the support is not being
> compiled into the kernel.
It had implications that didn't occurr to me at the time. *sigh*
>- Optional systems calls are "UNIMPL" if the support is not being
> compiled into the kernel.
It had implications that didn't occur to me at the time. *sigh*
defined:
define match functions to take a struct cfdata * as their second
argument, config_search() to take a struct cfdata * as its second
argument, and config_{root,}search() to return struct cfdata *.
remove 'cd_indirect' cfdriver element.
remove config_scan().
remove config_make_softc() as a seperate function, reintegrating
its functionality into config_attach().
Ports will define __BROKEN_INDIRECT_CONFIG until their drivers prototypes
are updated to work with the new definitions, and until it is sure that
their indirect-config drivers do not assume that they have a softc
in their match routine.
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.
(1) after removing a shutdown hook (in shutdownhook_disestablish()),
free it. We created it, we have to free it. Without this,
shutdownhook_disestablish() leaks memory.
(2) in doshutdownhooks(), before running each hook, remove it from the
shutdown hook list. This makes sure that every hook is tried once
(because doshutdownhooks() is called from before rebooting, and
a fault in a shutdown hook will cause doshutdownhooks() to be called
again), but prevents the hooks from potentially being run infinitely
(as used to be possible, in the above-mentioned situation).
If not compiled with -D_KERNEL, include different includes and
do so macro magic so that this will fit sanely into test harnesses.
When used in user-land, this should be compiled with -D_EXTENT_TESTING.
Bug fixes:
(extent_insert_and_optimize) You can't do things like:
LIST_REMOVE(elem->...le_next, ...);
free(elem->...le_next, ...);
They just don't work (and will corrupt your list and/or malloc free list).
(extent_alloc_region_descriptor) Unless you wait, malloc can fail.
Don't accidentally deref a potentially-NULL pointer.
- The functions that implement them and the argument names are
prepended with "sys_".
- Optional systems calls are "UNIMPL" if the support is not being
compiled into the kernel.
representing the names of those bits, prints them into a buffer
provided by the caller, and returns a pointer to that buffer.
Functionality is identical to that of the (non-standard) `%b' printf()
format, which will be deprecated.
Rename the non-exported function ksprintn() to ksnprintn(), and change
it to use a buffer provided by the caller, rather than at static
buffer.
* handle interpreters with nonzero virtual address of entry-point:
subtract p_vaddr from computed entrypoint, as the mips elf exec did.
* Add #ifdef ELF_INTERP_NON_RELOCATABLE/#endif around the code
that tries to choose a `good' address at which to load an interpreter,
if none was set by the emul probe function.
(the address chosen could be improved to avoid fragmenting the
process virtual address space).
* define ELF_INTERP_NON_RELOCATABLE in machine/elf_machdep.h for mips CPUs,
which currently use a GNU-derived ld.so.
ELF_INTERP_NON_RELOCATABLE is not necessary for native NetBSD/alpha ELF
binaries. It may be required for GNU-derived ELF dynamic loaders (Linux/i386?)
Keep queue of pending sockets in a double linked list. Previously,
a singly linked list was used, giving O(N) insertion/deletion times,
and was a major time consumer for sockets with large pending queues.
The double linked list give O(C) insertion/deletion times with only
a small cost in complexity.
Since a socket can be on, at most, one queue at a time, both so_q and
so_q0 can safely be used as (forward and backward, respectively) queue
pointers.
Submitted my Matt Thomas <matt@3am-software.com>, a long time ago.
(Geez, I've been running with this patch for _months_, and had completely
forgotten about it!)
struct member cn_nameptr 'const', since they should never be used to
modify the path name. (Only the pathname buffer, cn_pnbuf, should be
modified.) Propagate the const poisoning to code that uses the namei
and componentname structs.
not used by anything, for now), and implement MNT_NOCOREDUMP by checking
whether or not MNT_NOCOREDUMP is set on the file system where the dump
would land (i.e. the file system of the process's current working
directory), and disallowing the core dump if it's set.
- Rename EX_NOBLOB to EX_NOCOALESCE; it's much more descriptive of
what's going on.
- In extent_free_region_descriptor(), if we're a fixed extent,
freeing a dynamically allocated region descriptor, and someone
is waiting on the freelist, let the waiter have it, rather than
free'ing it back to the system.
- Use ALIGN(), rather than our homegrown EXTENT_ALIGN(), when dealing
with map overhead. Privatize the EXTENT_ALIGN() macro; there's no need
to export it.
- Implement EX_BOUNDZERO flag. This changes the boundary line policy in
extent_alloc() and extent_alloc_subregion(); boundary lines are
computed relative to 0, rather then the start of the extent.
- Fix a nasty race between multiple participants doing region and
descriptor allocation.
- Add a new flag to specify that it's ok to wait for space in the
extent: EX_WAITSPACE.
- Blow away an unnecessary splhigh()/splx().
- Put a bunch of sanity code inside #ifdef DIAGNOSTIC/#endif.
of using it directly, use a local, and set that local to be curproc
if curproc is not NULL else a pointer to process 0's proc struct.
If syncing disks while handling a panic that occurred while 'curproc'
was NULL, the old code would dereference NULL and die.
ktrace context switch checking. If syncing disks while handling a panic
that occurred while 'curproc' was NULL, the old code would dereference
NULL and die. The (slight) reorganization was done so that space (one extra
splhigh()), rather than time (one extra comparison), would be wasted.
section. Patch come up with by Bob Baron <rvb+@cs.cmu.edu> and myself.
This entire bit of code (the code which sets daddr/dsize and taddr/tsize)
is very bogus, but it's not clear what the 'right' way to fix it is
and this patch fixes a problem preventing some ELF executables from
being run.
ELF_ROUND (round to higher alignment boundary), and use them properly.
Also, change a bit of code in elf_load_psection to use the next ELF_ROUND
macro. This fixes a bug found by Robert Baron <rvb+@cs.cmu.edu> where
elf_load_psection, if given a properly aligned address at which to load
the section, would round actually load it at the next highest alignment
boundary.
for NOEXEC and NOSUID, and make sure the interpreter file is executable.
The mount point checks are done because, even though the interpreter
is not the program being 'executed', code from the interpreter is being
executed, and so the mount point's flags should be respected.
and shell script support to be optional (conditioned on EXEC_SCRIPT).
Remove the implicit inclusion of EXEC_ECOFF when COMPAT_OSF1 and/or
COMPAT_ULTRIX is included, and of EXEC_ELF32 when COMPAT_LINUX and/or
COMPAT_SVR4 is included.
queue.h list/queue head initializer macros. mountlist was converted so
that panics (or other reboots) early on in kernel startup don't cause
sys_sync() to croak. vnode_free_list was converted because it was nearby.