* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.
mount the root file system. If the operator specified the root
file system type in the kernel configuration file, attempt to
mount that file system type on the root device. If the root
file system type was wildcarded (or unspecified), try all of
the file systems statically built into the kernel until one
succeeds. If no file systems succeed, return an error. The
system will recover from this condition.
- Implement vfs_getopsbyname(). This function returns the file
system ops vector given a file system name.
from the version used by NetBSD/alpha, with several changes by me.
Support for asking for root device and root file system type on any
kernel, obsoleting "options GENERIC".
- Make my mountroothook implementation used by the sparc and x68k
ports machine-independent, and use it here. Mountroothooks allow
devices to execute special functions before being mounted as the
root device (such as ejecting the floppy and prompting for a new
floppy disk).
- Make swapconf() machine-independent. It was identical on all ports.
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.
is running (and NTP is not enabled), the adjtime()-handling code clobbers
any tickfix that may be necessary for systems with clocks with frequency
greater than 1000Hz.
Eliminate obsolete global kernel variable "struct timezone tz"
Add RTC_OFFSET option
Add global kernel variable rtc_offset, which is initialized by
RTC_OFFSET at kernel compile time.
on i386, x68k, mac68k, pc532 and arm32, RTC_OFFSET indicates how many
minutes west (east) of GMT the hardware RTC runs. Defaults to 0.
Places where tz variable was used to indicate this in the past have
been replaced with rtc_offset.
Add sysctl interface to rtc_offset.
Kill obsolete DST_* macros in sys/time.h
gettimeofday now always returns zeroed timezone if zone is requested.
settimeofday now ignores and logs attempts to set non-existant kernel
timezone.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes
* Change sockargs()'s second argument to be a const void *, to help
with dealing with the syscall argument type fixups/const poisoning.
* change in-kernel syscall prototypes to match user-land prototypes in
the following ways:
+ add 'const' where appropriate.
+ make the following "safe" type changes where appropriate:
caddr_t -> struct msghdr *
caddr_t -> struct sockaddr *
caddr_t -> void *
char * -> void *
int -> uid_t (safe because uid_t not used as index/count)
int -> gid_t (safe because gid_t not used as index/count)
u_int -> size_t
+ change "int" to "u_long" in flags arguments to chflags() and
fchflags(). This is safe because the arguments are used as
flag bits and there's nothing that would cause the top bit
of the int to be set yet, and because the user-land definitions
already specified u_long, so a u_long's worth of argument was
already being passed in.
wrong for a bunch of functions:
void: sys_exit, sys_sync
ssize_t: sys_read, sys_write, sys_recvmsg, sys_sendmsg,
sys_recvfrom, sys_readv, sys_writev, sys_sendto
long: sys_pathconf, sys_fpathconf
void *: sys_shmat
* Note that sys_open, sys_ioctl, and sys_fcntl are defined such that their
last argument is optional.
These changes should not have any real effect, because right now this
information is not actually used for anything.
* Don't output prototypes for INDIR syscalls (since they always show up as
sys_nosys() in the syscall table).
* Add "indir" to the comment for INDIR syscalls in the syscalls table, so
it's more obvious why they call sys_nosys().
* Deal with multi-word system call return types (i.e. foo *, or
struct foo *, or struct foo, etc.).
* Add a new class of system calls "INDIR" (for "indirect"), which
is to be used to represent indirect syscalls like syscall() and
__syscall() which are implemented in MD code and which don't want
args structures defined. (The old way of declaring this type of
syscalls still works.)
* Allow system calls to be marked as having a variable number of
arguments, by inserting "..." (no trailing comma) before the
first hf the optional arguments in the syscall definition. Because
of the way syscall arguments are handled by MI code, _ALL_ syscall
arguments must actually be included in the definition, i.e.
"optional" arguments are either "are there or aren't," i.e. these
aren't really varargs functions. Therefore, for normal syscalls,
there _must_ be arguments listed after the "...". For INDIR
syscalls, which really do have a variable number of arguments and
which aren't handled via the normal mechanism, that requirement is
not in force.
* output primitive (machine-parsable) syscall descriptions as comments
in <sys/syscall.h>. These can be used to easily build real function
prototypes, or to build stub functions for use by lint.
>- Optional systems calls are "UNIMPL" if the support is not being
> compiled into the kernel.
It had implications that didn't occurr to me at the time. *sigh*
>- Optional systems calls are "UNIMPL" if the support is not being
> compiled into the kernel.
It had implications that didn't occur to me at the time. *sigh*
defined:
define match functions to take a struct cfdata * as their second
argument, config_search() to take a struct cfdata * as its second
argument, and config_{root,}search() to return struct cfdata *.
remove 'cd_indirect' cfdriver element.
remove config_scan().
remove config_make_softc() as a seperate function, reintegrating
its functionality into config_attach().
Ports will define __BROKEN_INDIRECT_CONFIG until their drivers prototypes
are updated to work with the new definitions, and until it is sure that
their indirect-config drivers do not assume that they have a softc
in their match routine.
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.
(1) after removing a shutdown hook (in shutdownhook_disestablish()),
free it. We created it, we have to free it. Without this,
shutdownhook_disestablish() leaks memory.
(2) in doshutdownhooks(), before running each hook, remove it from the
shutdown hook list. This makes sure that every hook is tried once
(because doshutdownhooks() is called from before rebooting, and
a fault in a shutdown hook will cause doshutdownhooks() to be called
again), but prevents the hooks from potentially being run infinitely
(as used to be possible, in the above-mentioned situation).
If not compiled with -D_KERNEL, include different includes and
do so macro magic so that this will fit sanely into test harnesses.
When used in user-land, this should be compiled with -D_EXTENT_TESTING.
Bug fixes:
(extent_insert_and_optimize) You can't do things like:
LIST_REMOVE(elem->...le_next, ...);
free(elem->...le_next, ...);
They just don't work (and will corrupt your list and/or malloc free list).
(extent_alloc_region_descriptor) Unless you wait, malloc can fail.
Don't accidentally deref a potentially-NULL pointer.
- The functions that implement them and the argument names are
prepended with "sys_".
- Optional systems calls are "UNIMPL" if the support is not being
compiled into the kernel.
representing the names of those bits, prints them into a buffer
provided by the caller, and returns a pointer to that buffer.
Functionality is identical to that of the (non-standard) `%b' printf()
format, which will be deprecated.
Rename the non-exported function ksprintn() to ksnprintn(), and change
it to use a buffer provided by the caller, rather than at static
buffer.
* handle interpreters with nonzero virtual address of entry-point:
subtract p_vaddr from computed entrypoint, as the mips elf exec did.
* Add #ifdef ELF_INTERP_NON_RELOCATABLE/#endif around the code
that tries to choose a `good' address at which to load an interpreter,
if none was set by the emul probe function.
(the address chosen could be improved to avoid fragmenting the
process virtual address space).
* define ELF_INTERP_NON_RELOCATABLE in machine/elf_machdep.h for mips CPUs,
which currently use a GNU-derived ld.so.
ELF_INTERP_NON_RELOCATABLE is not necessary for native NetBSD/alpha ELF
binaries. It may be required for GNU-derived ELF dynamic loaders (Linux/i386?)
Keep queue of pending sockets in a double linked list. Previously,
a singly linked list was used, giving O(N) insertion/deletion times,
and was a major time consumer for sockets with large pending queues.
The double linked list give O(C) insertion/deletion times with only
a small cost in complexity.
Since a socket can be on, at most, one queue at a time, both so_q and
so_q0 can safely be used as (forward and backward, respectively) queue
pointers.
Submitted my Matt Thomas <matt@3am-software.com>, a long time ago.
(Geez, I've been running with this patch for _months_, and had completely
forgotten about it!)
struct member cn_nameptr 'const', since they should never be used to
modify the path name. (Only the pathname buffer, cn_pnbuf, should be
modified.) Propagate the const poisoning to code that uses the namei
and componentname structs.
not used by anything, for now), and implement MNT_NOCOREDUMP by checking
whether or not MNT_NOCOREDUMP is set on the file system where the dump
would land (i.e. the file system of the process's current working
directory), and disallowing the core dump if it's set.
- Rename EX_NOBLOB to EX_NOCOALESCE; it's much more descriptive of
what's going on.
- In extent_free_region_descriptor(), if we're a fixed extent,
freeing a dynamically allocated region descriptor, and someone
is waiting on the freelist, let the waiter have it, rather than
free'ing it back to the system.
- Use ALIGN(), rather than our homegrown EXTENT_ALIGN(), when dealing
with map overhead. Privatize the EXTENT_ALIGN() macro; there's no need
to export it.
- Implement EX_BOUNDZERO flag. This changes the boundary line policy in
extent_alloc() and extent_alloc_subregion(); boundary lines are
computed relative to 0, rather then the start of the extent.
- Fix a nasty race between multiple participants doing region and
descriptor allocation.
- Add a new flag to specify that it's ok to wait for space in the
extent: EX_WAITSPACE.
- Blow away an unnecessary splhigh()/splx().
- Put a bunch of sanity code inside #ifdef DIAGNOSTIC/#endif.
of using it directly, use a local, and set that local to be curproc
if curproc is not NULL else a pointer to process 0's proc struct.
If syncing disks while handling a panic that occurred while 'curproc'
was NULL, the old code would dereference NULL and die.
ktrace context switch checking. If syncing disks while handling a panic
that occurred while 'curproc' was NULL, the old code would dereference
NULL and die. The (slight) reorganization was done so that space (one extra
splhigh()), rather than time (one extra comparison), would be wasted.
section. Patch come up with by Bob Baron <rvb+@cs.cmu.edu> and myself.
This entire bit of code (the code which sets daddr/dsize and taddr/tsize)
is very bogus, but it's not clear what the 'right' way to fix it is
and this patch fixes a problem preventing some ELF executables from
being run.
ELF_ROUND (round to higher alignment boundary), and use them properly.
Also, change a bit of code in elf_load_psection to use the next ELF_ROUND
macro. This fixes a bug found by Robert Baron <rvb+@cs.cmu.edu> where
elf_load_psection, if given a properly aligned address at which to load
the section, would round actually load it at the next highest alignment
boundary.
for NOEXEC and NOSUID, and make sure the interpreter file is executable.
The mount point checks are done because, even though the interpreter
is not the program being 'executed', code from the interpreter is being
executed, and so the mount point's flags should be respected.
and shell script support to be optional (conditioned on EXEC_SCRIPT).
Remove the implicit inclusion of EXEC_ECOFF when COMPAT_OSF1 and/or
COMPAT_ULTRIX is included, and of EXEC_ELF32 when COMPAT_LINUX and/or
COMPAT_SVR4 is included.
queue.h list/queue head initializer macros. mountlist was converted so
that panics (or other reboots) early on in kernel startup don't cause
sys_sync() to croak. vnode_free_list was converted because it was nearby.
macros to use to remove #ifdefs from the machine ID case check.
Eventually, these headers will contain other information, e.g.
machine-dependent relocation information, etc.
p->p_vmspace->vm_shm would be NULL. Protected the rest of the cases where
that might happen too. This was the reason why sunxdoom would panic the
system in SVR4 emulation.
programs which attach their own header) can crash the machine. The problem
in this case was:
a variable "space" was set to the total data to copy,
len was used to remember how much to copy in this chunk (mbuf),
in one case, len = min(MCLBYTES - max_hdr, resid) but
size -= MCLBYTES;
instead of
size -= len;
Note that userland programs can still crash the machine by providing
bogus data in the ip->ip_len field I suspect. I haven't verified this,
but will soon be doing so and applying a fix of some sort. Probably
clamping the ip->ip_len value to the true packet size will be ok.
a boot string for firmware that can do this, such as the SPARC and
the sun3 models. It is currently silently ignored on all other
hardware now, however. The MD function "boot()" has been changed to
also take a char *.
Understands allocation aligment and boundary restrictions, "specific region"
allocations, and suballocations. Capable of statically or dynamically
allocating map overhead.
Many thanks to Matthias Drochner for running the code for me, and sending
me bug fixes, optimizations, and suggestions. Also, many thanks to
Chris Demetriou for his extremely helpful suggestions.
XXX No manual page yet. One is forthcoming, as soon as I can scare up
the time to write one. This has been sitting on my plate for quite a
while, and several projects are waiting for it. Time to move on.
delayed write is logically converted to a sync write, mirroring the async case.
In bdwrite(), move the tape case earlier to avoid needless reassignbuf()s.
and free some space by calling m_reclaim(). Also, log the "mb_map full"
error message (at most) every 60-seconds. The old code would log it
once over the lifetime of the system, but that's not a useful diagnostic.
(More useful is the new behaviour, which roughly indicates how often
periods of heavy load occur, without spamming the console and system
logs with messages.)
Right now, this code just panic()s (same as kmem_malloc() used to do
before, but different message), but in the future it should be modified
to try to reclaim wasted memory.
automatic array rather than an array allocated with alloca().
(This was the only use of alloca() in the kernel, and it wasn't
necessary or consistent with the way other functions in this file
work.)
structure and 'aux', right before ca_attach is called for the
newly-attached device. This allows the alpha port to do root device
autodetection without modifying every bus and device driver which could
be in the 'boot path.' In the long run, it may make sense to make
this machine-independent.
* Make 2nd and 3rd args timespecs, not timevals.
* Consistently pass a Boolean as the 4th arg (except in LFS).
Also, fix ffs_update() and lfs_update() to actually change the nsec fields.
documented behavior. sysctl(3) is documented to return 0 on success,
-1 on failure. The previous behavior was to return -1 on failure
and the number of bytes copied back down to user space.
Fixes part of PR #1999, from Kevin M. Lahey <kml@nas.nasa.gov>
return a struct device * of attached device, or NULL if device attach failed,
rather than 1/0 for success/failure, so that code that bus code which needs
to know what the child device is doesn't have to open-code a hacked variant
of config_found(). Make config_attach() return struct device *, rather than
void, to facilitate that.
- split softc size and match/attach out from cfdriver into
a new struct cfattach.
- new "attach" directive for files.*. May specify the name of
the cfattach structure, so that devices may be easily attached
to parents with different autoconfiguration semantics.
argument list. This allows easy 'submatching', which will eliminate a fair
bit of slightly tricky duplicated code from various busses. config_found()
is now a #define in sys/device.h, which invokes config_found_sm().
and the "kernel.tar.Z" distribution on louie.udel.edu, which is older than
xntp 3.4y or 3.5a, but contains newer kernel source fragments.
This commit adds support for a new kernel configuration option, NTP.
If NTP is selected, then the system clock should be run at "HZ", which
must be defined at compile time to be one value from:
60, 64, 100, 128, 256, 512, 1024.
Powers of 2 are ideal; 60 and 100 are supported but are marginally less
accurate.
If NTP is not configured, there should be no change in behavior relative
to pre-NTP kernels.
These changes have been tested extensively with xntpd 3.4y on a decstation;
almost identical kernel mods work on an i386. No pulse-per-second (PPS)
line discipline support is included, due to unavailability of hardware
to test it.
With this in-kernel PLL support for NetBSD, both xntp 3.4y and xntp
3.5a user-level code need minor changes. xntp's prototype for
syscall() is correct for FreeBSD, but not for NetBSD.
(1) do not cast it to (void *), and
(2) print it as 0x%x, rather than %p.
This is not perfect (because the data being printed is "int32_t"-sized), but
is more correct than printing it as a pointer because the data is _not_ a
pointer, it is data to be printed in hex, and on some systems, pointers are
wider than the data items being printed, which leads to excess and misleading
output. The only 'right' solution to this is to have a printf specifier
that prints the fixed-sized types the right way, and that's not really
practical.
* Change the argument names to vop_link so they actually make sense.
* Implement vop_link and vop_symlink for all file systems, so they do proper
cleanup.
* Require the file system to decide whether or not linking and unlinking of
directories is allowed, and disable it for all current file systems.
specific probe function did not specify it. It picks the same address
as mmap() does for a non-fixed map at address 0. See also the comment
around a similar line of code in vm/vm_mmap.c.
it's text or data; use the entry point instead (this solves some trouble
with ELF executables with strange permissions)
* Incorporate some fixes from r_friedl@informatik.uni-kl.de sent to
netbsd-bugs a while ago
- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.
- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.
- Several new functions for attaching and detaching disks, and
handling metrics calculation.
Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.
For usage and architectural details, see the forthcoming disk(9) manual
page.
they're already defined for some reason (this can happen
on the alpha, for example, which needs to define EXEC_ECOFF
in the std.alpha config file).
(2) minor spacing consistency.
no practical consequence on 32-bit systems. old prototype was
int profil(char *, int, int, int), and new one is int profile(char *,
size_t, u_long, u_int). the size_t is the size of the buffer,
and the u_long is the 'starting offset'. (I changed the last int
to u_int, because it's treated as a u_int everywhere, and isn't
logically a signed value.)
bawrite(). it's logically more correct (doesn't return an error code,
because it's async; bdwrite is also async), it still writes things
in-order, it makes sure the proper accountins is done (see the
wasdelayed cases in bwrite()), and it allows writes to vnodes on volumes
mountd with the MNT_ASYNC to be converted into delayed writes the way
God, err, Kirk intended. Convert synchronous bwrite()s on MNT_ASYNC
file systems to delayed writes.
explained in comments), which can cause a race condition. amazingly,
the _only_ time i've ever seen or heard of this problem was in some
comments and sources by Rick Macklem, and when running against the
a DEC OSF/1 NFS server running on an Alpha.
* If we got a stopping signal while already stopped with the same signal,
the second signal would sometimes (but not always) be ignored.
* Signals delivered by the debugger always pretended to be stopping
signals.
* PT_ATTACH still didn't quite work right.
sysv_shm.c: make shm_find_segment_by_shmid global so it can be used by
COMPAT_HPUX. There should be a better way...
rest: Add #ifdef COMPAT_HPUX where needed
error checking in the DIAGNOSTIC case. These changes might be backed out,
if it's decided that MINBUCKET should be 5 (rather than 4) on the alpha.
However, doing that has its own set of nasty consequences.