means we're in interrupt context. Since we can be called from a network
hardware interrupt, we could corrupt the protocol queues we try to drain
them at that time.
called when devices attach, take two.
Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.
The read/write system calls return ssize_t because -1 is used to indicate
error, therefore the transfer size MUST be limited to SSIZE_MAX, otherwise
garbage can be returned to the user.
There is NO change from existing behavior here, only a more precise
definition of that the semantics are, except in the Alpha case, where
the full SSIZE_MAX transfer size can now be realized (ssize_t is 64-bit
on the Alpha).
The read/write system calls return ssize_t because -1 is used to indicate
error, therefore the transfer size MUST be limited to SSIZE_MAX, otherwise
garbage can be returned to the user.
There is NO change from existing behavior here, only a more precise
definition of that the semantics are, except in the Alpha case, where
the full SSIZE_MAX transfer size can now be realized (ssize_t is 64-bit
on the Alpha).
- If either an alloc or release function is provided, make sure both are
provided, otherwise panic, as this is a fatal error.
- If using the default allocator, default the pool pagesz to PAGE_SIZE,
since that is the granularity of the default allocator's mechanism.
- In the default allocator, use new functions:
uvm_km_alloc_poolpage()/uvm_km_free_poolpage(), or
kmem_alloc_poolpage()/kmem_free_poolpage()
rather than doing it here. These functions may use pmap hooks to
provide alternate methods of mapping pool pages.
- pread() (#173) and pwrite() (#174), which are defined by XPG4.2. System
call numbers match Solaris.
- preadv() (#289) and pwritev() (#290), which are the positional cousins
of readv() and writev(), but not defined by any standard.
* we already have the vnode interlock, so vref() should not ask for it again.
* we call VOP_RECLAIM/VOP_INACTIVE(), which shouldn't be duplicated in vrele().
the new proc structure when performing a fork. This makes it much
easier to abort a fork operation and return an error if we run out
of KVA space.
The U-area pages are still wired down in {,u}vm_fork(), as before.
readlink() from type `int' to type `size_t'. This isn't an ABI change, since
the calling convention of our only LP64 platform (the Alpha) already promotes
this argument to a `long'.
This may not be the final action on this matter; readlink() still returns
an `int', which may change in a future revision of the standard.
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.
This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.
again - the facility required in this context would be a filesystem-specific
super-user determination, which is not available yet. Also, add some
clarification to a comment.
vectors; defer that to vfs_opv_init().
Change the interface to vfs_opv_init() and export it; it now takes a
pointer to an array of vnodeopv_desc *'s to initialize. Allocate
the vnode operations vectors here. Called by vfs_attach().
Implement vfs_opv_free(), which deallocates the vnode operations
vectors. Called by vfs_detach().
Change vfsinit() to build the initial vfs_list by traversing the
vfs_list_inital[] table, and vfs_attach()'ing those file systems.
Also, initialize special vnodeopv_descs (dead, fifo, spec) which
are not associated with any particular file system.
change_owner().
* Change the semantics of chown(), fchown() and lchown(): when requesting a
change of the owner of a file, clear the set-user-id bit; analogous behaviour
for group changes.
* Since the above is a violation of the semantics specified by POSIX and
X/Open, add corresponding compatibility syscalls: __posix_chown(),
__posix_fchown(), __posix_lchown(). (Neither fchown() nor lchown() is
specified by POSIX; the prefix is intended to reflect the semantics.)
* Rename posix_rename() to __posix_rename() to follow the above convention.
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.
Submitted by Tom Proett <proett@nas.nasa.gov>.
UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.
this is the rest of the MI portion changes.
this will be KNF'd shortly. :-)
down "Data modified on freelist" and "muliple free" problems.
The log is activated by the MALLOCLOG option, and the size of the
event ring buffer is controlable via the MALLOGLOGSIZE option (default
is 100000 entries).
From Chris Demetriou, cleaned up a little by me per suggestions in the
e-mail from Chris that contained the code.
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.
this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)
enabled with the LOCAL_CREDS socket option on the listener. Semantics are
similar to BSD/OS's:
- Creds are available with first data on SOCK_STREAM, and with every datagram
on SOCK_DGRAM.
- It is not possible to forge credentials.
Different in that:
- Different credential data structure (ours does not rely on the format
of internal kernel data structures, and does not pass the login name).
- We can pass creds and file descriptors at the same time (this does not
work in BSD/OS).
Luke Mewburn <lukem@netbsd.org> gets credit for inspiring me to implement
this. :-)
so_linger is used as an argument to tsleep(), so was stuffed with
clockticks for the TCP linger time. However, so_linger is set directly from
l_linger if the linger time is specified, and l_linger is seconds (although
this is not currently documented anywhere). Fix this to set the TCP
linger time in seconds, and multiply so_linger by hz when tsleep() is
called to actually perform the linger.
3BSD vfork(2), i.e. share address space w/ parent and block parent.
Keep statistics on the total number of forks, the number of forks that
block the parent, and the number of forks that share the address space
with the parent.
to be of type size_t; since this imposes an interface change on the Alpha
(sizeof(int) != sizeof(size_t)), allocate a new system call number and make
the previous version a compatibility system call.
whenever the %: format is used on NetBSD/Alpha. Disable %: for __alpha__.
Note: the "correct" (but untested on other architectures) fix is to
change the wrong: kprintf(cp, oflags, tp, NULL, va_arg(ap, va_list));
to the right: kprintf(cp, oflags, tp, NULL, ap);
and swapctl(). For the former three, they use an 'int' in their user-land
prototype which was a 'u_int' in the kernel, which screwed up automatic
generation/checking of lint syscall stubs. For the latter, the user-land
prototype uses a "const char *", but the syscall just used "char *".
From Chris Demetriou <cgd@pa.dec.com>.
floating point stuff removed].
the new kprintf replaces the 3 different (and buggy) versions of
printf that were in the kernel before (kprintf, sprintf, and db_printf),
thus reducing duplicated code by 2/3's. this fixes (or adds) several
printf formats. examples:
%#x - previously only supported by db_printf [not printf/sprintf]
%8.8s - printf would print "000chuck" for "chuck" before
%5p - printf would print "0x 1" for value 1 before
XXX: new kprintf still supports several non-standard '%' formats that
are supposed to eventually be removed:
%: - passes an additional format string and argument list recursively
%b - used to decode error registers
%r - int, but print in radix "db_radix" [DDB only]
%z - 'signed hex' [DDB only]
%n - unsigned int, but print in radix "db_radix" [DDB only]
note that DDB's "%n" conflicts with standard "%n" which takes the
number of characters written so far and stores it into the integer
indicated by the "int *" pointer arg. yuck!
while here, add comments for each function explaining what it is
supposed to do.
to the stat(2) family and msync(2). This uses a primitive function
versioning scheme.
This reverts the libc shared library major version from 13 to 12, and
adds a few new interfaces to bring us to libc version 12.20.
From Frank van der Linden <fvdl@NetBSD.ORG>.
pseudo-device rnd # /dev/random and in-kernel generator
in config files.
o Add declaration to all architectures.
o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.
by config(8):
- Make the vfs ops and vnodeop_opv symbols match the name of the
file-system option used to configure the file system into the kernel.
- Now that sys_mount() has mountcompatnames[], remove the holes previously
used to preserve ordering for COMPAT_09 and COMPAT_43 mount system calls.
Also, add a comment that describes how I feel about the existence of this
file.
which maps the old file system index numbers to the new (well, since after
NetBSD 0.9) string-based method of finding a file system ops vector. Use
this table rather than assuming the ordering of the vfssw[] array when
emulating the old mount system call.
- New function change_utimes() to set access and modification times
given a vnode.
- In the function sys_chmod() and sys_fchmod(), call change_mode().
- In the function sys_utimes() and sys_futimes(), call
change_utimes().
unmounting all of the file systems. If we encounter a condition where
all of the dirty buffers could not flush, then don't unmount file systems,
since it might be likely to wedge.
msgbuf. Note that old 'dmesg' and 'syslogd' binaries will continue running,
though old 'dmesg' binaries will output a few bytes of junk at the start of
the buffer, and will miss a few bytes at the end of the buffer.
anon cred are the same. Should probably be handled better in the mountd,
but this will do for now. Fixes PR 469, submitted Sept 1994 by
a certain "Jason R. Thorpe".. ;-)
`b_dev' value of NODEV happens and is normal if the buffer is on its way
to the underlying device strategy function for the first time.
Also, MFS sillily uses a major device number (255) which cannot be used
to index bdevsw[]. Check marked with XXXs.
call biowait() but return `success' immediately. We can return `success'
because buffers with recorded errors are not returned by getblk().
(Takes care of PR#3694).
possibly unused (with __attribute__ ((unused))), to avoid generating
warnings when compiling without optimization but with most ports' default
warning flags.
socket names:
- In unp_setsockaddr() and unp_setpeeraddr(), if the socket name can't
fit into a single mbuf, allocate enough external storage space to
hold it.
- In unp_bind() and unp_connect(), perform a similar operation, but allocate
one extra byte, and ensure that the pathname is nul-terminated.
Many thanks to enami tsugutomo <enami@cv.sony.co.jp> for the sanity
checking.
- Add a comment describing my feelings about this interface, in general.
- Remove the COMPAT_OLDSOCK length hack. Instead, if the socket argument
is too long to fit in an mbuf, allocate enough external storage to
hold it.
- If the socket argument is a sockaddr, don't allow the length to be
greater than 255, as that would overflow sa_len.
Many thanks to enami tsugutomo <enami@cv.sony.co.jp> for his sanity checking.
the format modifer. Reported by and suggested fix from Daniel G. Pouzzner
in PR #2633. Final fix is slightly different now that we support the %q
modifier. This fix also includes the equivalent fix for sprintf().
- Disallow < 1 values for SO_SNDBUF, SO_RCVBUF, SO_SNDLOWAT, and
SO_RCVLOWAT; return EINVAL if the user attempts to set <= 0.
Inspired by PR #3770, from Havard Eidnes <he@vader.runit.sintef.no>.
- For SO_SNDLOWAT and SO_RCVLOWAT, don't let the low-water mark get
set above the high-water mark. Behavior is now consistent with
BSD/OS: If such an attempt is made, silently truncate to the high-water
value.
- If RB_ASKNAME, prompt for the dump device, defaulting to
partition 'b' of the root device, if the root device is a disk.
- Else, if dumpspec is set to "none", do not configure a dump device.
- Else, if dumpspec is set by config(8), attempt to use that device.
- Else, dumpspec is wildcarded or unspecified; if the root device is
a disk, select partition b. (which was the previous default dump
partition)
Note, dumps to a local disk now work even if root is on nfs.
so that if the drop to spl0() causes another panic (e.g. because there's
still some fatal hardware interrupt that's pending) we'll know that we
dropped IPL to sync the disks.
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.
fill the socket's creation time into the stat structure's st_[acm]time fields:
POSIX requires this behavior for pipe(2). N.B.: updating the st_[am]time fields
when reading/writing the pipe is neither required nor implemented, though.
(1): "substart == ex->ex_end" and "subend == ex->ex_start"
are completely legal parameters for extent_alloc_subregion()
(2): "(subend - substart) + 1" can cause an overflow if the whole
numeric range is covered by the extent.
Submitted by Matthias Drochner <drochner@zelz26.zel.kfa-juelich.de>
in PR #3119.