The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).
Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
while here change use to typed pointer sockaddr * instead of void * which
also lets us get rid of sasize variable used to track length (since we
can now use sa_len easily)
nam parameter type from buf * to sockaddr *.
final commit for parameter type changes to protocol user requests
* bump kernel version to 7.99.15 for parameter type changes to pr_{send,connect}
pr_{accept,sockname,peername} nam parameter type from mbuf * to sockaddr *.
* retained use of mbuftypes[MT_SONAME] for now.
* bump to netbsd version 7.99.12 for parameter type change.
patch posted to tech-net@ 2015/04/19
instead of using the original sockaddr_{in,un} structures for storage
use the single sockaddr_big structure instead.
while here ditch superfluous assignment of sockaddr sb_len since the
assignment is already performed in netaddr_to_sockaddr_{in,un}
* update protocol bind implementations to use/expect sockaddr *
instead of mbuf *
* introduce sockaddr_big struct for storage of addr data passed via
sys_bind; sockaddr_big is of sufficient size and alignment to
accommodate all addr data sizes received.
* modify sys_bind to allocate sockaddr_big instead of using an mbuf.
* bump kernel version to 7.99.9 for change to pr_bind() parameter type.
Patch posted to tech-net@
http://mail-index.netbsd.org/tech-net/2015/03/15/msg005004.html
The choice to use a new structure sockaddr_big has been retained since
changing sockaddr_storage size would lead to unnecessary ABI change. The
use of the new structure does not preclude future work that increases
the size of sockaddr_storage and at that time sockaddr_big may be
trivially replaced.
Tested by mrg@ and myself, discussed with rmind@, posted to tech-net@
usrreq switches and put into separate functions
xxx_{peer,sock}addr(struct socket *, struct mbuf *).
- KASSERT(solocked(so)) always in new functions even if request
is not implemented
- KASSERT(pcb != NULL) and KASSERT(nam) if the request is
implemented and not for tcp.
* for tcp roll #ifdef KPROF and #ifdef DEBUG code from tcp_usrreq() into
easier to cut & paste functions tcp_debug_capture() and
tcp_debug_trace()
- functions provided by rmind
- remaining use of PRU_{PEER,SOCK}ADDR #define to be removed in a
future commit.
* rename netbt functions to permit consistency of pru function names
(as has been done with other requests already split out).
- l2cap_{peer,sock}addr() -> l2cap_{peer,sock}_addr_pcb()
- rfcomm_{peer,sock}addr() -> rfcomm_{peer,sock}_addr_pcb()
- sco_{peer,sock}addr() -> sco_{peer,sock}_addr_pcb()
* split/refactor do_sys_getsockname(lwp, fd, which, nam) into
two functions do_sys_get{peer,sock}name(fd, nam).
- move PRU_PEERADDR handling into do_sys_getpeername() from
do_sys_getsockname()
- have svr4_stream directly call do_sys_get{sock,peer}name()
respectively instead of providing `which' & fix a DPRINTF string
that incorrectly wrote "getpeername" when it meant "getsockname"
- fix sys_getpeername() and sys_getsockname() to call
do_sys_get{sock,peer}name() without `which' and `lwp' & adjust
comments
- bump kernel version for removal of lwp & which parameters from
do_sys_getsockname()
note: future cleanup to remove struct mbuf * abuse in
xxx_{peer,sock}name()
still to come, not done in this commit since it is easier to do post
split.
patch reviewed by rmind
welcome to 6.99.47
designated initializers.
I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
sync with native errno list.
Remove extra entries (linux) which resulted in bad translated values,
and add missing ones (ibcs2, osf1 and svr4) which made some out of
bounds accesses.
set close-on-exec on creation (http://udrepper.livejournal.com/20407.html).
- Add F_DUPFD_CLOEXEC to fcntl(2).
- Add MSG_CMSG_CLOEXEC to recvmsg(2) for unix file descriptor passing.
- Add dup3(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
- Add pipe2(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
- Add flags SOCK_CLOEXEC, SOCK_NONBLOCK to the socket type parameter
for socket(2) and socketpair(2).
- Add new paccept(2) syscall that takes an additional sigset_t to alter
the sigmask temporarily and a flags argument to set SOCK_CLOEXEC,
SOCK_NONBLOCK.
- Add new mode character 'e' to fopen(3) and popen(3) to open pipes
and file descriptors for close on exec.
- Add new kqueue1(2) syscall with a new flags argument to open the
kqueue file descriptor with O_CLOEXEC, O_NONBLOCK.
* Fix the system calls that take socklen_t arguments to actually do so.
* Don't include userland header files (signal.h) from system header files
(rump_syscallargs.h).
* Bump libc version for the new syscalls.
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)
This removes the need for the SAVENAME and HASBUF namei flags.
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.
Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).
The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
- update the linux syscall table for each platform.
- support new-style (NPTL) linux pthreads on all platforms.
clone() with CLONE_THREAD uses 1 process with many LWPs
instead of separate processes.
- move the contents of sys__lwp_setprivate() into a new
lwp_setprivate() and use that everywhere.
- update linux_release[] and linux32_release[] to "2.6.18".
- adjust placement of emul fork/exec/exit hooks as needed
and adjust other emul code to match.
- convert all struct emul definitions to use named initializers.
- change the pid allocator to allow multiple pids to refer to the same proc.
- remove a few fields from struct proc that are no longer needed.
- disable the non-functional "vdso" code in linux32/amd64,
glibc works fine without it.
- fix a race in the futex code where we could miss a wakeup after
a requeue operation.
- redo futex locking to be a little more efficient.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour. Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().
COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.
Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).
Fixes PR/43176.
in a row, and we need to try to read the next block, and have passed a
non-NULL cookie pointer to VOP_READDIR, ensure that we free the cookie
buffer before re-doing VOP_READDIR, so that we don't leak memory.
This fix is similar to nfs_serv.c revisions 1.115 + 1.124.
This should fix the long-standing problem observed by e.g. using Linux-
emulated programs to take backup of servers, which is one of the problems
which were reported in PR#42661.
Thanks to pooka@ for the hints for traversing the VOP* layer.
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
It was only being used to synchronize close, and in any case we needed
to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
that we can eliminate the membar_consumer() call in fd_getfile(). This is
mostly syntactic sugar; the main functional change is that fd_nfiles now
lives alongside the open file array.
Some measurements with libmicro:
- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
than one active reference to a file descriptor. It should dislodge threads
sleeping while holding a reference to the descriptor. Implemented only for
sockets but should be extended to pipes, fifos, etc.
Fixes the case of a multithreaded process doing something like the
following, which would have hung until the process got a signal.
thr0 accept(fd, ...)
thr1 close(fd)