Print a diagnostic message if we ever get ERESTART out of fd_close
and convert it to EINTR instead.
Even if fd_close fails, it has already closed the file descriptor, so
restarting the system call is a mistake, with dangerous consequences
for multithreaded programs.
Should probably turn the message into a kassert eventually, and maybe
add one deeper in fd_close in order to more easily debug it before
all the data structures are destroyed.
on EPIPE for all file descriptor types:
- provide O_NOSIGPIPE for open,kqueue1,pipe2,dup3,fcntl(F_{G,S}ETFL) [NetBSD]
- provide SOCK_NOSIGPIPE for socket,socketpair [NetBSD]
- provide SO_NOSIGPIPE for {g,s}seckopt [NetBSD/FreeBSD/MacOSX]
- provide F_{G,S}ETNOSIGPIPE for fcntl [MacOSX]
set close-on-exec on creation (http://udrepper.livejournal.com/20407.html).
- Add F_DUPFD_CLOEXEC to fcntl(2).
- Add MSG_CMSG_CLOEXEC to recvmsg(2) for unix file descriptor passing.
- Add dup3(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
- Add pipe2(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
- Add flags SOCK_CLOEXEC, SOCK_NONBLOCK to the socket type parameter
for socket(2) and socketpair(2).
- Add new paccept(2) syscall that takes an additional sigset_t to alter
the sigmask temporarily and a flags argument to set SOCK_CLOEXEC,
SOCK_NONBLOCK.
- Add new mode character 'e' to fopen(3) and popen(3) to open pipes
and file descriptors for close on exec.
- Add new kqueue1(2) syscall with a new flags argument to open the
kqueue file descriptor with O_CLOEXEC, O_NONBLOCK.
* Fix the system calls that take socklen_t arguments to actually do so.
* Don't include userland header files (signal.h) from system header files
(rump_syscallargs.h).
* Bump libc version for the new syscalls.
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
- Add fd_set_exclose() to encapsulate uses of FIO{,N}CLEX, O_CLOEXEC, F{G,S}ETFD
- Add a pipe1() function to allow passing flags to the fd's that pipe(2)
opens to ease implementation of linux pipe2(2)
- Factor out fp handling code from open(2) and fhopen(2)
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
It was only being used to synchronize close, and in any case we needed
to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
that we can eliminate the membar_consumer() call in fd_getfile(). This is
mostly syntactic sugar; the main functional change is that fd_nfiles now
lives alongside the open file array.
Some measurements with libmicro:
- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
we no longer need to guard against access from hardware interrupt handlers.
Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
- Redo reference counting to be sane. LWPs accessing files take a short
term reference on the local file descriptor. This is the most common
case. While a file is in a process descriptor table, a reference is
held to the file. The file reference count only changes during control
operations like open() or close(). Code that comes at files from an
unusual direction (i.e. foreign to the process) like procfs or sysctl
takes a reference on the file (f_count), and not on a descriptor.
- Remove knowledge of reference counting and locking from most code that
deals with files.
- Make the usual case of file descriptor lookup lockless.
- Make kqueue MP and MT safe. PR kern/38098, PR kern/38137.
- Fix numerous file handling bugs, and bugs in the descriptor code that
affected multithreaded processes.
- Split descriptor system calls out into sys_descrip.c.
- A few stylistic changes: KNF, remove unused casts now that caddr_t is
gone. Replace dumb gotos with loop control in a few places.
- Don't do redundant pointer passing (struct proc, lwp, filedesc *) unless
the routine is likely to be inlined. Most of the time it's about the
current process.