Detected with Kernel Undefined Behavior Sanitizer.
There were at least a single place reported, for consistency fix all the
left bit shift operations.
sys/kern/kern_descrip.c:1492:3, left shift of 1 by 31 places cannot be represented in type 'int'
sys/kern/kern_descrip.c:1493:28, left shift of 1 by 31 places cannot be represented in type 'int'
Reported by <Harry Pantazis>
Detected with Kernel Undefined Behavior Sanitizer.
sys/kern/kern_descrip.c:188:34, left shift of 1 by 31 places cannot be represented in type 'int'
Reported by <Harry Pantazis>
Detected with Kernel Undefined Behavior Sanitizer.
There were at least a single place reported, for consistency fix all the
left bit shift operations.
sys/kern/kern_descrip.c:302:26, left shift of 1 by 31 places cannot be represented in type 'int'
Reported by <Harry Pantazis>
kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()
all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
2. honor O_CLOEXEC, so the children of daemons that use cloning devices, don't
end up with the parents descriptors
fd_clone and in general the fd approach of 'allocate' > 'play with guts' >
'attach' should be converted to be more constructor like.
XXX: pullup-{6,7}
designated initializers.
I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
fds. XXX: We should really fix the fd's to be supported in the future.
Unsupported fd's have a NULL f_event, so registering crashes the kernel with
a NULL function dereference of f_event.
on EPIPE for all file descriptor types:
- provide O_NOSIGPIPE for open,kqueue1,pipe2,dup3,fcntl(F_{G,S}ETFL) [NetBSD]
- provide SOCK_NOSIGPIPE for socket,socketpair [NetBSD]
- provide SO_NOSIGPIPE for {g,s}seckopt [NetBSD/FreeBSD/MacOSX]
- provide F_{G,S}ETNOSIGPIPE for fcntl [MacOSX]
set close-on-exec on creation (http://udrepper.livejournal.com/20407.html).
- Add F_DUPFD_CLOEXEC to fcntl(2).
- Add MSG_CMSG_CLOEXEC to recvmsg(2) for unix file descriptor passing.
- Add dup3(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
- Add pipe2(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
- Add flags SOCK_CLOEXEC, SOCK_NONBLOCK to the socket type parameter
for socket(2) and socketpair(2).
- Add new paccept(2) syscall that takes an additional sigset_t to alter
the sigmask temporarily and a flags argument to set SOCK_CLOEXEC,
SOCK_NONBLOCK.
- Add new mode character 'e' to fopen(3) and popen(3) to open pipes
and file descriptors for close on exec.
- Add new kqueue1(2) syscall with a new flags argument to open the
kqueue file descriptor with O_CLOEXEC, O_NONBLOCK.
* Fix the system calls that take socklen_t arguments to actually do so.
* Don't include userland header files (signal.h) from system header files
(rump_syscallargs.h).
* Bump libc version for the new syscalls.
- Add fd_set_exclose() to encapsulate uses of FIO{,N}CLEX, O_CLOEXEC, F{G,S}ETFD
- Add a pipe1() function to allow passing flags to the fd's that pipe(2)
opens to ease implementation of linux pipe2(2)
- Factor out fp handling code from open(2) and fhopen(2)
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour. Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().
COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.
Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).
Fixes PR/43176.
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
It was only being used to synchronize close, and in any case we needed
to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
that we can eliminate the membar_consumer() call in fd_getfile(). This is
mostly syntactic sugar; the main functional change is that fd_nfiles now
lives alongside the open file array.
Some measurements with libmicro:
- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
than one active reference to a file descriptor. It should dislodge threads
sleeping while holding a reference to the descriptor. Implemented only for
sockets but should be extended to pipes, fifos, etc.
Fixes the case of a multithreaded process doing something like the
following, which would have hung until the process got a signal.
thr0 accept(fd, ...)
thr1 close(fd)
via SCM_RIGHTS messages are dealt with:
1. unp_gc: make this a kthread.
2. unp_detach: go not call unp_gc directly. instead, wake up unp_gc kthread.
3. unp_scan: do not close files here. instead, put them on a global list
for unp_gc to close, along with a per-file "deferred close count". if
file is already enqueued for close, just increment deferred close count.
this eliminates the recursive calls.
3. unp_gc: scan files on global deferred close list. close each file N
times, as specified by deferred close count in file. continue processing
list until it becomes empty (closing may cause additional files to be
queued for close).
4. unp_gc: add additional bit to mark files we are scanning. set during
initial scan of global file list that currently clears FMARK/FDEFER.
during later scans, never examine / garbage collect descriptors that
we have not marked during the earlier scan. do not proceed with this
initial scan until all deferred closes have been processed. be careful
with locking to ensure no races are introduced between deferred close
and file scan.
5. unp_gc: use dummy file_t to mark position in list when scanning. allow
us to drop filelist_lock. in turn allows us to eliminate kmem_alloc()
and safely close files, etc.
6. prohibit transfer of descriptors within SCM_RIGHTS messages if
(num_files_in_transit > maxfiles / unp_rights_ratio)
7. fd_allocfile: ensure recycled filse don't get scanned.
this is 97% work done by andrew doran, with a couple of minor bug fixes
and a lot of testing by yours truly.
while ironically trying to preserve the same during copy. Would only have
occurred if a multithreaded program expanded the descriptor table and,
within a tiny window of exposure, another thread in the program tried to
access descriptor zero.
- Convert to use kmem_alloc/kmem_free.