Commit Graph

246 Commits

Author SHA1 Message Date
ad
0eaaa024ea Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
2020-05-23 23:42:41 +00:00
riastradh
17201b1c03 Load struct fdfile::ff_file with atomic_load_consume.
Exceptions: when we're only testing whether it's there, not about to
dereference it.

Note: We do not use atomic_store_release to set it because the
preceding mutex_exit should be enough.

(That said, it's not clear the mutex_enter/exit is needed unless
refcnt > 0 already, in which case maybe it would be a win to switch
from the membar implied by mutex_enter to the membar implied by
atomic_store_release -- which I would generally expect to be much
cheaper.  And a little clearer without a long comment.)
2020-02-01 02:23:23 +00:00
riastradh
8e6cd4ce57 Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.

While here:

- Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused.
  => This is used only in fd_close and fd_abort, where it holds.
- Move bounds check assertion in fd_putfile to where it matters.
- Store fd_dt with atomic_store_release.
- Move load of fd_dt under lock in knote_fdclose.
- Omit membar_consumer in fdesc_readdir.
  => atomic_load_consume serves the same purpose now.
  => Was needed only on alpha anyway.
2020-02-01 02:23:03 +00:00
christos
b3b9361cff handle O_NOSIGPIPE too. 2019-02-20 19:42:14 +00:00
maxv
51f36ead8f Add KASSERT. 2019-01-03 10:16:43 +00:00
maxv
cc13a96134 Fix kernel pointer leaks in the kern.file sysctl, same as kern.file2. 2018-11-24 16:41:48 +00:00
maxv
572947b34e Rename fill_file -> fill_file2, since that's the KERN_FILE2 sysctl. 2018-11-24 16:25:20 +00:00
maxv
905aba58c9 Add LIST_INIT for filehead. 2018-11-02 12:27:47 +00:00
christos
6015b4b38b Provide a sysctl kern.expose_address to expose kernel addresses in
sysctl structure returns for non-root. Defaults to off. Turning it
on will restore sockstat/fstat and friends for regular users.
2018-10-05 22:12:37 +00:00
maxv
c11c7199c0 Don't leak kernel pointers to userland in kern.file2, same as kern.proc2. 2018-09-13 14:44:09 +00:00
riastradh
d1579b2d70 Rename min/max -> uimin/uimax for better honesty.
These functions are defined on unsigned int.  The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER!  Some subsystems have

	#define min(a, b)	((a) < (b) ? (a) : (b))
	#define max(a, b)	((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX.  Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate.  But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all.  (Who knows, maybe in some cases integer
truncation is actually intended!)
2018-09-03 16:29:22 +00:00
kamil
bf95c9b8f0 Avoid unportable signed integer left shift in fd_unused()
Detected with Kernel Undefined Behavior Sanitizer.

There were at least a single place reported, for consistency fix all the
left bit shift operations.
sys/kern/kern_descrip.c:345:2, left shift of 1 by 31 places cannot be represented in type 'int'
sys/kern/kern_descrip.c:346:28, left shift of 1 by 31 places cannot be represented in type 'int'

Reported by <Harry Pantazis>
2018-07-03 23:14:57 +00:00
kamil
11d85a2701 Avoid unportable signed integer left shift in fd_copy()
Detected with Kernel Undefined Behavior Sanitizer.

There were at least a single place reported, for consistency fix all the
left bit shift operations.
sys/kern/kern_descrip.c:1492:3, left shift of 1 by 31 places cannot be represented in type 'int'
sys/kern/kern_descrip.c:1493:28, left shift of 1 by 31 places cannot be represented in type 'int'

Reported by <Harry Pantazis>
2018-07-03 23:11:06 +00:00
kamil
4c40a51b4d Avoid unportable signed integer left shift in fd_isused()
Detected with Kernel Undefined Behavior Sanitizer.

sys/kern/kern_descrip.c:188:34, left shift of 1 by 31 places cannot be represented in type 'int'

Reported by <Harry Pantazis>
2018-07-03 22:49:51 +00:00
kamil
af399e1ecf Avoid unportable signed integer left shift in fd_used()
Detected with Kernel Undefined Behavior Sanitizer.

There were at least a single place reported, for consistency fix all the
left bit shift operations.
sys/kern/kern_descrip.c:302:26, left shift of 1 by 31 places cannot be represented in type 'int'

Reported by <Harry Pantazis>
2018-07-03 12:17:54 +00:00
chs
fd34ea77eb remove checks for failure after memory allocation calls that cannot fail:
kmem_alloc() with KM_SLEEP
  kmem_zalloc() with KM_SLEEP
  percpu_alloc()
  pserialize_create()
  psref_class_create()

all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
2017-06-01 02:45:05 +00:00
nat
5e34165f16 Explicitly set the flags instead of masking set values in.
This fixes FNONBLOCK weirdness seen in audio.c

OK christos@ and martin@.
2017-05-11 22:38:56 +00:00
christos
6fe583ecd6 1. mask fflags so we don't tack on whateve oflags were passed from userland
2. honor O_CLOEXEC, so the children of daemons that use cloning devices, don't
   end up with the parents descriptors
fd_clone and in general the fd approach of 'allocate' > 'play with guts' >
'attach' should be converted to be more constructor like.
XXX: pullup-{6,7}
2015-08-03 04:55:15 +00:00
christos
7c397b34ce remove casts to the same type. 2014-09-21 17:17:15 +00:00
matt
45b1ec740d Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get
a correctly typed pointer.
2014-09-05 09:20:59 +00:00
matt
a35d1a8c7c Don't next structure and enum definitions.
Don't use C++ keywords new, try, class, private, etc.
2014-09-05 05:57:21 +00:00
dholland
f9228f4225 Add d_discard to all struct cdevsw instances I could find.
All have been set to "nodiscard"; some should get a real implementation.
2014-07-25 08:10:31 +00:00
dholland
a68f9396b6 Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
2014-03-16 05:20:22 +00:00
pooka
4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
martin
fd52abfdd4 Remove __CT_LOCAL_.. hack 2013-09-15 13:03:59 +00:00
martin
97374bea1a Avoid warnings for a local CTASSERT 2013-09-14 13:46:52 +00:00
pooka
5d36abf618 In fd_abort(), reset ff_exclose to preserve invariants expected by fd_free() 2013-09-05 12:23:07 +00:00
christos
7e72d438b2 Return EOPNOTSUPP for fnullop_kqfilter to prevent registration of unsupported
fds. XXX: We should really fix the fd's to be supported in the future.
Unsupported fd's have a NULL f_event, so registering crashes the kernel with
a NULL function dereference of f_event.
2012-11-24 15:07:44 +00:00
christos
5418d2a724 As discussed in tech-kern, provide the means to prevent delivery of SIGPIPE
on EPIPE for all file descriptor types:

- provide O_NOSIGPIPE for open,kqueue1,pipe2,dup3,fcntl(F_{G,S}ETFL) [NetBSD]
- provide SOCK_NOSIGPIPE for socket,socketpair [NetBSD]
- provide SO_NOSIGPIPE for {g,s}seckopt [NetBSD/FreeBSD/MacOSX]
- provide F_{G,S}ETNOSIGPIPE for fcntl [MacOSX]
2012-01-25 00:28:35 +00:00
chs
42b05e0e61 in fd_allocfile(), free the fd if we fail to allocate a file. 2011-09-25 13:40:37 +00:00
christos
dc72548dbe fail with EINVAL if flags not are not O_CLOEXEC|O_NONBLOCK in pipe2(2) and
dup3(2)
2011-07-15 14:50:19 +00:00
christos
e2bebf7172 * Arrange for interfaces that create new file descriptors to be able to
set close-on-exec on creation (http://udrepper.livejournal.com/20407.html).

    - Add F_DUPFD_CLOEXEC to fcntl(2).
    - Add MSG_CMSG_CLOEXEC to recvmsg(2) for unix file descriptor passing.
    - Add dup3(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
    - Add pipe2(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
    - Add flags SOCK_CLOEXEC, SOCK_NONBLOCK to the socket type parameter
      for socket(2) and socketpair(2).
    - Add new paccept(2) syscall that takes an additional sigset_t to alter
      the sigmask temporarily and a flags argument to set SOCK_CLOEXEC,
      SOCK_NONBLOCK.
    - Add new mode character 'e' to fopen(3) and popen(3) to open pipes
      and file descriptors for close on exec.
    - Add new kqueue1(2) syscall with a new flags argument to open the
      kqueue file descriptor with O_CLOEXEC, O_NONBLOCK.

* Fix the system calls that take socklen_t arguments to actually do so.

* Don't include userland header files (signal.h) from system header files
  (rump_syscallargs.h).

* Bump libc version for the new syscalls.
2011-06-26 16:42:39 +00:00
rmind
507c30fb60 Drop extern inline for fd_getfile(). Apparently, GCC already ignores it. 2011-04-24 20:30:38 +00:00
rmind
6a7d2e04ad - Sprinkle __cacheline_aligned and __read_mostly in file descriptor code.
- While here, remove trailing whitespaces, KNF.
2011-04-23 18:57:27 +00:00
christos
a73f7b01d5 - Add O_CLOEXEC to open(2)
- Add fd_set_exclose() to encapsulate uses of FIO{,N}CLEX, O_CLOEXEC, F{G,S}ETFD
- Add a pipe1() function to allow passing flags to the fd's that pipe(2)
  opens to ease implementation of linux pipe2(2)
- Factor out fp handling code from open(2) and fhopen(2)
2011-04-10 15:45:33 +00:00
pooka
9b097994c2 Support FD_CLOEXEC in rump kernels. 2011-02-15 15:54:28 +00:00
pooka
dd7a40671a Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes).  This makes them
usable in a rump kernel, in case somebody was wondering.
2011-01-28 18:44:44 +00:00
pooka
08421f3eea Update comment and inspired by that update variable naming too.
no functional change.
2011-01-01 22:05:11 +00:00
yamt
112d262cd3 update some comments 2010-12-17 22:06:31 +00:00
pooka
41a10084d4 Attach implicit threads to initproc instead of proc0. This way
applications which alter, by purpose or by accident, the uid in an
implicit thread are don't affect kernel threads.

from discussion with njoly
2010-10-29 15:32:23 +00:00
pooka
0af65acdc5 Actually, the comment probably meant "would be nice to KASSERT here,
but can't".  So turn it into a KASSERT now that it's possible.
2010-09-01 15:15:18 +00:00
pooka
8411fe4cea Remove XXX comment. I'm not sure what it precisely means, but I'm
guessing it's from a time when rump used filedesc0 for everything
(and that isn't true anymore).
2010-09-01 15:12:16 +00:00
pooka
5777f63fd9 Remove overzealous KASSERT: the refcount can be non-zero if another
thread attempts to use a non-open file descriptor.  from ad

fixes PR kern/43694
2010-08-04 14:25:16 +00:00
rmind
3c507045e2 Remove pfind() and pgfind(), fix locking in various broken uses of these.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour.  Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().

COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.

Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).

Fixes PR/43176.
2010-07-01 02:38:26 +00:00
dsl
2a54322c7b If a multithreaded app closes an fd while another thread is blocked in
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
2009-12-20 09:36:05 +00:00
dsl
7a42c833db Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
2009-12-09 21:32:58 +00:00
rmind
e4be2748a3 - Amend fd_hold() to take an argument and add assert (reflects two cases,
fork1() and the rest, e.g. kthread_create(), when creating from lwp0).

- lwp_create(): do not touch filedesc internals, use fd_hold().
2009-10-27 02:58:28 +00:00
yamt
77d977dcbc assertion 2009-08-16 11:00:20 +00:00
martin
53822d1e78 Update fd_freefile when kqueue descriptors are not copied from
parent to child. From Wolfgang Solfrank in PR kern/41651.
Approved by Andrew Doran.
2009-06-30 20:32:49 +00:00
yamt
5c0faad4bd fd_free: fix posix advisory locks. PR/41549 from HITOSHI OSADA. 2009-06-08 00:19:56 +00:00