Commit Graph

210 Commits

Author SHA1 Message Date
pooka
dd7a40671a Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes).  This makes them
usable in a rump kernel, in case somebody was wondering.
2011-01-28 18:44:44 +00:00
pooka
08421f3eea Update comment and inspired by that update variable naming too.
no functional change.
2011-01-01 22:05:11 +00:00
yamt
112d262cd3 update some comments 2010-12-17 22:06:31 +00:00
pooka
41a10084d4 Attach implicit threads to initproc instead of proc0. This way
applications which alter, by purpose or by accident, the uid in an
implicit thread are don't affect kernel threads.

from discussion with njoly
2010-10-29 15:32:23 +00:00
pooka
0af65acdc5 Actually, the comment probably meant "would be nice to KASSERT here,
but can't".  So turn it into a KASSERT now that it's possible.
2010-09-01 15:15:18 +00:00
pooka
8411fe4cea Remove XXX comment. I'm not sure what it precisely means, but I'm
guessing it's from a time when rump used filedesc0 for everything
(and that isn't true anymore).
2010-09-01 15:12:16 +00:00
pooka
5777f63fd9 Remove overzealous KASSERT: the refcount can be non-zero if another
thread attempts to use a non-open file descriptor.  from ad

fixes PR kern/43694
2010-08-04 14:25:16 +00:00
rmind
3c507045e2 Remove pfind() and pgfind(), fix locking in various broken uses of these.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour.  Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().

COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.

Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).

Fixes PR/43176.
2010-07-01 02:38:26 +00:00
dsl
2a54322c7b If a multithreaded app closes an fd while another thread is blocked in
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
2009-12-20 09:36:05 +00:00
dsl
7a42c833db Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
2009-12-09 21:32:58 +00:00
rmind
e4be2748a3 - Amend fd_hold() to take an argument and add assert (reflects two cases,
fork1() and the rest, e.g. kthread_create(), when creating from lwp0).

- lwp_create(): do not touch filedesc internals, use fd_hold().
2009-10-27 02:58:28 +00:00
yamt
77d977dcbc assertion 2009-08-16 11:00:20 +00:00
martin
53822d1e78 Update fd_freefile when kqueue descriptors are not copied from
parent to child. From Wolfgang Solfrank in PR kern/41651.
Approved by Andrew Doran.
2009-06-30 20:32:49 +00:00
yamt
5c0faad4bd fd_free: fix posix advisory locks. PR/41549 from HITOSHI OSADA. 2009-06-08 00:19:56 +00:00
yamt
6f174f1311 shut up the following assertion failure and add a comment.
panic: kernel diagnostic assertion "!fd_isused(fdp, fd)" failed: file "/siro/nbsd/src/sys/kern/kern_descrip.c", line 175
2009-06-07 09:39:02 +00:00
yamt
75c4e4fde7 fd_free: reset fd_himap/lomap to make fd_checkmaps comfortable. PR/41487. 2009-05-29 00:10:52 +00:00
yamt
4f22237449 wrap a long line. 2009-05-28 22:17:04 +00:00
ad
0913d2e2f5 PR kern/41487: kern_descrip.c assertion failure
Remove bogus assertion.
2009-05-26 00:42:33 +00:00
ad
d991fcb3b6 More changes to improve kern_descrip.c.
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
  It was only being used to synchronize close, and in any case we needed
  to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
  that we can eliminate the membar_consumer() call in fd_getfile().  This is
  mostly syntactic sugar; the main functional change is that fd_nfiles now
  lives alongside the open file array.

Some measurements with libmicro:

- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
2009-05-24 21:41:25 +00:00
ad
3cb7a24bec Make descriptor access and file allocation cheaper in many cases,
mostly by avoiding a bunch of atomic operations.
2009-05-23 18:28:05 +00:00
ad
c6367674d6 Add fileops::fo_drain(), to be called from fd_close() when there is more
than one active reference to a file descriptor. It should dislodge threads
sleeping while holding a reference to the descriptor. Implemented only for
sockets but should be extended to pipes, fifos, etc.

Fixes the case of a multithreaded process doing something like the
following, which would have hung until the process got a signal.

thr0	accept(fd, ...)
thr1	close(fd)
2009-04-04 10:12:51 +00:00
rmind
6b0e9f0301 fownsignal: pre-check for zero pgid, avoids locking of proc_lock. 2009-03-29 04:40:01 +00:00
mrg
9ba87b8cc3 completely rework the way that orphaned sockets that are being fdpassed
via SCM_RIGHTS messages are dealt with:

1. unp_gc: make this a kthread.

2. unp_detach: go not call unp_gc directly. instead, wake up unp_gc kthread.

3. unp_scan: do not close files here. instead, put them on a global list
   for unp_gc to close, along with a per-file "deferred close count". if
   file is already enqueued for close, just increment deferred close count.
   this eliminates the recursive calls.

3. unp_gc: scan files on global deferred close list. close each file N
   times, as specified by deferred close count in file. continue processing
   list until it becomes empty (closing may cause additional files to be
   queued for close).

4. unp_gc: add additional bit to mark files we are scanning. set during
   initial scan of global file list that currently clears FMARK/FDEFER.
   during later scans, never examine / garbage collect descriptors that
   we have not marked during the earlier scan. do not proceed with this
   initial scan until all deferred closes have been processed. be careful
   with locking to ensure no races are introduced between deferred close
   and file scan.

5. unp_gc: use dummy file_t to mark position in list when scanning. allow
   us to drop filelist_lock. in turn allows us to eliminate kmem_alloc()
   and safely close files, etc.

6. prohibit transfer of descriptors within SCM_RIGHTS messages if
   (num_files_in_transit > maxfiles / unp_rights_ratio)

7. fd_allocfile: ensure recycled filse don't get scanned.


this is 97% work done by andrew doran, with a couple of minor bug fixes
and a lot of testing by yours truly.
2009-03-11 06:05:29 +00:00
ad
69f9e17075 Don't bother with file_t::f_iflags any more, as it's not used.
Noted by mrg@.
2009-03-08 12:52:08 +00:00
rmind
4bd0e7cebc fd_copy: fix off-by-one bug in a race condition path and assert.
Should fix PR/40625.  OK by <ad>.
2009-03-02 19:28:08 +00:00
ad
6d599f4e1f - Fix a bug where we trashed descriptor zero in the old open files array
while ironically trying to preserve the same during copy. Would only have
  occurred if a multithreaded program expanded the descriptor table and,
  within a tiny window of exposure, another thread in the program tried to
  access descriptor zero.

- Convert to use kmem_alloc/kmem_free.
2008-12-21 09:58:22 +00:00
pooka
9e46e516a7 Move fd_closeexec() and fd_checkstd() from kern_descrip to their
own file, subr_exec_fd.c (they're used only by exec).

After this change, the kernel source modules are in a partitioned
enough state to allow building a system without vfs at all.
2008-11-18 13:01:41 +00:00
pooka
48d146fba6 cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd.  No functional change.
2008-11-18 11:36:58 +00:00
matt
7408df1239 Change {ff,fd}_exclose and ff_allocated to bool. Change exclose arg to
fd_dup to bool.  Switch assignments from 1/0 to true/false.

This make alpha kernels compile.  Bump kern to 4.99.69 since structure
changed.
2008-07-02 16:45:19 +00:00
matt
1906aa3e59 Switch from KASSERT to CTASSERT for those asserts testing sizes of types. 2008-07-02 14:47:34 +00:00
gmcgarry
8b957c9d45 ioctl commands are unsigned long. Changes ABI for fsetown() and fgetown() on 64-bit architectures. 2008-06-24 10:26:26 +00:00
ad
e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
martin
ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad
284c2b9aef Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
2008-04-24 18:39:20 +00:00
ad
6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
wiz
e47f3f6ebe Commit fix for the fdfile leak described in PR 38374.
Patch provided by YAMAMOTO Takashi.

Ok ad@
2008-04-09 19:36:59 +00:00
ad
feb4783fdf Replace use of CACHE_LINE_SIZE in some obvious places. 2008-03-27 18:30:15 +00:00
ad
c743ad7159 File descriptor changes, discussed on tech-kern:
- Redo reference counting to be sane.  LWPs accessing files take a short
  term reference on the local file descriptor.  This is the most common
  case.  While a file is in a process descriptor table, a reference is
  held to the file.  The file reference count only changes during control
  operations like open() or close().  Code that comes at files from an
  unusual direction (i.e. foreign to the process) like procfs or sysctl
  takes a reference on the file (f_count), and not on a descriptor.

- Remove knowledge of reference counting and locking from most code that
  deals with files.

- Make the usual case of file descriptor lookup lockless.

- Make kqueue MP and MT safe. PR kern/38098, PR kern/38137.

- Fix numerous file handling bugs, and bugs in the descriptor code that
  affected multithreaded processes.

- Split descriptor system calls out into sys_descrip.c.

- A few stylistic changes: KNF, remove unused casts now that caddr_t is
  gone. Replace dumb gotos with loop control in a few places.

- Don't do redundant pointer passing (struct proc, lwp, filedesc *) unless
  the routine is likely to be inlined.  Most of the time it's about the
  current process.
2008-03-21 21:53:35 +00:00
ad
fb4dec8738 - Shrink 'struct file' to 60 bytes on 32-bit platforms.
- Align 'struct file' and 'struct filedesc' to CACHE_LINE_SIZE.
2008-02-06 21:51:36 +00:00
dsl
460b556c90 Move the prototype for do_posix_fadvise() somewhere useful. 2008-01-27 19:48:52 +00:00
martin
2e87d89112 Implement new version of posix_fadvise as a stub callinig the real
worker function, and compatibility stub doing the same with old argument
sturcture.
2008-01-27 16:16:50 +00:00
ad
58dc3540b0 Add fgetdummy/fputdummy: allocate and free dummy 'struct file' entries
to be used when traversing filehead.
2008-01-05 23:53:21 +00:00
dsl
8a62c0f2a5 Use FILE_LOCK() and FILE_UNLOCK() 2008-01-05 19:08:48 +00:00
ad
ea3f10f7e0 Merge more changes from vmlocking2, mainly:
- Locking improvements.
- Use pool_cache for more items.
2007-12-26 16:01:34 +00:00
dsl
7e2790cf6f Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
    int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
2007-12-20 23:02:38 +00:00
pooka
db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
ad
1cb3506898 Use atomics to adjust filedesc::fd_refcnt. 2007-11-29 18:17:47 +00:00
ad
6182ac0595 Use atomics to adjust cwdi_refcnt. 2007-11-29 18:15:14 +00:00
ad
d18c6ca4de Merge from vmlocking:
- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
2007-11-07 00:23:13 +00:00
ad
451aacda90 Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
2007-10-08 15:12:05 +00:00