Commit Graph

126 Commits

Author SHA1 Message Date
ad
d991fcb3b6 More changes to improve kern_descrip.c.
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
  It was only being used to synchronize close, and in any case we needed
  to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
  that we can eliminate the membar_consumer() call in fd_getfile().  This is
  mostly syntactic sugar; the main functional change is that fd_nfiles now
  lives alongside the open file array.

Some measurements with libmicro:

- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
2009-05-24 21:41:25 +00:00
yamt
706e6928e0 tweak some assertions on so_head to make them more meaningful. 2009-05-04 06:02:40 +00:00
yamt
bdeb01233f 0 -> NULL 2009-04-09 00:57:15 +00:00
yamt
e29a551d79 remove an unnecessary cast. 2009-04-09 00:44:32 +00:00
yamt
2c68552273 0 -> NULL where appropriate 2009-04-09 00:37:32 +00:00
mrg
9ba87b8cc3 completely rework the way that orphaned sockets that are being fdpassed
via SCM_RIGHTS messages are dealt with:

1. unp_gc: make this a kthread.

2. unp_detach: go not call unp_gc directly. instead, wake up unp_gc kthread.

3. unp_scan: do not close files here. instead, put them on a global list
   for unp_gc to close, along with a per-file "deferred close count". if
   file is already enqueued for close, just increment deferred close count.
   this eliminates the recursive calls.

3. unp_gc: scan files on global deferred close list. close each file N
   times, as specified by deferred close count in file. continue processing
   list until it becomes empty (closing may cause additional files to be
   queued for close).

4. unp_gc: add additional bit to mark files we are scanning. set during
   initial scan of global file list that currently clears FMARK/FDEFER.
   during later scans, never examine / garbage collect descriptors that
   we have not marked during the earlier scan. do not proceed with this
   initial scan until all deferred closes have been processed. be careful
   with locking to ensure no races are introduced between deferred close
   and file scan.

5. unp_gc: use dummy file_t to mark position in list when scanning. allow
   us to drop filelist_lock. in turn allows us to eliminate kmem_alloc()
   and safely close files, etc.

6. prohibit transfer of descriptors within SCM_RIGHTS messages if
   (num_files_in_transit > maxfiles / unp_rights_ratio)

7. fd_allocfile: ensure recycled filse don't get scanned.


this is 97% work done by andrew doran, with a couple of minor bug fixes
and a lot of testing by yours truly.
2009-03-11 06:05:29 +00:00
pooka
8e52b87ec3 Don't try to fd_putfile() descriptors we didn't manage to fd_getfile().
Fixes local DoS panic described in kern/40570.
2009-02-08 16:38:12 +00:00
pooka
7e5aba5af0 Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.
2008-10-11 13:40:57 +00:00
plunky
fd7356a917 Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
2008-08-06 15:01:23 +00:00
christos
5e865a2f1b Also enforce that cm->cmsg_len >= CMSG_ALIGN(sizeof cmsghdr), from
Michael van Elst
2008-06-20 15:27:50 +00:00
christos
db982eb35b Don't require cm->cmsg_len == control->m_len, just that the cm->cmsg_len
<= control->m_len, like FreeBSD does. Idea from Taylor R Campbell.
2008-06-20 15:18:38 +00:00
ad
06da5288fa There can be existing waiters on a socket's condition variables when we
change socket::so_lock, and they rely on the old lock to synchronize.
Wake them up whenever we change so_lock so they can restart their waits.
2008-06-10 11:49:11 +00:00
martin
ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad
2759896048 Add a comment. 2008-04-27 11:29:12 +00:00
ad
15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
mlelstv
77f5b73003 When unp_internalize fails (due to the sanity check or an out-of-memory
condition), it leaves the control message with file descriptors. Calling
unp_dispose() will interpret the message as containing file pointers
and crash the system.
This change removes unp_dispose() from this failure path and avoids
using goto to jump into switch statements...
The previous workaround to ignore such messages in unp_scan() is removed.
2008-04-20 07:47:18 +00:00
mjf
ede732e020 If cm->cmsg_len is not valid for unp_internalize do not use it to work out
where the data is in unp_scan.

Fixes PR/38391
2008-04-19 22:26:52 +00:00
ad
4bd84ff96a Prevent overlapping calls to bind() and/or connect() on a Unix socket. 2008-03-28 12:14:22 +00:00
yamt
9a4b7dd279 merge yamt-lazymbuf branch. 2008-03-24 12:24:37 +00:00
rmind
cbb7f92857 unp_gc: unlock filelist_lock in a case of restart. 2008-03-21 23:38:40 +00:00
ad
a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
ad
1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
dsl
8a62c0f2a5 Use FILE_LOCK() and FILE_UNLOCK() 2008-01-05 19:08:48 +00:00
pooka
db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
pooka
61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad
451aacda90 Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
2007-10-08 15:12:05 +00:00
dyoung
8cbfeac89a Make uipc_ctloutput() return ENOPROTOOPT instead of EINVAL when it
is passed a handle socket-option level that it does not care about.
2007-09-19 06:23:53 +00:00
he
fd961c4429 Add a new socket option for unix domain sockets: LOCAL_PEEREID, to make
it possible to get the pid, euid and egid of the process at the remote
end at the time it did bind() or connect().

Add a new libc function, getpeereid() to easily get at the euid and egid.
As a consequence, bump libc's minor number.

Document the LOCAL_PEEREID socket option in unix(4).

Based on contribution by Arne H. Juul, minor modifications by myself.
2007-08-09 15:23:01 +00:00
martin
d5d0a7225c PR kern/32842:
do not leak file descriptors when sending a datagram with SCM_RIGHTS
fails. Patch from Gary Thorpe, based on changes in FreeBSD and work
from Christian Biere.
2007-08-03 20:49:45 +00:00
dsl
b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
hannken
0adf7298aa Remove calls to now obsolete vn_start_write() and vn_finished_write(). 2007-04-03 16:11:31 +00:00
christos
53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
cbiere
7aa8c7d98c Pointing one element past an array is fine, pointing before it isn't. 2006-11-01 11:37:59 +00:00
christos
842f306745 use c99 initializers 2006-09-03 21:12:14 +00:00
ad
f474dceb13 Use the LWP cached credentials where sane. 2006-07-23 22:06:03 +00:00
elad
215bd95ba4 integrate kauth. 2006-05-14 21:15:11 +00:00
christos
ce9b807645 Coverity CID 1089: Add more KASSERTs to prevent NULL deref. 2006-04-14 23:15:21 +00:00
christos
ff57dc92a8 Coverity CID 1088: Add KASSERT to prevent NULL pointer deref. 2006-04-14 23:12:14 +00:00
matt
cda5c405e0 Add a KASSERT to document a condition for the PRU_ABORT case. 2006-04-13 04:58:31 +00:00
christos
a529c06750 PR/32856: Christian Biere: Don't panic if you send a control message with
SCM_RIGHTS on an unconnected stream socket.
2006-03-01 02:06:11 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
simonb
a21c456e2e Call nanotime() directly, instead of doing the
microtime()/TIMEVAL_TO_TIMESPEC() dance.
2005-11-11 07:07:42 +00:00
jmmv
b077bb7f72 Honor the user's umask while creating local sockets. Several other systems
do already this (such as FreeBSD, OpenBSD and Linux), so it will improve
portability of some third-party programs.  No objections in tech-kern@.
2005-08-30 15:03:04 +00:00
yamt
91fa31b5d2 uipc_usrreq: plug mbuf leak. 2005-06-16 14:36:42 +00:00
christos
efb6943313 - add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.
2005-05-29 22:24:14 +00:00
christos
761bd09636 PR/30154: YAMAMOTO Takashi: tcp_close locking botch
chgsbsize() as mentioned in the PR can be called from an interrupt context
via tcp_close(). Avoid calling uid_find() in chgsbsize().
- Instead of storing so_uid in struct socketvar, store *so_uidinfo
- Add a simple lock to struct uidinfo.
2005-05-07 17:42:09 +00:00
perry
da8abec863 nuke trailing whitespace 2005-02-26 21:34:55 +00:00
darrenr
02c34673a3 add a per-socket counter for dropped UDP packets when the internal buffers
are full.
2004-09-03 18:14:09 +00:00
jonathan
230fb9b8ab Eliminate several uses of `curproc' from the socket-layer code and from NFS.
Add a new explicit `struct proc *p' argument to socreate(), sosend().
Use that argument instead of curproc. Follow-on changes to pass that
argument to socreate(), sosend(), and (*so->so_send)() calls.
These changes reviewed and independently recoded  by Matt Thomas.

Changes to soreceive() and (*dom->dom_exernalize() from Matt Thomas:
pass soreceive()'s struct uio* uio->uio_procp to unp_externalize().
Eliminate curproc from unp_externalize.   Also, now soreceive() uses
its uio->uio_procp value, pass that same value downward to
((pr->pru_usrreq)() calls for consistency, instead of (struct proc * )0.

Similar changes in sys/nfs to eliminate (most) uses of curproc,
either via the req-> r_procp field of a struct nfsreq *req argument,
or by passing down new explicit struct proc * arguments.

Reviewed by: Matt Thomas, posted to tech-kern.
NB: The (*pr->pru_usrreq)() change should be tested on more (all!) protocols.
2004-05-22 22:52:13 +00:00
matt
ac57eb9d5b Constify sun_noname. 2004-04-18 22:20:32 +00:00