Commit Graph

573 Commits

Author SHA1 Message Date
rmind
ad12c77015 Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.
2012-02-19 21:05:51 +00:00
hannken
2cc7a01f10 Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock.  Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.
2011-10-14 09:23:28 +00:00
christos
ac55ba72e3 return the namemax from the bsd statvfs which is filesystem dependent, not
a random value.
2011-09-27 00:52:55 +00:00
christos
e2bebf7172 * Arrange for interfaces that create new file descriptors to be able to
set close-on-exec on creation (http://udrepper.livejournal.com/20407.html).

    - Add F_DUPFD_CLOEXEC to fcntl(2).
    - Add MSG_CMSG_CLOEXEC to recvmsg(2) for unix file descriptor passing.
    - Add dup3(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
    - Add pipe2(2) syscall with a flags argument for O_CLOEXEC, O_NONBLOCK.
    - Add flags SOCK_CLOEXEC, SOCK_NONBLOCK to the socket type parameter
      for socket(2) and socketpair(2).
    - Add new paccept(2) syscall that takes an additional sigset_t to alter
      the sigmask temporarily and a flags argument to set SOCK_CLOEXEC,
      SOCK_NONBLOCK.
    - Add new mode character 'e' to fopen(3) and popen(3) to open pipes
      and file descriptors for close on exec.
    - Add new kqueue1(2) syscall with a new flags argument to open the
      kqueue file descriptor with O_CLOEXEC, O_NONBLOCK.

* Fix the system calls that take socklen_t arguments to actually do so.

* Don't include userland header files (signal.h) from system header files
  (rump_syscallargs.h).

* Bump libc version for the new syscalls.
2011-06-26 16:42:39 +00:00
joerg
13011308e4 Explicitly initialize ucontext before calling getmcontext. 2011-02-03 21:45:31 +00:00
dholland
14402d0ff1 Abolish the SAVENAME and HASBUF flags. There is now always a buffer,
so the path in a struct componentname is now always valid during VOP
calls.
2010-11-30 10:43:01 +00:00
dholland
d4eb05390d Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
2010-11-30 10:29:57 +00:00
dholland
8f6ed30d57 Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
2010-11-19 06:44:33 +00:00
chs
33fa5ccbbf many changes for COMPAT_LINUX:
- update the linux syscall table for each platform.
 - support new-style (NPTL) linux pthreads on all platforms.
   clone() with CLONE_THREAD uses 1 process with many LWPs
   instead of separate processes.
 - move the contents of sys__lwp_setprivate() into a new
   lwp_setprivate() and use that everywhere.
 - update linux_release[] and linux32_release[] to "2.6.18".
 - adjust placement of emul fork/exec/exit hooks as needed
   and adjust other emul code to match.
 - convert all struct emul definitions to use named initializers.
 - change the pid allocator to allow multiple pids to refer to the same proc.
 - remove a few fields from struct proc that are no longer needed.
 - disable the non-functional "vdso" code in linux32/amd64,
   glibc works fine without it.
 - fix a race in the futex code where we could miss a wakeup after
   a requeue operation.
 - redo futex locking to be a little more efficient.
2010-07-07 01:30:32 +00:00
rmind
3c507045e2 Remove pfind() and pgfind(), fix locking in various broken uses of these.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour.  Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().

COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.

Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).

Fixes PR/43176.
2010-07-01 02:38:26 +00:00
hannken
1423e65b26 Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
2010-06-24 12:58:48 +00:00
he
7a30544200 When implementing "read directory", when there are too many empty entries
in a row, and we need to try to read the next block, and have passed a
non-NULL cookie pointer to VOP_READDIR, ensure that we free the cookie
buffer before re-doing VOP_READDIR, so that we don't leak memory.
This fix is similar to nfs_serv.c revisions 1.115 + 1.124.

This should fix the long-standing problem observed by e.g. using Linux-
emulated programs to take backup of servers, which is one of the problems
which were reported in PR#42661.

Thanks to pooka@ for the hints for traversing the VOP* layer.
2010-03-03 08:20:38 +00:00
drochner
3f900515e4 add missing glue file, otherwise the emulation will not work if
compiled into the kernel (and the module loader will load another
instance, causing symbol duplication)
2010-02-14 11:54:03 +00:00
dsl
2a54322c7b If a multithreaded app closes an fd while another thread is blocked in
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
2009-12-20 09:36:05 +00:00
matt
15aa4c53c9 Regen (new makesyscalls.sh) 2009-12-14 00:53:32 +00:00
matt
6a9e4e8eeb Change u_long to vaddr_t/vsize_t in exec code where appropriate (mostly
involves setregs and vmcmds).  Should result in no code differences.
2009-12-10 14:13:48 +00:00
dsl
7a42c833db Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
2009-12-09 21:32:58 +00:00
rmind
eaddd78061 Use lwp_getpcb() in compat code, clean from struct user. 2009-11-23 00:46:06 +00:00
rafal
0c8283fe12 Fix fallout from do_sys_wait changes (hi, rmind!) 2009-11-05 18:39:38 +00:00
rmind
4c1098f541 do_sys_wait(): fix previous by checking for ru != NULL. Noticed by
Onno van der Linden.  Also, remove redundant arguments (seems that
was_zombie was not used since rev 1.177 ?).
2009-11-04 21:23:02 +00:00
haad
5200b9b492 Add enum uio_seg argument to do_sys_mknod and do_sys_mkdir so these functions
can be called from kernel, too.

Change needed for zfs device node creation, until we have propoer devfs.

Oked by ad@.
2009-08-09 22:49:00 +00:00
ad
d991fcb3b6 More changes to improve kern_descrip.c.
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
  It was only being used to synchronize close, and in any case we needed
  to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
  that we can eliminate the membar_consumer() call in fd_getfile().  This is
  mostly syntactic sugar; the main functional change is that fd_nfiles now
  lives alongside the open file array.

Some measurements with libmicro:

- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
2009-05-24 21:41:25 +00:00
ad
c6367674d6 Add fileops::fo_drain(), to be called from fd_close() when there is more
than one active reference to a file descriptor. It should dislodge threads
sleeping while holding a reference to the descriptor. Implemented only for
sockets but should be extended to pipes, fifos, etc.

Fixes the case of a multithreaded process doing something like the
following, which would have hung until the process got a signal.

thr0	accept(fd, ...)
thr1	close(fd)
2009-04-04 10:12:51 +00:00
mrg
fcc023545e - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes.  this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS.  (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information.  (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897.  it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
2009-03-29 01:02:48 +00:00
dsl
82357f6d42 ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
2009-03-14 21:04:01 +00:00
cegger
e6ae80b89d buildfix: re-adapt to minor() returning a 32bit value again. 2009-01-22 16:10:19 +00:00
pooka
09ba2d6689 Regen to prove I didn't screw up the conversion: purely RCSID changes. 2009-01-13 22:33:11 +00:00
pooka
a9a2ce837b Convert the syscalls.master to a format from which it is easier
to parse and generate the compat name and basename (e.g. __stat50
and stat).  Use this to autogenerate __RENAME()'s to the rump_syscalls
header so that they can be called e.g. rump_sys_socket() instead
of rump_sys___socket30().
2009-01-13 22:27:43 +00:00
pooka
ef4e11410c regen 2009-01-13 22:08:59 +00:00
pooka
4180a41378 UNIMPL police 2009-01-13 22:08:30 +00:00
cegger
378bd0b618 make this compile 2009-01-11 10:47:37 +00:00
christos
461a86f9bd merge christos-time_t 2009-01-11 02:45:45 +00:00
ad
e157b99c7c Regen. 2008-11-19 18:39:43 +00:00
ad
92ce8c6a3d Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
2008-11-19 18:35:57 +00:00
wrstuden
fc7511b00e Merge wrstuden-revivesa into HEAD. 2008-10-15 06:51:17 +00:00
matt
7408df1239 Change {ff,fd}_exclose and ff_allocated to bool. Change exclose arg to
fd_dup to bool.  Switch assignments from 1/0 to true/false.

This make alpha kernels compile.  Bump kern to 4.99.69 since structure
changed.
2008-07-02 16:45:19 +00:00
ad
a00bd89dab Replace references to getsock/getvnode. 2008-06-24 11:18:14 +00:00
dogcow
8b20d17adf Regen. 2008-06-18 02:09:19 +00:00
dogcow
f76aab4a72 include sys/sched.h for cpuset-related build lossage 2008-06-18 02:08:36 +00:00
martin
b46765907d Minor typo in license 2008-05-10 11:49:37 +00:00
martin
ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad
284c2b9aef Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
2008-04-24 18:39:20 +00:00
ad
6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
ad
15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
ad
dd55791531 Regen. 2008-04-23 14:10:03 +00:00
ad
7cb840338e -SYCALL_MPSAFE 2008-04-23 14:07:49 +00:00
ad
f1b58cdc9b Sprinkle locks. 2008-04-23 13:58:06 +00:00
ad
98c2709959 Sprinkle locking. 2008-04-23 13:51:48 +00:00
ad
dc31d7f7e9 sys_kill -> sys__lwp_kill 2008-04-23 13:49:21 +00:00
ad
be04ac4896 Make rusage collection per-LWP and collate in the appropriate places.
cloned threads need a little bit more work but the locking needs to
be fixed first.
2008-03-27 19:06:51 +00:00