Commit Graph

181 Commits

Author SHA1 Message Date
rmind e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
dholland 8f6ed30d57 Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
2010-11-19 06:44:33 +00:00
pooka ff08ca3b05 Zero entire stat structure before filling in contents to avoid
leaking kernel memory -- the elements are no longer packed now that
dev_t is 64bit.

from pgoyette
2010-10-28 20:32:45 +00:00
chs 38b9dc3505 implement O_DIRECTORY as standardized in POSIX-2008,
for both native and linux emulations.
this fixes the rest of PR 43695.
2010-09-21 19:26:18 +00:00
pooka aeb7e802ec I'm not even going to describe this change. I'll just say that
churn creates interesting code.

Fixes open(O_CREAT|O_TRUNC) on at least tmpfs and nfs to not fail
with ENOENT due to a racy removal of the newly created file.

Caught, as most bugs these days are, by a test run.
2010-08-25 13:51:50 +00:00
hannken 2ac44efee2 Modify vn_lock():
- Take v_interlock before examining v_iflag
- Must always be called without v_interlock taken,
  LK_INTERLOCK flag is no longer allowed.
2010-07-28 09:30:21 +00:00
pooka c99c5d9840 Don't leak kernel stack into userspace. 2010-07-13 15:38:15 +00:00
hannken 1423e65b26 Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
2010-06-24 12:58:48 +00:00
hannken 2c090918c7 Remove the concept of recursive vnode locks by eliminating
vn_setrecurse(), vn_restorerecurse() and LK_CANRECURSE.
Welcome to 5.99.31

Discussed on tech-kern.
2010-06-18 16:29:01 +00:00
hannken 62bfdd2b21 Change layered file systems to always pass the locking VOP's down to the
leaf file system.  Remove now unused member v_vnlock from struct vnode.
Welcome to 5.99.30

Discussed on tech-kern.
2010-06-06 08:01:30 +00:00
pooka 0c20c076ce Enforce RLIMIT_FSIZE before VOP_WRITE. This adds support to file
system drivers where it was missing from and fixes one buggy
implementation.  The arguably weird semantics of the check are
maintained (v_size vs. va_bytes, overwrite).
2010-04-23 15:38:46 +00:00
pooka 242bf1c3e7 Stop exposing fifofs internals and leave only fifo_vnodeop_p visible. 2010-03-29 13:11:32 +00:00
pooka c3183f3251 The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase).  Plenty of mix'n match upper/lowercase has creeped
into the tree since then.  Nuke the macros and convert all callsites
to lowercase.

no functional change
2010-01-08 11:35:07 +00:00
dsl 2a54322c7b If a multithreaded app closes an fd while another thread is blocked in
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
2009-12-20 09:36:05 +00:00
dsl 7a42c833db Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
2009-12-09 21:32:58 +00:00
yamt 7e13bf31c7 remove FILE_LOCK and FILE_UNLOCK. 2009-05-17 05:54:42 +00:00
christos 86ba58fd64 Fix locking as Andy explained. Also fill in uid and gid like sys_pipe did. 2009-04-11 23:05:26 +00:00
ad c6367674d6 Add fileops::fo_drain(), to be called from fd_close() when there is more
than one active reference to a file descriptor. It should dislodge threads
sleeping while holding a reference to the descriptor. Implemented only for
sockets but should be extended to pipes, fifos, etc.

Fixes the case of a multithreaded process doing something like the
following, which would have hung until the process got a signal.

thr0	accept(fd, ...)
thr1	close(fd)
2009-04-04 10:12:51 +00:00
enami 7efd6dbe73 Make module (auto)loading under chroot envrionment actually work:
- NOCHROOT flag must be assigned to different bit from TRYEMULROOT
  since the code expected to be executed is in the else clase of
  if (flags & TRYEMULROOT).
- Necessary variables aren't set.
2009-02-11 00:19:11 +00:00
yamt cea19a4d14 malloc -> kmem_alloc. 2009-01-17 07:02:35 +00:00
ad 0efea177e3 Remove LKMs and switch to the module framework, pass 1.
Proposed on tech-kern@.
2008-11-12 12:35:50 +00:00
christos f4f1ff39a4 Writing 0 bytes on an O_APPEND file should not affect the offset 2008-08-27 06:28:09 +00:00
simonb 36d65f1138 Merge the simonb-wapbl branch. From the original branch commit:
Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
   journaling code.  Originally written by Darrin B. Jewell while
   at Wasabi and updated to -current by Antti Kantee, Andy Doran,
   Greg Oster and Simon Burge.

OK'd by core@, releng@.
2008-07-31 05:38:04 +00:00
ad 0d151db6c9 Don't needlessly acquire v_interlock. 2008-06-02 16:00:33 +00:00
ad 8ad974fae5 vn_marktext, vn_lock: don't needlessly acquire v_interlock. 2008-06-02 15:29:18 +00:00
ad 6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
ad a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
ad 3490efcc63 Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
2008-01-30 09:50:19 +00:00
ad 38076b60ea vn_setrecurse: if no lock is exported, use v_lock. Works around issue
described in PR kern/37808. The ideal solution here is to kill vnode
lock recursion, which should not be hard once it is understood what
the two remaining callers of vn_setrecurse() are doing.
2008-01-25 14:37:33 +00:00
ad 1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
pooka 625cda7eb5 vn_write: include f_advice in VOP_WRITE 2008-01-25 06:32:20 +00:00
dsl 8a62c0f2a5 Use FILE_LOCK() and FILE_UNLOCK() 2008-01-05 19:08:48 +00:00
ad 4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
pooka db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
hannken d556dc98b0 Fscow_run(): add a flag "bool data_valid" to note still valid data.
Buffers run through copy-on-write are marked B_COWDONE.  This condition
is valid until the buffer has run through bwrite() and gets cleared from
biodone().

Welcome to 4.99.39.

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2007-12-02 13:56:15 +00:00
yamt 8ec86368e9 - reduce the number of VOP_ACCESS calls for O_RDWR. for nfs, it reduces
the number of rpcs.
- reduce code duplication.
2007-11-30 16:52:19 +00:00
ad e9e11b98df Use atomics to maintain uvmexp.{anon,exec,file}pages. 2007-11-29 18:07:11 +00:00
pooka 61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad 7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad 451aacda90 Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
2007-10-08 15:12:05 +00:00
hannken 3856acafe2 Update the file system copy-on-write handler.
- Instead of hooking the handler on the specdev of a mounted file system
  hook directly on the `struct mount'.

- Rename from `vn_cow_*' to `fscow_*' and move to `kern/vfs_trans.c'.  Use
  `mount_*specific' instead of clobbering `struct mount' or `struct specinfo'.

- Replace the hand-made reader/writer lock with a krwlock.

- Keep `vn_cow_*' functions and mark as obsolete.

- Welcome to NetBSD 4.99.32 - `struct specinfo' changed size.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2007-10-07 13:38:53 +00:00
pooka 05ce20f4a0 Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
2007-07-22 19:16:04 +00:00
christos 09a50be501 - remove pathname_ interface.
- use macros to deal with pathnames in userspace, when veriexec is used.
- reorder the veriexec_ call arguments for consistency.
With help from elad@ finding the last bug.
2007-05-19 22:11:22 +00:00
dsl b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
hannken fc6776f366 Remove now obsolete vn_start_write() and vn_finished_write() and
corresponding flags.

Revert softdep_trackbufs() to its state before vn_start_write() was added.

Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.

Welcome to 4.99.17
2007-04-08 11:20:42 +00:00
hannken 0adf7298aa Remove calls to now obsolete vn_start_write() and vn_finished_write(). 2007-04-03 16:11:31 +00:00
ad c147748d84 - Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
2007-03-09 14:11:22 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
hannken 198beb0314 Make fstrans(9) the default helper for file system suspension.
Replaces the now obsolete vn_start_write()/vn_finished_write().
2007-02-16 17:23:53 +00:00
ad b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00