Commit Graph

942 Commits

Author SHA1 Message Date
christos 3bccc0f766 Handle files with a large number of mappings gracefully. Reported by Nicholas
Joly.
2008-07-25 17:40:24 +00:00
rmind 160268aca6 Remove proc_representative_lwp(), use a simple LIST_FIRST() instead.
OK by <ad>.
2008-07-02 19:49:58 +00:00
rumble 28f5ebd853 Create sysctl entries during module initialisation and destroy them
appropriately.

Many of these file systems are now ready for modularisation.
2008-06-28 01:34:05 +00:00
ad a8ced4f0d0 Set up the sysctl tree correctly when loaded as a file system. 2008-06-24 11:25:05 +00:00
ad a00bd89dab Replace references to getsock/getvnode. 2008-06-24 11:18:14 +00:00
ad 06c343ac94 vm_page: put TAILQ_ENTRY into a union with LIST_ENTRY, so we can use both. 2008-06-04 12:41:40 +00:00
ad 736a4d9b78 Kill devsw_lock and just use specfs_lock. The two would need merging
in order to prevent unload of modules when a device that they provide
is still open.
2008-05-31 21:34:42 +00:00
hannken 5d2bff060a Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write.  Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn().  If set the caller
  intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
  may clear the buffer and runs copy-on-write.  Process possible errors
  from getblk() or fscow_run().  Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-05-16 09:21:59 +00:00
reinoud e979c658c9 Import writing part of the UDF file system making optical media like CD's
and DVD's behave like floppy discs. Writing is supported upto and including
version 2.01; version 2.50 and 2.60 will follow.

Also extending the UDF implementation to support symbolic links and
hardlinks.

Added are the mmcformat(8) tool to format rewritable CD/DVD discs and
newfs_udf(8).

Limitations:
        all operations can be performed on the file system though the
        sheduling is currently optimised for archiving workloads.

        mv(1)/rename(2) is currently only implemented for non-directories.
2008-05-14 16:49:47 +00:00
simonb 2fd5130380 mnt_data is a pointer, set it to NULL not 0 when we're finished with it. 2008-05-13 08:31:12 +00:00
rumble a1221b6d4a Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
2008-05-10 02:26:09 +00:00
ad 42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
ad 928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
ad e3610f1886 kern/38135 vfs_busy/vfs_trybusy confusion
The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
  unmounting the file system. Simple contention on mnt_lock doesn't
  cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
2008-04-29 23:51:04 +00:00
ad baa3395f8f PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
2008-04-29 18:18:08 +00:00
martin ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad 284c2b9aef Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
2008-04-24 18:39:20 +00:00
ad 6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
ad ef9411cb09 Fix locking in the fifo kqueue routines. 2008-04-24 15:18:11 +00:00
ad 15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
hannken 0789b071d1 Remove a race when pages are released while waiting for fstrans_start().
Fixes PR #38460
2008-04-19 11:53:13 +00:00
hannken dc04f63f5b Remove stale include <sys/fstrans.h>. 2008-04-19 11:49:54 +00:00
ad a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
yamt 29e0fd1c9e sprinkle KERNEL_LOCK for socket.
a little different version was tested by Matthias Drochner.
2008-02-11 23:53:32 +00:00
ad d7f6ec471c Don't lock the socket to set/clear FNONBLOCK. Just set it atomically. 2008-02-06 21:57:53 +00:00
ad 22c6a20ebd Lock v_knlist with the vnode interlock. PR kern/37881. 2008-02-05 14:19:52 +00:00
ad 25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
ad 3490efcc63 Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
2008-01-30 09:50:19 +00:00
dholland 764ffd05f0 Part of the rename patches *doh* 2008-01-28 15:17:54 +00:00
dholland 717e1785a5 Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
2008-01-28 14:31:15 +00:00
hannken 5ab6217754 Spec_open(): clear sd_bdevvp if bdev_open() failed.
Ok: Andrew Doran <ad@netbsd.org>
2008-01-25 16:21:04 +00:00
riz 960857eb6d Since VOP_LEASE is gone, remove genfs_lease_check() too. Now my kernel
builds again.  :)
2008-01-25 15:34:59 +00:00
ad 1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
ad f9a31c8cd0 spec_fsync: don't assert that 'vp' holds the block device open. If it's
not open, there shouldn't be dirty buffers so vinvalbuf() is harmless.
2008-01-24 21:05:52 +00:00
ad 703069c0e9 specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
  vnode can describe a block device. Instead, prohibit concurrent opens of
  block devices. As a bonus remove the unreliable code that prevents
  multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
  goes away, instead of abusing vnode::v_usecount to tell if the device is
  open.
2008-01-24 17:32:52 +00:00
ad 27c0e63a2a layer_node_find: if we find a node being cleaned out, then ignore it and
continue.  A thread trying to clean out the extant layer vnode needs to
acquire the shared lock (i.e. the lower vnode's lock), which our caller
already holds. To allow the cleaning to succeed the current thread must make
progress.  So, for a brief time more than one vnode in a layered file system
may refer to a single vnode in the lower file system.
2008-01-23 20:11:32 +00:00
elad c27d5f30b6 Tons of process scope changes.
- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
    requests, and add specific requests for set/get scheduler policy and
    set/get scheduler parameters.

  - Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
    requests.

  - Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.

  - Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
    process information is being looked at (entry itself, args, env,
    open files).

  - Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.

  - Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.

  - Make bsd44 secmodel code handle the newly added rqeuests appropriately.

All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.

  - Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.

Discussed with christos@ and yamt@.
2008-01-23 15:04:38 +00:00
pooka f7455b20d9 portal_advlock: badop -> eopnotsupp. I guess advlock can be called
for the root vnode and badop panics.

fix in PR kern/25393 by Laurent Sartran
2008-01-19 21:54:47 +00:00
yamt 93a915eb7a genfs_do_putpages: DEBUG checks. 2008-01-18 11:01:23 +00:00
yamt 36c701bcd4 genfs_do_putpages: ensure that we clean the vnode in the case of PGO_RECLAIM. 2008-01-18 11:00:53 +00:00
yamt 2b40f35040 push pmap_clear_reference calls into pdpolicy code, where reference bits
actually matter.
2008-01-18 10:48:23 +00:00
ad 4eb2a42ae6 Fix v_freelisthd assertion failure during call to vdevdone(). No calling
VOPs without a vnode reference!
2008-01-17 17:28:54 +00:00
ad 4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
ad ea3f10f7e0 Merge more changes from vmlocking2, mainly:
- Locking improvements.
- Use pool_cache for more items.
2007-12-26 16:01:34 +00:00
yamt 2294b0bcb6 procfs_douptime: simply use microuptime() instead of a mysterious calculation. 2007-12-22 01:06:54 +00:00
yamt 0d13423925 procfs_docpustat: g/c a write-only variable. 2007-12-22 01:04:55 +00:00
dyoung 6528dd9d56 Bug fix: at the top of layer_bypass(), save a pointer to the mount
point for re-use at the bottom, instead of trying to re-read the
mount point from a potentially vrele()'d vnode.
2007-12-22 00:48:46 +00:00
christos 177940c72e use vnode_to_path. 2007-12-15 23:52:00 +00:00
pooka db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00