Commit Graph

1275 Commits

Author SHA1 Message Date
yamt cead2083b6 fix a botch in PRIxVADDR change (rev.1.38) 2011-07-26 13:18:55 +00:00
hannken 68ad0cad04 Layer_fsync(): when syncing a device node call spec_fsync() to clean the
layer node before descending to the lower file system.

Adresses PR kern/38762 panic: vwakeup: neg numoutput
2011-07-11 08:34:01 +00:00
hannken 49511bba25 Change VOP_BWRITE() to take a vnode as its first argument like all other
VOPs do.  Layered file systems no longer have to modify bp->b_vp and run
into trouble when an async VOP_BWRITE() uses the wrong vnode.

- change all occurences of VOP_BWRITE(bp) to VOP_BWRITE(bp->b_vp, bp).
- remove layer_bwrite().
- welcome to 5.99.55

Adresses PR kern/38762 panic: vwakeup: neg numoutput

No objections from tech-kern@.
2011-07-11 08:27:37 +00:00
christos 281288535e From Aleksey Cheusov: Don't make it easy for compromised systems to bypass
ASLR protections by providing the mapping addresses of programs to everyone.
2011-06-23 17:06:38 +00:00
rmind e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
hannken 64ee4fdf9f Remove no longer needed flag FSYNC_VFS /* fsync: via FSYNC_VFS() */. 2011-04-27 09:46:27 +00:00
hannken 87522af425 Change vflushbuf() to return an error if a synchronous write fails.
Welcome to 5.99.51.
2011-04-26 11:32:38 +00:00
matt 2420b39245 Move some #ifdefs to prevent a code path change when DEBUG .vs. !DEBUG
Solves problem an assert firing when using NFS on MIPS.
2011-04-21 06:27:17 +00:00
rmind 23e8e926b4 G/C unused speedup_syncer() mechanism and thus simplify some code.
Update some comments to reflect the reality.  No actual changes to
the (used) syncer logic.

OK ad@
2011-04-18 15:53:04 +00:00
rmind c71a09f0c6 - Use offsetof() in VOPARG_OFFSETOF() instead of re-implementing it.
- Remove VDESC_NOMAP_VPP and VDESC_VPP_WILLRELE.
- Remove VRELEL_NOINACTIVE and VRELEL_ONHEAD.
2011-04-03 01:19:35 +00:00
bouyer 063f96f3c2 merge the bouyer-quota2 branch. This adds a new on-disk format
to store disk quota usage and limits, integrated with ffs
metadata. Usage is checked by fsck_ffs (no more quotacheck)
and is covered by the WAPBL journal. Enabled with kernel
option QUOTA2 (added where QUOTA was enabled in kernel config files),
turned on with tunefs(8) on a per-filesystem
basis. mount_mfs(8) can also turn quotas on.

See http://mail-index.netbsd.org/tech-kern/2011/02/19/msg010025.html
for details.
2011-03-06 17:08:10 +00:00
joerg 48717cfc00 Refactor ps_strings access. Based on PK_32, write either the normal
version or the 32bit compat layout in execve1. Introduce a new function
copyin_psstrings for reading it back from userland and converting it to
the native layout. Refactor procfs to share most of the code with the
kern.proc_args sysctl handler.

This material is based upon work partially supported by
The NetBSD Foundation under a contract with Joerg Sonnenberger.
2011-03-04 22:25:24 +00:00
hannken 11f5c95248 Layer_revoke(): change previous to always take an extra reference on the
lower vnode before passing down the VOP_REVOKE().  This way VOP_REVOKE()
on a layered file system always inactivates and closes the lower vnode.

Should finally fix PR kern/43456.
2011-01-13 10:28:38 +00:00
hannken b89d0815aa Add layer_revoke() that adjusts the lower vnode use count to be at least as
high as the upper vnode count before passing down the VOP_REVOKE().

This way vclean() check for active (vp->v_usecount > 1) vnodes gets it right.

Should fix PR kern/43456.
2011-01-10 11:11:03 +00:00
hannken 111bde084e layer_inactive: With specnodes introduced during vmlocking2
it is safe to cache device nodes.

Tested with nullfs only as unionfs with device nodes panics.
2011-01-02 10:38:02 +00:00
hannken 53b57e3385 Extend the range of fstrans transactions to a sequence of vnode operations
on a locked vnode.  This leaves a suspended file system and therefore a
snapshot with either all or no operations of such a sequence done.
2010-12-27 18:49:42 +00:00
matt 6a66466f0c Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits.  Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.
2010-12-20 00:25:23 +00:00
yamt 5d63916ff3 do minimal locking to make assertions like KASSERT(VOP_ISLOCKED(vp)) happy. 2010-12-17 22:03:00 +00:00
uebayasi 3c4c042ea9 Correct an assertion; pointed out by mrg@ and pooka@, thanks. 2010-12-06 10:22:43 +00:00
hannken c1e6ef0c6b genfs_do_putpages(): When testing an uobject for dirty or modified
pages skip uninitialized (PG_FAKE) pages (DEBUG only).
2010-12-03 08:42:14 +00:00
hannken bd8f6f0b8f Always take the object lock before changing vmpage flags. Fixes a deadlock
where a thread is waiting on "genput" but the page in question is neither
BUSY nor WANTED.

No objections from tech-kern@.
2010-11-30 10:55:25 +00:00
dholland 14402d0ff1 Abolish the SAVENAME and HASBUF flags. There is now always a buffer,
so the path in a struct componentname is now always valid during VOP
calls.
2010-11-30 10:43:01 +00:00
dholland d4eb05390d Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
2010-11-30 10:29:57 +00:00
dholland 8f6ed30d57 Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
2010-11-19 06:44:33 +00:00
uebayasi 0ce666ede0 Whitespace. 2010-11-19 05:38:10 +00:00
hannken 1c9818e8f4 Genfs_getpages(): Break a deadlock where one thread runs VOP_GETPAGES(),
has busy pages and wants the wapbl lock as reader from wapbl_begin(),
another thread has the wapbl lock as reader and waits for a page from
the first thread.  Now a third thread calls wapbl_flush() and wants the
wapbl lock as writer.

Move the wapbl_begin() up to a point where genfs_getpages() has no busy
pages yet.
2010-11-09 16:31:48 +00:00
uebayasi c87cbe9fca genfs_getpages: restore vm_page array correctly in PGO_LOCKED error
code path.
2010-11-03 04:32:50 +00:00
jym b8a7885350 Use PRIxVADDR to print vaddr_t elements. Wrap lines. 2010-09-15 21:37:35 +00:00
chs fca58884f4 replace the earlier workaround for PR 40389 with a better fix.
the earlier change caused data corruption by freeing pages
without invaliding their mappings.  instead of the trylock/retry,
just take the genfs-node lock before calling VOP_GETPAGES()
and pass a new flag to tell it that we're already holding this lock.
2010-09-01 16:56:19 +00:00
pgoyette 23d5409e7e Update the rest of the kernel to conform to the module subsystem's new
locking protocol.
2010-08-21 13:19:39 +00:00
pooka 6cd7b7a7ca print more info in the "past eof" panic 2010-08-19 02:10:02 +00:00
chs e15697fcb4 in genfs_getpages(), mark the vnode dirty (ie. add to syncer worklist
and set VI_WRMAPDIRTY) after we have busied the pages rather than
before.  this prevents other threads calling genfs_do_putpages() from
marking the vnode clean again while we're in the process of creating
new writable mappings, since such threads will wait for the page(s) to
become unbusy before proceeding.
fixes the problem recently reported by hannken@ on tech-kern.
2010-08-08 18:17:11 +00:00
hannken c84e81cad1 Add vm page flag PG_MARKER and use it to tag dummy marker pages
in genfs_do_putpages() and uao_put().
Use 'v_uobj.uo_npages' to check for an empty memq.
Put some assertions where these marker pages may not appear.

Ok: YAMAMOTO Takashi <yamt@netbsd.org>
2010-07-29 10:54:50 +00:00
hannken fb62bef947 Make holding v_interlock mandatory for callers of vget().
Announced some time ago on tech-kern.
2010-07-21 17:52:09 +00:00
hannken 1664eae7f3 Using vfinddev() leads to vnode races as it returns an unreferenced
vnode that may disappear before the caller has a chance to reference it.

Reference the vnode while the specfs cache is locked.

Welcome to 5.99.37.

No objections on tech-kern.
2010-07-21 09:06:37 +00:00
hannken 3b6c9000bf Use a kmutex to protect the hash chains and always take this mutex
before removing a node from the hash chain.

Release the hash list lock before calling getnewvnode() and check the
hash list again like other file systems do.

Take v_interlock before calling vget().
2010-07-16 10:41:12 +00:00
hannken 7296ba383a Replace vget() with vref()/vn_lock(), this node already has a reference. 2010-07-09 08:10:50 +00:00
hannken 028129a7b8 LK_INTERLOCK is no longer a valid flag for VOP_LOCK(). This makes
layer_*lock*() obsolete.  Remove them and handle lock operations
with the generic bypass function.

Ride 5.99.34.
2010-07-02 08:09:51 +00:00
hannken c2de422c87 LK_INTERLOCK is no longer a valid flag for VOP_LOCK(). 2010-07-02 07:56:46 +00:00
rmind 9727219460 Slightly clean-up layerfs and nullfs: update the big description more to
the reality (remove duplicate one in nullfs, merge some differences from
it), KNF, improve and update some comments, add few KASSERT()s, remove
unused declarations, avoid double inclusion of headers, misc.

No functional changes.
2010-07-02 03:16:00 +00:00
hannken 245651a23d Remove vlockmgr(). Generic vnode lock operations now use a rwlock located
in the vnode.  All LK_* flags move from sys/lock.h to sys/vnode.h.  Calls
to vlockmgr() in file systems get replaced with VOP_LOCK() or VOP_UNLOCK().

Welcome to 5.99.34.

Discussed on tech-kern.
2010-07-01 13:00:54 +00:00
rmind 3c507045e2 Remove pfind() and pgfind(), fix locking in various broken uses of these.
Rename real routines to proc_find() and pgrp_find(), remove PFIND_* flags
and have consistent behaviour.  Provide proc_find_raw() for special cases.
Fix memory leak in sysctl_proc_corename().

COMPAT_LINUX: rework ptrace() locking, minimise differences between
different versions per-arch.

Note: while this change adds some formal cosmetics for COMPAT_DARWIN and
COMPAT_IRIX - locking there is utterly broken (for ages).

Fixes PR/43176.
2010-07-01 02:38:26 +00:00
hannken 1423e65b26 Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
2010-06-24 12:58:48 +00:00
hannken e8d576583d genfs_nolock(): LK_INTERLOCK flag no longer possible. 2010-06-24 10:39:35 +00:00
hannken f6c438ba23 Clean up vnode lock operations:
- VOP_LOCK(vp, flags): Limit the set of allowed flags to LK_EXCLUSIVE,
   LK_SHARED and LK_NOWAIT.  LK_INTERLOCK is no longer allowed as it
   makes no sense here.

- VOP_ISLOCKED(vp): Remove the for some time unused return value
  LK_EXCLOTHER.  Mark this operation as "diagnostic only".
  Making a lock decision based on this operation is no longer allowed.

Discussed on tech-kern.
2010-06-24 07:54:46 +00:00
hannken f9768510ca Procfs_lookup() does not lookup directory descriptors in the fd/
subdirectory.  There is no need for recursive vnode locking here.

Ok: Christos Zoulas <christos@netbsd.org>
2010-06-08 08:24:16 +00:00
hannken 62bfdd2b21 Change layered file systems to always pass the locking VOP's down to the
leaf file system.  Remove now unused member v_vnlock from struct vnode.
Welcome to 5.99.30

Discussed on tech-kern.
2010-06-06 08:01:30 +00:00
ahoka d733b7a884 Revert my last change, it's not The Right Thing [tm]. 2010-04-13 11:54:43 +00:00
ahoka 66b5bc59ed Autoload modules with any class.
This fixes autoloading of pf, zfs and possibly others.
2010-04-13 01:15:56 +00:00
mlelstv c243552ba3 The *_modcmd functions use the module name as prefix. 2010-04-11 10:26:25 +00:00
pooka 024c6fe985 Make module name match MOUNT_NAME. Inspired by PR kern/43110. 2010-04-11 06:36:25 +00:00
jld 06c3397342 Change the nullfs module's actual name to "null", to match the name
it's installed under and the name of the filesystem.

Fixes PR kern/43110.
2010-04-10 18:14:54 +00:00
pooka 1a992b2715 Call VOP_ABORTOP in genfs_eopnotsupp. This prevents file system
authors from having to get down on their knees and pray they won't
get POGA'd(*) again.

This plugs componentname leaks in at least smbfs and buggy puffs
servers (buggy servers shouldn't be able to leak kernel memory).

*) principle of greatest astonishment
2010-04-08 15:56:26 +00:00
christos 46a93244b5 starttime needs to be time_t (Izumi Tsutsui) 2010-04-02 19:25:21 +00:00
pooka 19599ea184 If msgbuf is not enabled, do not report the node in readdir. That
way ls -l won't report funny errors because getattr for a readdir
result fails.

XXX: lookup for msgbuf still succeeds even if not enabled
2010-03-31 01:27:05 +00:00
pooka 242bf1c3e7 Stop exposing fifofs internals and leave only fifo_vnodeop_p visible. 2010-03-29 13:11:32 +00:00
pooka 57fb3b92e2 Access fifoinfo only when it's non-NULL. 2010-03-27 02:33:11 +00:00
pooka a8ae7feaa2 You have found a scroll of genocide --More--
What class of monsters do you wish to genocide? --More--
> fs_foo.h
Wiped out all fs_foo.h
2010-03-03 01:26:01 +00:00
uebayasi 1b9d02ce0c Reduce the diff between genfs_getpages() and genfs_do_io(). These should be
merged eventually.
2010-01-30 12:06:20 +00:00
uebayasi 64cb3c884a Slightly more descriptive local variable names. 2010-01-30 05:19:20 +00:00
uebayasi 53000cec23 genfs_getpages: Narrow & clarify the context where I/O happens & vmobjlock is dropped. 2010-01-29 04:36:20 +00:00
uebayasi f4e16ac91b genfs_getpages: Redo previous with a better goto label. 2010-01-29 04:33:37 +00:00
uebayasi 29f5c078cb Revert part which variable initializations within interleaved gotos.
again:	if (...) goto err;
	void *ptr = alloc();
	if (...) goto again;
	if (...) goto err1;
	...
err1:	if (ptr) free(ptr);
err:
	return;

This leaks memory if exited with "goto again; -> goto err;".
2010-01-28 14:25:17 +00:00
uebayasi 9fa66d7a3f genfs_getpages: More constification & localization. 2010-01-28 13:43:53 +00:00
uebayasi a0629265f2 genfs_getpages: Constify 2 variables, move one. No functional changes. 2010-01-28 08:20:00 +00:00
uebayasi bb4b25cfbc genfs_getpages: Constify orignpages. Don't override its meaning by the value
re-calucated from GOP_SIZE(GOP_SIZE_MEM), but assign another variable
(orignmempages).
2010-01-28 08:02:12 +00:00
uebayasi b0b6ddc39d Unbreak modules build. 2010-01-28 07:49:08 +00:00
uebayasi 680e7444ba genfs_getpages: Constify & localize more variables. 2010-01-28 07:44:54 +00:00
uebayasi 1a2a3af3da genfs_getpages: Move local variable declarations that are used only for I/O
to where they're used.  This helps to track what's going in this lengthy
function.
2010-01-28 07:38:32 +00:00
uebayasi 64e0246a73 genfs_getpages: Localize a few more variables. 2010-01-28 07:26:25 +00:00
uebayasi 6903a05402 genfs_putpages: Localize a few variables. No functional changes. 2010-01-28 07:24:55 +00:00
uebayasi a75c80a070 Use genfs_node_*lock(). 2010-01-27 15:53:06 +00:00
uebayasi e0f79090b7 Don't forget to tell the result of rw_tryenter(). 2010-01-27 15:52:31 +00:00
uebayasi 2372674c71 Constify some pointers in genfs_getpages() and genfs_do_putpages(). 2010-01-27 15:24:54 +00:00
uebayasi b9bfa07443 Add genfs_node_rdtrylock(). 2010-01-27 15:18:40 +00:00
njoly d343885518 Remove unneeded strlen() call in KFShostname case. 2010-01-22 22:46:00 +00:00
pooka c3183f3251 The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase).  Plenty of mix'n match upper/lowercase has creeped
into the tree since then.  Nuke the macros and convert all callsites
to lowercase.

no functional change
2010-01-08 11:35:07 +00:00
uebayasi 1f7c235131 gimpy invented PRIxVADDR format specifier. 2009-12-14 13:00:07 +00:00
pooka 70d4493c77 Remove the portalfs kernel file system driver. Replace mount_portal(8)
with a version based on puffs.  User functionality remains the same.
2009-12-05 20:11:01 +00:00
pooka 1643f3a7a1 Introduce genfs_statvfs() as pretty much a no-info statvfs and
convert several pseudo file systems to use it.
2009-11-30 10:59:19 +00:00
roy fab5d12590 Allow chown if caller is in the new group. 2009-11-20 13:42:43 +00:00
pooka fb54d5c528 Disallow chown for files the caller does not own. 2009-11-20 13:19:46 +00:00
elad 1570e68c40 - Move kauth_init() a little bit higher.
- Add spec_init() to authorize special device actions (and passthru too for
  the time being). Move policy out of secmodel_suser.
2009-11-14 18:36:56 +00:00
rmind 40cf6f3659 Remove uarea swap-out functionality:
- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code.  Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
2009-10-21 21:11:57 +00:00
dholland a501df5ab8 Avoid leaking pages. Fixes PR 42053 from SHIMIZU Ryo. 2009-10-19 01:25:29 +00:00
elad 756638cf95 Factor out a block of code that appears in three places (Veriexec, keylock,
and securelevel) so that others can use it as well.
2009-10-06 04:28:10 +00:00
tsutsui 445e8226bb Put workaround fix for LOCKDEBUG panic mentioned in PR kern/41078:
Don't try to load a driver module if the driver is already exist but just
 not attached. [bc]dev_open() could return ENXIO even if the driver exists.

XXX: Maybe this should be handled by helper functions for
XXX: module_autoload() calls on demand.
2009-10-04 06:23:58 +00:00
elad 51f0d6a0eb Put procfs policy back in the subsystem. 2009-10-02 23:00:02 +00:00
pooka 11281f01a0 Replace a large number of link set based sysctl node creations with
calls from subsystem constructors.  Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL
2009-09-16 15:23:04 +00:00
pooka 288dd7d670 Get rid of dependency on M_UFSMNT. Since we need storage only for
one pointer, simply hang that off of mnt_data instead of allocating
storage.
2009-07-31 19:47:47 +00:00
pooka 2ebc149961 Do a name-based search for the ctty major instead of requiring an
external symbol.
2009-07-31 18:50:58 +00:00
pooka af1b79236a Instead of reporting some random "files used/free" figures for the
process doing statvfs(!), just report 0.  The code had some kernel
panicking bug after the descriptor code update, the functionality
is more like a bunny rabbit hat than anything useful, and I can't
bother to figure out what the invariants in the new descriptor code
are.

fixes PR kern/41534 and kern/41786
2009-07-31 18:44:58 +00:00
elad 009f5d2f88 Where possible, extract the file-system's access() routine to two internal
functions: the first checking if the operation is possible (regardless of
permissions), the second checking file-system permissions, ACLs, etc.

Mailing list reference:

	http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005311.html
2009-07-03 21:17:40 +00:00
elad 870920260d Move the implementation of vaccess() to genfs_can_access(), in line with
the other routines of the same spirit.

Adjust file-system code to use it.

Keep vaccess() for KPI compatibility and to keep element of least
surprise. A "diagnostic" message warning that vaccess() is deprecated will
be printed when it's used (obviously, only in DIAGNOSTIC kernels).

No objections on tech-kern@:

	http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005310.html
2009-06-23 19:36:38 +00:00
ad d991fcb3b6 More changes to improve kern_descrip.c.
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
  It was only being used to synchronize close, and in any case we needed
  to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
  that we can eliminate the membar_consumer() call in fd_getfile().  This is
  mostly syntactic sugar; the main functional change is that fd_nfiles now
  lives alongside the open file array.

Some measurements with libmicro:

- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
2009-05-24 21:41:25 +00:00
elad 863a01b5c1 Extract the open-coded authorization logic for chtimes() from various
file-systems and put it in a single function, genfs_can_chtimes().

This also makes UDF follow the same policy as all other file-systems.

Mailing list reference:

	http://mail-index.netbsd.org/tech-kern/2009/04/27/msg004951.html
2009-05-07 19:30:29 +00:00
elad 54bf8cc67a Add genfs_can_mount() and use it to prevent some more code duplication of
the security checks when mounting a device (VOP_ACCESS() + kauth(9) call)).

Proposed with no objections on tech-kern@:

	http://mail-index.netbsd.org/tech-kern/2009/04/20/msg004859.html

The vnode is always expected to be locked, so no locking is done outside
the file-system code.
2009-04-25 18:53:44 +00:00
rmind 440e5485e0 - Rearrange pg_delete() and pg_remove() (renamed pg_free), thus
proc_enterpgrp() with proc_leavepgrp() to free process group and/or
  session without proc_lock held.
- Rename SESSHOLD() and SESSRELE() to  to proc_sesshold() and
  proc_sessrele().  The later releases proc_lock now.

Quick OK by <ad>.
2009-04-25 15:06:31 +00:00
elad f68b0219b0 Per discussion on tech-kern@:
- Replace use of label/goto with returns

  - Rename, change prototype of, and move functions from vfs_subr.c to
    genfs_vnops.c
2009-04-22 22:57:08 +00:00
pooka 6d1ff74c7a Move genfs_null_putpages() from genfs_io.c to genfs_vnops.c -- it does
not really do i/o.
2009-04-18 15:40:33 +00:00
cegger b8817e4aed ansify function definitions 2009-03-15 17:14:40 +00:00
dsl 82357f6d42 ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
2009-03-14 21:04:01 +00:00
dsl 454af1c0e8 Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
2009-03-14 15:35:58 +00:00
rmind e52fb16203 genfs_getpages: rework 1.18 revision - move uvm_pagermapout() back.
It is useful to make KVA available ASAP.  Per discussion with <yamt>.
2009-02-23 21:27:51 +00:00
rmind aa58fb8da4 sched_sync: syncer_data_lock is not released now (regression fix). 2009-02-22 22:26:53 +00:00
ad 59fcf21389 PR kern/26878 FFSv2 + softdep = livelock (no free ram)
PR kern/16942 panic with softdep and quotas
PR kern/19565 panic: softdep_write_inodeblock: indirect pointer #1 mismatch
PR kern/26274 softdep panic: allocdirect_merge: ...
PR kern/26374 Long delay before non-root users can write to softdep partitions
PR kern/28621 1.6.x "vp != NULL" panic in ffs_softdep.c:4653 while unmounting a softdep (+quota) filesystem
PR kern/29513 FFS+Softdep panic with unfsck-able file-corruption
PR kern/31544 The ffs softdep code appears to fail to write dirty bits to disk
PR kern/31981 stopping scsi disk can cause panic (softdep)
PR kern/32116 kernel panic in softdep (assertion failure)
PR kern/32532 softdep_trackbufs deadlock
PR kern/37191 softdep: locking against myself
PR kern/40474 Kernel panic after remounting raid root with softdep

Retire softdep, pass 2. As discussed and later formally announced on the
mailing lists.
2009-02-22 20:28:05 +00:00
ad 430f67aa17 PR kern/39564 wapbl performance issues with disk cache flushing
PR kern/40361 WAPBL locking panic in -current
PR kern/40361 WAPBL locking panic in -current
PR kern/40470 WAPBL corrupts ext2fs
PR kern/40562 busy loop in ffs_sync when unmounting a file system
PR kern/40525 panic: ffs_valloc: dup alloc

- A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg
  buffers being invalidated. Problem discovered and patch by dholland@.

- If the syncer fails to lazily sync a vnode due to lock contention,
  retry 1 second later instead of 30 seconds later.

- Flush inode atime updates every ~10 seconds (this makes most sense with
  logging). Presently they didn't hit the disk for read-only files or
  devices until the file system was unmounted. It would be better to trickle
  the updates out but that would require more extensive changes.

- Fix issues with file system corruption, busy looping and other nasty
  problems when logging and non-logging file systems are intermixed,
  with one being the root file system.

- For logging, do not flush metadata on an inode-at-a-time basis if the sync
  has been requested by ioflush. Previously, we could try hundreds of log
  sync operations a second due to inode update activity, causing the syncer
  to fall behind and metadata updates to be serialized across the entire
  file system. Instead, burst out metadata and log flushes at a minimum
  interval of every 10 seconds on an active file system (happens more often
  if the log becomes full). Note this does not change the operation of
  fsync() etc.

- With the flush issue fixed, re-enable concurrent metadata updates in
  vfs_wapbl.c.
2009-02-22 20:10:25 +00:00
plunky 767dc27ad2 add a comment re the vop (?) flag LAYERFS_MBYPASSDEBUG, that if set
could cause a bad pointer dereference in the debug printing when
credentials with values of NOCRED or FSCRED were passed to kauth.

I don't see any way to set such a flag, I think its just a debug
thing that could be enabled at compile time by somebody who knew
how, hence the comment rather than a real fix.
2009-02-14 17:29:11 +00:00
plunky cea3e862b4 consistency checks made inside #ifdef SAFETY should really
be #ifdef DIAGNOSTIC
2009-02-14 16:57:05 +00:00
plunky 821f05b0d3 While we remap credentials we should ignore cred == FSCRED as well as
cred == NOCRED.

This fixes a page fault occurring when a union is mounted over a umap,
as FSCRED is passed by union filesystem.
2009-02-13 22:29:00 +00:00
rmind 78a982c8f2 genfs_getpages: move putiobuf() and uvm_pagermapout() outside the glock.
OK by <ad>.
2009-02-04 20:32:19 +00:00
haad 07b62696b9 Add support for loading pseudo-device drivers. Try to autoload modules from
specs_open routine. If devsw_open fail, get driver name with devsw_getname
routine and autoload module.

For now only dm drivervcan be loaded, other pseudo drivers needs more work.

Ok by ad@.
2009-02-02 14:00:27 +00:00
yamt 812bb0d164 restore the pre socket locking patch signal behaviour.
this fixes a busy-loop in nfs_connect.
2009-01-21 06:59:29 +00:00
yamt cea19a4d14 malloc -> kmem_alloc. 2009-01-17 07:02:35 +00:00
yamt 09ff411cf6 - g/c stale function prototypes.
- rename UVM_PAGE_HASH_PENALTY to UVM_PAGE_TREE_PENALTY.
2009-01-16 02:33:14 +00:00
christos 8f9e04edea this change was somehow missed. 2009-01-11 03:16:33 +00:00
christos 461a86f9bd merge christos-time_t 2009-01-11 02:45:45 +00:00
dholland 2bd5b48033 Clarify a comment 2009-01-03 04:38:07 +00:00
pooka 8583cae233 Rename specfs_lock as device_lock and move it from specfs to devsw.
Relaxes kernel dependency on vfs.
2008-12-29 17:41:18 +00:00
cegger 9b87d582bd kill MALLOC and FREE macros. 2008-12-17 20:51:31 +00:00
ad 49e50a21d6 PR kern/40110: null, overlay and umap modules loading -> panic (layerfs symbols not there)
Add a layerfs module.
2008-12-05 13:05:37 +00:00
joerg f5bbefdb21 Check that the filesystem acutally uses WAPBL before initiating a
transaction for the directio case. Fixes PR 39929 and similiar issues
seen with PostgreSQL.
2008-12-01 11:22:12 +00:00
pooka 010ce4930e more <sys/buf.h> police 2008-11-16 19:34:29 +00:00
christos 2a274197af - allocate 8 pointers on the stack to avoid stack overflow in nfs.
- make that 8 a constant
- remove bogus panic
2008-10-31 20:42:41 +00:00
hannken ac6b16172a Make genfs_directio() IO_JOURNALLOCKED aware. DirectIO no longer triggers
"locking against myself" panic in wapbl_begin().

Observed and tested by: Frank Kardel <kardel@netbsd.org>
2008-10-19 18:17:13 +00:00
hannken 44f3404f57 Break a deadlock where one thread has a wapbl transaction, calls VOP_GETPAGES
and wants to busy a page  while  another thread calls VOP_PUTPAGES on the same
vnode, takes pages busy and wants to start a wapbl transaction.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2008-10-10 09:21:58 +00:00
skrll 81817d63bf PR/39324 kernel diagnostic assertion "l->l_stat != LSZOMB" failed.
Ignore procs with zero or all LSZOMB LWPs. Get a non-LSZOMB LWP to perform
operations against as part of the deal.

procfs really needs to be updated to support multi-threading fully.
Hi Antti!
2008-09-05 14:01:11 +00:00
skrll 006aadc921 ANSIfy 2008-09-05 13:21:12 +00:00
yamt 34272c569e remove always-true conditionals. 2008-08-14 00:47:13 +00:00
yamt 1907407b97 constify 2008-08-11 02:51:01 +00:00
apb 10c7b7cb02 #include <sys/tree.h> to get a definition for SPLAY_ENTRY.
Needed by third party code, such as lsof.
2008-08-01 16:55:48 +00:00
simonb 36d65f1138 Merge the simonb-wapbl branch. From the original branch commit:
Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
   journaling code.  Originally written by Darrin B. Jewell while
   at Wasabi and updated to -current by Antti Kantee, Andy Doran,
   Greg Oster and Simon Burge.

OK'd by core@, releng@.
2008-07-31 05:38:04 +00:00
christos b00559d07e use bufsize instead of BUFFERSIZE 2008-07-25 18:36:50 +00:00
christos 3bccc0f766 Handle files with a large number of mappings gracefully. Reported by Nicholas
Joly.
2008-07-25 17:40:24 +00:00
rmind 160268aca6 Remove proc_representative_lwp(), use a simple LIST_FIRST() instead.
OK by <ad>.
2008-07-02 19:49:58 +00:00
rumble 28f5ebd853 Create sysctl entries during module initialisation and destroy them
appropriately.

Many of these file systems are now ready for modularisation.
2008-06-28 01:34:05 +00:00
ad a8ced4f0d0 Set up the sysctl tree correctly when loaded as a file system. 2008-06-24 11:25:05 +00:00
ad a00bd89dab Replace references to getsock/getvnode. 2008-06-24 11:18:14 +00:00
ad 06c343ac94 vm_page: put TAILQ_ENTRY into a union with LIST_ENTRY, so we can use both. 2008-06-04 12:41:40 +00:00
ad 736a4d9b78 Kill devsw_lock and just use specfs_lock. The two would need merging
in order to prevent unload of modules when a device that they provide
is still open.
2008-05-31 21:34:42 +00:00
hannken 5d2bff060a Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write.  Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn().  If set the caller
  intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
  may clear the buffer and runs copy-on-write.  Process possible errors
  from getblk() or fscow_run().  Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-05-16 09:21:59 +00:00
reinoud e979c658c9 Import writing part of the UDF file system making optical media like CD's
and DVD's behave like floppy discs. Writing is supported upto and including
version 2.01; version 2.50 and 2.60 will follow.

Also extending the UDF implementation to support symbolic links and
hardlinks.

Added are the mmcformat(8) tool to format rewritable CD/DVD discs and
newfs_udf(8).

Limitations:
        all operations can be performed on the file system though the
        sheduling is currently optimised for archiving workloads.

        mv(1)/rename(2) is currently only implemented for non-directories.
2008-05-14 16:49:47 +00:00
simonb 2fd5130380 mnt_data is a pointer, set it to NULL not 0 when we're finished with it. 2008-05-13 08:31:12 +00:00
rumble a1221b6d4a Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
2008-05-10 02:26:09 +00:00
ad 42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
ad 928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
ad e3610f1886 kern/38135 vfs_busy/vfs_trybusy confusion
The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
  unmounting the file system. Simple contention on mnt_lock doesn't
  cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
2008-04-29 23:51:04 +00:00
ad baa3395f8f PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
2008-04-29 18:18:08 +00:00
martin ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad 284c2b9aef Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
2008-04-24 18:39:20 +00:00
ad 6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
ad ef9411cb09 Fix locking in the fifo kqueue routines. 2008-04-24 15:18:11 +00:00
ad 15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
hannken 0789b071d1 Remove a race when pages are released while waiting for fstrans_start().
Fixes PR #38460
2008-04-19 11:53:13 +00:00
hannken dc04f63f5b Remove stale include <sys/fstrans.h>. 2008-04-19 11:49:54 +00:00
ad a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
yamt 29e0fd1c9e sprinkle KERNEL_LOCK for socket.
a little different version was tested by Matthias Drochner.
2008-02-11 23:53:32 +00:00
ad d7f6ec471c Don't lock the socket to set/clear FNONBLOCK. Just set it atomically. 2008-02-06 21:57:53 +00:00
ad 22c6a20ebd Lock v_knlist with the vnode interlock. PR kern/37881. 2008-02-05 14:19:52 +00:00
ad 25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
ad 3490efcc63 Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
2008-01-30 09:50:19 +00:00
dholland 764ffd05f0 Part of the rename patches *doh* 2008-01-28 15:17:54 +00:00
dholland 717e1785a5 Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
2008-01-28 14:31:15 +00:00
hannken 5ab6217754 Spec_open(): clear sd_bdevvp if bdev_open() failed.
Ok: Andrew Doran <ad@netbsd.org>
2008-01-25 16:21:04 +00:00
riz 960857eb6d Since VOP_LEASE is gone, remove genfs_lease_check() too. Now my kernel
builds again.  :)
2008-01-25 15:34:59 +00:00
ad 1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
ad f9a31c8cd0 spec_fsync: don't assert that 'vp' holds the block device open. If it's
not open, there shouldn't be dirty buffers so vinvalbuf() is harmless.
2008-01-24 21:05:52 +00:00
ad 703069c0e9 specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
  vnode can describe a block device. Instead, prohibit concurrent opens of
  block devices. As a bonus remove the unreliable code that prevents
  multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
  goes away, instead of abusing vnode::v_usecount to tell if the device is
  open.
2008-01-24 17:32:52 +00:00
ad 27c0e63a2a layer_node_find: if we find a node being cleaned out, then ignore it and
continue.  A thread trying to clean out the extant layer vnode needs to
acquire the shared lock (i.e. the lower vnode's lock), which our caller
already holds. To allow the cleaning to succeed the current thread must make
progress.  So, for a brief time more than one vnode in a layered file system
may refer to a single vnode in the lower file system.
2008-01-23 20:11:32 +00:00
elad c27d5f30b6 Tons of process scope changes.
- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
    requests, and add specific requests for set/get scheduler policy and
    set/get scheduler parameters.

  - Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
    requests.

  - Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.

  - Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
    process information is being looked at (entry itself, args, env,
    open files).

  - Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.

  - Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.

  - Make bsd44 secmodel code handle the newly added rqeuests appropriately.

All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.

  - Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.

Discussed with christos@ and yamt@.
2008-01-23 15:04:38 +00:00
pooka f7455b20d9 portal_advlock: badop -> eopnotsupp. I guess advlock can be called
for the root vnode and badop panics.

fix in PR kern/25393 by Laurent Sartran
2008-01-19 21:54:47 +00:00
yamt 93a915eb7a genfs_do_putpages: DEBUG checks. 2008-01-18 11:01:23 +00:00
yamt 36c701bcd4 genfs_do_putpages: ensure that we clean the vnode in the case of PGO_RECLAIM. 2008-01-18 11:00:53 +00:00
yamt 2b40f35040 push pmap_clear_reference calls into pdpolicy code, where reference bits
actually matter.
2008-01-18 10:48:23 +00:00
ad 4eb2a42ae6 Fix v_freelisthd assertion failure during call to vdevdone(). No calling
VOPs without a vnode reference!
2008-01-17 17:28:54 +00:00
ad 4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
ad ea3f10f7e0 Merge more changes from vmlocking2, mainly:
- Locking improvements.
- Use pool_cache for more items.
2007-12-26 16:01:34 +00:00
yamt 2294b0bcb6 procfs_douptime: simply use microuptime() instead of a mysterious calculation. 2007-12-22 01:06:54 +00:00
yamt 0d13423925 procfs_docpustat: g/c a write-only variable. 2007-12-22 01:04:55 +00:00
dyoung 6528dd9d56 Bug fix: at the top of layer_bypass(), save a pointer to the mount
point for re-use at the bottom, instead of trying to re-read the
mount point from a potentially vrele()'d vnode.
2007-12-22 00:48:46 +00:00
christos 177940c72e use vnode_to_path. 2007-12-15 23:52:00 +00:00
pooka db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
ad 6ab26a0fa8 Partially merge syncer changes from vmlocking2. 2007-12-08 15:47:32 +00:00
ad 7c9b007bbc Destroy ovm_hashlock before freeing. 2007-12-08 15:12:15 +00:00
ad 0444cfe507 Use kmem_alloc/free. 2007-12-08 15:10:22 +00:00
pooka 4e38160d4d Do not "return 1" from kqfilter for errors. That value is passed
directly to the userland caller and results in a mysterious EPERM.
Instead, return EINVAL or something else sensible depending on the
case.
2007-12-05 17:19:46 +00:00
hannken d556dc98b0 Fscow_run(): add a flag "bool data_valid" to note still valid data.
Buffers run through copy-on-write are marked B_COWDONE.  This condition
is valid until the buffer has run through bwrite() and gets cleared from
biodone().

Welcome to 4.99.39.

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2007-12-02 13:56:15 +00:00
pooka 61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad ad89ae5a21 Revision 1.42 was lost. Pointed out by Nicolas Joly:
This was using mutex_exit where mutex_enter was required.
2007-11-12 14:11:47 +00:00
christos dfdca25ef7 report the proper stack size on 32 bit emulations. 2007-11-11 18:29:03 +00:00
christos 26515bc536 make the last argument of procfs_dir size_t 2007-11-09 22:45:49 +00:00
ad d18c6ca4de Merge from vmlocking:
- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
2007-11-07 00:23:13 +00:00
pooka 735dd21e07 Split I/O-related routines (getpages, putpages, etc.) which are heavily
tied to uvm out of genfs_vnops into genfs_io.c
2007-10-17 16:45:00 +00:00
ad 6b7322f1ed This was using mutex_exit where mutex_enter was required. 2007-10-11 18:46:19 +00:00
ad 3fa279a5ee umapm_hashlock is a mutex. 2007-10-10 22:07:48 +00:00
ad 7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad 36a1712707 Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.
2007-10-08 20:06:17 +00:00
ad 9f56dfa520 Merge brelse() changes from the vmlocking branch. 2007-10-08 18:02:53 +00:00
ad 451aacda90 Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
2007-10-08 15:12:05 +00:00
hannken 3856acafe2 Update the file system copy-on-write handler.
- Instead of hooking the handler on the specdev of a mounted file system
  hook directly on the `struct mount'.

- Rename from `vn_cow_*' to `fscow_*' and move to `kern/vfs_trans.c'.  Use
  `mount_*specific' instead of clobbering `struct mount' or `struct specinfo'.

- Replace the hand-made reader/writer lock with a krwlock.

- Keep `vn_cow_*' functions and mark as obsolete.

- Welcome to NetBSD 4.99.32 - `struct specinfo' changed size.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2007-10-07 13:38:53 +00:00
pooka 3f3cac88a3 Make bioops a pointer and point it to the softdeps struct in softdep
init.  Decouples "options SOFTDEP" from the main kernel and ffs code.
2007-09-01 23:40:21 +00:00
pooka ce3dd6b3a6 cleanup unused prototype 2007-08-03 08:50:23 +00:00
pooka 9feac0b35c ANSI-fy 2007-08-03 08:45:36 +00:00
pooka 8d1f899239 * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
  use VFS_PROTOS() instead of manually prototyping the methods
2007-07-31 21:14:15 +00:00
ad a0d1fd8d0c It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
2007-07-29 13:31:07 +00:00
ad 66fefd117b It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
2007-07-29 12:15:35 +00:00
pooka 91f15f1760 whoops, forgot to commit this a while back: initialize new vnode size 2007-07-27 08:38:39 +00:00
pooka c59e414d23 vop_mmap parameter change 2007-07-27 08:32:44 +00:00
pooka d9970c8066 Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter. 2007-07-26 22:57:36 +00:00
pooka 606670f3e8 Initialize size and/or writesize when creating a vnode. 2007-07-23 11:27:45 +00:00
pooka 05ce20f4a0 Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
2007-07-22 19:16:04 +00:00
pooka 0921857772 Don't allow getcwd() on procfs vnodes and provide "/" as the path
instead of the result from getcwd().  The works around locking
panics caused by namei calling VOP_READLINK while holding on to a
directory lock and getcwd() trying to acquire that lock.  The real
fix would be to get rid of getcwd() calls within VOPs (not locking
safe), but that's not a viable option in the netbsd-4 timeframe.

Suggestion for workaround from David Holland.
2007-07-22 13:37:13 +00:00
pooka a97de7b959 nuke homegrown getcwd_common() decl 2007-07-21 22:47:36 +00:00
pooka e24b0872a4 Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
2007-07-17 11:19:31 +00:00
dsl 2721ab6c7b Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
2007-07-12 19:35:32 +00:00
ad 88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
pooka b7d4ee5f17 * allow unmount even if rootvp has a usecount > 1 provided that
MNT_FORCE is given
* decrease cargo cult index by getting rid of commented sections
  with mntflushbuf() in them - AFAICT the call was removed from our
  kernel over 13 years ago with the 4.4BSDlite import
2007-07-08 23:58:53 +00:00
pooka dbeb9a3eeb I'm all for redundant and failsafe computing, but ...
vap->va_atime = vap->va_mtime = vap->va_ctime;
        vap->va_atime = vap->va_mtime = vap->va_ctime;

... is missing the point.
2007-07-02 17:55:33 +00:00
pooka 5ac04c46a8 VOP_LOCK() doesn't handle LK_RETRY, call vn_lock() instead 2007-06-30 18:28:15 +00:00
dsl 6319443e37 Updates for changes prototype of kauth_cred_set/getgroups(). 2007-06-30 15:27:02 +00:00
pooka 835b0326c5 Using POOL_INIT here makes no sense, since file systems always have
an init method.  So get rid of it and #ifdef _LKM and just always
init in the init method.  Give malloc types the same treatment.
Makes file systems nicer to work with in linksetless environments
and fixes a few LKM discrepancies.
2007-06-30 09:37:53 +00:00
yamt da51d139a4 improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
2007-06-05 12:31:30 +00:00
agc f1a5908695 In /proc/<pid>/statm, avoid leaking buffer space if the attempt to get
vmspace information fails.

Return the nice value properly to userland via the /proc/<pid>/stat entry.

Use vm sizes from vmspace, rather than rusage structs, for the same
reasons as mentioned previously - see the comment in
kvm_proc.c::kvm_getproc2() about rusage values and zombie processes.
2007-05-26 16:21:04 +00:00
agc 12003e8756 Use a bit more common code for the MULTIPROCESSOR and !MULTIPROCESSOR
cases.

Use the lwp's priority when returning the priority value, rather than
returning the nice value.
2007-05-25 22:26:14 +00:00
agc 15a3a67ede Various changes for better Linux emulation:
+ in /proc/<pid>/statm emulation, use the memory values from vmspace,
rather than struct rusage, since the rusage values appear to be 0 for
all processes except zombies.  cf dsl's comment in
kvm_proc.c::kvm_getproc2()

+ in /proc/<pid>/stat, instead of returning the tv_sec value, return the
number of ticks we've had (roughly equivalent to the Linux jiffies).
Calculate these values from the tv_usec values.

Also:

+ enclose CPU_INFO_ITERATOR and CPU_INFO_FOREACH usage in #ifdef
MULTIPROCESSOR, at the request of Nick Hudson

Together, these changes allow htop to work on NetBSD.
2007-05-25 19:20:06 +00:00
dogcow 905b715a4b use PRIu64, not llu, to unbork on 64-bit platforms. 2007-05-24 05:33:08 +00:00
agc 4dbe5ed7e7 Extend the Linux emulation of /proc to include
/proc/stat
	/proc/loadavg and
	/proc/<pid>/statm.

These are only present when -o linux is specified as a mount option
to procfs.

Factor out some common code so that it can be used by a number of
functions.

XXX The values returned in the statm emulation need to be verified.
2007-05-24 00:37:40 +00:00
hannken 64b7e5637e Fstrans_start() always returns zero, so change its type to void. 2007-05-17 07:26:21 +00:00
yamt 4d3b7e04c8 use a cached value of v_size. no functional changes. 2007-05-13 13:11:53 +00:00
perseant 0569cad0fd Split the VOP interface part of genfs_putpages() from the code. The new
function that does the work, genfs_do_putpages(), now takes as an argument
a pointer to the page that would be waited on, if PGO_BUSYWAIT were not set.
This allows a consumer, e.g. lfs_putpages(), to perform an action outside
the scope of UVM before sleeping on the page in question.
2007-04-24 22:46:03 +00:00
enami 780e071921 Don't expand RCS id of ancestor file. The id itself is actually copied
from null_vnops.c since the log message of rev. 1.1 implies the copy.
2007-04-16 08:10:58 +00:00
chs aba740b225 define a pager flag PGO_RECLAIM, similar to FSYNC_RECLAIM, and use it
to skip unnecessary flushing when layered file system vnodes are recycled.
this also prevents a deadlock with the dodgy LFS putpages routine.
fixes the non-LFS part of PR 36150.
2007-04-16 05:14:54 +00:00
hannken fc6776f366 Remove now obsolete vn_start_write() and vn_finished_write() and
corresponding flags.

Revert softdep_trackbufs() to its state before vn_start_write() was added.

Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.

Welcome to 4.99.17
2007-04-08 11:20:42 +00:00
hannken e956461048 Remove calls to now obsolete vn_start_write() and vn_finished_write(). 2007-04-07 15:06:53 +00:00
rmind 0a747ea89c Unfortunately, missed procfs_proc_unlock() in previous.
Pointed out by pooka@
2007-04-04 10:50:42 +00:00
rmind 199691e947 procfs_readlink: Handle a possible fail of fd_getfile(), also, we
do not need to check for error again.
CID: 4436
2007-04-04 01:27:32 +00:00
christos a7761fd2c5 Instead of reading and writing little by little, allocate memory and
write the whole map in one shot so that we don't have to deal with the
map changing under us. Fixes the linux emulated jdk-1.6 where it was
losing the last map entry and could not find the stack on startup.
2007-04-01 03:18:57 +00:00
christos 6a4825167b return a page less than the actual top of stack so that linux-java works. 2007-04-01 03:16:44 +00:00
ad 0b43c20288 Remove useless cast. 2007-03-11 22:07:32 +00:00
ad c147748d84 - Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
2007-03-09 14:11:22 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
salo 20af5e4fd5 Don't prepend rootvnode to the path in non-NULL case for exe links.
It breaks procfs in chroot.

from <christos>, tested by me.
2007-03-03 01:18:32 +00:00
ad b89010bfa3 Destroy the hash locks on final unmount. 2007-02-27 16:11:51 +00:00
thorpej 7cc07e11dc TRUE -> true, FALSE -> false 2007-02-22 06:16:03 +00:00
thorpej 712239e366 Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 22:59:35 +00:00
ad 4abc9f506a Add genfs_node_destroy(). Fixes a lock "leak" seen when running LOCKDEBUG
kernels.
2007-02-20 16:19:42 +00:00
pooka 76aba343c2 When checking for file validity under pid/, do proper proc->lwp
lookup (fsvo proper) instead of fiddling directly with the lwp
list.
2007-02-19 00:08:18 +00:00
ad 42a7dff463 procfs_map():
- Drop the target's vm_map lock before calling uiomove(). We could
  deadlock if inspecting /proc/curproc/map.
- If the vm_map might have changed, restart the operation, but give
  up after 250 retries if the map keeps changing.  XXX This is not
  ideal.
2007-02-18 20:03:44 +00:00
pooka 7b63f0de5d Don't check for validity of p in lookup for root nodes, since it
will always be NULL.  Rather, just call pt_valid with NULL directly
and let it decide if we're a linux mount or not.
2007-02-18 01:55:26 +00:00