RUMP-visible code. Instead of checking that updateproc (aka ioflush,
aka syncer) will not sleep in PUFFS code, I check for any kernel thread:
after all none of them are designed to hang awaiting for a remote filesystem
operation to complete.
a memory allocation, or a response from the filesystem.
This avoids deadlocks in the following situations:
1) when memory is low: ioflush waits the fileystem, the fielsystem waits
for memory
2) when the filesystem does not respond (e.g.: network outage ona
distributed filesystem)
This is required to avoid data corruption bugs, where a getattr slices
itself within a setattr operation, and sets the size to the stall value
it got from the filesystem. That value is smaller than the one set by
setattr, and the call to uvm_vnp_setsize() trigged a spurious truncate.
The result is a chunk of zeroed data in the file.
Such a situation can easily happen when the ioflush thread issue a
VOP_FSYNC/puffs_vnop_sync/flushvncache/dosetattrn while andother process
do a sys_stat/VOP_GETATTR/puffs_vnop_getattr.
This mutex on size operation can be removed the day we decide VOP_GETATTR
has to operated on a locked vnode, since the other operations that touch
size already require that.
filesystem in which format extended attribute shall be listed.
There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
extattr_list_file(2), which is obtanined by setting the
EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)
This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)
This removes the need for the SAVENAME and HASBUF namei flags.
Interrupt server wait only on certain signals (same set at nfs -i)
instead of all signals. According to the PR this helps with
"git clone" run on a puffs file system.
This makes the size the same on 64bit archs. Don't bother bumping
any version, since you'd have explicitly had to jump through some
hoops to use pathconf before.
operations. This prevents kernel memory leaks (one of which happened
every time the file system was unmounted via PUFFSOP_UNMOUNT ...
and incidentally would've been trivially caught with the old
malloc(9) interface. I wonder if the message is to use a ton of
pools instead of regression-attractive kmem interface).
reference the puffs_node before sending the request to the file
server. This diminishes the window where the inode can be reclaimed
and be invalidated before it is accessed (but does not completely
eliminate the race, as that is a caller problem which we cannot
fix here).
to initiate self destruct, i.e. unmount(MNT_FORCE). This, however,
is a semi-controlled self-destruct, since all caches are flushed
before the (possibly) violent unmount takes place.
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations
(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
a valid optimization, but that's long gone and once VOP_INACTIVE is
called and the file server says that the vnode is going to be recycled,
it really is going to be recycled extra references gained or not.
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.