1) Comply with the way buffercache(9) is intended to be used. Now we
read in single blocks of EFS_BB_SIZE, never taking in variable
length extents with a single bread() call.
2) Handle symlinks with more than one extent. There's no reason for
this to ever happen, but it's handled now.
3) Finally, add a hint to our iteration initialiser so we can start
from the desired offset, rather than naively looping through from
the beginning each time. Since we can binary search the correct
location quickly, this improves large sequential reads by about
40% with 128MB files. Improvement should increase with file size.
When reading a file, we would erroneously iterate to the next extent
before having filled the entire uio request. This lead to unnecessary
extent iteration and excessive calls to efs_read.
Sequential read performance has doubled in the uncached case and
quadrupled when data is buffered.
an init method. So get rid of it and #ifdef _LKM and just always
init in the init method. Give malloc types the same treatment.
Makes file systems nicer to work with in linksetless environments
and fixes a few LKM discrepancies.
go through VFS_ROOT() and allow to fetch it without locking it.
This allows us to call the cache flush operations also for the root
vnode and most notably fixes e.g. a "No such file or directory"
for a psshfs root directory ls -l when a file was locally deleted
and remotely re-created.
Also fix some sloppy programming in root node fetch (mostly cosmetic).
Actually, we do need separate "no references in file server" and
"noref + inactive" flags if we wish to correctly support unix open
file semantics and optimize away pre-reclaim cache flushes. So,
add PNODE_DYING which stands for norefs + inactive.
"noref + inactive" flags if we wish to correctly support unix open
file semantics and optimize away pre-reclaim cache flushes. So,
add PNODE_DYING which stands for norefs + inactive.
the kernel it has 0 references to the node in question. In other
words, this can be used to avoid inactive(), or, if the file server
does not implement inactive, prompt reclaim for removed nodes.
as we give a reference to userspace for the puffs_node for the
duration of the poll call. So reference count puffs_node separately
from the parent vnode. vref()/vrele() is not possible due to a possible
surprise visit from VOP_INACTIVE.
and other information instead of always using VDIR. To make this
possible without races, require all root node information already
in puffs_mount() and nuke puffs_start2() and the associated start
operation completely.
requested/inspired by Tobias Nygren
file system value for the size of device special files, as that
comes from specfs instead of the "host" file system. Therefore,
take care that getattr doesn't override the value of vp->v_size.
file server only if the op was still waiting for fetch (as opposed
to waiting for the response). Also, properly flag the possible
following inactive as an op for which we do not want to wait for
the response from the file server.
for nodes upon return from the userspace. Currently it can be used
to indicate that the file server should be notified of "inactive"
in case the file server has opted to not receive inactive every
time the reference count for a vnode drops to zero. (inactive is
a common event, almost never requires any action and must be executed
sychronously, so it is wasteful).
While doing this, cleanup the release-relock nonsense from the
vntouser*() arguments. It was never enabled and the whole LOCKEDVP()
concept was very broken to begin with.
all metadata info cached in the kernel while we're setattr'ing in
any case. Solves problems such as truncate (via extend-by-write)
+ chmod resulting in EPERM because the file was already read-only
when the actual truncate was flushed out of the kernel in fsync.
when unmounting the file system in case of a certain timing (and
possibly some other conditions), a thread would wait on a condition
variable, while another thread broadcast the cv and immediately
proceeded to destroy it. The result was a system frozen completely
solid shorly after the process waiting for the cv woke up. So
introduce reference counting to synchronize destruction of the
resources in unmount.
I was able to repeat the problem only on my laptop in some special
cases, so I do not know how common it was. Ironically, killing
the file server process violently instead of unmount() didn't have
this problem because it never entered the unmount path from two
directions.
it. It might be that the file server is either crashing or just
returning consistent errors. uiomove() would handle the error,
but if the pages weren't faulted in, memset() to the unfaultable
ubc window would cause a kernel page fault.
to skip unnecessary flushing when layered file system vnodes are recycled.
this also prevents a deadlock with the dodgy LFS putpages routine.
fixes the non-LFS part of PR 36150.
break other implementations so lookup the physical number instead of
indexing it. Choosing random numbers here is legal according to the specs,
but not a logical choice and most likely done as a wierd kind of copy
protection.
Rogue implementation found to use this
*Microsoft CDIMAGE UDF
corresponding flags.
Revert softdep_trackbufs() to its state before vn_start_write() was added.
Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.
Welcome to 4.99.17
again. This is useful until locking is further developed and basically
any deadlocks can be solved by killing appropriate processes.
Thanks especially to Tommi Kyntola and Antti Louko for sitting down
with me and discussing resource ownership and locking strategies
in implementing this.
workaround for the problem analyzed more deeply in kern/30831. In
short, the problem is keeping the vnode on the mount point vnode
list during reclaim. If reclaim happens to sleep (as is a possibility
with smbfs due to calling vrele() and therefore possibly VOP_INACTIVE),
code going through the entire mountpoint vnode list will hit
half-reclaimed vnodes.
from putop. even though there's only one user currently, makes code
more readable
* move "delta" to a standard parameter in vntouser and get rid of the
specialcase vntouser_delta
bunch of bugs.
* park structures are now always allocated from a pool instead of a
mixed stack/malloc allocation
* get rid of the whole adjbuf concept, always just alloc the maximal
amount of memory to satisfy a request
* little regression: don't allow interrupting wait from file system
to userspace; this had problems already before, but now the problems
really started to shine through. I'll try to make this work again
some day.
* fix bmap to return a sensible value in runp
Remove offset argument, we should now find an HFS+ volume in any
of its standard places.
Based on work from and test image provided by Pelle Johansson.
kernel and flush it out all at once instead of continuous updating
* add support for delivering notifications to the file server about
when a page was written to (but disabled by default for now). the
file server can use this to request flushing or invalidating the
kernel page cache
(hfslib_open_volume) We are only interested in the catalgo and
extents header, so read the first 512 bytes, not the whole first
extent. Also makes mounting a lot faster.