as needed to read a request instead of the maximum message size.
Insanely enough, this makes "cheap backend I/O" file systems such
dtfs and sysctlfs perform 10-30% better (depending on the average
size of incoming requests).
Previously each request was executed on its own callcontext and
switched to every time the request was being processed. Now requests
borrow the mainloop context and switch only if/when they yield.
This takes two context switches away from a file system request
bringing down the typical amounts 2->0 (e.g. dtfs) and 4->2 (e.g.
psshfs).
The interfaces for manually executing requests changed a bit:
puffs_dispatch_create() and puffs_dispatch_exec() must now be used.
They are not tested, as nothing in-tree wants them and I doubt
anyone else is really interested in them either.
Also do some misc code cleanup related to execution contexts. The
"work-in-progress checkpoint" committed over a year ago was starting
to look slightly weed-infested.
them every time. Speeds up pure in-memory file systems such as
sysctlfs or dtfs quite a bit. For actual I/O-workhorses the result
is of course less tasty.
If the env variable PUFFS_COMFD is set, the descriptor value
contained in it is used for communication instead of opening
/dev/puffs and doing mount(2).
This feature is obviously very undocumented and should not be used
without adult supervision.
Get rid of the original puffs_req(3) framework and use puffs_framebuf(3)
instead for file system requests. It has the advantage of being
suitable for transporting a distributed message passing protocol
and therefore us being able to run the file system server on any
host.
Ok, puffs is not quite here yet: libpuffs needs to grow request
routing support and the message contents need to be munged into a
host independent format. Saying which format would be telling,
but it might begin with an X, end in an L and have the 13th character
in the middle. Keep an eye out for the sequels: Parts 3+m/n.
Ok, ok, a few more words about it: stop holding puffs_cc as a holy
value and passing it around to almost every possible place (popquiz:
which kernel variable does this remind you of?). Instead, pass
the natural choice, puffs_usermount, and fetch puffs_cc via
puffs_cc_getcc() only in routines which actually need it. This
not only simplifies code, but (thanks to the introduction of
puffs_cc_getcc()) enables constructs which weren't previously sanely
possible, say layering as a curious example.
There's still a little to do on this front, but this was the major
fs interface blast.
separately
* provide puffs_cc_getcc()
This is in preparation for the removal of you-should-guess-what as
an argument to routines here and there and everywhere.
also synchronizes with puffs_mount() and does not return (exit) in the
parent process until the file system has been mounted. This makes
it possible to reliably run e.g. mount_foo jippi /kai ; cd /kai/ee
servers. Calling daemon() (i.e. fork()ing) inside a library can
cause nice surprises for e.g. threaded programs. As discussed with
Greg Oster & others.
alternative to the (vastly superior ;) continuation model. This
is very preliminary stuff and not compiled by default (which it
even won't do without some other patches I cannot commit yet).
The raison d'commit of the patch is a snippet which ensures proper
in-order dispatching of all operations, including those which don't
require a response. Previously many of them would be dispatched
simultaneosly, e.g. fsync and reclaim on the same node, which
obviously isn't all that nice for correct operation.
userspace, since it doesn't contain any information yet. I should
still rework this more so this is just a quickie to get the read/write
style interface more up to speed with the ioctl version.
interacts with the userspace file server:
* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.
* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.
* start generalizing the transport interface to be independent of puffs
* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".
* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
the maximum request size after mount. Calling mount(MNT_GETARGS)
from the file server is currently not kosher, as it vrele()s the
root vnode, potentially causing an inactive, which the file server
cannot handle while it itself is inactive in the kernel (deadlock).
Add an additional call with MNT_GETARGS to retrieve the modified
information instead of relying on the kernel code abusing the mount
interface during mount.
* in addition add/remove, allow enable/disable, which can be used
to control events for descriptors without having to remove all the
data associated with them
* add directsend/receive, which can be used to pass the same buffer
from the caller to read/writeframe and back again
* add flags to enqueue functions and allow urgent buffers to be
processed as the next PDU
file always after creation to cache the inode number given by the
backend file system. Otherwise we would not find a newly created
node from incore and create another one. In practise this was
pretty well hidden by the kernel name cache.
and destroyed, but not yet reclaimed. This prevents puffs_pn_nodewalk()
from returning stale entries. Make nullfs use this (some file
systems are a bit too happy with recycling inode numbers).
FORTIFY_SOURCE feature of libssp, thus checking the size of arguments to
various string and memory copy and set functions (as well as a few system
calls and other miscellany) where known at function entry. RedHat has
evidently built all "core system packages" with this option for some time.
This option should be used at the top of Makefiles (or Makefile.inc where
this is used for subdirectories) but after any setting of LIB.
This is only useful for userland code, and cannot be used in libc or in
any code which includes the libc internals, because it overrides certain
libc functions with macros. Some effort has been made to make USE_FORT=yes
work correctly for a full-system build by having the bsd.sys.mk logic
disable the feature where it should not be used (libc, libssp iteself,
the kernel) but no attempt has been made to build the entire system with
USE_FORT and doing so will doubtless expose numerous bugs and misfeatures.
Adjust the system build so that all programs and libraries that are setuid,
directly handle network data (including serial comm data), perform
authentication, or appear likely to have (or have a history of having)
data-driven bugs (e.g. file(1)) are built with USE_FORT=yes by default,
with the exception of libc, which cannot use USE_FORT and thus uses
only USE_SSP by default. Tested on i386 with no ill results; USE_FORT=no
per-directory or in a system build will disable if desired.
turn a reference to puffs_framebuf in the file system from a
cb/justsend operation to a cc wait, should the file system find
itself desiring the result.
equal, larger, respectively instead of 0/1 for non/equal. This
will allow sorting the buffers for faster matching in libpuffs.
While here, change the name from respcmp to framecmp, as that better
reflects the purpose.
NOTE! there is no obvious way to make compilation fail for file
systems which may already be using this feature (although I don't
think there are any outside our tree, as the feature is two weeks
old). Nevertheless, non-updated file systems will fail very quickly.