userspace, since it doesn't contain any information yet. I should
still rework this more so this is just a quickie to get the read/write
style interface more up to speed with the ioctl version.
interacts with the userspace file server:
* since the kernel-user communication is not purely request-response
anymore (hasn't been since 2006), try to rename some "request" to
"message". more similar mangling will take place in the future.
* completely rework how messages are allocated. previously most of
them were borrowed from the stack (originally *all* of them),
but now always allocate dynamically. this makes the structure
of the code much cleaner. also makes it possible to fix a
locking order violation. it enables plenty of future enhancements.
* start generalizing the transport interface to be independent of puffs
* move transport interface to read/write instead of ioctl. the
old one had legacy design problems, and besides, ioctl's suck.
implement a very generic version for now; this will be
worked on later hopefully some day reaching "highly optimized".
* implement libpuffs support behind existing library request
interfaces. this will change eventually (I hate those interfaces)
kernel to the file server for silly things the file server did,
e.g. attempting to create a file with size VSIZENOTSET. The file
server can handle these as it chooses, but the default action is
for it to throw its hands in the air and sing "goodbye, cruel world,
it's over, walk on by".
the maximum request size after mount. Calling mount(MNT_GETARGS)
from the file server is currently not kosher, as it vrele()s the
root vnode, potentially causing an inactive, which the file server
cannot handle while it itself is inactive in the kernel (deadlock).
Add an additional call with MNT_GETARGS to retrieve the modified
information instead of relying on the kernel code abusing the mount
interface during mount.
* in addition add/remove, allow enable/disable, which can be used
to control events for descriptors without having to remove all the
data associated with them
* add directsend/receive, which can be used to pass the same buffer
from the caller to read/writeframe and back again
* add flags to enqueue functions and allow urgent buffers to be
processed as the next PDU
file always after creation to cache the inode number given by the
backend file system. Otherwise we would not find a newly created
node from incore and create another one. In practise this was
pretty well hidden by the kernel name cache.
and destroyed, but not yet reclaimed. This prevents puffs_pn_nodewalk()
from returning stale entries. Make nullfs use this (some file
systems are a bit too happy with recycling inode numbers).
FORTIFY_SOURCE feature of libssp, thus checking the size of arguments to
various string and memory copy and set functions (as well as a few system
calls and other miscellany) where known at function entry. RedHat has
evidently built all "core system packages" with this option for some time.
This option should be used at the top of Makefiles (or Makefile.inc where
this is used for subdirectories) but after any setting of LIB.
This is only useful for userland code, and cannot be used in libc or in
any code which includes the libc internals, because it overrides certain
libc functions with macros. Some effort has been made to make USE_FORT=yes
work correctly for a full-system build by having the bsd.sys.mk logic
disable the feature where it should not be used (libc, libssp iteself,
the kernel) but no attempt has been made to build the entire system with
USE_FORT and doing so will doubtless expose numerous bugs and misfeatures.
Adjust the system build so that all programs and libraries that are setuid,
directly handle network data (including serial comm data), perform
authentication, or appear likely to have (or have a history of having)
data-driven bugs (e.g. file(1)) are built with USE_FORT=yes by default,
with the exception of libc, which cannot use USE_FORT and thus uses
only USE_SSP by default. Tested on i386 with no ill results; USE_FORT=no
per-directory or in a system build will disable if desired.
turn a reference to puffs_framebuf in the file system from a
cb/justsend operation to a cc wait, should the file system find
itself desiring the result.
equal, larger, respectively instead of 0/1 for non/equal. This
will allow sorting the buffers for faster matching in libpuffs.
While here, change the name from respcmp to framecmp, as that better
reflects the purpose.
NOTE! there is no obvious way to make compilation fail for file
systems which may already be using this feature (although I don't
think there are any outside our tree, as the feature is two weeks
old). Nevertheless, non-updated file systems will fail very quickly.
reclaiming and the network/server is slow, we might have thousands
of buffers allocated at the same time causing the process to run
out of vm space. Rate limiting the number of outstanding ops would
be a nicer choice, but that requires more complex changes.
loop function might generate some results. and this is still "after"
event handling (except for the first call, but I'm not too keen on
optimizing for that)
* don't be such a baby about EINTR from kevent(). if we get it, suck
it up and continue instead of quitting
instead of puffs_start(). Get completely rid of puffs_start(), as
everything it used to do is now handled by the mount routine.
Introduce an optional pre-mount call puffs_setrootinfo() for setting
non-default root node information. As the old puffs_mount() is
now virtually useless, say byebye to it and rename the old
puffs_domount() to puffs_mount(), but add a root cookie parameter
to compensate for the late puffs_start().
support removal and addition of i/o file descriptors on the fly.
* detect closed file descriptors
* automatically free waiters of a dead file descriptor
* give the file server the possibility to specify a callback which
notifies of a dead file descriptor
* move loop function to be a property of the mainloop instead of
framebuf (doesn't change effective behaviour)
* add the possibility to configure a timespec parameter which
attempts to call the loop function periodically
* move the event loop functions from the puffs_framebuf namespace
to puffs_framev to differential between pure memory management
functions
framebuf event loop to take n i/o fd's as parameters and also allow
dynamic add/remove of fd's. (not tested except for one fd, but more
changes coming soon)
stack instead of the continuation stack. This is for lib/36011, where
pthread gets confused since we aren't running on the regular stack.
I'm not really sure which direction to go to with this quite yet, so
make the hack hard to enable on purpose. The whole request dispatch
code needs cleaning anyway.
and event handling mechanisms required in file servers with blocking
I/O backends. puffs_framebuf is built on the concept of puffs_cc
and uses those to multiplex execution where needed.
File systems are required to implement three methods:
* read frame
* write frame
* compare if frame is a response to the given one
Memory management is provided by puffs_framebuf, but the file
systems must still, of course, interpret the protocol and do e.g.
byte order conversion.
As always, puffs_framebuf is work in progress. Current users are
mount_psshfs and mount_9p.
only take the bare essentials, which currently means removing
"maxreqlen" from the argument list (all current callers I'm aware
of set it as 0 anyway). Introduce puffs_init(), which provides a
context for setting various parameters and puffs_domount(), which
can be used to mount the file system. Keep puffs_mount() as a
shortcut for the above two for simple file systems.
Bump development ABI version to 13. After all, it's Friday the 13th.
Watch out! Bad things can happen on Friday the 13th. --No carrier--
accessors for interesting data in it. Namely, you can now get
pu->pu_privdata with puffs_getspecific(), pu->pu_pn_root with
puffs_set/getroot() and pu->pu_maxreqlen with puffs_getmaxreqlen().
and compares the path of the node against the given pathobject.
Also make comparison method take a flag to indicate if it should
check if the second path is a true prefix of the first.
plus some namespace cleanup
macro which does strcmp against ".." and (the untranslated)
componentname
* make PUFFS_FLAG_BUILDPATH build paths also if dotdot is the case,
and adapt the regular path objects to this
* make nullfs lookup readable because we can now get rid of dotdot
processing there
system backends which operate purely based on paths, push out more
path management into the library and make path management more
abstract: enable a file system to define a bunch of path management
callbacks, which are used by the framework. Management of normal
/this/is/a/path type paths is provided by the library.
the given directory if the file system wants paths (PUFFS_FLAG_BUILDPATH).
Do this by walking the nodelist and adjusting the path prefix of
each matching node.
* truncate only regular files to set size
* do the chmod()-dance for cache flush to now write-protected files
until I can think of a nicer way to solve this
a directory hierarchy to another point, just like with the kernel
nullfs. This is not really a layering scheme yet, but it should
evolve into one. Currently it can just be used to do 1:1 mapping.
server that it needs to mount the file system backend if it wants
to call mount
* provide some options for getmntopts(), assume that callers will parse
command line (or fstab) args
* reorganize the puffs_cc interface just a bit, preparing for a bigger
revamp later
Add support for having multiple outstanding operations. This is done
by exposing enough interfaces so that it is convenient to have the
main event loop in the implementation itself and by providing a
continuation framework for convinient blocking and rescheduling.
works fine, but will undergo further cleanup & development
This work was initially started and completed for Google SoC 2005
and tweaked to work a bit better in the past few weeks. While
being far from complete, it is functional enough to be able and
stable to host a fairly general-purpose in-memory file system in
userspace. Even so, puffs should be considered experimental and
no binary compatibility for interfaces or crash-freedom or zero
security implications should be relied upon just yet.
The GSoC project was mentored by William Studenmund and the final
review for the code was done by Christos.