Commit Graph

36 Commits

Author SHA1 Message Date
manu 8abab6b782 Add PUFFS_KFLAG_NOFLUSH_META to prevent sending metadata flush to FUSE
FUSE filesystems do not expect to get metadata updates for [amc]time
and size, they updates the value on their own after operations.

The PUFFS PUFFS_KFLAG_NOFLUSH_META option prevents regular metadata cache
flushes to the filesystem , and libperfuse uses it to match Linux FUSE
behavior.

While there, fix a bug in SETATTR: do not update kernel metadata cache
from SETATTR reply when the request is asynchronous, as we do not have
the reply yet.
2015-02-15 20:21:29 +00:00
manu 6645d5252e FUSE fallocate support
There seems to be no fdiscard FUSE operation at the moment, hence that one
is left unused.
2014-10-31 15:12:15 +00:00
manu bcfebaff94 Improve POSIX compliance of FUSE filesystems through PERUSE
- access denied is EPERM and not EACCES
- access to file owned by someone else in a sticy-bit directory should
  be allowed for the sticy-bit directory owner
- setting sticky-bit on a non directory should produce EFTYPE
- implement PATHCONF method as much as we can.
2014-09-03 16:01:45 +00:00
manu beafd5bc7c Removed unimplemented mmap and seek method. seek's declaration caused
seek request to be passed backand forth between kernel and userland
while we did nothing about them.
2014-08-16 16:31:15 +00:00
manu 781f78b809 Use just introduced open2 PUFFS method and its PUFFS_OPEN_IO_DIRECT oflag
to implement FUSE's OPEN_IO_DIRECT, by which the filesystem tells the kernel
that read/write to the file should bypass the page cache.

Remove a warning about read beyond EOF which will now normally appear when
page cache is bypassed.
2014-08-16 16:28:43 +00:00
manu 3cd0e66ce4 Turn a fatal error into a warning. 2012-09-10 13:56:18 +00:00
manu 2a9a80bb36 Add PUFFS_KFLAG_CACHE_DOTDOT so that vnodes hold a reference on their
parent, keeping them active, and allowing to lookup .. without sending
a request to the filesystem.

Enable the featuure for perfused, as this is how FUSE works.
2012-08-10 16:49:35 +00:00
manu 075ba0e590 - Fix same vnodes associated with multiple cookies
The scheme used to retreive known nodes on lookup was flawed, as it only
used parent and name. This produced a different cookie for the same file
if it was renamed, when looking up ../ or when dealing with multiple files
associated with the same name through link(2).

We therefore abandon the use of node name and introduce hashed lists of
inodes. This causes a huge rewrite of reclaim code, which do not attempt
to keep parents allocated until all their children are reclaimed

- Fix race conditions in reclaim
There are a few situations where we issue multiple FUSE operations for
a PUFFS operation. On reclaim, we therefore have to wait for all FUSE
operation to complete, not just the current exchanges. We do this by
introducing node reference count with node_ref() and node_rele().

- Detect data loss caused by FAF
VOP_PUTPAGES causes FAF writes where the kernel does not check the
operation result. At least issue a warning on error.

- Enjoy FAF shortcut on setattr
No need to wait for the result if the kernel does not want it. There is
however an exception for setattr that touch the size, we need to wait
for completion because we have other operations queued for after the
resize.

- Fix fchmod() on write-open file
fchmod() on a node open with write privilege will send setattr with both mode and size set. This confuses some FUSE filesystem. Therefore we send two FUSE operations, one for mode, and one for size.

- Remove node TTL handling for netbsd-5 for simplicity sake. The code
still builds on netbsd-5 but does not have the node TTL feature anymore.
It works fine with kernel support on netbsd-6.
2012-07-21 05:49:42 +00:00
manu 70d8192475 - When using PUFFS_KFLAG_CACHE_FS_TTL, do not use puffs_node to carry
attribute and TTL fora newly created node. Instead extend puffs_newinfo
  and add puffs_newinfo_setva() and puffs_newinfo_setttl()
- Remove node_mk_common_final in libperfuse. It used to set uid/gid for
  a newly created vnode but has been made redundant along time ago since
  uid and gid are properly set in FUSE header.
- In libperfuse, check for corner case where opc = 0 on INACTIVE and   RECLAIM (how is it possible? Check for it to avoid a crash anyway)
- In libperfuse, make sure we unlimit RLIMIT_AS and RLIMIT_DATA so that
  we do notrun out of memory because the kernel is lazy at reclaiming vnodes.
- In libperfuse, cleanup style of perfuse_destroy_pn()
2012-04-18 00:57:21 +00:00
manu 6bceb41868 Use new PUFFS_KFLAG_CACHE_FS_TTL option to puffs_init(3) so that
FUSE TTL on name and attributes are used. This save many PUFFS
operations and improves performances.

PUFFS_KFLAG_CACHE_FS_TTL is #ifdef'ed in many places for now so that
libperfuse can still be used on netbsd-5.
2012-04-08 15:13:06 +00:00
matt e1a2f47f12 Use C89 function definition 2012-03-21 10:10:36 +00:00
manu 9724ab82d4 Make sure perfused exit when the filesystem crashed, so that unmount
is done. Failure to do so caused deadlocks, with operation that
held a lock on the root vnode and got stuck in perfused forever.

Approved by releng.
2012-02-03 15:54:15 +00:00
manu 4fba06add5 Add a FUSE trace facility, with statistics collection. This should help
tracking bugs and performance issues
2011-12-28 17:33:52 +00:00
manu 2bc8acd859 - Fix the confusion between fileno (opaque FUSE reference) and inode
numbers. fileno must be used when exchanging FUSE messages.
- Do not use kernel name cache anymore, as it caused modification from
  other machines to be invisible.
- Honour name and attribute cache directive from FUSE filesystem
2011-10-30 05:11:37 +00:00
manu aec8bd3191 perfuse memory usage can grow quite large when using a lot of vnodes,
and the amount of data memory involved is not easy to forcast. We therefore
raise the limit to the maximum.

Patch from Manuel Bouyer. It helps completing a cvs update on a glusterfs
colume.
2011-10-23 05:01:00 +00:00
manu b4362f9202 mlockall is not necessary after all, once we have fixed a kernel bug involving
agedaemon sleeping form memory
2011-10-18 15:47:32 +00:00
christos 92ad06d875 make this build on amd64 and remove redundant and unused code. 2011-09-09 22:51:44 +00:00
manu f19a344abc Make sure perfused remains locked in memory, otherwise we can get
deadlocks in low memory situations, where ioflush waits for perfused
to fsync vnodes, and perfused waits for memory to be freed.
2011-09-09 15:35:22 +00:00
christos dda15b03dd - fix warn/err confusiog
- fix debugging printf
- add func arguments to simple formats
2011-08-13 23:12:15 +00:00
manu 40e8be3f0f Remove PUFFS_KFLAG_WTCACHE, which caused data corruption and slowdown 2011-08-09 06:58:33 +00:00
manu 8ae0a67d6d Add support for extended attributes 2011-06-28 16:19:16 +00:00
manu 26381d518d Use SOCK_SEQPACKET in perfuse if available. This fix file operations hangs
where the FUSE filesyste replied to an operation and got an ENOBUFS it did
not handle.

We now are also able to cleanly unmount
2011-05-30 14:50:08 +00:00
manu 286587ad9c Set buffer size as big in nomal mode as we do in debug mode, when
perfused stays in foreground. The difference is a mistake and was not
intended.

There is still a bug ready to bite here, since SOCK_STREAM is not reliable.
We just hope that buffers are big enough to hold all packets, but if they
are overflown, we loose a packet and a file operation gets stuck.

We really nee SOCk_SEQPACKET here, but unfortunately it is very broken at
that time.
2011-05-18 15:25:19 +00:00
manu 6b36a33563 Mont FUSE filesystem with proprer source and fstype so that df and mount
display something that makes sense
2011-05-12 10:32:41 +00:00
manu c3c545a544 - Implement proper unprivilegied user permission verifications
Verification is now done in the lookup method, as it is the way to
go. Of course there are corner cases, such as the sticky bit which
need special handling in the remove method.

- Set full fsidx in vftstat method

- Do not pass O_APPEND to the filesystem. FUSE always sends the
write offset, so setting O_APPEND is useless. If the filesystem
uses it in an open(2) system call, it will even cause file
corruptions, since offsets given to pwrite(2) will be ignored.
This fix allows glusterfs to host a NetBSD ./build.sh -o build

- Do not use the FUSE access method, use getattr and check for
permission on our own. The problem is that a FUSE filesystem will
typically use the Linux-specific setfsuid() to perform access
control. If that is missing, any chack is likely to occur on
behalf of the user running the filesystem (typically root), causing
access method to return wrong information.

- When possible, avoid performing a getattr method call and use
cached value in puffs_node instead. We still retreive the latest
value by calling getattr when performing append write operation,
to minimize the chances that another writer appended since the
last time we did.

- Update puffs_node cached file size in write method

- Remove unused argument to perfuse_destroy_pn()
2011-04-25 04:54:53 +00:00
manu f4f951a0c1 Remove code supporting SOCK_STREAM, as SOCK_DGRAM works fine 2010-10-11 05:37:58 +00:00
manu 2ff0ea03a7 - Correctly handle rename whith overwritten destination
- Keep track of file name to avoid lookups when we can. This makes sure we
  do not have two cookies for the same inode, a situation that cause wreak
  havoc when we come to remove or rename a node.
- Do not use PUFFS_FLAG_BUILDPATH at all, since we now track file names
- In open, queue requests after checking for access, as there is no merit
  to queue a will-be-denied request while we can deny it immediatly
- request reclaim of removed nodes at inactive stage
2010-10-03 05:46:47 +00:00
manu f7174423c5 = Open files =
- Restore open on our own in fsycn and readdir, as the node may not already
be open, and FUSE really wants it to be. No need to close immediatly, it
can be done at inactive time.

= Write operations =
- fix a nasty bug that corrupted files on write (written added twice)
- Keep track of file size in order to honour PUFFS_IO_APPEND

= many fixes in rename =
- handler overwritten nodes correctly
- wait for all operations on the node to drain before doing rename, as
filesystems may not cope with operations on a moving file.
- setback PUFFS_SETBACK_INACT_N1 cannot be used from rename, we therefore
miss the inactive time for an overwritten node. This bounds us to give up
PUFFS_KFLAG_IAONDEMAND.

= Removed files =
- forbid most operations on a removed node, return ENOENT
- setback PUFFS_SETBACK_NOREF_N1 at inactive stage to cause removed
file reclaim

= Misc =
- Update outdated ARGSUSED for lint
- Fix a memory leak (puffs_pn_remove instead of puffs_pn_put)
- Do not use PUFFS_FLAG_BUILDPATH except for debug output. It makes the
lookup code much simplier.
2010-09-29 08:01:10 +00:00
manu bcf6f2f32a == file close operations ==
- use PUFFS_KFLAG_WTCACHE to puffs_init so that all writes are
immediatly send to the filesystem, and we do not have anymore write
after inactive. As a consequence, we can close files at inactive
stage, and there is not any concern left with files opened at
create time. We also do not have anymore to open ourselves in readdir and
fsync.

- Fsync on close (inactive stage). That makes sure we will not need to
do these operations once the file is closed (FUSE want an open file).
short sircuit the request that come after the close, bu not fsinc'ing
closed files,

- Use PUFFS_KFLAG_IAONDEMAND to get less inactive calls

== Removed nodes ==
- more ENOENT retunred for operations on removed node (but there
are probably some still missing): getattr, ooen, setattr, fsync

- set PND_REMOVE before sending the UNLINK/RMDIR operations so that we avoid
races during UNLINK completion. Also set PND_REMOVED on node we overwirte
in rename

== Filehandle fixes ==
- queue open operation to avoid getting two fh for one file

- set FH in getattr, if the file is open

- Just requires a read FH for fsyncdir, as we always opendir in read
mode. Ok, this is misleading :-)

== Misc ==
- do not set FUSE_FATTR_ATIME_NOW in setattr, as we provide the time

- short circuit nilpotent operations in setattr

- add a filename diagnostic flag to dump file names
2010-09-23 16:02:34 +00:00
manu 3d6861b56c - performance improvement for read, readdir and write. Now we use
SOCK_DGRAM, we can send many pages at once without hitting any bug

- when creating a file, it is open for FUSE, but not for the kernel.
If the kernel does not do a subsequent open, we have a leak. We fight
against this by trying to close such file that the kernel left unopen
for some time.

- some code refactoring to make message exchange debug easier (more to come)
2010-09-20 07:00:21 +00:00
manu e9a8a6acc0 - Use SOCK_DGRAM instead of SOCK_STREAM, as the filesystem seems to
assume datagram semantics: when using SOCK_STREAM, if perfused sends
frames faster than the filesystem consumes them, it will grab multiple
frames at once and discard anything beyond the first one. For now the
code can work both with SOCK_DGRAM and SOCK_STREAM, but SOCK_STREAM
support will probably have to be removed for the sake of readability.

- Remeber to sync parent directories when moving a node

- In debug output, display the requeue type (readdir, write, etc...)
2010-09-15 01:51:43 +00:00
manu 1e672db8d2 - Do not checkfor peer credentials when perfused is autostarted and
therefore runs with filesystem privileges

- shut up warnings and debug messages when perfused is autostarted

- make perfused patch modifiable with CFLAGS for easier pkgsrc integration

- Fix build warnings
2010-09-07 02:11:04 +00:00
manu 5536686b23 More LP64 fixes 2010-09-06 01:40:24 +00:00
manu 374b973db4 - set user/group ownership after object creation.
- enforce permissios checks. This needs to be  reviewed.
2010-08-28 03:46:21 +00:00
manu 1326029781 - if perfused is not already started (cannot connect to /dev/fuse),
FUSE filesystems will attempt to start it on their own, and will
communicate using a socketpair

- do not advertise NULL file handle as being valid when sending themback to the FUSE filesystem.

- unmount if we cannot talk to the FUSE process anymore

- set calling process gid properly

- debug message cleanup
2010-08-27 09:58:17 +00:00
manu 7b1d1ee680 libperfuse(3) is a PUFFS relay to FUSE. In order to use it,
FUSE filesystem must be patched to #include <perfuse.h> in the source
files that open /dev/fuse and perform the mount(2) system call. The
FUSE filesystem must be linked with -lperfuse.

libperfuse(3) implements the FUSE kernel interface, on which libfuse or
any FUSE filesystem that opens /dev/fuse directly can be used.

For now, an external daemon called perfused(8) is used. This may change
in the future.
2010-08-25 07:16:00 +00:00