Commit Graph

54 Commits

Author SHA1 Message Date
tron
aff2bc3b4f Avoid comparison between signed and unsigned integer expressions by
casting the offset to a unsigned type. This fixes the NetBSD/i386
and hopefully the NetBSD/amd64 build.
2011-09-10 10:06:10 +00:00
christos
92ad06d875 make this build on amd64 and remove redundant and unused code. 2011-09-09 22:51:44 +00:00
manu
5a6d3e75bd Serialize access to file size. We already have such a thing in the
kernel, where it fixes race for PUFFS filesystems, but we need it again
in perfused since FUSE filesystems are allowed to reorder requests.

The huge issue is in the asyncrhonous SETATTR sent by fsync. It is
followed by a syncrhnous FSYNC, so if the filesystem does not reorder
requests, once the FSYNC returns, we are confident the SETATTR is done.
But since FUSE can reorder, we need to implement sync in perfused.
2011-09-09 15:45:28 +00:00
manu
f19a344abc Make sure perfused remains locked in memory, otherwise we can get
deadlocks in low memory situations, where ioflush waits for perfused
to fsync vnodes, and perfused waits for memory to be freed.
2011-09-09 15:35:22 +00:00
christos
e3c6f18d56 simplify and eliminate non literal string formats. 2011-08-14 08:19:04 +00:00
christos
dda15b03dd - fix warn/err confusiog
- fix debugging printf
- add func arguments to simple formats
2011-08-13 23:12:15 +00:00
manu
4c7933948b Fix uninitiaized variable usage (never though lint would miss that when
used by return statement) that caused unprivilegied user to fail on
unlink(2) and rename(2) operations.
2011-08-09 09:06:52 +00:00
manu
40e8be3f0f Remove PUFFS_KFLAG_WTCACHE, which caused data corruption and slowdown 2011-08-09 06:58:33 +00:00
manu
1de6a9d5ab Do not reject reads on directory, it raises a useless EBADFD while the
thing can just fail silently.
2011-08-02 16:57:16 +00:00
manu
fb0fa57f18 Fix creds passed to FUSE when requests are done on behalf of the kernel.
We previously sent uid/gid set to -1, we now set it to 0.
2011-08-02 14:53:38 +00:00
manu
4b1fc9a3f7 Make sure libperfuse still builds on netbsd-5.1 2011-07-19 07:29:39 +00:00
manu
7541af315c ftruncate(2) cause a SETATTR with only va_size set, and some filesystems
(e.g.: glusterfs) will do a custom handling in such a situation. This
breaks because libpuffs folds a metadata (va_atime and va_mtime) update
in each SETATTR. We try to identify SETATTR caused by ftruncate(2) and
remove va_atime and va_mtime in such situation.

This fixes a bug with glusterfs, where parts of a file downloaded by
FTP was filled with zeros because of a ftruncate(2) sent out of order
with write(2) requests. glusterfs behavior depends on the undocumented
FUSE rule that ftruncate(2) will only set va_size in SETATTR.
2011-07-18 02:14:01 +00:00
manu
6e1ab723f7 FUSE struct dirent's off is not the offset in the buffer, it is an opaque
cookie that the filesystem passes us, and that we need to send back on
the next READDIR. Most filesystem just ignore the value and send the
next chunk of buffer, but not all of them. Fixing this allows glusterfs
distributed volume to work.
2011-07-14 15:37:32 +00:00
manu
be95d60797 Add a flag to VOP_LISTEXTATTR(9) so that the vnode interface can tell the
filesystem in which format extended attribute shall be listed.

There are currently two formats:
- NUL-terminated strings, used for listxattr(2), this is the default.
- one byte length-pprefixed, non NUL-terminated strings, used for
  extattr_list_file(2), which is obtanined by setting the
  EXTATTR_LIST_PREFIXLEN flag to VOP_LISTEXTATTR(9)

This approach avoid the need for converting the list back and forth, except
in libperfuse, since FUSE uses NUL-terminated strings, and the kernel may
have requested EXTATTR_LIST_PREFIXLEN.
2011-07-04 08:07:29 +00:00
riz
ad760bfaf3 Don't hardcode the libpuffs path to /usr/src/lib/libpuffs. 2011-06-28 20:28:48 +00:00
manu
8ae0a67d6d Add support for extended attributes 2011-06-28 16:19:16 +00:00
manu
5255616730 Fix race conditions between write and getattr/setattr, which lead to
inconsitencies between kernel and filesystem idea of file size during
writes with IO_APPEND.

At mine, this resulted in a configure script producing config.status
with ": clr\n" lines stripped (not 100% reproductible, but always this
specific string. That is of little interest except for my own future
reference).

When a write is in progress, getattr/setattr get/set the maximum size
among kernel idea (grown by write) and filesystem idea (not yet grown).
2011-06-01 15:54:10 +00:00
manu
344a543c33 Remove outdated comment about a fixed bug 2011-06-01 07:57:24 +00:00
manu
26381d518d Use SOCK_SEQPACKET in perfuse if available. This fix file operations hangs
where the FUSE filesyste replied to an operation and got an ENOBUFS it did
not handle.

We now are also able to cleanly unmount
2011-05-30 14:50:08 +00:00
joerg
a216da57a6 Default to -Wno-sign-compare -Wno-pointer-sign for clang.
Push -Wno-array-bounds down to the cases that depend on it.
Selectively disable warnings for 3rd party software or non-trivial
issues to be reviewed later to get clang -Werror to build most of the
tree.
2011-05-26 12:56:24 +00:00
manu
e0a6df40c2 - Proper permission checks when doing directory traversal. e.g.: run
rm dir/file while dir was never looked up since the mount. In that
situation, we get lookup with pcn_nameiop NAMEI_DELETE for dir before
we get it for file. But for dir we are just looking for PUFFS_VEXEC.
This is solved by honouring NAMEI_ISLASTCN, which is set for the last
element only

- do not send O_EXCL to FUSE as documentation forbids it.

- fix warning
2011-05-18 15:28:12 +00:00
manu
286587ad9c Set buffer size as big in nomal mode as we do in debug mode, when
perfused stays in foreground. The difference is a mistake and was not
intended.

There is still a bug ready to bite here, since SOCK_STREAM is not reliable.
We just hope that buffers are big enough to hold all packets, but if they
are overflown, we loose a packet and a file operation gets stuck.

We really nee SOCk_SEQPACKET here, but unfortunately it is very broken at
that time.
2011-05-18 15:25:19 +00:00
manu
e7a016f266 typos 2011-05-18 15:22:54 +00:00
manu
6b36a33563 Mont FUSE filesystem with proprer source and fstype so that df and mount
display something that makes sense
2011-05-12 10:32:41 +00:00
jakllsch
32dec9bd40 Use sysconf(_SC_PAGESIZE) instead of PAGE_SIZE. 2011-05-11 14:52:48 +00:00
njoly
411ef8d5d2 Small typo in macro (Xd -> Xr). 2011-05-10 12:14:37 +00:00
manu
73963ae9de Enable the build of perfused and libperfuse 2011-05-09 08:51:08 +00:00
manu
38ecbcf429 Fixes for the advlock method. It can now sustain pkgsrc/devel/locktests
with glusterfs as backend
2011-05-03 13:19:50 +00:00
manu
f8934bcc9b Fix build (libperfuse is still not built by default, but time is coming) 2011-05-03 13:14:09 +00:00
manu
c3c545a544 - Implement proper unprivilegied user permission verifications
Verification is now done in the lookup method, as it is the way to
go. Of course there are corner cases, such as the sticky bit which
need special handling in the remove method.

- Set full fsidx in vftstat method

- Do not pass O_APPEND to the filesystem. FUSE always sends the
write offset, so setting O_APPEND is useless. If the filesystem
uses it in an open(2) system call, it will even cause file
corruptions, since offsets given to pwrite(2) will be ignored.
This fix allows glusterfs to host a NetBSD ./build.sh -o build

- Do not use the FUSE access method, use getattr and check for
permission on our own. The problem is that a FUSE filesystem will
typically use the Linux-specific setfsuid() to perform access
control. If that is missing, any chack is likely to occur on
behalf of the user running the filesystem (typically root), causing
access method to return wrong information.

- When possible, avoid performing a getattr method call and use
cached value in puffs_node instead. We still retreive the latest
value by calling getattr when performing append write operation,
to minimize the chances that another writer appended since the
last time we did.

- Update puffs_node cached file size in write method

- Remove unused argument to perfuse_destroy_pn()
2011-04-25 04:54:53 +00:00
manu
f4f951a0c1 Remove code supporting SOCK_STREAM, as SOCK_DGRAM works fine 2010-10-11 05:37:58 +00:00
manu
f782f0a9e3 FUSE filesystems' readlink returns a resolved link with a NUL trailing
character, and PUFFS do not want it. This fixes this bug, that returned
stat the informations for x instead of reporting ENOENT:
mkdir x && ln x z && stat -x z/whatever/you/want
2010-10-11 01:52:05 +00:00
manu
5b646d774b - fix access control: pcn->pcn_cred is not user credentials
- Keep track of file generation
- remove size tracking in pnd_size, we have it in pn_va.va_size
2010-10-11 01:08:26 +00:00
manu
3a9497b97a - delete an obsoelte comment about inactive
- remove a test for getattr return field that was never filled
- correctly send filehandle and filehandle flags for getaattr
2010-10-04 03:56:24 +00:00
manu
2ff0ea03a7 - Correctly handle rename whith overwritten destination
- Keep track of file name to avoid lookups when we can. This makes sure we
  do not have two cookies for the same inode, a situation that cause wreak
  havoc when we come to remove or rename a node.
- Do not use PUFFS_FLAG_BUILDPATH at all, since we now track file names
- In open, queue requests after checking for access, as there is no merit
  to queue a will-be-denied request while we can deny it immediatly
- request reclaim of removed nodes at inactive stage
2010-10-03 05:46:47 +00:00
manu
f7174423c5 = Open files =
- Restore open on our own in fsycn and readdir, as the node may not already
be open, and FUSE really wants it to be. No need to close immediatly, it
can be done at inactive time.

= Write operations =
- fix a nasty bug that corrupted files on write (written added twice)
- Keep track of file size in order to honour PUFFS_IO_APPEND

= many fixes in rename =
- handler overwritten nodes correctly
- wait for all operations on the node to drain before doing rename, as
filesystems may not cope with operations on a moving file.
- setback PUFFS_SETBACK_INACT_N1 cannot be used from rename, we therefore
miss the inactive time for an overwritten node. This bounds us to give up
PUFFS_KFLAG_IAONDEMAND.

= Removed files =
- forbid most operations on a removed node, return ENOENT
- setback PUFFS_SETBACK_NOREF_N1 at inactive stage to cause removed
file reclaim

= Misc =
- Update outdated ARGSUSED for lint
- Fix a memory leak (puffs_pn_remove instead of puffs_pn_put)
- Do not use PUFFS_FLAG_BUILDPATH except for debug output. It makes the
lookup code much simplier.
2010-09-29 08:01:10 +00:00
manu
bcf6f2f32a == file close operations ==
- use PUFFS_KFLAG_WTCACHE to puffs_init so that all writes are
immediatly send to the filesystem, and we do not have anymore write
after inactive. As a consequence, we can close files at inactive
stage, and there is not any concern left with files opened at
create time. We also do not have anymore to open ourselves in readdir and
fsync.

- Fsync on close (inactive stage). That makes sure we will not need to
do these operations once the file is closed (FUSE want an open file).
short sircuit the request that come after the close, bu not fsinc'ing
closed files,

- Use PUFFS_KFLAG_IAONDEMAND to get less inactive calls

== Removed nodes ==
- more ENOENT retunred for operations on removed node (but there
are probably some still missing): getattr, ooen, setattr, fsync

- set PND_REMOVE before sending the UNLINK/RMDIR operations so that we avoid
races during UNLINK completion. Also set PND_REMOVED on node we overwirte
in rename

== Filehandle fixes ==
- queue open operation to avoid getting two fh for one file

- set FH in getattr, if the file is open

- Just requires a read FH for fsyncdir, as we always opendir in read
mode. Ok, this is misleading :-)

== Misc ==
- do not set FUSE_FATTR_ATIME_NOW in setattr, as we provide the time

- short circuit nilpotent operations in setattr

- add a filename diagnostic flag to dump file names
2010-09-23 16:02:34 +00:00
manu
3d6861b56c - performance improvement for read, readdir and write. Now we use
SOCK_DGRAM, we can send many pages at once without hitting any bug

- when creating a file, it is open for FUSE, but not for the kernel.
If the kernel does not do a subsequent open, we have a leak. We fight
against this by trying to close such file that the kernel left unopen
for some time.

- some code refactoring to make message exchange debug easier (more to come)
2010-09-20 07:00:21 +00:00
manu
e9a8a6acc0 - Use SOCK_DGRAM instead of SOCK_STREAM, as the filesystem seems to
assume datagram semantics: when using SOCK_STREAM, if perfused sends
frames faster than the filesystem consumes them, it will grab multiple
frames at once and discard anything beyond the first one. For now the
code can work both with SOCK_DGRAM and SOCK_STREAM, but SOCK_STREAM
support will probably have to be removed for the sake of readability.

- Remeber to sync parent directories when moving a node

- In debug output, display the requeue type (readdir, write, etc...)
2010-09-15 01:51:43 +00:00
manu
17ce0ff611 - call FSYNCDIR for directories
- directories can be open R/W (for FSYNCDIR)
- do not skip calls to FSYNC or FSYNCDIR if the filesystem returned ENOSYS:
it may change its mind, and it may also actually do something when retunring
ENOSYS
- When FSYNC and FSYNCDIR return ENOSYS, do not report it to kernel (silent
failure)
2010-09-09 09:12:35 +00:00
manu
fac2d0c060 Mode argument must contain the file type (S_* items) for create and mknod 2010-09-07 16:58:13 +00:00
manu
1e672db8d2 - Do not checkfor peer credentials when perfused is autostarted and
therefore runs with filesystem privileges

- shut up warnings and debug messages when perfused is autostarted

- make perfused patch modifiable with CFLAGS for easier pkgsrc integration

- Fix build warnings
2010-09-07 02:11:04 +00:00
manu
5536686b23 More LP64 fixes 2010-09-06 01:40:24 +00:00
manu
6f8501feb8 build fixes for LP64 2010-09-06 01:17:05 +00:00
manu
374c4263ae - correctly set flags for CREATE
- after a node is deleted, some operations should return ENOENT, some
should be ignored. Fixed it for ACCESS, SETATTR and GETATTR. Other
operation may also need a fix.

- At reclaim time, there is no need to wait for READDIR and READ
completion, since the caller will never close a file before getting
readir() and read() replies. Waiting for WRITE completion is still
mandatory, but we must ensure that no queued WRITE is awaiting to
be scheduled. Once the queue is drained, we must check that the
reclaim operation was not canceled by a new file LOOKUP.

- At reclaim time, fixed a mix up between read and write fh to close

- Fixed permission checks for RENAME (it tested the node itself
instead of the source)

- When seting file mode, only MKNOD needs the filetype (S_* fields).
It is probably a bug to set it for other operations.
2010-09-05 06:49:13 +00:00
manu
1eb23a5f2b Fix reference count bug introduced by previous commit 2010-09-03 14:32:50 +00:00
manu
28d5b6408e - Postpone file close at reclaim time, since NetBSD sends fsync and
setattr(mtime, ctime) after close, while FUSE expects the file
to be open for these operations

- remove unused argument to node_mk_common()

- remove requeued requests when they are executed, not when they
are tagged for schedule

- try to make filehandle management simplier, by keeping track of only
one read and one write filehandle (the latter being really read/write).

- when CREATE is not available, we use the MKNOD/OPEN path. Fix a
bug here where we opened the parent directory instead of the node:
add the missing lookup of the mknod'ed node.

- lookup file we just created: glusterfs does not really see them
otherwise.

- open file when doing setattr(mtime, ctime) on non open files, as
some filesystems seems to require it.

- Do not flush pagecache for removed nodes

- Keep track of read/write operations in progress, and at reclaim
time, make sure they are over before closing and forgeting the file.
2010-09-03 07:15:18 +00:00
manu
518513ec60 - only remove queued requests once they are executed, not when they
are set to be scheduled later
- remove an unused argument to make lint happy
2010-09-02 08:58:06 +00:00
manu
ef7e1877ae Build fixes for LP64 2010-09-01 14:57:24 +00:00
wiz
0584bf10ba Some fixes. Comment out ERRORS section until it has content. 2010-09-01 13:04:11 +00:00