While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.
Discussed on tech-kern, with lots of help from yamt (thanks!).
matter how empty they are.
Note that if two blocks have the same inode and block number, they sort
the same (this should never happen, but if it does there's no reason to
have qsort scramble the list).
Add some diagnostic syslog messages for unusual cases.
In determining when to stop reading segments when counting bytes (-b flag),
total the sizes of the blocks we're actually writing instead of assuming
they are all full blocks: many could be fragments or inode blocks. This
increases the number of segments per Ifile write, markedly improving the
efficiency of the cleaner in the small file case.
indirect block when considering the cleaning of block numbers less
than NDADDR (which do not use indirect blocks).
Also, note the loss of only half a block per segment to fragmentation
when considering the benefit function, rather than a whole block.
cleaner, but with more legible code.
Includes code for reading and writing to the raw disk device (so that an
unmounted fs could be cleaned), for the use of a single daemon to clean
multiple filesystems to save on resources, and for recording the old
contents of cleaned segments to offline storage for regression testing of
the LFS system as a whole; though these new features are not properly
tested at this point.
* Adapt lfs_cleanerd to use the fcntl call to get the Ifile filehandle,
so it need not be in the namespace.
* Make lfs_cleanerd be more careful when there are very few available
segments.
* Remove the Ifile from the filesystem namespace. The cleaner now uses
a fcntl call on the root inode to find the Ifile filehandle.
* Make lfs_cleanerd less verbose when the filesystem is unmounted.
64 bit block pointers, extended attribute storage, and a few
other things.
This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.
Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
(there are still some details to work out) but expect that to go
away soon. To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:
* Create a writer daemon kernel thread whose purpose is to handle page
writes for the pagedaemon, but which also takes over some of the
functions of lfs_check(). This thread is started the first time an
LFS is mounted.
* Add a "flags" parameter to GOP_SIZE. Current values are
GOP_SIZE_READ, meaning that the call should return the size of the
in-core version of the file, and GOP_SIZE_WRITE, meaning that it
should return the on-disk size. One of GOP_SIZE_READ or
GOP_SIZE_WRITE must be specified.
* Instead of using malloc(...M_WAITOK) for everything, reserve enough
resources to get by and use malloc(...M_NOWAIT), using the reserves if
necessary. Use the pool subsystem for structures small enough that
this is feasible. This also obsoletes LFS_THROTTLE.
And a few that are not strictly necessary:
* Moves the LFS inode extensions off onto a separately allocated
structure; getting closer to LFS as an LKM. "Welcome to 1.6O."
* Unified GOP_ALLOC between FFS and LFS.
* Update LFS copyright headers to correct values.
* Actually cast to unsigned in lfs_shellsort, like the comment says.
* Keep track of which segments were empty before the previous
checkpoint; any segments that pass two checkpoints both dirty and
empty can be summarily cleaned. Do this. Right now lfs_segclean
still works, but this should be turned into an effectless
compatibility syscall.
failures as well as successes when a run of clean_all_inodes completes.
Explicitly cast to off_t in get_dinode and get_rawblock, to make sure we
read the right block.
a potential problem with cleaning fragments at all.
Better sanity checks when selecting files to coalesce; in particular don't
shift too far left when comparing the number of discontinuities to the log2
of the number of total blocks.
Better log messages: note beginning of coalescing correctly; also take
the log message from add_segment out of "if (debug)" for symmetry with the
"finished segment" message.
Use lfs_bmapv to find the inode, rather than looking it up manually in
the ifile; this should give more up-to-date information, since trolling
through every inode in the fs could take some time.