- in replacement malloc(), if sbrk(2) returns (void *)-1, convert to NULL
before returning
- in replacement calloc(), check return value of malloc() before zeroing result
Kernels and tools understand both v1 and v2 filesystems; newfs_lfs
generates v2 by default. Changes for the v2 layout include:
- Segments of non-PO2 size and arbitrary block offset, so these can be
matched to convenient physical characteristics of the partition (e.g.,
stripe or track size and offset).
- Address by fragment instead of by disk sector, paving the way for
non-512-byte-sector devices. In theory fragments can be as large
as you like, though in reality they must be smaller than MAXBSIZE in size.
- Use serial number and filesystem identifier to ensure that roll-forward
doesn't get old data and think it's new. Roll-forward is enabled for
v2 filesystems, though not for v1 filesystems by default.
- The inode free list is now a tailq, paving the way for undelete (undelete
is not yet implemented, but can be without further non-backwards-compatible
changes to disk structures).
- Inode atime information is kept in the Ifile, instead of on the inode;
that is, the inode is never written *just* because atime was changed.
Because of this the inodes remain near the file data on the disk, rather
than wandering all over as the disk is read repeatedly. This speeds up
repeated reads by a small but noticeable amount.
Other changes of note include:
- The ifile written by newfs_lfs can now be of arbitrary length, it is no
longer restricted to a single indirect block.
- Fixed an old bug where ctime was changed every time a vnode was created.
I need to look more closely to make sure that the times are only updated
during write(2) and friends, not after-the-fact during a segment write,
and certainly not by the cleaner.
same configuration format that -c and -C use.
this is useful if you're using autoconfig and you've misplaced the
/etc/raidXXX.conf files
* "filesystem" -> "file system", and other man page cleanups.
Some hosts and gateways ignore record route, but not "many." Of course,
more are firewalled. But that's not what was meant here.
Expand flood-pinging admonition to include multicast addresses.
Note flags that conflict with ping under Solaris and FreeBSD.
Reorder BUGS in rough order of significance.
Currently, only Aironet ("an") driver/card can be used.
nwkey persist (IEEE 802.11 devices only) Enable WEP encryption for IEEE
802.11-based wireless network interfaces with the persis-
tent key written in the network card.
nwkey persist:key
(IEEE 802.11 devices only) Write the key to the persis-
tent memory of the network card, and enable WEP encryp-
tion for IEEE 802.11-based wireless network interfaces
with the key.
(force) is given. fsck(8) will return with a zero exit status if "fsck -p"
is used in this circumstance, but all other invocations (e.g, "fsck",
"fsck /filesystem", "fsck -p /filesystem") will return with a non-zero exit
status in this circumstance.
Per discussions with various people including Bill Sommerfeld.
- Use "file system" instead of "filesystem"
a little used server daemon which can be controlled with rc.conf in any case.
(xxx: list of files probably should be totally configurable, but that's
another story). from [bin/13061] by matthew green.
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.
Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva
All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.
This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.
network interfaces. This works by pre-computing the pseudo-header
checksum and caching it, delaying the actual checksum to ip_output()
if the hardware cannot perform the sum for us. In-bound checksums
can either be fully-checked by hardware, or summed up for final
verification by software. This method was modeled after how this
is done in FreeBSD, although the code is significantly different in
most places.
We don't delay checksums for IPv6/TCP, but we do take advantage of the
cached pseudo-header checksum.
Note: hardware-assisted checksumming defaults to "off". It is
enabled with ifconfig(8). See the manual page for details.
Implement hardware-assisted checksumming on the DP83820 Gigabit Ethernet,
3c90xB/3c90xC 10/100 Ethernet, and Alteon Tigon/Tigon2 Gigabit Ethernet.
- if it's a path to an unmounted file-system listed in /etc/fstab, use
that instead of assuming the user wanted a subtree dump of the parent
directory. this restores the behaviour of dump before the subtree
dumping code went in.
- if it's a path to a mounted file-system which is not in /etc/fstab,
use the info from getmntinfo(3). previously, dump would choke.
* implement error checked malloc(), calloc(), strdup(), and use
appropriately (some of the calloc()s weren't being checked)
* use 'file-system' instead of 'filesystem' in the man page
- add a function to print only one partition's info.
- print the partition information if it was modified in interactive mode.
- improve on the chaining code. [still assumes that partition offsets increase
monotonically]. We could check for overlap too.
the offset of an extended sub-partition is the offset of the top-level
extended partition, not the partition before it (this is annoying, and
makes `clean' recursive mbr descent difficult). fixes PRs 11829 and 12677.
otherwise an unaligned address gets passed to the linker. (which is
rounded there, so this is harmless)
XXX how about passing "-N" and killing all these hacks?
The most significant [fix] involves so called "remote" interfaces
configured in the kludge file to with what appear to be colliding
networks. Edward Mascarenhas <eddiem@vihar.engr.sgi.com> found
the problem and the fix, and I think has tested it in the SGI
network.
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.
- clean up WARNS=2 problems
- implement getshort()
- use getshort() with MBR_MAGICOFF to test if the magic number is OK, rather
than using hard-coded magic numbers
normal operation (/var can get filled up by flodding bogus packets).
sysctl net.inet6.icmp6.nd6_debug will turn on diagnostic messages.
(#define ND6_DEBUG will turn it on by default)
improve stats in ND6 code.
lots of synchronziation with kame (including comments and cometic ones).
let static routes overwrite cloned routes, as cloned routes can come back again
if necessary. behavior same as freebsd/bsdi, code partially from bsdi42.
(NRL rt->rt_parent was not added)
should fix PR 11916 and maybe some other PRs with ARP behavior.
recompilation of usr.sbin/route6d is suggested.
XXX route show is total duplicate of netstat -r, we need to either remove
route show, or share the same source code, otherwise maintenance cost
bites (and is biting) us
the current in-core master superblock, and fix them up if
they're incorrect. Move the code that writes the alternate
superblocks if (cvtlevel || doswap) into pass 5 for efficiency.
Reviewd by Charles Hannum, and used by me to fix up a curdled
file system.
'int compress' in savecore.c and the function 'compress' in libz.
Gnu ld 2.10 (with BFD 2.10) used on sparc64 warns this conflict
(symbol "compress" changed size).
Some years ago I made it O(n^2).
Someone helpfully made it O(n^4) again.
Today I'm making it O(n).
If that's not good enough, I don't know what else to do. B-)
Technical details:
* The graph traversal in propagate() is modified to be able to start from any
point in the tree. To handle certain exceptional cases, it is also modified
to work in two passes, marking the tree with a special tag and then changing
it to DFOUND.
* The reconnect case now modifies the child/sibling pointers and calls
propagate() to propagate the connection state starting with the reconnected
directory.
Pray that you never encounter a file system trashed enough for this to matter.
too damn small) by setting a minimum (1024) and maximum (maxino + 1). This
prevents certain operations getting REALLY slow when -b is used, and also
avoids overallocating memory if the superblock is hosed.
already. So, don't fail if there appears to be a corrupt label or
no 'fake' label; get the 'default' label (which is generated
from DIOCGDEFLABEL) instead.
this fixes a problem where elf_mod_sizes() would report size which would
be different (smaller) that the actual size of LKM code to be loaded in some
cases
Reviewed by: Johan Danielsson
-w write in-core label if changed
-r update on-disk as well as in-core label (with -w)
-f force update (-w), even if there's been no change
-r behaviour suggested by matt green. what used to be `-f' is now `-wrf'
gotten bitten by mbrlabel trashing my incore disklabel to a point where
the machine wasn't usable, so I reworked it:
* only update the incore (and on-disk) label if `-f' is given. by default,
the proposed disklabel will be printed but no changes will occur
* add -q, to make the default operation a bit more quiet.
* leave existing `used' in-core partition slots alone, and only add entries
to the incore label if:
- there's not an existing partition of the same size and offset
(even of a different type)
- there's a free partition slot (`unused', with size == 0)
* use DIOCWDINFO instead of DIOCSDINFO, to update the incore as well as
the on-disk label
* use showpartitions() from ../disklabel/printlabel.c
this should make mbrlabel a *lot* more useful.
__CONCAT("foo","bar");
actually works to concantate strings, it's because the preprocessor expands
it into "foo""bar" as separate strings, and then ANSI string concatenation
is performed on that. It's more straightforward to just use ANSI string
concatenation directly, and newer GCCs complain (rightly) about misuse
of token pasting.
effectively an MBR with it's own partition table which contains another
4 `slots', each of which can be another extended partition...
This involved reworking some of the internal functions.
* Use off_t appropriately (so we can manipulate sectors past 4GB).
* Tweak to compile with WARNS=2
authentication for Phase 1 per dratf-ietf-ipsec-isakmp-gss-auth-06.txt.
This work was done by Frank van der Linden of Wasabi Systems, Inc.
under contract from Zembu Labs, Inc.
If sysctl supports it, try to get the kernel name with CPU_BOOTED_KERNEL.
Get current kernels version string in all cases.
Adapt some error messages to the correct kernel name.
Reviewed by Simon Burge.
bandwidth and seek time of the disk, using the "4 * bandwidth * seek
time" formula from Neefe-Matthews' 1997 paper. An RZ25 disk with this
option gets 200K segments. Reference the paper in the manual page.
Don't try to subtract the address of "acg.cg_firstfield" from
"acg.cg_nextfreeoff", as it's already relative to the start of "&acg".
This always worked because the result of the subtraction was
always negative, thus could never be > "sblock.fs_cgsize" ...
Let lfs_cleanerd record its pid in /var/run like other daemons. Make
mount_lfs not start another cleaner when updating the mount, unless it is
being upgraded from read-only to read-write; when downgrading to read-only,
kill the cleaner using the recorded pids.
created and used. Some ramdisks compile individual mount_* commands
directly and hence need the obj dirs setup in order to allow shared
source tree.
Noted in private mail by Andrew Brown <atatat@atatdot.net>.
- keep the case consistent between the actual name and what's referenced.
e.g, if it's `foo', don't use '.Nm Foo' at the start of a sentence.
- remove unnecessary `.Nm foo' after the first occurrence (except for
using `.Nm ""' if there's stuff following, or for the 2nd and so on
occurrences in a SYNOPSIS
- use Sx, Ic, Li, Em, Sq, and Xr as appropriate
- don't warn. It's just too verbose when we know there is
no disklabel and want to use the default filesystem type.
- close the file descriptor so that further mount success.
mount_mfs(8); the mount_*(8) are hardlinked to mount (appropriate mount routine
is called depending on program name) - this saves approx. 1.7MB of /sbin
space
mount.c: make all local symbols static
* make static all symbols which do not need to be exported
* rename main() to mount_FOO()
* new main() now just calls mount_FOO(), main() is only compiled in if
MOUNT_NOMAIN is not defined
* a_gid(), a_uid() and a_mask() were put into ../mount/fattr.[ch], local
versions removed
"component1" and "raidctl -A yes"
- add a note about how to build a RAID set with a limited number of disks
(thanks to Simon Burge for suggestions)
- improve layout of 'raidctl -i' discussion (thanks to Hubert Feyrer)
- add a (small) section on Performance Tuning
for the disklabel if the given device fails with EBUSY. Also make disklabel
errors non fatal (just fall back to ffs as per pre-autofilesystem behaviour)
Based on further discussion with Launey Thomas <ljt@alum.mit.edu>
you need KAME tree to compile this (point the top by ${KAMEROOT}
in Makefile.inc).
XXX maybe too big for /sbin...869K with certificate support, and 574K without
certificate support (i386, stripped, static-link)
Also, change the segment size report to include the total size of the disk,
similar to newfs, e.g.
newfs_lfs -N -F -B 65536 /dev/rsd0b
272.7MB in 4363 segments of size 65536
super-block backups (for fsck -b #) at:
16, 55824, 111632, 167440, 223248, 279056, 334864, 390672, 446480, 502288
Kernel:
* Add runtime quantity lfs_ravail, the number of disk-blocks reserved
for writing. Writes to the filesystem first reserve a maximum amount
of blocks before their write is allowed to proceed; after the blocks
are allocated the reserved total is reduced by a corresponding amount.
If the lfs_reserve function cannot immediately reserve the requested
number of blocks, the inode is unlocked, and the thread sleeps until
the cleaner has made enough space available for the blocks to be
reserved. In this way large files can be written to the filesystem
(or, smaller files can be written to a nearly-full but thoroughly
clean filesystem) and the cleaner can still function properly.
* Remove explicit switching on dlfs_minfreeseg from the kernel code; it
is now merely a fs-creation parameter used to compute dlfs_avail and
dlfs_bfree (and used by fsck_lfs(8) to check their accuracy). Its
former role is better assumed by a properly computed dlfs_avail.
* Bounds-check inode numbers submitted through lfs_bmapv and lfs_markv.
This prevents a panic, but, if the cleaner is feeding the filesystem
the wrong data, you are still in a world of hurt.
* Cleanup: remove explicit references of DEV_BSIZE in favor of
btodb()/dbtob().
lfs_cleanerd:
* Make -n mean "send N segments' blocks through a single call to
lfs_markv". Previously it had meant "clean N segments though N calls
to lfs_markv, before looking again to see if more need to be cleaned".
The new behavior gives better packing of direct data on disk with as
little metadata as possible, largely alleviating the problem that the
cleaner can consume more disk through inefficient use of metadata than
it frees by moving dirty data away from clean "holes" to produce
entirely clean segments.
* Make -b mean "read as many segments as necessary to write N segments
of dirty data back to disk", rather than its former meaning of "read
as many segments as necessary to free N segments worth of space". The
new meaning, combined with the new -n behavior described above,
further aids in cleaning storage efficiency as entire segments can be
written at once, using as few blocks as possible for segment summaries
and inode blocks.
* Make the cleaner take note of segments which could not be cleaned due
to error, and not attempt to clean them until they are entirely free
of dirty blocks. This prevents the case in which a cleanerd running
with -n 1 and without -b (formerly the default) would spin trying
repeatedly to clean a corrupt segment, while the remaining space
filled and deadlocked the filesystem.
* Update the lfs_cleanerd manual page to describe all the options,
including the changes mentioned here (in particular, the -b and -n
flags were previously undocumented).
fsck_lfs:
* Check, and optionally fix, lfs_avail (to an exact figure) and
lfs_bfree (within a margin of error) in pass 5.
newfs_lfs:
* Reduce the default dlfs_minfreeseg to 1/20 of the total segments.
* Add a warning if the sgs disklabel field is 16 (the default for FFS'
cpg, but not usually desirable for LFS' sgs: 5--8 is a better range).
* Change the calculation of lfs_avail and lfs_bfree, corresponding to
the kernel changes mentioned above.
mount_lfs:
* Add -N and -b options to pass corresponding -n and -b options to
lfs_cleanerd.
* Default to calling lfs_cleanerd with "-b -n 4".
[All of these changes were largely tested in the 1.5 branch, with the
idea that they (along with previous un-pulled-up work) could be applied
to the branch while it was still in ALPHA2; however my test system has
experienced corruption on another filesystem (/dev/console has gone
missing :^), and, while I believe this unrelated to the LFS changes, I
cannot with good conscience request that the changes be pulled up.]
searchs (amongst others) are case insensitive.
* in interactive mode (-i), when editing entries display supported disk types
and filesystem types when given `?' (when ``[?]'' appears in the prompt
this feature is supported for the question).
* support `m' as a suffix equivalent to `M'
* in interactive mode, be a bit more sensible about handling errors and EOF
* implement dumpnames(), which takes a char ** and size, and displays
as per ls -F (sorted, listed vertically) but indented by one tab
* don't assume d_typename and d_packname are NUL terminated
* fix up some comments and some warning messages (bad cut & pastos :)
* deprecate deffstypename() and getfstypename()
* be consistent when using sizeof()