cylinder groups to work correctly, with minor modifications by me to work
with our FFS_EI code. From the FreeBSD commit message:
The ffs superblock includes a 128-byte region for use by temporary
in-core pointers to summary information. An array in this region
(fs_csp) could overflow on filesystems with a very large number of
cylinder groups (~16000 on i386 with 8k blocks). When this happens,
other fields in the superblock get corrupted, and fsck refuses to
check the filesystem.
Solve this problem by replacing the fs_csp array in 'struct fs'
with a single pointer, and add padding to keep the length of the
128-byte region fixed. Update the kernel and userland utilities
to use just this single pointer.
With this change, the kernel no longer makes use of the superblock
fields 'fs_csshift' and 'fs_csmask'. Add a comment to newfs/mkfs.c
to indicate that these fields must be calculated for compatibility
with older kernels.
Reviewed by: mckusick
- replace the unused fs_headswitch and fs_trkseek with fs_id[2], bringing
our struct fs closer to that in freebsd & openbsd (& solaris FWIW)
- dumpfs: improve warning message when cpc == 0
- fix round-off errors when determining the number of inodes per group,
which often resulted in the total number of inodes in the file system
being less than what the density asked for.
now you might get more inodes than requested for a given density,
rather than less.
- if the new inodes/group is <= 0, ensure that it's at least 1, preventing
a possible division by zero or other wacky problems
- use long long instead of quad_t
- reorder "special" validation to after option parsing
- use getopt(3) instead of homegrown code
- add getnum() to parse and validate a number
- clean up man page
- ansi KNF, WARNS=2
determine the endianness of the `struct fs *o' superblock from o->fs_magic
and set needswap as necessary, rather than trusting the caller to get
it right. invariably, almost every caller of ffs_sb_swap() was calling it
with ns set to the wrong value for ns anyway!
ansi KNF ffs_bswap.c declarations whilst here.
this fixes all sorts of problems when trying to use other-endian file systems,
notably the kernel trying to access memory *way* off, possibly corrupting or
panicing, and userland programs SEGVing and/or corrupting things (e.g,
"fsck_ffs -B" to swap a file system endianness).
whilst the previous rev of ffs_bswap.c (1.10, 2000/12/23) made this problem
worse, i suspect that the problem was always there and previous versions
just happened not to trash things at the wrong time.
FFS_EI should now be a lot more stable.
and exit.
Previously, combinations would produce unintended results, such as
deleting the primary IP on an interface, instead of deleting an specified
alias.
safe (since there's two separate mallocs using sbrk(2) in that case)
XXX: local malloc provided for mfs memory store allocation; need to
investigate if system (phk) malloc can be used instead.
disklabel is created as per mfs on "swap".
* add -Z option: pre-zero the -F image file before use. this is necessary if
the image is to be used with vnd(4) because by default the files created
with -F have "holes" and vnd doesn't cope with that.
* support 'k', 'm', 'g' suffixes for all options which take numeric arguments.
provide strsuftoi() which performs the parsing mechanism.
* improve man page description of various options
* replace "filesystem" with "file system"
* when displaying usage for mfs, only list mfs options
* minor KNF and WARNS=2 cleanups
- in replacement malloc(), if sbrk(2) returns (void *)-1, convert to NULL
before returning
- in replacement calloc(), check return value of malloc() before zeroing result
Kernels and tools understand both v1 and v2 filesystems; newfs_lfs
generates v2 by default. Changes for the v2 layout include:
- Segments of non-PO2 size and arbitrary block offset, so these can be
matched to convenient physical characteristics of the partition (e.g.,
stripe or track size and offset).
- Address by fragment instead of by disk sector, paving the way for
non-512-byte-sector devices. In theory fragments can be as large
as you like, though in reality they must be smaller than MAXBSIZE in size.
- Use serial number and filesystem identifier to ensure that roll-forward
doesn't get old data and think it's new. Roll-forward is enabled for
v2 filesystems, though not for v1 filesystems by default.
- The inode free list is now a tailq, paving the way for undelete (undelete
is not yet implemented, but can be without further non-backwards-compatible
changes to disk structures).
- Inode atime information is kept in the Ifile, instead of on the inode;
that is, the inode is never written *just* because atime was changed.
Because of this the inodes remain near the file data on the disk, rather
than wandering all over as the disk is read repeatedly. This speeds up
repeated reads by a small but noticeable amount.
Other changes of note include:
- The ifile written by newfs_lfs can now be of arbitrary length, it is no
longer restricted to a single indirect block.
- Fixed an old bug where ctime was changed every time a vnode was created.
I need to look more closely to make sure that the times are only updated
during write(2) and friends, not after-the-fact during a segment write,
and certainly not by the cleaner.
same configuration format that -c and -C use.
this is useful if you're using autoconfig and you've misplaced the
/etc/raidXXX.conf files
* "filesystem" -> "file system", and other man page cleanups.
Some hosts and gateways ignore record route, but not "many." Of course,
more are firewalled. But that's not what was meant here.
Expand flood-pinging admonition to include multicast addresses.
Note flags that conflict with ping under Solaris and FreeBSD.
Reorder BUGS in rough order of significance.
Currently, only Aironet ("an") driver/card can be used.
nwkey persist (IEEE 802.11 devices only) Enable WEP encryption for IEEE
802.11-based wireless network interfaces with the persis-
tent key written in the network card.
nwkey persist:key
(IEEE 802.11 devices only) Write the key to the persis-
tent memory of the network card, and enable WEP encryp-
tion for IEEE 802.11-based wireless network interfaces
with the key.
(force) is given. fsck(8) will return with a zero exit status if "fsck -p"
is used in this circumstance, but all other invocations (e.g, "fsck",
"fsck /filesystem", "fsck -p /filesystem") will return with a non-zero exit
status in this circumstance.
Per discussions with various people including Bill Sommerfeld.
- Use "file system" instead of "filesystem"
a little used server daemon which can be controlled with rc.conf in any case.
(xxx: list of files probably should be totally configurable, but that's
another story). from [bin/13061] by matthew green.
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.
Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva
All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.
This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.
network interfaces. This works by pre-computing the pseudo-header
checksum and caching it, delaying the actual checksum to ip_output()
if the hardware cannot perform the sum for us. In-bound checksums
can either be fully-checked by hardware, or summed up for final
verification by software. This method was modeled after how this
is done in FreeBSD, although the code is significantly different in
most places.
We don't delay checksums for IPv6/TCP, but we do take advantage of the
cached pseudo-header checksum.
Note: hardware-assisted checksumming defaults to "off". It is
enabled with ifconfig(8). See the manual page for details.
Implement hardware-assisted checksumming on the DP83820 Gigabit Ethernet,
3c90xB/3c90xC 10/100 Ethernet, and Alteon Tigon/Tigon2 Gigabit Ethernet.
- if it's a path to an unmounted file-system listed in /etc/fstab, use
that instead of assuming the user wanted a subtree dump of the parent
directory. this restores the behaviour of dump before the subtree
dumping code went in.
- if it's a path to a mounted file-system which is not in /etc/fstab,
use the info from getmntinfo(3). previously, dump would choke.
* implement error checked malloc(), calloc(), strdup(), and use
appropriately (some of the calloc()s weren't being checked)
* use 'file-system' instead of 'filesystem' in the man page
- add a function to print only one partition's info.
- print the partition information if it was modified in interactive mode.
- improve on the chaining code. [still assumes that partition offsets increase
monotonically]. We could check for overlap too.
the offset of an extended sub-partition is the offset of the top-level
extended partition, not the partition before it (this is annoying, and
makes `clean' recursive mbr descent difficult). fixes PRs 11829 and 12677.
otherwise an unaligned address gets passed to the linker. (which is
rounded there, so this is harmless)
XXX how about passing "-N" and killing all these hacks?
The most significant [fix] involves so called "remote" interfaces
configured in the kludge file to with what appear to be colliding
networks. Edward Mascarenhas <eddiem@vihar.engr.sgi.com> found
the problem and the fix, and I think has tested it in the SGI
network.
each of the basic types (anonymous data, executable image, cached files)
and prevent the pagedaemon from reusing a given page if that would reduce
the count of that type of page below a sysctl-setable minimum threshold.
the thresholds are controlled via three new sysctl tunables:
vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the
percentages of pageable memory reserved for each usage, and we do not allow
the sum of the minimums to be more than 95% so that there's always some
memory that can be reused.
- clean up WARNS=2 problems
- implement getshort()
- use getshort() with MBR_MAGICOFF to test if the magic number is OK, rather
than using hard-coded magic numbers
normal operation (/var can get filled up by flodding bogus packets).
sysctl net.inet6.icmp6.nd6_debug will turn on diagnostic messages.
(#define ND6_DEBUG will turn it on by default)
improve stats in ND6 code.
lots of synchronziation with kame (including comments and cometic ones).
let static routes overwrite cloned routes, as cloned routes can come back again
if necessary. behavior same as freebsd/bsdi, code partially from bsdi42.
(NRL rt->rt_parent was not added)
should fix PR 11916 and maybe some other PRs with ARP behavior.
recompilation of usr.sbin/route6d is suggested.
XXX route show is total duplicate of netstat -r, we need to either remove
route show, or share the same source code, otherwise maintenance cost
bites (and is biting) us
the current in-core master superblock, and fix them up if
they're incorrect. Move the code that writes the alternate
superblocks if (cvtlevel || doswap) into pass 5 for efficiency.
Reviewd by Charles Hannum, and used by me to fix up a curdled
file system.
'int compress' in savecore.c and the function 'compress' in libz.
Gnu ld 2.10 (with BFD 2.10) used on sparc64 warns this conflict
(symbol "compress" changed size).
Some years ago I made it O(n^2).
Someone helpfully made it O(n^4) again.
Today I'm making it O(n).
If that's not good enough, I don't know what else to do. B-)
Technical details:
* The graph traversal in propagate() is modified to be able to start from any
point in the tree. To handle certain exceptional cases, it is also modified
to work in two passes, marking the tree with a special tag and then changing
it to DFOUND.
* The reconnect case now modifies the child/sibling pointers and calls
propagate() to propagate the connection state starting with the reconnected
directory.
Pray that you never encounter a file system trashed enough for this to matter.
too damn small) by setting a minimum (1024) and maximum (maxino + 1). This
prevents certain operations getting REALLY slow when -b is used, and also
avoids overallocating memory if the superblock is hosed.
already. So, don't fail if there appears to be a corrupt label or
no 'fake' label; get the 'default' label (which is generated
from DIOCGDEFLABEL) instead.
this fixes a problem where elf_mod_sizes() would report size which would
be different (smaller) that the actual size of LKM code to be loaded in some
cases
Reviewed by: Johan Danielsson
-w write in-core label if changed
-r update on-disk as well as in-core label (with -w)
-f force update (-w), even if there's been no change
-r behaviour suggested by matt green. what used to be `-f' is now `-wrf'
gotten bitten by mbrlabel trashing my incore disklabel to a point where
the machine wasn't usable, so I reworked it:
* only update the incore (and on-disk) label if `-f' is given. by default,
the proposed disklabel will be printed but no changes will occur
* add -q, to make the default operation a bit more quiet.
* leave existing `used' in-core partition slots alone, and only add entries
to the incore label if:
- there's not an existing partition of the same size and offset
(even of a different type)
- there's a free partition slot (`unused', with size == 0)
* use DIOCWDINFO instead of DIOCSDINFO, to update the incore as well as
the on-disk label
* use showpartitions() from ../disklabel/printlabel.c
this should make mbrlabel a *lot* more useful.
__CONCAT("foo","bar");
actually works to concantate strings, it's because the preprocessor expands
it into "foo""bar" as separate strings, and then ANSI string concatenation
is performed on that. It's more straightforward to just use ANSI string
concatenation directly, and newer GCCs complain (rightly) about misuse
of token pasting.
effectively an MBR with it's own partition table which contains another
4 `slots', each of which can be another extended partition...
This involved reworking some of the internal functions.
* Use off_t appropriately (so we can manipulate sectors past 4GB).
* Tweak to compile with WARNS=2
authentication for Phase 1 per dratf-ietf-ipsec-isakmp-gss-auth-06.txt.
This work was done by Frank van der Linden of Wasabi Systems, Inc.
under contract from Zembu Labs, Inc.
If sysctl supports it, try to get the kernel name with CPU_BOOTED_KERNEL.
Get current kernels version string in all cases.
Adapt some error messages to the correct kernel name.
Reviewed by Simon Burge.
bandwidth and seek time of the disk, using the "4 * bandwidth * seek
time" formula from Neefe-Matthews' 1997 paper. An RZ25 disk with this
option gets 200K segments. Reference the paper in the manual page.
Don't try to subtract the address of "acg.cg_firstfield" from
"acg.cg_nextfreeoff", as it's already relative to the start of "&acg".
This always worked because the result of the subtraction was
always negative, thus could never be > "sblock.fs_cgsize" ...