(and much lower) value. When flushing the log these deallocations will
produce new blocks and that may execeed the journal size resulting in
a "wapbl_flush: current transaction too big to flush" panic.
Seen when removing a large snapshot.
Adresses PR #44568 (WAPBL doens't play nice with snapshots).
protect wl_dealloc* members. Take the mutex here and change the lock
requirements of these fields to "writer lock or mutex".
This error lead to file system corruption and "freeing free block" panics.
-wrong initialization reported in a followup to PR bin/43336
(looks harmless because it applies to zero-initialized memory, so
LIST_INIT() is a no-op)
-wrong loop count in reply misses a hash bucket (PR kern/43827)
(this was introduced by a post-netbsd-5 change, so it isn't related
to the PR above)
conditionally build DEV_BSIZE adjustments for kernel. fsck_ffs shares
the same code but accesses physical blocks.
Also compute correct block numbers for each physical sector.
Also change access to filesystem blocks to be done by fragment instead
of by physical block. Fragments are the fundamental blocks of the
filesystem.
For a theoretical filesystem that accesses the disk in smaller units
than stored in mp->mnt_fs_bshift, the assumption might be wrong. But
this will also break other subsystems. The value mp->mnt_dev_bshift
which formerly represents the physical sector size is currently only
virtual in NetBSD (always DEV_BSIZE).
PR kern/40361 WAPBL locking panic in -current
PR kern/40361 WAPBL locking panic in -current
PR kern/40470 WAPBL corrupts ext2fs
PR kern/40562 busy loop in ffs_sync when unmounting a file system
PR kern/40525 panic: ffs_valloc: dup alloc
- A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg
buffers being invalidated. Problem discovered and patch by dholland@.
- If the syncer fails to lazily sync a vnode due to lock contention,
retry 1 second later instead of 30 seconds later.
- Flush inode atime updates every ~10 seconds (this makes most sense with
logging). Presently they didn't hit the disk for read-only files or
devices until the file system was unmounted. It would be better to trickle
the updates out but that would require more extensive changes.
- Fix issues with file system corruption, busy looping and other nasty
problems when logging and non-logging file systems are intermixed,
with one being the root file system.
- For logging, do not flush metadata on an inode-at-a-time basis if the sync
has been requested by ioflush. Previously, we could try hundreds of log
sync operations a second due to inode update activity, causing the syncer
to fall behind and metadata updates to be serialized across the entire
file system. Instead, burst out metadata and log flushes at a minimum
interval of every 10 seconds on an active file system (happens more often
if the log becomes full). Note this does not change the operation of
fsync() etc.
- With the flush issue fixed, re-enable concurrent metadata updates in
vfs_wapbl.c.
Merge wapbl_replay_get_inodes into wapbl_replay_prescan. Change the
logic to determine the head: It doesn't make sense to update it if the
last inode record seen was not the beginning of the journal, as the
beginning of the journal might not be 0, so always update inodeshead.
transactions. The initial prescan has already sorted out what blocks are
in the journal and removed any revoced blocks, so the hash table is
authorative.
This changes the order of hook processing as the copy-on-write handlers
are called after the journal processing. This makes more sense as the
journal overwrite is logically part of the disk IO.
Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.
OK'd by core@, releng@.