Commit Graph

52 Commits

Author SHA1 Message Date
hannken 0253e4d99a wapbl_biodone: Release the buffer before reclaiming the log.
wapbl_flush() may wait for the log to become empty and
    all buffers should be unbusy before it returns.
2012-11-17 10:10:17 +00:00
chs 67b37d586b mark all wapbl I/O as BPRIO_TIMECRITICAL.
this is the second part of addressing PR 46325.
2012-04-29 22:55:11 +00:00
para 9c9daafc45 replacing malloc(9) with kmem(9)
wapbl_entries get there own pool, they are freed from softint context

ok: rmind@
2012-01-28 18:02:56 +00:00
para e62ee4d475 extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged
2012-01-27 19:48:38 +00:00
yamt c237385989 comments 2012-01-11 00:11:32 +00:00
yamt b9c83c9d4b - move disk cache flushing code into a separate function.
- more verbose output if vfs.wapbl.verbose_commit >= 2.
  namely, time taken for each DIOCCACHESYNC calls.
	wapbl_flush: 1322826000.785245900 this transaction = 546304 bytes
	wapbl_cache_sync: 1: dev 0x0 0.017572724
	wapbl_cache_sync: 2: dev 0x0 0.007199825
	wapbl_flush: 1322826011.860771302 this transaction = 431104 bytes
	wapbl_cache_sync: 1: dev 0x0 0.019469753
	wapbl_cache_sync: 2: dev 0x0 0.009473410
	wapbl_flush: 1322829266.489154342 this transaction = 187904 bytes
	wapbl_cache_sync: 1: dev 0x4 0.022270180
	wapbl_cache_sync: 2: dev 0x4 0.030749402
- fix a comment.
2011-12-02 12:38:59 +00:00
christos a84933c836 add a couple of asserts 2011-09-01 09:03:43 +00:00
christos b866ba6e1a fix sign-compare warnings 2011-08-14 12:37:09 +00:00
rmind e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
uebayasi c6e687b646 Catch up with B_* flag name changes in debug code. 2011-05-26 04:51:57 +00:00
nakayama 387820aa30 Fix digit number of nanosecond. 2011-02-20 11:21:34 +00:00
hannken 6b3add8f17 Adjust previous: set the dealloc soft limit to half hard limit. 2011-02-18 13:24:40 +00:00
hannken 497b3d48db Set the limit for deallocations in one transaction to a more realistic
(and much lower) value.  When flushing the log these deallocations will
produce new blocks and that may execeed the journal size resulting in
a "wapbl_flush: current transaction too big to flush" panic.
Seen when removing a large snapshot.

Adresses PR #44568 (WAPBL doens't play nice with snapshots).
2011-02-16 19:43:05 +00:00
bouyer 2df8503191 if DIAGNOSTIC, check the size of the transaction in wapbl_end().
Hopefully this will point us to the place which generaed the large
transaction, before an asynchronous panic() in wabl_end()
2011-02-14 16:05:11 +00:00
christos f79b6ffa09 Add two sysctls one that does verbose transaction logging and a second one
that disables flushing the disk cache (which is fast but dangerous for
data integrity). From simon a long while back.
2011-01-08 20:37:05 +00:00
hannken 897bf1dab0 Wapbl_register_deallocation(): the taken reader lock is not sufficient to
protect wl_dealloc* members.  Take the mutex here and change the lock
requirements of these fields to "writer lock or mutex".

This error lead to file system corruption and "freeing free block" panics.
2010-11-09 16:30:26 +00:00
drochner 3ebe07497e fix two bugs reported by Ryo Shimizu:
-wrong initialization reported in a followup to PR bin/43336
 (looks harmless because it applies to zero-initialized memory, so
 LIST_INIT() is a no-op)
-wrong loop count in reply misses a hash bucket (PR kern/43827)
 (this was introduced by a post-netbsd-5 change, so it isn't related
 to the PR above)
2010-09-10 10:14:55 +00:00
pooka 33de26e6c6 dumdidumdum, need _KERNEL in previous for fsck. noticed by moof 2010-04-21 19:50:57 +00:00
pooka 34244e1069 Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)
2010-04-21 16:51:24 +00:00
mlelstv 7ad5c184b5 Move block number computations to callers of wapl_read/wapl_write and
conditionally build DEV_BSIZE adjustments for kernel. fsck_ffs shares
the same code but accesses physical blocks.

Also compute correct block numbers for each physical sector.
2010-02-27 16:51:03 +00:00
mlelstv ef95b640b0 Store physical block numbers in superblock that point to the journal.
Calculate position of both commit headers correctly for disks with
large sectors.
Correct calculation of circular buffer size.
2010-02-27 12:04:19 +00:00
mlelstv c30b0f26b2 mnt_fs_bshift is the filesystem block size, not the fragment size.
Revert to physical block size. This is fine as long as filesystem
and log stay on a similar physical medium.
2010-02-26 22:24:07 +00:00
mlelstv b4d69db7b5 Use correct offset to block number calculations.
Also change access to filesystem blocks to be done by fragment instead
of by physical block. Fragments are the fundamental blocks of the
filesystem.

For a theoretical filesystem that accesses the disk in smaller units
than stored in mp->mnt_fs_bshift, the assumption might be wrong. But
this will also break other subsystems. The value mp->mnt_dev_bshift
which formerly represents the physical sector size is currently only
virtual in NetBSD (always DEV_BSIZE).
2010-02-23 20:51:25 +00:00
uebayasi 2903e6d834 __inline -> inline 2010-02-06 12:10:59 +00:00
pooka 64ab232858 make WAPBL_DEBUG_PRINT compile 2009-11-25 14:43:31 +00:00
pooka bea18fb702 Add dealloccnt to list of things to be considered in the stetson-harrison
decision making algorithm for flushing a wapbl transation.
2009-10-01 12:28:34 +00:00
pooka 5b19885537 Turn a KASSERT into a panic. I don't want us to be randomly
overwriting memory on non-DIAGNOSTIC kernels if resource estimation
fails.
2009-10-01 07:42:45 +00:00
apb dfcfba79d8 Convert free text inside #ifdef to a proper comment.
Inspired by PR 41255 from Kurt Lidl.
2009-07-14 20:59:00 +00:00
lukem 2b2f4703f2 fix sign-compare issues 2009-04-05 11:48:02 +00:00
cegger b8817e4aed ansify function definitions 2009-03-15 17:14:40 +00:00
ad 430f67aa17 PR kern/39564 wapbl performance issues with disk cache flushing
PR kern/40361 WAPBL locking panic in -current
PR kern/40361 WAPBL locking panic in -current
PR kern/40470 WAPBL corrupts ext2fs
PR kern/40562 busy loop in ffs_sync when unmounting a file system
PR kern/40525 panic: ffs_valloc: dup alloc

- A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg
  buffers being invalidated. Problem discovered and patch by dholland@.

- If the syncer fails to lazily sync a vnode due to lock contention,
  retry 1 second later instead of 30 seconds later.

- Flush inode atime updates every ~10 seconds (this makes most sense with
  logging). Presently they didn't hit the disk for read-only files or
  devices until the file system was unmounted. It would be better to trickle
  the updates out but that would require more extensive changes.

- Fix issues with file system corruption, busy looping and other nasty
  problems when logging and non-logging file systems are intermixed,
  with one being the root file system.

- For logging, do not flush metadata on an inode-at-a-time basis if the sync
  has been requested by ioflush. Previously, we could try hundreds of log
  sync operations a second due to inode update activity, causing the syncer
  to fall behind and metadata updates to be serialized across the entire
  file system. Instead, burst out metadata and log flushes at a minimum
  interval of every 10 seconds on an active file system (happens more often
  if the log becomes full). Note this does not change the operation of
  fsync() etc.

- With the flush issue fixed, re-enable concurrent metadata updates in
  vfs_wapbl.c.
2009-02-22 20:10:25 +00:00
yamt 4c5a0bb384 redo rev.1.19 correctly. 2009-02-18 13:22:10 +00:00
yamt a13bb3bef4 whitespace 2009-02-18 13:12:00 +00:00
yamt 5e978bf3b6 remove a non-ascii comment. 2009-02-02 00:10:18 +00:00
yamt e1146d7ee9 back to malloc for now as wapbl_biodone is called by softint. 2009-02-02 00:07:06 +00:00
yamt 740c75e25d - malloc -> kmem_alloc
- kill WAPBL_UVM_ALLOC.
- kill wapbl_blk_pool to reduce #ifdef.
2009-01-31 09:33:36 +00:00
yamt 68b6d8786e remove extra semicolons. 2009-01-03 03:31:23 +00:00
joerg 6c45130eba Move the specification of the on-disk journal format into a separate
header.
2008-11-24 16:05:21 +00:00
joerg 27024ae7a6 Push functionality to deal with existing inode records into a separate
function.
2008-11-20 00:17:08 +00:00
joerg 412427525e Decouple journal operation from replay header by copying the interesting
fields into wapbl_replay as opposed to embedding wapbl_wc_header.
2008-11-18 22:21:48 +00:00
joerg 5658187923 #if 0 wapbl_replay_verify. 2008-11-18 19:31:35 +00:00
joerg 2e2e65b3b8 Check for NULL before calling free as the kernel free doesn't handle it. 2008-11-18 18:54:39 +00:00
joerg a3925622e1 Rename wapbl_replay_prescan to wapbl_replay_process. 2008-11-18 13:29:34 +00:00
joerg 355e64e949 Refact wapbl_replay_prescan to use a function for each WAPBL record.
Merge wapbl_replay_get_inodes into wapbl_replay_prescan. Change the
logic to determine the head: It doesn't make sense to update it if the
last inode record seen was not the beginning of the journal, as the
beginning of the journal might not be 0, so always update inodeshead.
2008-11-18 11:37:37 +00:00
joerg c42112239b In wapbl_replay_write just iterate over the hash table and not the
transactions. The initial prescan has already sorted out what blocks are
in the journal and removed any revoced blocks, so the hash table is
authorative.
2008-11-17 22:08:09 +00:00
joerg c42fa4ab26 Remove debug printf. 2008-11-17 19:36:11 +00:00
joerg bea450f881 Ensure that block records are correctly padded. 2008-11-17 19:31:47 +00:00
joerg b9400f6fd4 Move WAPL replay handling from bread() into ufs_strategy.
This changes the order of hook processing as the copy-on-write handlers
are called after the journal processing. This makes more sense as the
journal overwrite is logically part of the disk IO.
2008-11-11 08:29:58 +00:00
joerg 8800d320f1 Define wapbl_flush_fn_t only for the kernel. 2008-11-10 20:30:31 +00:00
joerg 3fbdfc8af9 Reduce internals of WAPBL exposed to the rest of the system. 2008-11-10 20:12:13 +00:00