failed, and a "raidctl -C" was run afterwards, triggering mutex locking
issues. fix this by moving alloc and destroy of mutex/condvar for a
raid device into separate functions, and call the destroy function from
the DO_RAID_FAIL() macro.
probably needs a netbsd-6 pullup. sigh.
different buffer queue strategies.
Initialize raid sets to use the default buffer queue strategy for the given
architecture, rather than forcing raidframe to use fcfs in all cases.
This should cause raidframe to use the same buffer queue strategy as the
underlying disks.
On I386, I see performance enhancements of between 14 and 16% with raid5
sets with no other change.
See http://mail-index.NetBSD.org/tech-kern/2012/08/08/msg013758.html
for a discussion of this issue.
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.
XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.
This change causes components on raw disks, as opposed to components inside
partitions or wedges, to be autoconfigured if the raid set is configured
for autoconfiguration.
Approved by oster@ and mrg@ for submission after the NetBSD-6 tag. I've
been running these changes in production at my day job for over a year
without a problem.
See http://mail-index.NetBSD.org/tech-kern/2010/11/09/msg009167.html
for the original discussion of this patch and for a version of this patch
that works with NetBSD-5.x systems.
instead of setting it in code, so it can easily be checked and changed in an
on-disk kernel with gdb. Use a separate raidautoconfigdone variable to keep
track of whether raid configuration has actually occurred.
initialized before use.
Validate the component label before considering a component for use,
and make sure we only consider components that are optimal.
Fixes PR#44251. All atf RAIDframe tests now pass.
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.
this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.
- remove RF_DECLARE_EXTERN_MUTEX and RF_DECLARE_STATIC_MUTEX, the qualifier
can be provided at the use point with the normal define
- rename the *LGMGR_MUTEX() macros to *mutex2() names, and add some more
defines for use:
rf_declare_mutex2()
rf_declare_cond2()
rf_lock_mutex2()
rf_unlock_mutex2()
rf_init_mutex2()
rf_destroy_mutex2()
rf_init_cond2()
rf_destroy_cond2()
rf_wait_cond2()
rf_signal_cond2()
rf_broadcast_cond2()
- use the new names for the configureMutex(), which previous used some combo
of direct mutex* calls and macros
- convert the node_queue to use a mutex/cv combo
- in rf_ShutdownEngine() and DAGExecutionThread(), also signal the former from
the latter when it is done and about to exit
- convert iodone_lock to use the new macros
this only handles the smallest use of old simple_lock/tsleep/wakeup
APIs inside raidframe, and it points out that cv(9)'s have only one
wait channel per cv, whereas each tsleep() caller can specify a
different wait channel. this change removes the difference between
normal raidio and waiting for IO during shutdown.
i've tested this one 3 systems, ran atf, and had mlelstv and rmind
review the change.
to do so, move the call to fix the label inside of rf_reasonable_label()
itself, so we can fix the partition sizes before calling
rf_component_label_partitionsize() itself.
fixes the failure mode where i had garbage not in numBlocksHi but in
partitionSizeHi, and the check against rf_component_label_partitionsize()
would fail and my raid would not auto-configure.