Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.
As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)
Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.
- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue
- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.
- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.
- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.
XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.
This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]
Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.
Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.
The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.
This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.
No functional changes to RAIDframe.
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.
No functional changes to the kernel code in this commit.
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.