original RAIDframe code had the same bug with dag_h being used when
possibly NULL. Use dagList as the starting point for any potential
dag_h's. Move the initialization of dag_h in this part to a
little later. Loop now runs through in equivalent lock-step with the
construction of the dagList earlier in the function.
Addresses Coverity CID 1129 (id=6841 Run 5).
write paths within RAIDframe. They also resolve the "panics with
RAID 5 sets with more than 3 components" issue which was present
(briefly) in the commits which were previously supposed to address
the malloc() issue.
With this new code the 5-component RAID 5 set panics are now gone.
It is also now also possible to swap to RAID 5.
The changes made are:
1) Introduce rf_AllocStripeBuffer() and rf_FreeStripeBuffer() to
allocate/free one stripe's worth of space. rf_AllocStripeBuffer() is
used in rf_MapUnaccessedPortionOfStripe() where it is not sufficient to
allocate memory using just rf_AllocBuffer(). rf_FreeStripeBuffer() is
called from rf_FreeRaidAccDesc(), well after the DAG is finished.
2) Add a set of emergency "stripe buffers" to struct RF_Raid_s.
Arrange for their initialization in rf_Configure(). In low-memory
situations these buffers will be returned by rf_AllocStripeBuffer()
and re-populated by rf_FreeStripeBuffer().
3) Move RF_VoidPointerListElem_t *iobufs from the dagHeader into
into struct RF_RaidAccessDesc_s. This is more consistent with the
original code, and will not result in items being freed "too early".
4) Add a RF_RaidAccessDesc_t *desc to RF_DagHeader_s so that we have a
way to find desc->iobufs.
5) Arrange for desc in the DagHeader to be initialized in InitHdrNode().
6) Don't cleanup iobufs in rf_FreeDAG() -- the freeing is now delayed
until rf_FreeRaidAccDesc() (which is how the original code handled the
allocList, and for which there seem to be some subtle, undocumented
assumptions).
7) Rename rf_AllocBuffer2() to be rf_AllocBuffer() and remove the
former rf_AllocBuffer(). Fix all callers of rf_AllocBuffer().
(This was how it was *supposed* to be after the last time these
changes were made, before they were backed out).
8) Remove RF_IOBufHeader and all references to it.
9) Remove desc->cleanupList and all references to it.
Fixes PR#20191
elements from the pools.
Re-work rf_SelectAlgorithm() to get rid of all the 8 malloc's, and to
use the new functions to get/put these 'support structures'. I'm not
overly happy with some of the variable names, but them's the breaks.
In the process of changing things, fix a bug:
- in the case where we can't create a dag, free asmh_b and blockFuncs
too!!
[if you were able to look at the source code related to these changes,
and comprehend what was going on without having your eyes bleed or
getting dizzy, please contact me... I'm sure I'll have more code
which would benefit by you having a look at it before I commit it :) ]
Provide rf_AllocDAGNode() and rf_FreeDAGNode() to handle
allocation/freeing.
- Introduce a "nodes" linked list of RF_DagNode_t's into the DAG header.
Initialize nodes in InitHdrNode(). Arrange for nodes cleanup in rf_FreeDAG().
- Add a "list_next" to RF_DagNode_t to keep track of nodes on the
above "nodes" list. (This is distinct from the "next" field of
RF_DagNode_t, which keeps track of the firing order of nodes.)
"list_next" gets used in the cleanup routines, and in traversing
through a set of nodes that belong to a particular set of nodes
(e.g. those belonging to xorNodes for a given DAG).
- use rf_AllocDAGNode() instead of mallocs of variable-sized arrays of
RF_DagNode_t's. Mostly mechanical changes to convert the DAG construction
from "access nodes via an array index" to "access nodes via a 'nextnode'
pointer".
- rework a couple of tricky spots where assumptions about the node order
was being abused.
- performance remains consistent with performance before these changes.
[Thanks to Simon Burge (simonb at you.know.where) for looking over
the mechanical changes to make sure I didn't biff anything.]
dynamically allocated variable-sized array (dagArray). Convert code
to use the new linked list stuff instead of the array stuff (the ratio
of one dagList per stripe still applies). The big advantage is in
being able to more efficiently allocate the dagLists on-the-fly, and
not have to know the size(s) of the array beforehand.
~forever. This requires a number of things:
1) If we can't create a DAG, set desc->numStripes to 0 in
rf_SelectAlgorithm. This will ensure that we don't attempt to free
any dagArray[] elements in rf_StateCleanup.
2) Modify rf_State_CreateDAG() to not panic in the event of a DAG
failure. Instead, set the bp->b_flags and bp->b_error, and set things
up to skip to rf_State_Cleanup().
3) Need to mark desc->status as "bad" so that we actually stop looking
for a different DAG. (which we won't find... no matter how many times
we try).
4) rf_State_LastState() will then do the biodone(), and return EIO for
the IO in question.
5) Remove some " || 1 "'s from ProcessNode(). These were for
debugging, and we don't need the failure notices spewing
over and over again as the failing DAGs are processed.
6) Needed to change
if (asmap->numDataFailed + asmap->numParityFailed > 1)
to
if ((asmap->numDataFailed + asmap->numParityFailed > 1) ||
(raidPtr->numFailures > 1)){
in rf_raid5.c so that it doesn't try to return
rf_CreateNonRedundantWriteDAG as the creation function.
7) Note that we can't apply the above change to the RAID 1 code as
with the silly "fake 2-D" RAID 1 sets, it is possible to have 2 failed
components in the RAID 1 set, and that would stop them from working.
(I really don't know why/how those "fake 2-D" RAID 1 sets even work
with all the "single-fault" assumptions present in the rest of the
code.)
8) Needed to protect rf_RAID0DagSelect() in a similar way -- it should
return NULL as the createFunc.
9) No point printing out "Multiple disks failed..." a zillion times.
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".
This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.
No functional changes to RAIDframe.
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.
No functional changes to the kernel code in this commit.
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.