As noted by Thierry Deval in a posting to misc/at/openbsd.org,
rf_DecrAccessesCountState wasn't in the correct spot in RF_AccessState_e. Following up on that has resulted in one other correction. Changing orderings of these states is tricky, and shouldn't be attempted without some thorough analysis. For the changes committed, the following analysis is offerred: 1) RAIDframe uses a little state machine to take care of building, executing, and processing the DAGs used to direct IO. 2) The rf_DecrAccessesCountState state is handled by the function rf_State_DecrAccessCount(). The purpose of this state is to decrement the number of "accesses-in-flight". 3) rf_Cleanup_State is handled by rf_State_Cleanup(). Its job is to do general cleanup of DAG arrays and any stripe locks. 4) DefaultStates[] in rf_layout.c indicates that the right spot for rf_DecrAccessesCountState is just before rf_Cleanup_State. Analysis of code for both states indicates that the order doesn't matter too much, although rf_State_DecrAccessCount() should probably take place *after* rf_State_Cleanup() to be more correct. 5) Comments in rf_State_ProcessDAG() indicates that the next state should be rf_Cleanup_State. However: it attempts to get there by using desc->state++; which actually takes it to just rf_DecrAccessesCountState! This turned out to be OK before, since rf_Cleanup_State would follow right after, and all would be taken careof (albeit in arguably the "less correct" order). 6) With the current ordering, if we head directly to rf_Cleanup_State (as we do, for example, if multiple components fail in a RAID 5 set), then we'll actually miss going trough rf_DecrAccessesCountState), and could end up never being able to reach quiescence! Perhaps not too big of a deal, given that the RAID set is pretty much toast by that point at which such a drastic state change happens, but might as well have this correct. The changes made are: 1) Since having rf_State_DecrAccessCount() come after rf_State_Cleanup() is just fine, change rf_layout.c to reflect that rf_DecrAccessesCountState comes after rf_Cleanup_State (i.e. they swap positions in the state list). This means that going to rf_Cleanup_State after bailing on a failed DAG access will do all the right things -- the state will get cleaned up, and then the access counts will get decremented properly. The comment in rf_State_ProcessDAG() is now actually correct -- the next state *will* be rf_Cleanup_State. 2) Move rf_DecrAccessesCountState in RF_AccessState_e to just after rf_CleanupState. This puts RF_AccessState_e in sync with DefaultStates[]. Fortunately, these states are rarely referred to by name, and so this change ends up being mostly cosmetic -- it really only fixes cleanup behaviour for the recent "Failed to create a DAG" changes.
This commit is contained in:
parent
232d108953
commit
87e6ffb752
@ -1,4 +1,4 @@
|
|||||||
/* $NetBSD: raidframevar.h,v 1.5 2003/12/29 02:38:17 oster Exp $ */
|
/* $NetBSD: raidframevar.h,v 1.6 2004/01/04 06:37:16 oster Exp $ */
|
||||||
/*-
|
/*-
|
||||||
* Copyright (c) 1996, 1997, 1998 The NetBSD Foundation, Inc.
|
* Copyright (c) 1996, 1997, 1998 The NetBSD Foundation, Inc.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
@ -281,7 +281,6 @@ typedef enum RF_AccessState_e {
|
|||||||
/* original states */
|
/* original states */
|
||||||
rf_QuiesceState, /* handles queisence for reconstruction */
|
rf_QuiesceState, /* handles queisence for reconstruction */
|
||||||
rf_IncrAccessesCountState, /* count accesses in flight */
|
rf_IncrAccessesCountState, /* count accesses in flight */
|
||||||
rf_DecrAccessesCountState,
|
|
||||||
rf_MapState, /* map access to disk addresses */
|
rf_MapState, /* map access to disk addresses */
|
||||||
rf_LockState, /* take stripe locks */
|
rf_LockState, /* take stripe locks */
|
||||||
rf_CreateDAGState, /* create DAGs */
|
rf_CreateDAGState, /* create DAGs */
|
||||||
@ -289,6 +288,7 @@ typedef enum RF_AccessState_e {
|
|||||||
rf_ProcessDAGState, /* DAGs are completing- check if correct,
|
rf_ProcessDAGState, /* DAGs are completing- check if correct,
|
||||||
* or if we need to retry */
|
* or if we need to retry */
|
||||||
rf_CleanupState, /* release stripe locks, clean up */
|
rf_CleanupState, /* release stripe locks, clean up */
|
||||||
|
rf_DecrAccessesCountState,
|
||||||
rf_LastState /* must be the last state */
|
rf_LastState /* must be the last state */
|
||||||
} RF_AccessState_t;
|
} RF_AccessState_t;
|
||||||
|
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/* $NetBSD: rf_layout.c,v 1.15 2003/12/30 21:59:03 oster Exp $ */
|
/* $NetBSD: rf_layout.c,v 1.16 2004/01/04 06:37:16 oster Exp $ */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 1995 Carnegie-Mellon University.
|
* Copyright (c) 1995 Carnegie-Mellon University.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
@ -30,7 +30,7 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
#include <sys/cdefs.h>
|
#include <sys/cdefs.h>
|
||||||
__KERNEL_RCSID(0, "$NetBSD: rf_layout.c,v 1.15 2003/12/30 21:59:03 oster Exp $");
|
__KERNEL_RCSID(0, "$NetBSD: rf_layout.c,v 1.16 2004/01/04 06:37:16 oster Exp $");
|
||||||
|
|
||||||
#include <dev/raidframe/raidframevar.h>
|
#include <dev/raidframe/raidframevar.h>
|
||||||
|
|
||||||
@ -85,8 +85,8 @@ static const RF_AccessState_t DefaultStates[] = {
|
|||||||
rf_CreateDAGState,
|
rf_CreateDAGState,
|
||||||
rf_ExecuteDAGState,
|
rf_ExecuteDAGState,
|
||||||
rf_ProcessDAGState,
|
rf_ProcessDAGState,
|
||||||
rf_DecrAccessesCountState,
|
|
||||||
rf_CleanupState,
|
rf_CleanupState,
|
||||||
|
rf_DecrAccessesCountState,
|
||||||
rf_LastState};
|
rf_LastState};
|
||||||
|
|
||||||
#define RF_NU(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p) a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p
|
#define RF_NU(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p) a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p
|
||||||
|
Loading…
Reference in New Issue
Block a user