Add a description of how to possibly recover a RAID set in the
event of a multiple disk failure.
This commit is contained in:
parent
15d16b2223
commit
c4aed2da0e
@ -1,4 +1,4 @@
|
|||||||
.\" $NetBSD: raidctl.8,v 1.26 2001/11/16 11:06:46 wiz Exp $
|
.\" $NetBSD: raidctl.8,v 1.27 2002/01/20 02:30:11 oster Exp $
|
||||||
.\"
|
.\"
|
||||||
.\" Copyright (c) 1998 The NetBSD Foundation, Inc.
|
.\" Copyright (c) 1998 The NetBSD Foundation, Inc.
|
||||||
.\" All rights reserved.
|
.\" All rights reserved.
|
||||||
@ -962,6 +962,93 @@ raidctl -F component1 raid0
|
|||||||
at which point the data missing from
|
at which point the data missing from
|
||||||
.Sq component1
|
.Sq component1
|
||||||
would be reconstructed onto /dev/sd3e.
|
would be reconstructed onto /dev/sd3e.
|
||||||
|
.Pp
|
||||||
|
When more than one component is marked as
|
||||||
|
.Sq failed
|
||||||
|
due to a non-component hardware failure (e.g. loss of power to two
|
||||||
|
components, adapter problems, termination problems, or cabling issues) it
|
||||||
|
is quite possible to recover the data on the RAID set. The first
|
||||||
|
thing to be aware of is that the first disk to fail will almost certainly
|
||||||
|
be out-of-sync with the remainder of the array. If any IO was
|
||||||
|
performed between the time the first component is considered
|
||||||
|
.Sq failed
|
||||||
|
and when the second component is considered
|
||||||
|
.Sq failed ,
|
||||||
|
then the first component to fail will
|
||||||
|
.Ar not
|
||||||
|
contain correct data, and should be ignored. When the second
|
||||||
|
component is marked as failed, however, the RAID device will
|
||||||
|
(currently) panic the system. At this point the data on the RAID set
|
||||||
|
(not including the first failed component) is still self consistent,
|
||||||
|
and will be in no worse state of repair than had the power gone out in
|
||||||
|
the middle of a write to a filesystem on a non-RAID device.
|
||||||
|
The problem, however, is that the component labels may now have 3
|
||||||
|
different 'modification counters' (one value on the first component
|
||||||
|
that failed, one value on the second component that failed, and a
|
||||||
|
third value on the remaining components). In such a situation, the
|
||||||
|
RAID set will not autoconfigure, and can only be forcibly re-configured
|
||||||
|
with the
|
||||||
|
.Fl C
|
||||||
|
option. To recover the RAID set, one must first remedy whatever physical
|
||||||
|
problem caused the multiple-component failure. After that is done,
|
||||||
|
the RAID set can be restored by forcibly configuring the raid set
|
||||||
|
.Ar without
|
||||||
|
the component that failed first. For example, if /dev/sd1e and
|
||||||
|
/dev/sd2e fail (in that order) in a RAID set of the following
|
||||||
|
configuration:
|
||||||
|
.Bd -literal -offset indent
|
||||||
|
START array
|
||||||
|
1 4 0
|
||||||
|
|
||||||
|
START drives
|
||||||
|
/dev/sd1e
|
||||||
|
/dev/sd2e
|
||||||
|
/dev/sd3e
|
||||||
|
/dev/sd4e
|
||||||
|
|
||||||
|
START layout
|
||||||
|
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
|
||||||
|
64 1 1 5
|
||||||
|
|
||||||
|
START queue
|
||||||
|
fifo 100
|
||||||
|
|
||||||
|
.Ed
|
||||||
|
.Pp
|
||||||
|
then the following configuration (say "recover_raid0.conf")
|
||||||
|
.Bd -literal -offset indent
|
||||||
|
START array
|
||||||
|
1 4 0
|
||||||
|
|
||||||
|
START drives
|
||||||
|
/dev/sd6e
|
||||||
|
/dev/sd2e
|
||||||
|
/dev/sd3e
|
||||||
|
/dev/sd4e
|
||||||
|
|
||||||
|
START layout
|
||||||
|
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
|
||||||
|
64 1 1 5
|
||||||
|
|
||||||
|
START queue
|
||||||
|
fifo 100
|
||||||
|
.Ed
|
||||||
|
.Pp
|
||||||
|
(where /dev/sd6e has no physical device) can be used with
|
||||||
|
.Bd -literal -offset indent
|
||||||
|
raidctl -C recover_raid0.conf raid0
|
||||||
|
.Ed
|
||||||
|
.Pp
|
||||||
|
to force the configuration of raid0. A
|
||||||
|
.Bd -literal -offset indent
|
||||||
|
raidctl -I 12345 raid0
|
||||||
|
.Ed
|
||||||
|
.Pp
|
||||||
|
will be required in order to synchronize the component labels.
|
||||||
|
At this point the filesystems on the RAID set can then be checked and
|
||||||
|
corrected. To complete the re-construction of the RAID set,
|
||||||
|
/dev/sd1e is simply hot-added back into the array, and reconstructed
|
||||||
|
as described earlier.
|
||||||
.Ss RAID on RAID
|
.Ss RAID on RAID
|
||||||
RAID sets can be layered to create more complex and much larger RAID
|
RAID sets can be layered to create more complex and much larger RAID
|
||||||
sets. A RAID 0 set, for example, could be constructed from four RAID
|
sets. A RAID 0 set, for example, could be constructed from four RAID
|
||||||
|
Loading…
Reference in New Issue
Block a user