diff --git a/sbin/raidctl/raidctl.8 b/sbin/raidctl/raidctl.8 index b54cbee2d9af..eb8c32ebb50a 100644 --- a/sbin/raidctl/raidctl.8 +++ b/sbin/raidctl/raidctl.8 @@ -1,4 +1,4 @@ -.\" $NetBSD: raidctl.8,v 1.13 2000/02/24 23:52:46 oster Exp $ +.\" $NetBSD: raidctl.8,v 1.14 2000/02/25 22:24:46 oster Exp $ .\" .\" Copyright (c) 1998 The NetBSD Foundation, Inc. .\" All rights reserved. @@ -156,7 +156,8 @@ eligible to be the root partition. A RAID set configured this way will .Ar override the use of the boot disk as the root device. All components of the -set must be of type RAID in the disklabel. +set must be of type RAID in the disklabel. Note that the kernel being +booted must currently reside on a non-RAID set. .It Fl B Ar dev Initiate a copyback of reconstructed data from a spare disk to its original disk. This is performed after a component has failed, @@ -188,13 +189,11 @@ the reconstruction process if a component does have a hardware failure. .It Fl g Ar component Ar dev Get the component label for the specified component. .It Fl i Ar dev -Initialize (re-write) the parity on the device. This +Initialize the RAID device. In particular, (re-write) the parity on +the selected device. This .Ar MUST -be done before the RAID device is labeled and before -filesystems are created on the RAID device, and is normally used after -a system crash (and before a -.Xr fsck 8 ) -to ensure the integrity of the parity. +be done for all RAID sets before the RAID device is labeled and before +filesystems are created on the RAID device. .It Fl I Ar serial_number Ar dev Initialize the component labels on each component of the device. .Ar serial_number @@ -210,6 +209,9 @@ message, and returns successfully if the parity is up-to-date. .It Fl P Ar dev Check the status of the parity on the RAID set, and initialize (re-write) the parity if the parity is not known to be up-to-date. +This is normally used after a system crash (and before a +.Xr fsck 8 ) +to ensure the integrity of the parity. .It Fl r Ar component Ar dev Remove the spare disk specified by .Ar component @@ -225,8 +227,9 @@ component has a hardware failure. Display the status of the RAIDframe device for each of the components and spares. .It Fl S Ar dev -Check the status of component reconstruction. The output indicates -the amount of progress achieved in reconstructing a failed component. +Check the status of parity re-writing, component reconstruction, and +component copyback. The output indicates the amount of progress +achieved in each of these areas. .It Fl u Ar dev Unconfigure the RAIDframe device. .It Fl v @@ -250,7 +253,7 @@ files, a indicates the beginning of a comment. .Pp There are 4 required sections of a configuration file, and 2 -optional components. Each section begins with a +optional sections. Each section begins with a .Sq START , followed by the section name, and the confuration parameters associated with that @@ -289,8 +292,13 @@ operate in degraded mode. Note that it is .Ar imperative that the order of the components in the configuration file does not change between configurations of a RAID device. Changing the order -of the components (at least at the time of this writing) will result in -data loss. +of the components will result in data loss if the set is configured +with the +.Fl C +option. In normal circumstances, the RAID set will not configure if +only +.Fl c +is specified, and the components are out-of-order. .Pp The next section, which is the .Sq spare @@ -310,7 +318,7 @@ START spare for a configuration with a single spare component. If no spare drives are to be used in the configuration, then the .Sq spare -section may be omitted. +section may be omitted. .Pp The next section is the .Sq layout @@ -361,13 +369,11 @@ section. This is most often specified as: .Bd -unfilled -offset indent START queue -fifo 1 +fifo 100 .Ed .Pp where the queueing method is specified as fifo (first-in, first-out), -and the size of the per-component queue is limited to 1 request. A -value of 1 is quite conservative here, and values of 100 or more may -been used to increase the driver performance. +and the size of the per-component queue is limited to 100 requests. Other queuing methods may also be specified, but a discussion of them is beyond the scope of this document. .Pp @@ -385,38 +391,71 @@ for a more complete configuration file example. .Sh EXAMPLES -The examples in this section will focus on a RAID 5 configuration. -Other RAID configurations will behave similarly. It is highly -recommended that before using the RAID driver for real filesystems -that the system administrator(s) have used -.Ar all -of the options for +It is highly recommended that before using the RAID driver for real +filesystems that the system administrator(s) become quite familiar +with the use of .Nm "" , and that they understand how the component reconstruction process -works. While this example is not created as a tutorial, the steps -shown here can be easily duplicated using four equal-sized partitions -from any number of disks (including all four from a single disk). +works. The examples in this section will focus on configuring a +number of different RAID sets of varying degrees of redundancy. +By working through these examples, administrators should be able to +develop a good feel for how to configure a RAID set, and how to +initiate reconstruction of failed components. .Pp -The first step to configuring a RAID set is to mark the components -that will be used for that set. This is typically done by using -.Xr disklabel 8 -to create partitions of type -.Dv FS_BSDFFS or -.Dv FS_RAID . -The type -.Dv FS_RAID -is reserved for RAIDframe use, and is required for features such as -auto-configuration. A typical disklabel entry for a RAID component +In the following examples +.Sq raid0 +will be used to denote the RAID device. Depending on the +architecture, +.Sq /dev/rraid0c +or +.Sq /dev/rraid0d +may be used in place of +.Sq raid0 . +.Pp +The initial step in configuring a RAID set is to identify the components +that will be used in the RAID set. All components should be the same +size. Each component should have a disklabel type of +.Dv FS_RAID , +and a typical disklabel entry for a RAID component might look like: .Bd -unfilled -offset indent f: 1800000 200495 RAID # (Cyl. 405*- 4041*) .Ed .Pp -The primary uses of +While +.Dv FS_BSDFFS +will also work as the component type, the type +.Dv FS_RAID +is preferred for RAIDframe use, as it is required for features such as +auto-configuration. As part of the initial configuration of each RAID +set, each component will be given a +.Sq component label . +A +.Sq component label +contains important information about the component, including a +user-specified serial number, the row and column of that component in +the RAID set, the redundancy level of the RAID set, a 'modification +counter', and whether the parity information (if any) on that +component is known to be correct. Component labels are an integral +part of the RAID set, since they are used to ensure that components +are configured in the correct order, and used to keep track of other +vital information about the RAID set. Component labels are also +required for the auto-detection and auto-configuration of RAID sets at +boot time. For a component label to be considered valid, that +particular component label must be in agreement with the other +component labels in the set. For example, the serial number, +'modification counter', number of rows and number of columns must all +be in agreement. If any of these are different, then the component is +not considered to be part of the set. See +.Xr raid 4 +for more information about component labels. +.Pp +Once the components have been identified, and the disks have +appropriate labels, .Nm "" -is to configure and unconfigure +is then used to configure the .Xr raid 4 -devices. To configure the device, a configuration +device. To configure the device, a configuration file which looks something like: .Bd -unfilled -offset indent START array @@ -439,46 +478,20 @@ START queue fifo 100 .Ed .Pp -is first created. In short, this configuration file specifies a RAID -5 configuration consisting of the components /dev/sd1e, +is created in a file. In this example, the above configuration +will be in a filed called +.Sq raid0.conf . +The above configuration file specifies a RAID +5 set consisting of the components /dev/sd1e, /dev/sd2e, and /dev/sd3e, with /dev/sd4e available as a .Sq hot spare in case one of -the three main drives should fail. If the above configuration is in a -file called -.Sq rfconfig , -raid device 0 in the normal case can be configured with: -.Bd -unfilled -offset indent -raidctl -c rfconfig raid0 -.Ed -.Pp -The above is equivalent to the following: -.Bd -unfilled -offset indent -raidctl -c rfconfig /dev/rraid0d -.Ed -.Pp -on the i386 architecture. On all other architectures, /dev/rraid0c -is used in place of /dev/rraid0d. -.Pp -A RAID set will not configure with -.Fl c -if the component labels are not correct. A -.Sq component label -contains important information about the component, including a -user-specified serial number, the row and column of that component in the RAID -set, and whether the data (and parity) on the component is -.Sq clean . -See -.Xr raid 4 -for more information about component labels. -.Pp -Since new RAID sets will not have correct component labels, the first -configuration of a RAID set must use +the three main drives should fail. +The first time a RAID set is configured, the .Fl C -instead of -.Fl c : +option must be used: .Bd -unfilled -offset indent -raidctl -C rfconfig raid0 +raidctl -C raid0.conf raid0 .Ed .Pp The @@ -487,7 +500,13 @@ forces the configuration to succeed, even if any of the component labels are incorrect. This option should not be used lightly in situations other than initial configurations, as if the system is refusing to configure a RAID set, there is probably a -very good reason for it. +very good reason for it. After the initial configuration is done (and +appropriate component labels are added with the +.Ar I +option) then raid0 can be configured normally with: +.Bd -unfilled -offset indent +raidctl -c raid0.conf raid0 +.Ed .Pp When the RAID set is configured for the first time, it is necessary to initialize the component labels, and to initialize the @@ -499,17 +518,37 @@ raidctl -I 112341 raid0 where .Sq 112341 is a user-specified serial number for the RAID set. Using different -serial numbers between RAID sets is strongly encouraged, as using the +serial numbers between RAID sets is +.Ar strongly encouraged , +as using the same serial number for all RAID sets will only serve to decrease the usefulness of the component label checking. .Pp -Initializing the parity on the RAID set is done via: +Initializing the RAID set is done via: .Bd -unfilled -offset indent raidctl -i raid0 .Ed .Pp -Initializing the parity in this way may also be required after an -unclean shutdown. Since it is the parity that provides the +This initialization includes ensuring that the parity (if any) on the +RAID set is correct. Since this initialization may be quite +time-consuming, the +.Ar v +option may be also used in conjuction with +.Ar i . +This will give more verbose output on the +status of the initialization: +.Bd -unfilled -offset indent +Initiating re-write of parity +Parity Re-write status: + 10% |**** | ETA: 06:03 / +.Ed +.Pp +The output provides a +.Sq Percent Complete +in both a numeric and graphical format, as well as an estimated time +to completion of the operation. +.Pp +Since it is the parity that provides the 'redundancy' part of RAID, it is critical that the parity is correct as much as possible. If the parity is not correct, then there is no guarantee that data will not be lost if a component fails. @@ -529,7 +568,8 @@ raidctl -p raid0 .Ed .Pp can be used to check the current status of the parity. To check the -parity and rebuild it necessary the command: +parity and rebuild it necessary (for example, after an unclean +shutdown) the command: .Bd -unfilled -offset indent raidctl -P raid0 .Ed @@ -555,9 +595,46 @@ Components: /dev/sd3e: optimal Spares: /dev/sd4e: spare +Component label for /dev/sd1e: + Row: 0 Column: 0 Num Rows: 1 Num Columns: 3 + Version: 2 Serial Number: 13432 Mod Counter: 65 + Clean: No Status: 0 + sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1 + RAID Level: 5 blocksize: 512 numBlocks: 1799936 + Autoconfig: No + Last configured as: raid0 +Component label for /dev/sd2e: + Row: 0 Column: 1 Num Rows: 1 Num Columns: 3 + Version: 2 Serial Number: 13432 Mod Counter: 65 + Clean: No Status: 0 + sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1 + RAID Level: 5 blocksize: 512 numBlocks: 1799936 + Autoconfig: No + Last configured as: raid0 +Component label for /dev/sd3e: + Row: 0 Column: 2 Num Rows: 1 Num Columns: 3 + Version: 2 Serial Number: 13432 Mod Counter: 65 + Clean: No Status: 0 + sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1 + RAID Level: 5 blocksize: 512 numBlocks: 1799936 + Autoconfig: No + Last configured as: raid0 +Parity status: clean +Reconstruction is 100% complete. +Parity Re-write is 100% complete. +Copyback is 100% complete. .Ed .Pp -This indicates that all is well with the RAID set. +This indicates that all is well with the RAID set. Of importance here +are the component lines which read +.Sq optimal , +and the +.Sq Parity status +line which indicates that the parity is up-to-date. Note that if +there are filesystems open on the RAID set, the individual components +will not be +.Sq clean +but the set as a whole can still be clean. .Pp To check the component label of /dev/sd1e, the following is used: .Bd -unfilled -offset indent @@ -566,25 +643,16 @@ raidctl -g /dev/sd1e raid0 .Pp The output of this command will look something like: .Bd -unfilled -offset indent -Component label for /dev/sd2e: -Version: 1 -Serial Number: 112341 -Mod counter: 6 -Row: 0 -Column: 1 -Num Rows: 1 -Num Columns: 3 -Clean: 0 -Status: optimal +Component label for /dev/sd1e: + Row: 0 Column: 0 Num Rows: 1 Num Columns: 3 + Version: 2 Serial Number: 13432 Mod Counter: 65 + Clean: No Status: 0 + sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1 + RAID Level: 5 blocksize: 512 numBlocks: 1799936 + Autoconfig: No + Last configured as: raid0 .Ed .Pp -For a component label to be considered valid, that particular -component label must be in agreement with the other component labels -in the set. For example, the serial number, 'modification counter', -number of rows and number of columns must all be in agreement. If any -of these are different, then the component is not considered to be -part of the set. -.Pp If for some reason (perhaps to test reconstruction) it is necessary to pretend a drive has failed, the following will perform that function: @@ -594,7 +662,7 @@ raidctl -f /dev/sd2e raid0 .Pp The system will then be performing all operations in degraded mode, where missing data is re-computed from existing data and the parity. -In this case, obtaining the status of raid0 will return: +In this case, obtaining the status of raid0 will return (in part): .Bd -unfilled -offset indent Components: /dev/sd1e: optimal @@ -627,6 +695,11 @@ Components: /dev/sd3e: optimal Spares: /dev/sd4e: used_spare +[...] +Parity status: clean +Reconstruction is 10% complete. +Parity Re-write is 100% complete. +Copyback is 100% complete. .Ed .Pp This indicates that a reconstruction is in progress. To find out how @@ -644,6 +717,11 @@ Components: /dev/sd3e: optimal Spares: /dev/sd4e: used_spare +[...] +Parity status: clean +Reconstruction is 100% complete. +Parity Re-write is 100% complete. +Copyback is 100% complete. .Ed .Pp At this point there are at least two options. First, if /dev/sd2e is @@ -656,7 +734,7 @@ be initiated with the .Fl B option. In this example, this would copy the entire contents of /dev/sd4e to /dev/sd2e. Once the copyback procedure is complete, the -status of the device would be: +status of the device would be (in part): .Bd -unfilled -offset indent Components: /dev/sd1e: optimal @@ -691,7 +769,7 @@ in a configuration file should be changed. .Pp If a component fails and there are no hot spares -available on-line, the status of the RAID set might look like: +available on-line, the status of the RAID set might (in part) look like: .Bd -unfilled -offset indent Components: /dev/sd1e: optimal