mdadm: A New Tool For Linux Software RAID Management
Pages: 1, 2
Starting an Array
Assemble mode is used to start an array that already exists. If you created an /etc/mdadm.conf you can automatically start an array listed there with the following command:
# mdadm -As /dev/md0
mdadm: /dev/md0 has been started with 2 drives.
The -A option denotes assemble mode. You can also use --assemble. The -s or --scan option tells mdadm to look in /etc/mdadm.conf for information about arrays and devices. If you want to start every array listed in /etc/mdadm.conf, don't specify an md device on the command line.
If you didn't create an /etc/mdadm.conf file, you will need to specify additional information on the command line in order to start an array. For example, this command attempts to start /dev/md0 using the devices listed on the command line:
# mdadm -A /dev/md0 /dev/sdb1 /dev/sdc1
Since using mdadm -A in this way assumes you have some prior knowledge about how arrays are arranged,
it might not be useful on systems that have arrays that were created by someone else. So you may wish to
examine some devices to gain a better picture about how arrays should be assembled. The examine options
(-E or --examine) allows you to print the md superblock (if present) from a block device that could be an
array component.
# mdadm -E /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 84788b68:1bb79088:9a73ebcc:2ab430da
Creation Time : Mon Sep 23 16:02:33 2002
Raid Level : raid0
Device Size : 17920384 (17.09 GiB 18.40 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Mon Sep 23 16:14:52 2002
State : clean, no-errors
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : 8ab5e437 - correct
Events : 0.10
Chunk Size : 128K
Number Major Minor RaidDevice State
this 1 8 33 1 active sync /dev/sdc1
0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 65 3 active sync /dev/sde1
mdadm's examine option displays quite a bit of useful information about component disks. In this case we
can tell that /dev/sdc1 belongs to a RAID-0 made up of a total of four member disks. What I want to
specifically point out is the line of output that contains the UUID. A UUID is a 128-bit number that is
guaranteed to be reasonably unique on both the local system and across other systems. It is a randomly
generated using system hardware and timestamps as part of its seed. UUIDs are commonly used by many
programs to uniquely tag devices. See the uuidgen and libuuid manual pages for more information.
When an array is created, the md driver generates a UUID for the array and stores it in the md superblock. You can use the UUID as criteria for array assembly. In the next example I am going to activate the array to which /dev/sdc1 belongs using its UUID.
# mdadm -Av /dev/md0 --uuid=84788b68:1bb79088:9a73ebcc:2ab430da /dev/sd*
This command scans every SCSI disk (/dev/sd*) to see if it's a member of the array with the UUID 84788b68:1bb79088:9a73ebcc:2ab430da and then starts the array, assuming it found each component device. mdadm will produce a lot of output each time it tries to scan a device that does not exist. You can safely ignore such warnings.
Managing Arrays
Using Manage mode you can add and remove disks to a running array. This is useful for removing failed
disks, adding spare disks, or adding replacement disks. Manage mode can also be used to mark a member
disk as failed. Manage mode replicates the functions of raidtools programs such as raidsetfaulty, raidhotremove, and raidhotadd.
For example, to add a disk to an active array, replicating the raidhotadd command:
# mdadm /dev/md0 --add /dev/sdc1
Or, to remove /dev/sdc1 from /dev/md0 try:
# mdadm /dev/md0 --f ail /dev/sdc1 --remove /dev/sdc1
Notice that I first mark /dev/sdc1 as failed and then remove it. This is the same as using the raidsetfaulty and raidhotremove commands with raidtools. It's fine to combine add, fail, and remove options on a single command line as long as they make sense in terms of array management. So you have to fail a disk before removing it, for example.
Monitoring Arrays
Follow, or Monitor, mode provides some of mdadm's best and most unique features. Using Follow/Monitor mode you can daemonize mdadm and configure it to send email alerts to system administrators when arrays encounter errors or fail. You can also use Follow mode to arbitrarily execute commands when a disk fails. For example, you might want to try removing and reinserting a failed disk in an attempt to correct a non-fatal failure without user intervention.
The following command will monitor /dev/md0 (polling every 300 seconds) for critical events. When a fatal error occurs, mdadm will send an email to sysadmin. You can tailor the polling interval and email address to meet your needs.
# mdadm --monitor --mail=sysadmin --delay=300 /dev/md0
When using monitor mode, mdadm will not exit, so you might want to wrap it around nohup and ampersand:
# nohup mdadm --monitor --mail=sysadmin --delay=300 /dev/md0 &
Follow/Monitor mode also allows arrays to share spare disks, a feature that has been lacking in Linux
software RAID since its inception. That means you only need to provide one spare disk for a group of
arrays or for all arrays. It also means that system administrators don't have to manually intervene to shuffle around spare disks when arrays fail. Previously this functionality was available only using hardware RAID. When Follow/Monitor mode is invoked, it polls arrays at regular intervals. When a disk failure is detected on an array without a spare disk, mdadm will remove an available spare disk from another array and insert it into the array with the failed disk. To facilitate this process, each ARRAY line in /etc/mdadm.conf needs to have a spare-group defined.
DEVICE /dev/sd*
ARRAY /dev/md0 level=raid1 num-devices=3 spare-group=database \
UUID=410a299e:4cdd535e:169d3df4:48b7144a
ARRAY /dev/md1 level=raid1 num-device=2 spare-group=database \
UUID=59b6e564:739d4d28:ae0aa308:71147fe7
In this example, both /dev/md0 and /dev/md1 are part of the spare group database. Just assume that /dev/md0 is a two-disk RAID-1 with a single spare disk. If mdadm is running in monitor mode (as I showed earlier), and a disk in /dev/md1 fails, mdadm will remove the spare disk from /dev/md0 and insert it into /dev/md1.
mdadm has many other options that I haven't covered here. I strongly recommend reading its manual page
for further details. Remember, you don't have to switch to mdadm. raidtools is still in development, and it
has the benefit of many years of development. But, I find that mdadm is a worthy replacement. It is both
feature rich and intuitive, and there's no harm in trying out alternatives.
Derek Vadala is the author of O'Reilly's Managing RAID on Linux. He has written for magazines including SysAdmin, The Perl Journal and Linux Journal.
O'Reilly & Associates will soon release (December 2002) Managing RAID with Linux.
Beta Sample Chapter 2, Planning and Architecture, is available free online.
You can also look at the Table of Contents, the Index, and the Full Description of the book.
For more information, or to order the book, click here.
Return to the Linux DevCenter.
-
mdadm is excellent
2003-11-28 10:19:00 anonymous2 [View]
-
Notice to Debian Users
2003-11-27 15:24:39 anonymous2 [View]
-
mdadm at kernel.org
2003-02-19 17:31:33 Derek Vadala |
[View]
-
Compatable?
2003-02-19 15:25:57 anonymous2 [View]
-
Compatable?
2003-02-19 17:30:31 Derek Vadala |
[View]