]>
Commit | Line | Data |
---|---|---|
7675959b | 1 | .\" See file COPYING in distribution for details. |
7f0066ba | 2 | .TH MDMON 8 "" v3.1 |
7675959b DW |
3 | .SH NAME |
4 | mdmon \- monitor MD external metadata arrays | |
5 | ||
6 | .SH SYNOPSIS | |
7 | ||
8 | .BI mdmon " CONTAINER [NEWROOT]" | |
9 | ||
10 | .SH OVERVIEW | |
11 | The 2.6.27 kernel brings the ability to support external metadata arrays. | |
12 | External metadata implies that user space handles all updates to the metadata. | |
13 | The kernel's responsibility is to notify user space when a "metadata event" | |
14 | occurs, like disk failures and clean-to-dirty transitions. The kernel, in | |
15 | important cases, waits for user space to take action on these notifications. | |
16 | ||
17 | .SH DESCRIPTION | |
e0fe762a N |
18 | .SS Metadata updates: |
19 | To service metadata update requests a daemon, | |
20 | .IR mdmon , | |
21 | is introduced. | |
22 | .I Mdmon | |
23 | is tasked with polling the sysfs namespace looking for changes in | |
7675959b DW |
24 | .BR array_state , |
25 | .BR sync_action , | |
26 | and per disk | |
27 | .BR state | |
28 | attributes. When a change is detected it calls a per metadata type | |
29 | handler to make modifications to the metadata. The following actions | |
30 | are taken: | |
31 | .RS | |
32 | .TP | |
33 | .B array_state \- inactive | |
34 | Clear the dirty bit for the volume and let the array be stopped | |
35 | .TP | |
36 | .B array_state \- write pending | |
37 | Set the dirty bit for the array and then set | |
38 | .B array_state | |
39 | to | |
40 | .BR active . | |
41 | Writes | |
42 | are blocked until userspace writes | |
43 | .BR active. | |
44 | .TP | |
45 | .B array_state \- active-idle | |
46 | The safe mode timer has expired so set array state to clean to block writes to the array | |
47 | .TP | |
48 | .B array_state \- clean | |
49 | Clear the dirty bit for the volume | |
50 | .TP | |
51 | .B array_state \- read-only | |
e0fe762a N |
52 | This is the initial state that all arrays start at. |
53 | .I mdmon | |
54 | takes one of the three actions: | |
7675959b DW |
55 | .RS |
56 | .TP | |
57 | 1/ | |
58 | Transition the array to read-auto keeping the dirty bit clear if the metadata | |
59 | handler determines that the array does not need resyncing or other modification | |
60 | .TP | |
61 | 2/ | |
62 | Transition the array to active if the metadata handler determines a resync or | |
63 | some other manipulation is necessary | |
64 | .TP | |
65 | 3/ | |
66 | Leave the array read\-only if the volume is marked to not be monitored; for | |
67 | example, the metadata version has been set to "external:\-dev/md127" instead of | |
68 | "external:/dev/md127" | |
69 | .RE | |
70 | .TP | |
71 | .B sync_action \- resync\-to\-idle | |
72 | Notify the metadata handler that a resync may have completed. If a resync | |
73 | process is idled before it completes this event allows the metadata handler to | |
74 | checkpoint resync. | |
75 | .TP | |
76 | .B sync_action \- recover\-to\-idle | |
77 | A spare may have completed rebuilding so tell the metadata handler about the | |
e0fe762a N |
78 | state of each disk. This is the metadata handler's opportunity to clear |
79 | any "out-of-sync" bits and clear the volume's degraded status. If a recovery | |
7675959b DW |
80 | process is idled before it completes this event allows the metadata handler to |
81 | checkpoint recovery. | |
82 | .TP | |
83 | .B <disk>/state \- faulty | |
84 | A disk failure kicks off a series of events. First, notify the metadata | |
85 | handler that a disk has failed, and then notify the kernel that it can unblock | |
86 | writes that were dependent on this disk. After unblocking the kernel this disk | |
e0fe762a | 87 | is set to be removed+ from the member array. Finally the disk is marked failed |
7675959b DW |
88 | in all other member arrays in the container. |
89 | .IP | |
e0fe762a | 90 | + Note This behavior differs slightly from native MD arrays where |
7675959b DW |
91 | removal is reserved for a |
92 | .B mdadm --remove | |
93 | event. In the external metadata case the container holds the final | |
94 | reference on a block device and a | |
95 | .B mdadm --remove <container> <victim> | |
96 | call is still required. | |
97 | .RE | |
98 | ||
e0fe762a | 99 | .SS Containers: |
7675959b DW |
100 | .P |
101 | External metadata formats, like DDF, differ from the native MD metadata | |
102 | formats in that they define a set of disks and a series of sub-arrays | |
103 | within those disks. MD metadata in comparison defines a 1:1 | |
104 | relationship between a set of block devices and a raid array. For | |
105 | example to create 2 arrays at different raid levels on a single | |
106 | set of disks, MD metadata requires the disks be partitioned and then | |
107 | each array can created be created with a subset of those partitions. The | |
108 | supported external formats perform this disk carving internally. | |
109 | .P | |
110 | Container devices simply hold references to all member disks and allow | |
e0fe762a N |
111 | tools like |
112 | .I mdmon | |
113 | to determine which active arrays belong to which | |
7675959b DW |
114 | container. Some array management commands like disk removal and disk |
115 | add are now only valid at the container level. Attempts to perform | |
116 | these actions on member arrays are blocked with error messages like: | |
117 | .IP | |
118 | "mdadm: Cannot remove disks from a \'member\' array, perform this | |
119 | operation on the parent container" | |
120 | .P | |
121 | Containers are identified in /proc/mdstat with a metadata version string | |
122 | "external:<metadata name>". Member devices are identified by | |
123 | "external:/<container device>/<member index>", or "external:-<container | |
124 | device>/<member index>" if the array is to remain readonly. | |
125 | ||
126 | .SH OPTIONS | |
127 | .TP | |
128 | CONTAINER | |
129 | The | |
130 | .B container | |
131 | device to monitor. It can be a full path like /dev/md/container, a simple md | |
e0fe762a N |
132 | device name like md127, or /proc/mdstat which tells |
133 | .I mdmon | |
134 | to scan for containers and launch an | |
135 | .I mdmon | |
136 | instance for each one found. | |
7675959b DW |
137 | .TP |
138 | [NEWROOT] | |
e0fe762a N |
139 | In order to support an external metadata raid array as the rootfs |
140 | .I mdmon | |
141 | needs to be started in the initramfs environment. Once the initramfs | |
142 | environment mounts the final rootfs | |
143 | .I mdmon | |
144 | needs to be restarted in the new namespace. When NEWROOT is specified | |
145 | .I mdmon | |
146 | will terminate any | |
147 | .I mdmon | |
148 | instances that are running in the current namespace, | |
149 | .IR chroot (2) | |
150 | to NEWROOT, and continue monitoring the container. | |
151 | .PP | |
152 | Note that | |
153 | .I mdmon | |
154 | is automatically started by | |
155 | .I mdadm | |
156 | when needed and so does not need to be considered when working with | |
157 | RAID arrays. The only times it is run other that by | |
158 | .I mdadm | |
159 | is when the boot scripts need to restart it after mounting the new | |
160 | root filesystem. | |
7675959b | 161 | |
e0fe762a N |
162 | .SH SEE ALSO |
163 | .IR mdadm (8), | |
164 | .IR md (4). |