.\" the Free Software Foundation; either version 2 of the License, or
.\" (at your option) any later version.
.\" See file COPYING in distribution for details.
-.TH MDADM 8 "" v3.2.5
+.TH MDADM 8 "" v4.0
.SH NAME
mdadm \- manage MD devices
.I aka
Linear and RAID levels 0/1/4/5/6,
changing the RAID level between 0, 1, 5, and 6, and between 0 and 10,
changing the chunk size and layout for RAID 0,4,5,6,10 as well as adding or
-removing a write-intent bitmap.
+removing a write-intent bitmap and changing the array's consistency policy.
.TP
.B "Incremental Assembly"
.P
If a device is given before any options, or if the first option is
+one of
.BR \-\-add ,
+.BR \-\-re\-add ,
+.BR \-\-add\-spare ,
.BR \-\-fail ,
.BR \-\-remove ,
or
by a digit string). See below under
.BR "Auto Assembly" .
+The special name "\fBany\fP" can be used as a wild card. If an array
+is created with
+.B --homehost=any
+then the name "\fBany\fP" will be stored in the array and it can be
+assembled in the same way on any host. If an array is assembled with
+this option, then the homehost recorded on the array will be ignored.
+
.TP
.B \-\-prefer=
When
and
.BR \-\-monitor .
+.TP
+.B \-\-home\-cluster=
+specifies the cluster name for the md device. The md device can be assembled
+only on the cluster which matches the name specified. If this option is not
+provided, mdadm tries to detect the cluster name automatically.
+
.SH For create, build, or grow:
.TP
.TP
.BR \-z ", " \-\-size=
-Amount (in Kibibytes) of space to use from each drive in RAID levels 1/4/5/6.
+Amount (in Kilobytes) of space to use from each drive in RAID levels 1/4/5/6.
This must be a multiple of the chunk size, and must leave about 128Kb
of space at the end of the drive for the RAID superblock.
If this is not specified
size, though if there is a variance among the drives of greater than 1%, a warning is
issued.
-A suffix of 'M' or 'G' can be given to indicate Megabytes or
+A suffix of 'K', 'M' or 'G' can be given to indicate Kilobytes, Megabytes or
Gigabytes respectively.
Sometimes a replacement drive can be a little smaller than the
.B "\-\-grow \-\-array\-size="
command.
-A suffix of 'M' or 'G' can be given to indicate Megabytes or
+A suffix of 'K', 'M' or 'G' can be given to indicate Kilobytes, Megabytes or
Gigabytes respectively.
A value of
.B max
restores the apparent size of the array to be whatever the real
amount of available space is.
+Clustered arrays do not support this parameter yet.
+
.TP
.BR \-c ", " \-\-chunk=
-Specify chunk size of kibibytes. The default when creating an
+Specify chunk size of kilobytes. The default when creating an
array is 512KB. To ensure compatibility with earlier versions, the
-default when Building and array with no persistent metadata is 64KB.
+default when building an array with no persistent metadata is 64KB.
This is only meaningful for RAID0, RAID4, RAID5, RAID6, and RAID10.
RAID4, RAID5, RAID6, and RAID10 require the chunk size to be a power
of 2. In any case it must be a multiple of 4KB.
-A suffix of 'M' or 'G' can be given to indicate Megabytes or
+A suffix of 'K', 'M' or 'G' can be given to indicate Kilobytes, Megabytes or
Gigabytes respectively.
.TP
.B "none"
is given with
.B \-\-grow
-mode, then any bitmap that is present is removed.
+mode, then any bitmap that is present is removed. If the word
+.B "clustered"
+is given, the array is created for a clustered environment. One bitmap
+is created for each node as defined by the
+.B \-\-nodes
+parameter and are stored internally.
To help catch typing errors, the filename must contain at least one
slash ('/') if it is a real file (not 'internal' or 'none').
.I mdadm
automatically adds an internal bitmap as it will usually be
beneficial. This can be suppressed with
-.B "\-\-bitmap=none".
+.B "\-\-bitmap=none"
+or by selecting a different consistency policy with
+.BR \-\-consistency\-policy .
.TP
.BR \-\-bitmap\-chunk=
bitmap, the chunksize defaults to 64Meg, or larger if necessary to
fit the bitmap into the available space.
-A suffix of 'M' or 'G' can be given to indicate Megabytes or
+A suffix of 'K', 'M' or 'G' can be given to indicate Kilobytes, Megabytes or
Gigabytes respectively.
.TP
.BR \-\-create ,
or
.B \-\-add
-command will be flagged as 'write-mostly'. This is valid for RAID1
+command will be flagged as 'write\-mostly'. This is valid for RAID1
only and means that the 'md' driver will avoid reading from these
devices if at all possible. This can be useful if mirroring over a
slow link.
mode, and write-behind is only attempted on drives marked as
.IR write-mostly .
+.TP
+.BR \-\-failfast
+subsequent devices listed in a
+.B \-\-create
+or
+.B \-\-add
+command will be flagged as 'failfast'. This is valid for RAID1 and
+RAID10 only. IO requests to these devices will be encouraged to fail
+quickly rather than cause long delays due to error handling. Also no
+attempt is made to repair a read error on these devices.
+
+If an array becomes degraded so that the 'failfast' device is the only
+usable device, the 'failfast' flag will then be ignored and extended
+delays will be preferred to complete failure.
+
+The 'failfast' flag is appropriate for storage arrays which have a
+low probability of true failure, but which may sometimes
+cause unacceptable delays due to internal maintenance functions.
+
.TP
.BR \-\-assume\-clean
Tell
which computed a different offset.
Setting the offset explicitly over-rides the default. The value given
-is in Kilobytes unless an 'M' or 'G' suffix is given.
+is in Kilobytes unless a suffix of 'K', 'M' or 'G' is used to explicitly
+indicate Kilobytes, Megabytes or Gigabytes respectively.
Since Linux 3.4,
.B \-\-data\-offset
Start the array
.B read only
rather than read-write as normal. No writes will be allowed to the
-array, and no resync, recovery, or reshape will be started.
+array, and no resync, recovery, or reshape will be started. It works with
+Create, Assemble, Manage and Misc mode.
.TP
.BR \-a ", " "\-\-auto{=yes,md,mdp,part,p}{NN}"
the number of devices in a RAID0, it is necessary to set the new
number of devices, and to add the new devices, in the same command.
+.TP
+.BR \-\-nodes
+Only works when the array is for clustered environment. It specifies
+the maximum number of nodes in the cluster that will use this device
+simultaneously. If not specified, this defaults to 4.
+
+.TP
+.BR \-\-write-journal
+Specify journal device for the RAID-4/5/6 array. The journal device
+should be a SSD with reasonable lifetime.
+
+.TP
+.BR \-\-symlinks
+Auto creation of symlinks in /dev to /dev/md, option --symlinks must
+be 'no' or 'yes' and work with --create and --build.
+
+.TP
+.BR \-k ", " \-\-consistency\-policy=
+Specify how the array maintains consistency in case of unexpected shutdown.
+Only relevant for RAID levels with redundancy.
+Currently supported options are:
+.RS
+
+.TP
+.B resync
+Full resync is performed and all redundancy is regenerated when the array is
+started after unclean shutdown.
+
+.TP
+.B bitmap
+Resync assisted by a write-intent bitmap. Implicitly selected when using
+.BR \-\-bitmap .
+
+.TP
+.B journal
+For RAID levels 4/5/6, journal device is used to log transactions and replay
+after unclean shutdown. Implicitly selected when using
+.BR \-\-write\-journal .
+
+.TP
+.B ppl
+For RAID5 only, Partial Parity Log is used to close the write hole and
+eliminate resync. PPL is stored in the metadata region of RAID member drives,
+no additional journal drive is needed.
+
+.PP
+Can be used with \-\-grow to change the consistency policy of an active array
+in some cases. See CONSISTENCY POLICY CHANGES below.
+.RE
+
+
.SH For assemble:
.TP
.BR summaries ,
.BR uuid ,
.BR name ,
+.BR nodes ,
.BR homehost ,
+.BR home-cluster ,
.BR resync ,
.BR byteorder ,
.BR devicesize ,
.BR no\-bitmap ,
.BR bbl ,
-.BR no-\bbl ,
+.BR no\-bbl ,
+.BR ppl ,
+.BR no\-ppl ,
.BR metadata ,
or
.BR super\-minor .
of the array as stored in the superblock. This is only supported for
version-1 superblocks.
+The
+.B nodes
+option will change the
+.I nodes
+of the array as stored in the bitmap superblock. This option only
+works for a clustered environment.
+
The
.B homehost
option will change the
same as updating the UUID.
For version-1 superblocks, this involves updating the name.
+The
+.B home\-cluster
+option will change the cluster name as recorded in the superblock and
+bitmap. This option only works for clustered environment.
+
The
.B resync
option will cause the array to be marked
The
.B byteorder
option allows arrays to be moved between machines with different
-byte-order.
+byte-order, such as from a big-endian machine like a Sparc or some
+MIPS machines, to a little-endian x86_64 machine.
When assembling such an array for the first time after a move, giving
.B "\-\-update=byteorder"
will cause
removed. If the bad block list contains entries, this will fail, as
removing the list could cause data corruption.
+The
+.B ppl
+option will enable PPL for a RAID5 array and reserve space for PPL on each
+device. There must be enough free space between the data and superblock and a
+write-intent bitmap or journal must not be used.
+
+The
+.B no\-ppl
+option will disable PPL in the superblock.
+
.TP
.BR \-\-freeze\-reshape
Option is intended to be used in start-up scripts during initrd boot phase.
.B \-\-continue
option for the grow command.
+.TP
+.BR \-\-symlinks
+See this option under Create and Build options.
+
.SH For Manage mode:
.TP
useful if you are certain that the reason for failure has been
resolved.
+.TP
+.B \-\-add\-spare
+Add a device as a spare. This is similar to
+.B \-\-add
+except that it does not attempt
+.B \-\-re\-add
+first. The device will be added as a spare even if it looks like it
+could be an recent member of the array.
+
.TP
.BR \-r ", " \-\-remove
remove listed devices. They must not be active. i.e. they should
.BR \-\-readwrite
Subsequent devices that are added or re\-added will have the 'write-mostly'
flag cleared.
+.TP
+.BR \-\-cluster\-confirm
+Confirm the existence of the device. This is issued in response to an \-\-add
+request by a node in a cluster. When a node adds a device it sends a message
+to all nodes in the cluster to look for a device with a UUID. This translates
+to a udev notification with the UUID of the device to be added and the slot
+number. The receiving node must acknowledge this message
+with \-\-cluster\-confirm. Valid arguments are <slot>:<devicename> in case
+the device is found or <slot>:missing in case the device is not found.
+
+.TP
+.BR \-\-add-journal
+Recreate journal for RAID-4/5/6 array that lost a journal device. In the
+current implementation, this command cannot add a journal to an array
+that had a failed journal. To avoid interrupting on-going write opertions,
+.B \-\-add-journal
+only works for array in Read-Only state.
+
+.TP
+.BR \-\-failfast
+Subsequent devices that are added or re\-added will have
+the 'failfast' flag set. This is only valid for RAID1 and RAID10 and
+means that the 'md' driver will avoid long timeouts on error handling
+where possible.
+.TP
+.BR \-\-nofailfast
+Subsequent devices that are re\-added will be re\-added without
+the 'failfast' flag set.
.P
Each of these options requires that the first device listed is the array
.TP
.BR \-Y ", " \-\-export
When used with
-.B \-\-detail , \-\-detail-platform
-or
+.BR \-\-detail ,
+.BR \-\-detail-platform ,
.BR \-\-examine ,
+or
+.B \-\-incremental
output will be formatted as
.B key=value
pairs for easy import into the environment.
+With
+.B \-\-incremental
+The value
+.B MD_STARTED
+indicates whether an array was started
+.RB ( yes )
+or not, which may include a reason
+.RB ( unsafe ", " nothing ", " no ).
+Also the value
+.B MD_FOREIGN
+indicates if the array is expected on this host
+.RB ( no ),
+or seems to be from elsewhere
+.RB ( yes ).
+
.TP
.BR \-E ", " \-\-examine
Print contents of the metadata stored on the named device(s).
kernel handles dirty-clean transitions at shutdown. No action is taken
if safe-mode handling is disabled.
+.TP
+.B \-\-action=
+Set the "sync_action" for all md devices given to one of
+.BR idle ,
+.BR frozen ,
+.BR check ,
+.BR repair .
+Setting to
+.B idle
+will abort any currently running action though some actions will
+automatically restart.
+Setting to
+.B frozen
+will abort any current action and ensure no other action starts
+automatically.
+
+Details of
+.B check
+and
+.B repair
+can be found it
+.IR md (4)
+under
+.BR "SCRUBBING AND MISMATCHES" .
+
.SH For Incremental Assembly mode:
.TP
.BR \-\-rebuild\-map ", " \-r
will automatically be added unless some other option is explicitly
requested with the
.B \-\-bitmap
-option. In any case space for a bitmap will be reserved so that one
-can be added layer with
+option or a different consistency policy is selected with the
+.B \-\-consistency\-policy
+option. In any case space for a bitmap will be reserved so that one
+can be added later with
.BR "\-\-grow \-\-bitmap=internal" .
If the metadata type supports it (currently only 1.x metadata), space
.TP
.B \-\-readonly
-start the array readonly \(em not supported yet.
+start the array in readonly mode.
.SH MANAGE MODE
.HP 12
.B \-U
or
.B \-\-update=
-option. Currently only
-.B name
-is supported.
+option. The supported options are
+.BR name ,
+.B ppl
+and
+.BR no\-ppl .
-The
+The
.B name
option updates the subarray name in the metadata, it may not affect the
device node name or the device node symlink until the subarray is
-re\-assembled. If updating
+re\-assembled. If updating
.B name
would change the UUID of an active subarray this operation is blocked,
and the command will end in an error.
+The
+.B ppl
+and
+.B no\-ppl
+options enable and disable PPL in the metadata. Currently supported only for
+IMSM subarrays.
+
.TP
.B \-\-examine
The device should be a component of an md array.
.TP
.B RebuildStarted
-An md array started reconstruction. (syslog priority: Warning)
+An md array started reconstruction (e.g. recovery, resync, reshape,
+check, repair). (syslog priority: Warning)
.TP
.BI Rebuild NN
.IP \(bu 4
add a write-intent bitmap to any array which supports these bitmaps, or
remove a write-intent bitmap from such an array.
+.IP \(bu 4
+change the array's consistency policy.
.PP
Using GROW on containers is currently supported only for Intel's IMSM
in a filesystem that is on the RAID array being affected, the system
will deadlock. The bitmap must be on a separate filesystem.
+.SS CONSISTENCY POLICY CHANGES
+
+The consistency policy of an active array can be changed by using the
+.B \-\-consistency\-policy
+option in Grow mode. Currently this works only for the
+.B ppl
+and
+.B resync
+policies and allows to enable or disable the RAID5 Partial Parity Log (PPL).
+
.SH INCREMENTAL MODE
.HP 12
.RB [ \-\-run ]
.RB [ \-\-quiet ]
.I component-device
+.RI [ optional-aliases-for-device ]
.HP 12
Usage:
.B mdadm \-\-incremental \-\-fail
.B DEVICES
line in that file. If
.B DEVICES
-is absent then the default it to allow any device. Similar if
+is absent then the default it to allow any device. Similarly if
.B DEVICES
contains the special word
.B partitions
then any device is allowed. Otherwise the device name given to
-.I mdadm
+.IR mdadm ,
+or one of the aliases given, or an alias found in the filesystem,
must match one of the names or patterns in a
.B DEVICES
line.
+This is the only context where the aliases are used. They are
+usually provided by a
+.I udev
+rules mentioning
+.BR ${DEVLINKS} .
+
.IP +
Does the device have a valid md superblock? If a specific metadata
version is requested with
.I mdadm
will create and devices that are needed.
+.TP
+.B MDADM_NO_SYSTEMCTL
+If
+.I mdadm
+detects that
+.I systemd
+is in use it will normally request
+.I systemd
+to start various background tasks (particularly
+.IR mdmon )
+rather than forking and running them in the background. This can be
+suppressed by setting
+.BR MDADM_NO_SYSTEMCTL=1 .
+
.TP
.B IMSM_NO_PLATFORM
A key value of IMSM metadata is that it allows interoperability with
recovery. You should be aware that interoperability may be
compromised by setting this value.
+.TP
+.B MDADM_GROW_ALLOW_OLD
+If an array is stopped while it is performing a reshape and that
+reshape was making use of a backup file, then when the array is
+re-assembled
+.I mdadm
+will sometimes complain that the backup file is too old. If this
+happens and you are certain it is the right backup file, you can
+over-ride this check by setting
+.B MDADM_GROW_ALLOW_OLD=1
+in the environment.
+
.TP
.B MDADM_CONF_AUTO
Any string given in this variable is added to the start of the
From kernel version 2.6.28 the "non-partitioned array" can actually
be partitioned. So the "md_d\fBNN\fP"
names are no longer needed, and
-partitions such as "/dev/md\fBNN\fPp\fBXX\fp"
+partitions such as "/dev/md\fBNN\fPp\fBXX\fP"
are possible.
.PP
From kernel version 2.6.29 standard names can be non-numeric following