RAID Management in Linux

The mdadm utility is the primary tool for creating, managing, and monitoring Intel VROC RAID volumes in Linux. This chapter follows the original guide’s commands and examples to ensure alignment with the source material.

4.1 Examine System RAID Capabilities

Before creating arrays, verify that Intel VROC capabilities are active on the system.

This command displays supported RAID levels, the maximum number of devices per array, metadata format, and Intel VROC-related controller and disk information.

# mdadm --detail-platform
Platform : Intel®(R) Virtual RAID on CPU
Version : 8.0.0.1304
RAID Levels : raid0 raid1 raid10 raid5
Chunk Sizes : 4k 8k 16k 32k 64k 128k
2TB volumes : supported
2TB disks : supported
Max Disks : 8
Max Volumes : 2 per array, 8 per controller
I/O Controller : /sys/devices/pci0000:00/0000:00:17.0 (SATA)
Port7 : /dev/sdj (CVEM948500JX032HGN)
Port3 : /dev/sdd (PEPR304600TX120LGN)
Port4 : /dev/sde (BTPR142501MH120LGN)
Port1 : /dev/sdb (BTPR209202AA120LGN)
Port5 : /dev/sdf (BTPR147300GW120LGN)
Port2 : /dev/sdc (BTPR212500QE120LGN)
Port6 : /dev/sdi (CVEM947301KY032HGN)
Port0 : /dev/sda (BTPR212500UG120LGN)
[…]
Platform : Intel®(R) Virtual RAID on CPU
Version : 8.0.0.1304
RAID Levels : raid0 raid1 raid10 raid5
Chunk Sizes : 4k 8k 16k 32k 64k 128k
2TB volumes : supported
2TB disks : supported
Max Disks : 48
Max Volumes : 2 per array, 24 per controller
3rd party NVMe : supported
I/O Controller : /sys/devices/pci0000:97/0000:97:00.5 (VMD)
NVMe under VMD : /dev/nvme11n1 (PHLN843500CR6P4EGN-2)
NVMe under VMD : /dev/nvme10n1 (PHLN843500CR6P4EGN-1)
I/O Controller : /sys/devices/pci0000:37/0000:37:00.5 (VMD)
NVMe under VMD : /dev/nvme1n1 (PHLE7134002M2P0IGN)
I/O Controller : /sys/devices/pci0000:48/0000:48:00.5 (VMD)
NVMe under VMD : /dev/nvme3n1 (BTLN90550F7B1P6AGN)
NVMe under VMD : /dev/nvme2n1 (PHLF730000Y71P0GGN)
NVMe under VMD : /dev/nvme5n1 (PHLN929100AH1P6AGN)
NVMe under VMD : /dev/nvme4n1 (BTLF7320075H1P0GGN)
I/O Controller : /sys/devices/pci0000:59/0000:59:00.5 (VMD)
NVMe under VMD : /dev/nvme6n1 (PHAB9435004C7P6GGN)
NVMe under VMD : /dev/nvme9n1 (PHAL02860029800LGN)
NVMe under VMD : /dev/nvme8n1 (PHAC0301009X3P8AGN) NVMe under VMD : /dev/nvme7n1 (PHAL029300AE1P6MGN)
I/O Controller : /sys/devices/pci0000:80/0000:80:00.5 (VMD) NVMe under VMD : /dev/nvme12n1 (PHLF64120016480AGN)

This command displays supported RAID levels, the maximum number of devices per array, and the metadata format.

4.2 Creating RAID Volumes

Intel VROC supports creating RAID volumes within an IMSM container. The container device must exist and be active before creating volumes.

Create IMSM Container

Example 2 – Create IMSM container

RAID Volume Creation

You can create a RAID volume when the IMSM container device exists and is active.

This command creates a RAID 5 volume named volume within the container imsm0. The device node /dev/md/volume will be created. Four drives /dev/nvme[0-3]n1 assigned to the IMSM container are used by default.

| Note: Any RAID volume type created after the OS has been installed on a RAID volume of a different RAID type may be set to inactive after a reboot. This is because the Linux installer automatically rebuilds initramfs with only the RAID modules needed for the OS RAID type. When creating different RAID volumes afterward, the missing RAID modules may prevent arrays from starting, leaving them in an “inactive” state. To avoid this, rebuild initramfs after creating new RAID volumes of a different type, then restart the system to activate the volumes.

Intel® Matrix RAID Volume Creation

Intel® Matrix RAID allows the creation of two independent RAID volumes within a single RAID container. The initial step follows the same procedure described in Section 4.2.1, where you first create an IMSM container device.

Next, create the first RAID volume using the --size parameter to reserve free space for the second volume. Both RAID volumes must utilize the same set of member drives within the IMSM container to ensure configuration consistency and performance balance.

In this example, a RAID 10 with name “volume0” is created with size of 10GiB, within the “imsm0” container:

Expected output:

Now, you can proceed to create the second RAID volume.

In this example, a RAID 5 volume named “volume1” is created within the same “imsm0” container. This second RAID 5 volume utilizes the remaining available space on the drives within the container.

Expected output:

Manage RAID Volumes

Intel Matrix RAID allows the creation of multiple RAID volumes of different types from the same set of physical drives. For example, part of the drives can be used to create a RAID 0 volume, and the remaining space can be used to create a RAID 5 volume.

Example – a RAID 10 with name “volume0” is created with size of 10GiB, within the “imsm0” container:

Expected output:

Now, the second volume can be created. In this example a RAID 5 with name “volume1” is created, within the same “imsm0” container. The second RAID 5 volume uses the rest of available space on the drives.

Expected output:

Now, the second volume can be created.

In this example a RAID 5 with name “volume1” is created, within the same “imsm0” container. The second RAID 5 volume uses the rest of available space on the drives.

Expected output:

RAID Volume Creation Parameters

There are several optional parameters which can be used during Intel VROC RAID creation. These parameters are tested and verified for Intel VROC Linux*. The open source mdadm utility may offer other parameters which may not be compatible with Intel VROC.

Table 4-1. RAID Volume Creation Customizable Parameters

Parameter

Short Version

Definition

--bitmap=

-b

Default is none.

Can be set to internal to enable internal bitmap feature.

--chunk=

-c

Specifies chunk (strip) size, in kibibytes by default. Meaningful for RAID 0, RAID 5 and RAID 10. Must be power of 2.

Optimal chunk size should be considered depending on expected workload profiles.

--consistency- policy=

-k

Meaningful only for RAID 5. Default is resync.

Can be set to ppl to enable RWH Closure mechanism.

--force

-f

Used to indicate that non usual operation as intentional. It allows to:

· Create container with one drive.

· RAID 0 volume with one drive.

--level=

-l

Specifies the RAID level. The options supported by Intel® VROC RAID are 0, 1, 5, 10.

--metadata=

-e

Specifies metadata to be used during creation. Meaningful only for container creation. Must be set to imsm.

--raid- devices=

-n

Number of devices to be used in the array. Must be set as follows:

· For RAID1 must be set to 2.

· For RAID10 must be set to 4.

--run

-R

Flag used to auto agree if confirmation is required.

--size=

-z

Specifies the size (by default in kibibyte) of space dedicated on each disk to the RAID volume. This must be a mulNotele of the chunk size. By default, all available space will be used.

Following suffixes can be used to specify the unit of size:

· M for Megabytes.

· G for Gigabytes.

· T for Terabytes.

Note: Intel VROC support RAID creation only with names defined as /dev/md/name or just name. Other possibilities are not supported and may cause bugs or unproper behavior.

Additional Intel VROC RAID Volume Creation Examples

Example 1: Create a 2-drive RAID 0 volume:

Example 2: Create a 1-drive RAID 0 volume with force parameter:

Example 3: Create a 2-drive RAID 1 volume with 100G size:

Example 4: Create a 3-drive RAID 5 volume:

Example 5: Create a 3-drive RAID 5 volume with 64k chunk size and PPL enabled:

Example 6: Create a 4-drive RAID 10 volume:

Example 7: Create a 4-drive Intel Matrix RAID with RAID 5 and RAID 10 volumes:

4.3 Reporting Intel® VROC RAID Information

After an Intel VROC RAID volume is created, there are mulNotele ways to get information of RAID volumes in Linux*. This section will show some basic methods to check and get Intel VROC RAID information.

4.4 Creating Intel VROC RAID Configuration File

After creating an Intel VROC RAID container or volume device, the Linux* block device is created under the /dev directory with name mdXXX, where XXX is the number allocated automatically starting from 127. The user defined name specified when creating the RAID will be represented in the /dev/md directory as a symbolic link to the /dev/mdXXX block device. The following is the example of RAID container and volume devices named as “imsm” and “volume”, which are linked to /dev/md127 and /dev/md126 block devices respectively.

Note: The /dev/mdXXX block device name is not a persistent name. The XXX number may change after system reboots. User defined name in the /dev/md directory can be persistent when a configuration file is created with the specific array name defined.

Note: After creating an Intel VROC RAID volume device, the block device of the RAID member drive is still visible and accessible to the user. For that reason, other system utilities (e.g., lsblk) may export the device to the user. This is the current Intel VROC Linux* product design.

Retrieve RAID Status through/proc/mdstat

/proc/mdstat is a Linux* special file for the user to read the status of all the containers and RAID volumes in the system.

In the above example, the content of /proc/mdstat is presented with Intel VROC RAID 5 volume (md126) and Intel VROC IMSM container (md127). Here are some explanations of each field displayed on the md126 and md127 devices.

Table 4-2. Explanation of md126 Global Properties

Output snippet

Explanation

md126

Device name

active

Array state

raid5

RAID Level 5

nvme6n1[3] nvme5n1[2] nvme4n1[1] nvme3n1[0]

RAID Volume member devices

Table 4-3. Explanation of md126 Additional Properties

Output snippet

Explanation

125958144 blocks

Volume size in sectors

super external:/md127/0

RAID metadata details, container name, and array index

level 5

RAID level

128k chunk

Chunk size

algorithm 0 [4/4] [UUUU]

Additional information about member disk availability

Note: Some parameters reported by /proc/mdstat are determined by the Linux* MD RAID personality and may differ depending on the RAID level in use.

In the example above, md127 represents the container device. It stores metadata only. Intel VROC RAID uses IMSM metadata, which can be identified by the string super external:imsm printed in the container line.

Extracting Detailed RAID Information

The mdadm utility offers a special command to extract all RAID array information from the system. To extract all RAID details the following command is used:

Reading Intel VROC RAID Metadata

Note: Reviewing IMSM metadata is a recommended practice when diagnosing array issues or validating the correct RAID configuration after deployment.

RAID Volume States

After creating RAID volumes, it is necessary to create the configuration file to record the existing RAID volumes’ information as well as adding specific RAID management policies. This configuration file allows a consistent RAID volume naming in /dev/md after system reboots. This reliable and constant name can be used for system auto- configuration (e.g., in /etc/fstab for automounting). The MD device nodes in /dev are not consistent and may change depend on system enumeration.

The following command extracts system RAID volume information and stores it to the configuration file /etc/mdadm.conf.

The configuration file is typically stored at the default location of /etc/mdadm.conf. Depending on the OS distribution, there may be different file locations and alternative means of saving the configuration file. It is important to reference the mdadm.conf man page to determine what applies to your distribution.

4.5 Intel® VROC RAID Volume Initialization/Resync

As soon as a RAID volume is created, an initialization (or resync) process automatically begins if the RAID level is 1, 10, or 5. During this phase, data integrity is not fully guaranteed on RAID 5 arrays. If a drive fails before initialization completes, data recovery will not be possible. The same risk applies when a RAID volume is undergoing a rebuild.

Adjusting Initialization/Resync Speed

By default, Linux* sets a maximum speed of 200 MB/s for RAID volume initialization and rebuild operations. This limit was originally chosen as a community standard based on the performance of spinning-platter hard drives. With the advancement of storage technologies, however, this default may be unnecessarily restrictive.

This value can be modified to avoid excessive initialization or rebuild times on modern storage devices. The following commands illustrate how to check the current speed setting and how to modify it:

Check the current speed limit:

You can modify this to a higher value, such as 5GB for NVMe drives:

Note: Different values may be appropriate for different drive types. 5GB is reasonable for typical Data Center NVMe SSD models.

Adding a Hot Spare Drive

Adding a hot spare drive allows for immediate reconstruction of the RAID volume when a device failure is detected. The mdadm tool will mark the failed device as bad and begin reconstruction using the first available hot spare drive. Hot spare drives also support capacity expansion scenarios. During normal operations, these drives remain idle.

Intel VROC Linux* RAID uses mdadm with IMSM metadata, meaning that any hot spare drive added to a container is dedicated to that specific container.

To add a hot spare drive to the container /dev/md/imsm0:

Note: The metadata UUID for the hot spare inside the container will remain set to all zeros until the drive is re‑assigned as an active member of the RAID volume. A hot spare drive with a zero UUID may be allocated to a different container after a system reboot.

Note: Add hot spare drives one at a time. This avoids a known limitation of the mdadm tool, where adding multiple spares in a single command may not be handled correctly.

4.6 Configuring Global Hot Spare

By default, the Intel VROC Linux* RAID hot spare drive is dedicated to a specific container. Only the Intel VROC RAID volumes inside that container can be automatically reconstructed to the hot spare drive when redundant RAID volume is degraded. Intel VROC Linux* RAID allows configuring a global hot spare drive that can be used by any degraded Intel VROC RAID volume on the system for automatic recovery (rebuilding).

The following command adds the policy with the same domain and the same action for all drives to the mdadm configuration file, which enables the global hot spare function that allows the spare drives to move from one container to another for rebuilding:

After configuring the policy in the configuration file, the mdmonitor service should be restarted to take effect.

4.7 Removing Intel® VROC RAID Volumes

Completely removing Intel VROC RAID in Linux* requires several steps. The following sections illustrate the mdadm commands to stop and remove an Intel VROC RAID volume in Linux*.

Stopping RAID Volume and Container

The first step is to stop the RAID volume and container devices. Stopping a RAID volume or container essentially removes the Linux* MD block device, but the Intel VROC RAID metadata remains on the member drives. This means the RAID volume can be reassembled later if needed.

Note: An Intel VROC RAID volume can only be stopped if it is not in use (for example, when no filesystem is mounted). Similarly, a RAID container device can only be stopped if no member RAID volumes are active.

To stop a RAID volume or container device:

Expected result:

Multiple MD devices can be stopped at once. Each will be processed individually, and the sequence should not matter:

To stop all active RAID volumes and containers on the system:

This command scans for and stops all running RAID volumes and containers.

Erasing RAID Metadata

Once the Intel VROC RAID volume and container devices are stopped correctly, the next step to completely remove Intel VROC RAID in Linux* is to erase Intel VROC RAID metadata. Having incorrect or bad metadata can cause RAID volumes to be assembled incorrectly. The metadata can be erased on each potential member drive with the following command to ensure the drive is clean. This operation does not attempt to wipe existing user data but will delete an array if it’s a current member drive or spare drive. The RAID volumes and containers must be stopped and deactivated to run the erase operation.

MulNotele drives can be specified to clear the superblock at the same time.

Removing One RAID Volume of Intel Matrix RAID

Intel Matrix RAID configurations contain two RAID volumes inside a single container, so removing one volume requires extra care.

First, stop both RAID volumes within the Matrix RAID container. For example, to stop volumes vol0 and vol1:

Next, remove the metadata for the volume you want to delete. The following example removes the first RAID volume (index 0) from container imsm0:

Note: The --kill-subarray option requires a volume index of either 0 or 1, representing the first or second volume.

Note: After a subarray is deleted, RAID volume UUIDs and indices are reassigned. If subarray index 1 is deleted, index 0 remains. If index 0 is deleted, index 1 is renumbered to 0.

Assembling Intel® VROC RAID Volumes

Intel VROC RAID volumes can be assembled and activated using the mdadm utility. By default, the tool scans the configuration file located at /etc/mdadm.conf to assemble arrays:

If the configuration file is not present, mdadm scans all available drives for RAID members and assembles any detected volumes. In this case, the RAID volume name under /dev/md will automatically include the suffix _0.

Note: If the configuration file exists but contains incorrect or incomplete information about RAID volumes or containers, mdadm will refuse to assemble the arrays.

If you prefer to assemble arrays manually without relying on the configuration file, use the following steps:

  1. Assemble the container device For example, to assemble container /dev/md/md0 with IMSM metadata and a list of drives:

  1. Assemble the RAID volume After the container is active, assemble the RAID volume inside it. For example, to assemble volume /dev/md/md0 within container /dev/md/imsm0:

This step-by-step process ensures that Intel VROC RAID volumes are brought online correctly, whether or not a configuration file is available.

Creating File Systems

Once a RAID volume has been created, it must be formatted with a file system before it can be mounted and used. The Linux* mkfs utility provides this capability.

For example, to create an EXT4 file system on the Intel VROC RAID volume md0 and then mount it to /mnt/data:

After the file system is created, you can mount the RAID volume to any desired directory. In the example above, the md0 volume is mounted at /mnt/data.

To configure automatic mounting at boot, edit the Linux file system table (/etc/fstab). It is recommended to include either the _netdev option or the noauto,x-systemd.automount option. These options ensure the RAID member drives and RAID volume device have sufficient time to be detected before the system attempts to mount the file system.

Two configuration examples in /etc/fstab:

  • Option 1: Use _netdev mount option

  • Option 2: Use noauto,x-systemd.automount mount option

Removing an Active Intel VROC RAID Member Disk

Intel VROC Linux* provides two methods for removing an active drive from a RAID volume: surprise hot removal and graceful removal.

  • Surprise hot removal: The physical drive is removed directly from the system without advance notice. Depending on the RAID level, the array state changes to degraded or failed. If the array fails, its logical device will be removed from Linux* and will not be reassembled after reboot.

  • Graceful removal: For RAID levels that support redundancy (RAID 1, RAID 5, RAID 10), you can remove a member disk in a controlled way using a two-step process.

  1. Mark the drive as faulty in the RAID volume

Example: Mark nvme0n1 as faulty in volume /dev/md/volume0:

This action removes the drive from the RAID volume but keeps it within the container.

  1. Remove the faulty drive from the container

Execute the following command to remove the drive from container /dev/md/imsm0:

After these steps, the logical drive is detached from the RAID volume, and you can safely remove the physical disk from the system.

Note: Intel VROC Linux* only allows you to mark a drive as faulty when the RAID volume is redundant and not already degraded (or double-degraded in the case of RAID 10). This safeguard prevents accidental failure of the entire array.

RAID Recovery

Recovery is one of the most critical aspects of RAID management, as it enables rebuilding of redundant RAID volumes when a drive failure occurs. Recovery is supported only for RAID levels 1, 5, and 10.

  • RAID 1 and RAID 5: Recovery is possible if no more than one drive fails. In these levels, the array can be rebuilt by reconstructing data on a replacement drive.

  • RAID 10: Recovery may be possible even if two drives fail, provided the failures occur in two different mirrored pairs. However, if both drives in the same mirrored pair fail, recovery is not possible.

Intel VROC supports multiple recovery scenarios for degraded RAID volumes, ensuring flexibility and resilience in data protection strategies.

Auto Rebuilding to a Hot Spare

When a hot spare drive is available in the container, Intel VROC automatically starts rebuilding a degraded array to that spare. Progress can be monitored using /proc/mdstat:

The output shows recovery progress and resync speed. The rebuilding details also confirm which spare drive (for example, /dev/nvme2n1) is being used.

If no spare was pre-configured, you can manually add one to trigger recovery. Refer to Section 4.6 Adding a Hot Spare Drive for instructions. For example, adding /dev/nvme0n1 as a hot spare to a degraded RAID 5 volume (md126) will start the rebuild process automatically.

Auto Rebuilding to a New Drive

Another recovery scenario occurs when a failed RAID member drive is replaced with a brand-new disk. Intel VROC can be configured to automatically start rebuilding the degraded volume once the new drive is hot-inserted into the same slot.

To enable this functionality, complete the following steps:

  1. Configure global hot spare policy

Add the policy entry into /etc/mdadm.conf to allow spare drives to move across containers for rebuilding:

Example configuration file:

  1. Generate and reload udev rules

After updating the configuration file, run:

This ensures that new devices are correctly recognized and associated with the array.

  1. Confirm the udev rule

A rule similar to the following will be added:

  1. Verify mdmonitor service

Ensure the mdmonitor service is running. This service enables automatic rebuild operations when a replacement drive is inserted.

  1. Prepare the replacement drive

The new drive must be a bare disk, meaning at least the first and last 4 KB must be zeroed out. For example, to prepare /dev/nvme0n1:

With these steps complete, Intel VROC will automatically begin rebuilding to the newly inserted drive when it replaces a failed member in the same slot.

Last updated

Was this helpful?