mdadm, which is a Linux Software RAID, is not inferior to a typical hardware RAID controller, and just like a hardware controller – enables us to swap physical disks inside the RAID array. It requires executing some commands indeed, but the whole process still seems to be pretty straightforward.
Usually, we replace a disk in RAID when it starts failing, but there might be scenarios, where you just want to swap mechanical SATA disks in RAID with SSDs, one by one, for better performance, without reinstalling the whole OS.
Our RAID1 array is based on two physical disks: /dev/sda and /dev/sdb, it consists of two personalities: /dev/md126 and /dev/md127:
[root@fixxxer ~]# cat /proc/mdstat
Personalities : [raid1]
md126 : active raid1 sdb2[1] sda2[2]
487252992 blocks super 1.2 [2/2] [UU]
bitmap: 3/4 pages [12KB], 65536KB chunk
md127 : active raid1 sdb1[1] sda1[2]
999424 blocks super 1.2 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk
Personality /dev/md126 consists of two corresponding partitions: /dev/sda2 and /dev/sdb2 and it creates an LVM Physical Volume:
[root@fixxxer ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/md126 fedora lvm2 a-- 464,68g 0
Personality /dev/md127 consists of two corresponding standard partitions: /dev/sda1, /dev/sdb1 and is mounted in the system as a /boot directory:
[root@fixxxer ~]# df -hT | grep boot
/dev/md127 xfs 973M 294M 679M 31% /boot
In this tutorial, we are replacing the /dev/sdb drive in our RAID1 array (mirror), with the new disk, and rebuilding the GRUB Bootloader on the new disk.
Steps:
1. Mark disk partitions as failed
Because /dev/sdb disk includes two partitions, involved in two RAID personalities, we need to set both partitions as FAILED in order to be able to remove the disk from the array:
[root@fixxxer ~]# mdadm --manage /dev/md126 --fail /dev/sdb2
mdadm: set /dev/sdb2 faulty in /dev/md126
[root@fixxxer ~]# mdadm --manage /dev/md127 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md127
Both partitions from /dev/sdb are now marked with the F letter and the array is degraded state:
[root@fixxxer ~]# cat /proc/mdstat
Personalities : [raid1]
md126 : active raid1 sdb2[1](F) sda2[2]
487252992 blocks super 1.2 [2/1] [U_]
bitmap: 4/4 pages [16KB], 65536KB chunk
md127 : active raid1 sdb1[1](F) sda1[2]
999424 blocks super 1.2 [2/1] [U_]
bitmap: 0/1 pages [0KB], 65536KB chunk
2. Remove disk partitions from RAID personalities
[root@fixxxer ~]# mdadm --manage /dev/md126 --remove /dev/sdb2
mdadm: hot removed /dev/sdb2 from /dev/md126
[root@fixxxer ~]# mdadm --manage /dev/md127 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1 from /dev/md127
3. Replace the disk
Replace /dev/sdb disk with the new one.
4. Create a partition table on the new disk
Copy the partition table from the existing disk /dev/sda to the new one, that is /dev/sdb:
[root@fixxxer ~]# sfdisk -d /dev/sda | sfdisk /dev/sdb
4. Recreate RAID1 mirrors using new disk partitions
Now add the newly created partitions /dev/sdb1 and /dev/sdb2 to the corresponding RAID 1 personalities, that is /dev/md127 and /dev/md126:
[root@fixxxer ~]# mdadm --manage /dev/md126 --add /dev/sdb2
mdadm: added /dev/sdb2
[root@fixxxer ~]# mdadm --manage /dev/md127 --add /dev/sdb1
mdadm: added /dev/sdb1
Warning: be careful – if the disk replacement required powering your computer off and on, then the drive letters of your partitions could have changed and you will have to alter the above commands accordingly.
5. Verify RAID1 mirror status
After we have added new partitions to the existing personalities /dev/md126 and /dev/md127, the RAID arrays will start to rebuild:
[root@fixxxer ~]# cat /proc/mdstat
Personalities : [raid1]
md126 : active raid1 sdb2[3] sda2[2]
999424 blocks super 1.2 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk
md127 : active raid1 sdb1[3] sda1[2]
487252992 blocks super 1.2 [2/1] [U_]
[=====>...............] recovery = 26.7% (130184064/487252992) finish=93.9min speed=63315K/sec
bitmap: 4/4 pages [16KB], 65536KB chunk
Rebuilding takes a while, depending on the type of both disks. After it has been accomplished, verify the status of both RAID personas:
[root@fixxxer ~]# mdadm --detail /dev/md126
/dev/md126:
Version : 1.2
Creation Time : Sun Sep 15 02:36:11 2019
Raid Level : raid1
Array Size : 487252992 (464.68 GiB 498.95 GB)
Used Dev Size : 487252992 (464.68 GiB 498.95 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu May 5 19:14:46 2022
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : localhost-live:pv00
UUID : e802093f:ef7988e5:4337bde0:662bf872
Events : 73263
Number Major Minor RaidDevice State
2 8 2 0 active sync /dev/sda2
3 8 18 1 active sync /dev/sdb2
[root@fixxxer ~]# mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Sun Sep 15 02:35:57 2019
Raid Level : raid1
Array Size : 999424 (976.00 MiB 1023.41 MB)
Used Dev Size : 999424 (976.00 MiB 1023.41 MB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu May 5 18:55:49 2022
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : localhost-live:boot
UUID : b9680eb2:2139f4ae:e099a845:e12657da
Events : 232
Number Major Minor RaidDevice State
2 8 1 0 active sync /dev/sda1
3 8 17 1 active sync /dev/sdb1
6. Recreate GRUB Bootloader on the new disk
Both disks in our RAID1 array, that is /dev/sda and /dev/sdb, are bootable, so each one of them should include GRUB Bootloader. That is why, we need to install GRUB Bootloader on the new disk, in order to be able to boot from it, in case the first disk has failed:
[root@fixxxer ~]# grub2-install /dev/sdb
After GRUB installation, it is good to temporarily change the disk boot priority in BIOS and try to boot from /dev/sdb, just in case, to see if it is going to work in the future.
Troubleshooting
On CentOS 7 you may encounter an error trying to re-install GRUB after replacing the disk ( for example /dev/sda ) in mdadm RAID array (I haven’t noticed this issue on other distros so far):
[root@fixxxer ~]# grub2-install /dev/sda
Installing for i386-pc platform.
grub2-install: error: disk `mduuid/13a75c2dd7275e4aedb6fb4acd9b9f7a' not found.
The possible solution to fix it is reassembling the RAID persona which holds a /boot directory.
Steps:
1. Find out which RAID persona holds your /boot directory
[root@fixxxer ~]# df -hT | grep /boot
/dev/md127 xfs 973M 294M 679M 31% /boot
2. Find out what partitions are included in the persona
[root@fixxxer ~]# cat /proc/mdstat
Personalities : [raid1]
md126 : active raid1 sda2[2] sdb2[3]
487252992 blocks super 1.2 [2/2] [UU]
bitmap: 3/4 pages [12KB], 65536KB chunk
md127 : active raid1 sda1[2] sdb1[3]
999424 blocks super 1.2 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk
unused devices:
3. Unmount /boot directory
[root@fixxxer ~]# umount /boot
4. Stop RAID persona
[root@fixxxer ~]# mdadm --stop /dev/md127
5. Reassemble back your RAID persona using the same partitions
[root@fixxxer ~]# mdadm --assemble --run /dev/md127 /dev/sda1 /dev/sdb1
6. Install GRUB on both disks involved in the persona
[root@fixxxer ~]# grub2-install /dev/sda
[root@fixxxer ~]# grub2-install /dev/sdb