#2702 PR closed: Software RAID: IMSM support

Labels: enhancement, won't fix / can't fix / obsolete

rmetrich opened issue at 2021-10-25 15:29:

The current code was badly failing with Software RAID having Containers, such as IMSM.

Relax-and-Recover (ReaR) Pull Request Template

Please fill in the following items before submitting a new pull request:

Pull Request Details:
  • Type: Enhancement

  • Impact: Normal

  • Reference to related issue (URL):

  • How was this pull request tested?

    On RHEL8 with Upstream ReaR, by creating fake IMSM arrays on a QEMU/KVM

  • Brief description of the changes in this pull request:

    When encountering an IMSM array, to be able to recreate it the mdadm command line is somehow different.
    In particular the volumes inside the container have only the container as parent, and size of the array can be specified.
    Finally, because the normal partitions are also seen, it's necessary to wipe the devices prior to setting it up.

Reproducer (a QEMU/KVM)

  1. Create a QEMU/KVM with additional SCSI disks and allocate an ID to each disk.

  2. Create a IMSM container

    # mdadm --create /dev/md/imsm1 -n 2 --metadata=imsm /dev/sdc /dev/sdd
    
  3. Create volumes in the container (size is optional)

    # mdadm --create /dev/md/Raid1_1 -n 2 --level 1 /dev/md/imsm1 --size 200M
    # mdadm --create /dev/md/Raid0_1 -n 2 --level 0 /dev/md/imsm1 --size 300M
    
  4. Amend /etc/rear/local.conf

    export IMSM_NO_PLATFORM=1
    AUTOEXCLUDE_DISKS=n
    
  5. Execute rear mkrescue

  6. Restore the ISO on a clone system

With current code, the restore will fail due to missing raid-devices option for the container, wipefs issues and badly formed lines for the Volumes containing raw devices making the Container, whereas the Container device should be used.

Example with this patch:

raid /dev/md127 metadata=imsm level=container raid-devices=2 uuid=cd51b012:c06cc1a1:f8ecdb57:690899ac name=imsm1 devices=/dev/sdc,/dev/sdd

raid /dev/md124 level=raid1 raid-devices=2 uuid=c49d5e0b:f78f04be:42d81ecc:7d5ca98a name=Raid1_1 devices=/dev/md127 size=204800
raid /dev/md123 level=raid0 raid-devices=2 uuid=11d7c434:a1be15ae:cd0049d8:53bd36b5 name=Raid0_1 chunk=128 devices=/dev/md127 size=307200

pcahyna commented at 2021-10-25 15:38:

Great, thanks for the contribution! I am currently in the process of learning how to use IMSM on an actual IMSM capable hardware, but I haven't written any code yet. I am going to try your code in IMSM test setup, hopefully soon.

rmetrich commented at 2021-10-25 15:46:

See also RHBZ 2017107 for reproducing without any IMSM hardware (just some hack is required to let the system boot).
If the IMSM devices are created for non-system partitions, this is straightforward.

jsmeix commented at 2021-10-26 08:18:

@rmetrich
thank you for your enhancement!

Could you do me a favour and also post here your output of

lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT,UUID

on your test system - ideally the one where you posted your "Example with this patch"
in your initial comment https://github.com/rear/rear/pull/2702#issue-765592871

I like that lsblk output so much because at least for me it provides me
an easily comprehensible overview of the whole block devices structure.

rmetrich commented at 2021-10-26 08:41:

Actually I don't have it anymore, did modify the configuration for wider testing ...
Anyway, this is what I have with the following configuration (4 more 2GB disks):

# mdadm --create /dev/md/imsm1 -n 4 --metadata=imsm /dev/sdc /dev/sdd /dev/sde /dev/sdf
# mdadm --create /dev/md/Raid1_1 -n 2 --level 1 /dev/md/imsm1 --size 200M
# mdadm --create /dev/md/Raid0_1 -n 2 --level 0 /dev/md/imsm1 --size 200M
# mdadm --create /dev/md/Raid10_1 -n 4 --level 10 /dev/md/imsm1 --size 300M

# lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT,UUID
NAME                        KNAME        PKNAME       TRAN TYPE   FSTYPE           SIZE MOUNTPOINT UUID
/dev/sda                    /dev/sda                       disk   isw_raid_member   20G            
`-/dev/md126                /dev/md126   /dev/sda          raid1                    20G            
  |-/dev/md126p1            /dev/md126p1 /dev/md126        md     xfs                1G /boot      93e3a8a6-e365-4e53-9648-f699e9bb6a28
  `-/dev/md126p2            /dev/md126p2 /dev/md126        md     LVM2_member       19G            dFAVuo-eQTU-c0Jc-BhpB-eXcK-Lfde-1c388C
    |-/dev/mapper/rhel-root /dev/dm-0    /dev/md126p2      lvm    xfs               17G /          67fcc871-8f65-4592-8cb4-5ff9b94e5f9b
    `-/dev/mapper/rhel-swap /dev/dm-1    /dev/md126p2      lvm    swap               2G [SWAP]     9e2259c8-a730-4e64-8d0e-66e579e5bdc2
/dev/sdb                    /dev/sdb                       disk   isw_raid_member    2G            
|-/dev/md122                /dev/md122   /dev/sdb          raid10                  600M            
|-/dev/md123                /dev/md123   /dev/sdb          raid0                   400M            
`-/dev/md124                /dev/md124   /dev/sdb          raid1                   200M            
/dev/sdc                    /dev/sdc                       disk   isw_raid_member    2G            
|-/dev/md122                /dev/md122   /dev/sdc          raid10                  600M            
`-/dev/md123                /dev/md123   /dev/sdc          raid0                   400M            
/dev/sdd                    /dev/sdd                       disk   isw_raid_member    2G            
`-/dev/md122                /dev/md122   /dev/sdd          raid10                  600M            
/dev/sde                    /dev/sde                       disk   isw_raid_member    2G            
|-/dev/md122                /dev/md122   /dev/sde          raid10                  600M            
`-/dev/md124                /dev/md124   /dev/sde          raid1                   200M            
/dev/sdf                    /dev/sdf                       disk   isw_raid_member   20G            
`-/dev/md126                /dev/md126   /dev/sdf          raid1                    20G            
  |-/dev/md126p1            /dev/md126p1 /dev/md126        md     xfs                1G /boot      93e3a8a6-e365-4e53-9648-f699e9bb6a28
  `-/dev/md126p2            /dev/md126p2 /dev/md126        md     LVM2_member       19G            dFAVuo-eQTU-c0Jc-BhpB-eXcK-Lfde-1c388C
    |-/dev/mapper/rhel-root /dev/dm-0    /dev/md126p2      lvm    xfs               17G /          67fcc871-8f65-4592-8cb4-5ff9b94e5f9b
    `-/dev/mapper/rhel-swap /dev/dm-1    /dev/md126p2      lvm    swap               2G [SWAP]     9e2259c8-a730-4e64-8d0e-66e579e5bdc2

jsmeix commented at 2021-10-26 09:03:

To me as non RAID expert the lsblk output seems
to not match the mdadm command that created the container
because to me it seems

mdadm --create /dev/md/imsm1 -n 4 --metadata=imsm /dev/sdc /dev/sdd /dev/sde /dev/sdf

should create a container of sdc sdd sde sdf
while the lsblk output looks as if the container consists of sdb sdc sdd sde
and there is a separated RAID1 consisting of sda and sdf.

rmetrich commented at 2021-10-26 09:08:

To me as non RAID expert the lsblk output seems to not match the mdadm command that created the container because to me it seems

mdadm --create /dev/md/imsm1 -n 4 --metadata=imsm /dev/sdc /dev/sdd /dev/sde /dev/sdf

should create a container of sdc sdd sde sdf while the lsblk output looks as if the container consists of sdb sdc sdd sde and there is a separated RAID1 consisting of sda and sdf.

Right, actually there was a disk renaming after I rebooted, explaining the difference (sdX naming is not reliable).

jsmeix commented at 2021-10-26 09:47:

Ah - "reboot" explains it all.

Unfortunately it seems the NAME output column of lsblk
only shows /dev/mapper/ symlinks as NAME (in contrast to KNAME)
but does not show /dev/md/ symlinks as NAME (in contrast to KNAME)

E.g. on my homeoffice laptop with LUKS (excerpt)

# lsblk -ipo NAME,KNAME,PKNAME,TYPE,FSTYPE,MOUNTPOINT
NAME                                                      KNAME     PKNAME    TYPE  FSTYPE      MOUNTPOINT
/dev/sda                                                  /dev/sda            disk              
|-/dev/sda1                                               /dev/sda1 /dev/sda  part              
|-/dev/sda2                                               /dev/sda2 /dev/sda  part  crypto_LUKS 
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 /dev/dm-1 /dev/sda2 crypt swap        [SWAP]
|-/dev/sda3                                               /dev/sda3 /dev/sda  part  crypto_LUKS 
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 /dev/dm-0 /dev/sda3 crypt ext4        /
|-/dev/sda4                                               /dev/sda4 /dev/sda  part  ext4        /nfs

where one can see the /dev/mapper/ symlinks as NAME

/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 -> ../dm-0
/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 -> ../dm-1

versus on a QEMU/KVM test system with RAID1 (excerpt)

# lsblk -ipo NAME,KNAME,PKNAME,TYPE,FSTYPE,MOUNTPOINT
NAME                      KNAME        PKNAME       TYPE  FSTYPE            MOUNTPOINT
/dev/sda                  /dev/sda                  disk  linux_raid_member 
`-/dev/md127              /dev/md127   /dev/sda     raid1                   
  |-/dev/md127p1          /dev/md127p1 /dev/md127   part                    
  `-/dev/md127p2          /dev/md127p2 /dev/md127   part  crypto_LUKS       
    `-/dev/mapper/cr_root /dev/dm-0    /dev/md127p2 crypt btrfs             /
/dev/sdb                  /dev/sdb                  disk  linux_raid_member 
`-/dev/md127              /dev/md127   /dev/sdb     raid1                   
  |-/dev/md127p1          /dev/md127p1 /dev/md127   part                    
  `-/dev/md127p2          /dev/md127p2 /dev/md127   part  crypto_LUKS       
    `-/dev/mapper/cr_root /dev/dm-0    /dev/md127p2 crypt btrfs             /

where one cannot see the stable /dev/md/ symlinks as NAME

/dev/md/raid1sdab2 -> ../md127p2
/dev/md/raid1sdab1 -> ../md127p1
/dev/md/raid1sdab -> ../md127

but only the unstable kernel device nodes :-(

DEvil0000 commented at 2021-10-29 12:13:

md127 and such is the name of the raid given by the kernel (I guess) if the raid has no name set. see mdadm assemble for rename.
For plain disk names and order (like sdb) this is normally controlled by your BIOS/firmware and configured boot order but you can identify them by UUID or label instead.
Another option is using LVM which can internally make use of (md) raid but you get the easy naming and other LVM features for free.

jsmeix commented at 2021-11-15 12:52:

@pcahyna
if you approve it - provided it looks OK to you - I would merge it.
No problem if you need more time for review.
I only like to know what its status is.

pcahyna commented at 2021-11-15 13:20:

@jsmeix sorry for the delay, it took me some time to do proper tests (backup+restore) on an actual IMSM-capable hardware. Here are the results (not with the current code though - I backported the change to ReaR 2.6 in RHEL 8.5).
Output from rear recover:

Creating software RAID /dev/md127
Creating software RAID /dev/md126
Disk '/dev/md126': creating 'gpt' partition table
Disk '/dev/md126': creating partition number 1 with name ''EFI System Partition''
Disk '/dev/md126': creating partition number 2 with name ''md126p2''
Disk '/dev/md126': creating partition number 3 with name ''md126p3''
Creating LVM PV /dev/md126p3
Restoring LVM VG 'rhel_sgi-uv300-02'
Sleeping 3 seconds to let udev or systemd-udevd create their devices...
Creating filesystem of type xfs with mount point / on /dev/mapper/rhel_sgi--uv300--02-root.
Mounting filesystem /
Creating filesystem of type xfs with mount point /home on /dev/mapper/rhel_sgi--uv300--02-home.
Mounting filesystem /home
Creating filesystem of type xfs with mount point /boot on /dev/md126p2.
Mounting filesystem /boot
Creating filesystem of type vfat with mount point /boot/efi on /dev/md126p1.
Mounting filesystem /boot/efi
Creating swap on /dev/mapper/rhel_sgi--uv300--02-swap
Disk layout created.
Restoring from '/tmp/rear.HPhb0jLJMJzuKqD/outputfs/sgi-uv300-02/backup.tar.gz' (restore log in /var/lib/rear/restore/recover.backup.tar.gz.2620.restore.log) ...
Restoring tar: boot/extlinux: time stamp 2021-11-11 04:53:49.916804272 is 14206.848134038 s in the future OK
Backup restore 'tar' finished with zero exit code
Restoring finished (verify backup restore log messages in /var/lib/rear/restore/recover.backup.tar.gz.2620.restore.log)
Created SELinux /mnt/local/.autorelabel file : after reboot SELinux will relabel all files
Recreating directories (with permissions) from /var/lib/rear/recovery/directories_permissions_owner_group
Migrating disk-by-id mappings in certain restored files in /mnt/local to current disk-by-id mappings ...
Migrating filesystem UUIDs in certain restored files in /mnt/local to current UUIDs ...
Patching symlink etc/sysconfig/grub target /mnt/local/etc/default/grub
Patching filesystem UUIDs in /mnt/local/etc/default/grub to current UUIDs
Skip patching symlink etc/mtab target /mnt/local/proc/6394/mounts on /proc/ /sys/ /dev/ or /run/
Patching filesystem UUIDs in etc/fstab to current UUIDs
Patching filesystem UUIDs in etc/mtools.conf to current UUIDs
Patching filesystem UUIDs in boot/efi/EFI/redhat/grub.cfg to current UUIDs
Running mkinitrd...
Updated initrd with new drivers for kernel 4.18.0-348.el8.x86_64.
Creating EFI Boot Manager entries...
Cannot create EFI Boot Manager entry for ESP /dev/md126p1 (unable to find the underlying disk)
efibootmgr failed to create EFI Boot Manager entry for 'EFI\redhat\grubx64.efi' (UEFI_BOOTLOADER='/boot/efi/EFI/redhat/grubx64.efi')
WARNING:
For this system
RedHatEnterpriseServer/8 on Linux-i386 (based on Fedora/8/i386)
there is no code to install a boot loader on the recovered system
or the code that we have failed to install the boot loader correctly.
Please contribute appropriate code to the Relax-and-Recover project,
see http://relax-and-recover.org/development/
Take a look at the scripts in /usr/share/rear/finalize - for example
for PC architectures like x86 and x86_64 see the script
/usr/share/rear/finalize/Linux-i386/660_install_grub2.sh
and for POWER architectures like ppc64le see the script
/usr/share/rear/finalize/Linux-ppc64le/660_install_grub2.sh
---------------------------------------------------
|  IF YOU DO NOT INSTALL A BOOT LOADER MANUALLY,  |
|  THEN YOUR SYSTEM WILL NOT BE ABLE TO BOOT.     |
---------------------------------------------------
You can use 'chroot /mnt/local bash --login'
to change into the recovered system and
manually install a boot loader therein.

So, first problem, ReaR is not able to execute efibootmgr to set the RAID bootable. (This looks similar to #2595 and I can fix this separately, because I fixed that issue as well).

After fixing the boot order in UEFI and rebooting, the machine is stuck in initrd and drops to the emergency shell because it is not able to find its LVM storage. Investigation reveals that the recovery process has changed the MD UUIDs, but the GRUB command line and /etc/mdadm.conf in the initrd still have the old UUIDs. After fixing them in GRUB (I had to do it both in /boot/efi/EFI/redhat/grub.cfg and in /boot/efi/EFI/redhat/grubenv where there is the current boot entry saved - grub2-editenv /boot/efi/EFI/redhat/grubenv unset saved_entry kernelopts boot_success did the trick for me) the restored system worked properly.

Not sure about the second problem - ReaR should be able to patch configuration files to solve it, but shouldn't it restore the arrays with the same UUIDs in the first place to avoid the problem from the start?
Here is the saved layout (note that it includes RAID array UUIDs):

# Disk layout dated 20211111104549 (YYYYmmddHHMMSS)
# NAME                                       KNAME        PKNAME       TRAN   TYPE  FSTYPE            SIZE MOUNTPOINT
# /dev/sda                                   /dev/sda                  sata   disk  isw_raid_member 372.6G 
# `-/dev/md126                               /dev/md126   /dev/sda            raid1                 372.6G 
#   |-/dev/md126p1                           /dev/md126p1 /dev/md126          md    vfat              600M /boot/efi
#   |-/dev/md126p2                           /dev/md126p2 /dev/md126          md    xfs                 1G /boot
#   `-/dev/md126p3                           /dev/md126p3 /dev/md126          md    LVM2_member       371G 
#     |-/dev/mapper/rhel_sgi--uv300--02-root /dev/dm-0    /dev/md126p3        lvm   xfs                70G /
#     |-/dev/mapper/rhel_sgi--uv300--02-swap /dev/dm-1    /dev/md126p3        lvm   swap                4G [SWAP]
#     `-/dev/mapper/rhel_sgi--uv300--02-home /dev/dm-2    /dev/md126p3        lvm   xfs               297G /home
# /dev/sdb                                   /dev/sdb                  sata   disk  isw_raid_member 372.6G 
# `-/dev/md126                               /dev/md126   /dev/sdb            raid1                 372.6G 
#   |-/dev/md126p1                           /dev/md126p1 /dev/md126          md    vfat              600M /boot/efi
#   |-/dev/md126p2                           /dev/md126p2 /dev/md126          md    xfs                 1G /boot
#   `-/dev/md126p3                           /dev/md126p3 /dev/md126          md    LVM2_member       371G 
#     |-/dev/mapper/rhel_sgi--uv300--02-root /dev/dm-0    /dev/md126p3        lvm   xfs                70G /
#     |-/dev/mapper/rhel_sgi--uv300--02-swap /dev/dm-1    /dev/md126p3        lvm   swap                4G [SWAP]
#     `-/dev/mapper/rhel_sgi--uv300--02-home /dev/dm-2    /dev/md126p3        lvm   xfs               297G /home
# /dev/sr0                                   /dev/sr0                  usb    rom                    1024M 
# Disk /dev/sda
# Format: disk <devname> <size(bytes)> <partition label type>
disk /dev/sda 400088457216 gpt
# Partitions on /dev/sda
# Format: part <device> <partition size(bytes)> <partition start(bytes)> <partition type|name> <flags> /dev/<partition>
# Disk /dev/sdb
# Format: disk <devname> <size(bytes)> <partition label type>
disk /dev/sdb 400088457216 gpt
# Partitions on /dev/sdb
# Format: part <device> <partition size(bytes)> <partition start(bytes)> <partition type|name> <flags> /dev/<partition>
# List of Software Raid devices (mdadm --detail --scan --config=partitions):
# ARRAY /dev/md/imsm0 metadata=imsm UUID=4d5cf215:80024c95:e19fdff4:2fba35a8
# ARRAY /dev/md/Volume0_0 container=/dev/md/imsm0 member=0 UUID=ffb80762:127807b3:3d7e4f4d:4532022f
# 
# mdadm --misc --detail /dev/md127
# /dev/md127:
#            Version : imsm
#         Raid Level : container
#      Total Devices : 2
# 
#    Working Devices : 2
# 
# 
#               UUID : 4d5cf215:80024c95:e19fdff4:2fba35a8
#      Member Arrays : /dev/md/Volume0_0
# 
#     Number   Major   Minor   RaidDevice
# 
#        -       8        0        -        /dev/sda
#        -       8       16        -        /dev/sdb
raid /dev/md127 metadata=imsm level=container raid-devices=2 uuid=4d5cf215:80024c95:e19fdff4:2fba35a8 name=imsm0 devices=/dev/sda,/dev/sdb
# mdadm --misc --detail /dev/md126
# /dev/md126:
#          Container : /dev/md/imsm0, member 0
#         Raid Level : raid1
#         Array Size : 390706176 (372.61 GiB 400.08 GB)
#      Used Dev Size : 390706176 (372.61 GiB 400.08 GB)
#       Raid Devices : 2
#      Total Devices : 2
# 
#              State : active 
#     Active Devices : 2
#    Working Devices : 2
#     Failed Devices : 0
# 
# Consistency Policy : resync
# 
# 
#               UUID : ffb80762:127807b3:3d7e4f4d:4532022f
#     Number   Major   Minor   RaidDevice State
#        1       8       16        0      active sync   /dev/sdb
#        0       8        0        1      active sync   /dev/sda
raid /dev/md126 level=raid1 raid-devices=2 uuid=ffb80762:127807b3:3d7e4f4d:4532022f name=Volume0_0 devices=/dev/md127 size=390706176
part /dev/md126 629145600 1048576 EFI%20System%20Partition boot,esp /dev/md126p1
part /dev/md126 1073741824 630194176 md126p2 none /dev/md126p2
part /dev/md126 398378139648 1703936000 md126p3 lvm /dev/md126p3
# Format for LVM PVs
# lvmdev <volume_group> <device> [<uuid>] [<size(bytes)>]
lvmdev /dev/rhel_sgi-uv300-02 /dev/md126p3 hqqg0f-9bZF-bKqR-nnEv-lR0Y-AjxB-X3EM2i 778082304
# Format for LVM VGs
# lvmgrp <volume_group> <extentsize> [<size(extents)>] [<size(bytes)>]
lvmgrp /dev/rhel_sgi-uv300-02 4096 94980 389038080
# Format for LVM LVs
# lvmvol <volume_group> <name> <size(bytes)> <layout> [key:value ...]
lvmvol /dev/rhel_sgi-uv300-02 home 318918098944b linear 
lvmvol /dev/rhel_sgi-uv300-02 root 75161927680b linear 
lvmvol /dev/rhel_sgi-uv300-02 swap 4294967296b linear 
# Filesystems (only ext2,ext3,ext4,vfat,xfs,reiserfs,btrfs are supported).
# Format: fs <device> <mountpoint> <fstype> [uuid=<uuid>] [label=<label>] [<attributes>]
fs /dev/mapper/rhel_sgi--uv300--02-home /home xfs uuid=7f2000c1-6555-4d4a-91ca-c95a30afc5c1 label=  options=rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
fs /dev/mapper/rhel_sgi--uv300--02-root / xfs uuid=3b8f6908-860a-4a13-aa21-27dfd61efa5e label=  options=rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
fs /dev/md126p1 /boot/efi vfat uuid=97C0-D758 label= options=rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro
fs /dev/md126p2 /boot xfs uuid=c63225cb-ac04-4220-8200-2d03a5a9b0be label=  options=rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
# Swap partitions or swap files
# Format: swap <filename> uuid=<uuid> label=<label>
swap /dev/mapper/rhel_sgi--uv300--02-swap uuid=9af4ffb5-2a09-480e-82c2-ad55a54ff6cd label=

jsmeix commented at 2021-11-15 13:57:

@pcahyna

I never tested software RAID having containers such as IMSM
so I cannot tell how this behaves different.

From plain looking at the code in layout/prepare/GNU/Linux/120_include_raid_code.sh
in current master code (i.e. without this pull request code)
ReaR should recreate RAID with the UUID that is stored in disklayout.conf

if version_newer "$mdadm_version" 2.0 ; then
    FEATURE_MDADM_UUID="y"
fi
...
create_raid() {
    local raid device options
    read raid device options < <(grep "^raid $1 " "$LAYOUT_FILE")
    ...
    for option in $options ; do
        case "$option" in
        ...
            (uuid=*)
                if [ -n "$FEATURE_MDADM_UUID" ] ; then
                    mdadmcmd="$mdadmcmd --$option"
                fi
                ;;

So you should run "rear -D recover" to see how that code
with the changes of this pull request actually behaves in your case
and inspect the diskrestore.sh script in the recovery system.

Before you run "rear -D recover" you may # export MIGRATION_MODE='true'
to let the recovery process stop at certain steps/stages so you could better
inspect how things actually are after each step/stage.

By the way:
The code in layout/prepare/GNU/Linux/120_include_raid_code.sh
is yet another piece of mostly old code that is almost impossible for me to understand
in particular its meaningless variable names "device" "devices" "raiddevice"
that seem to contradict the names that are used in "man mdadm" (of mdadm-4.1)

SYNOPSIS
       mdadm [mode] <raiddevice> [options] <component-devices>
...
For create, build, or grow:
       -n, --raid-devices=

I think "device" in 120_include_raid_code.sh is "raiddevice" in "man mdadm" and
"devices" in 120_include_raid_code.sh is "component-devices" in "man mdadm" and
"raiddevice" in 120_include_raid_code.sh would be "component-device" in "man mdadm"
where only the option name "raid-devices" seem to match what is used in "man mdadm"
but all that could be just wrong because of my confusion...

pcahyna commented at 2021-11-15 15:34:

@jsmeix thanks for looking so quickly and for the warning about the code in layout/prepare/GNU/Linux/120_include_raid_code.sh. Indeed, I would expect the code to be able to restore the correct UUIDs already, because we would have had similar problems with regular MD arrays already. I have captured the -D log from recovery, so I will check if it can provide a hint on what went wrong.
I am also only marginally familiar with IMSM (had to learn it now because of ReaR), but I can say that the metadata container and the RAID volume in it both have their own independent UUIDs and both are wrong (changed from their previous values).

jsmeix commented at 2021-11-16 11:53:

@pcahyna
via https://github.com/rear/rear/pull/2714
I overhauled layout/save/GNU/Linux/210_raid_layout.sh
which is based on the changes of this pull request here.
I would much appreciate it if you could have an early look.
It is not yet finished.
It seems to work for my simple RAID1 array.
I could have made mistakes for other types of RAID arrays.
I just cannot work with confusing variable names and things like that.
So to be able to actually do something for RAID I must make its code readable.

jsmeix commented at 2021-11-16 13:36:

See https://github.com/rear/rear/pull/2714#issuecomment-970279152
for my current results with a RAID1 array
of plain /dev/sda and /dev/sdb disks as component devices.

pcahyna commented at 2021-11-16 15:38:

So, I looked at the debug log and ReaR is not doing anything weird: it merely passes the right --uuid=... options to mdadm. Further experiments revealed that mdadm silently ignores this option when creating IMSM arrays (both containers and the volumes inside them) and picks a random UUID. The mdadm manual mentions that it is possible to change the UUID of an existing array during assembly: -U uuid. This would be a workaround, but it does not work either. The problem is present both in RHEL 7.9 and RHEL 8.5. Setting IMSM_NO_PLATFORM=1 has no effect either, so I wonder how it worked in @rmetrich's tests. Can you please examine your test results for the actual and expected MD UUIDs?
With this problem, I see two possible ways forward:

  • report the mdadm problem to upstream and don't support IMSM until this is fixed
  • write a workaround that will deal with changed UUIDs by substituting them in all files where they can be used. (There is already some code to change storage IDs, but it does not apply to GRUB and mdadm configuration.)

jsmeix commented at 2021-11-17 08:06:

@pcahyna
thank you very much for your laborious investigation!
It helps so much to understand what actually goes on.

Regarding how to deal with such kind of issues in ReaR
see the part "Dirty hacks welcome" in
https://github.com/rear/rear/wiki/Coding-Style

I would try to add a test in layout/prepare/GNU/Linux/120_include_raid_code.sh
or in a separated new script like layout/recreate/default/220_verify_layout.sh
(i.e. after diskrestore.sh was run via layout/recreate/default/200_run_layout_code.sh)
that tests if the recreated array has the specified UUID (if a UUID is there at all)
and if this test fails show a LogPrintError so the user so he is informed
that he must check that after "rear recover" in the recovery system.

Optionally - but only as far as possible with reasonable effort - I would
try to correct that UUID in the restored files during 'finalize' stage
preferably with DebugPrint messages in what files that UUID was
actually corrected but the latter only if it is easy to implement.

In general the user must test if "rear recover" works for him
in his particular environment so it is OK when ReaR cannot
automatically recreate his specific system 100% identical
but then ReaR should tell the user about it.

jsmeix commented at 2021-11-19 14:16:

@rmetrich @pcahyna
because the changes of this pull request
are all also in https://github.com/rear/rear/pull/2714
(hopefully I did not mess up something with my major cleanup work)
I suggest to close this pull request as "obsolete" and
continue only in https://github.com/rear/rear/pull/2714

If I messed up something in https://github.com/rear/rear/pull/2714
I will of course fix it there.

rmetrich commented at 2021-11-19 14:17:

Sure, feel free to close this, I have no time to follow all the details :)

jsmeix commented at 2021-11-19 14:19:

@rmetrich
wow - that was a fast response - thank you!


[Export of Github issue for rear/rear.]