#3230 Issue closed: ReaR confuses Device Mapper Multipathing (DM) devices with Non-volatile Memory Express™ (NVMe™) multipathing devices.

Labels: support / question, fixed / solved / done

StefanFromMunich opened issue at 2024-05-22 08:22:

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.6 / 2020-06-17

  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):

NAME="Oracle Linux Server"
VERSION="8.5"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="8.5"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Oracle Linux Server 8.5"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:8:5:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://bugzilla.oracle.com/"

ORACLE_BUGZILLA_PRODUCT="Oracle Linux 8"
ORACLE_BUGZILLA_PRODUCT_VERSION=8.5
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=8.5
  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
OUTPUT=ISO
OUTPUT_URL=nfs://193.14.236.134/volume1/Rear
BACKUP=NETFS
BACKUP_URL=nfs://193.14.236.134/volume1/Rear
BACKUP_PROG_EXCLUDE=("${BACKUP_PROG_EXCLUDE[@]}" '/media' '/var/tmp' '/var/crash' '/u02' '/dev/nvme0n1'  '/dev/nvme1n1'  '/dev/nvme2n1'  '/dev/nvme3n1'  '/dev/nvme4n1'  '/dev/nvme5n1' )
NETFS_KEEP_OLD_BACKUP_COPY=
UEFI_BOOTLOADER=/boot/efi/EFI/redhat/grubx64.efi
SECURE_BOOT_BOOTLOADER=/boot/efi/EFI/redhat/shimx64.efi
EXCLUDE_MOUNTPOINTS=( '/u03/flashdata/ZVK_MUC' '/u02/oradata/ZVK_MUC' )
AUTOEXCLUDE_MULTIPATH=n
BOOT_OVER_SAN=y
AUTORESIZE_PARTITIONS=true
AUTOSHRINK_DISK_SIZE_LIMIT_PERCENTAGE=100
  • Hardware vendor/product (PC or PowerNV BareMetal or ARM) or VM (KVM guest or PowerVM LPAR):
    PC

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    UEFI and bootloader GRUB

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    SSD (DM and NVMe)

  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT"):

NAME                                      KNAME          PKNAME         TRAN   TYPE  FSTYPE          LABEL   SIZE MOUNTPOINT
/dev/sda                                  /dev/sda                      sata   disk  isw_raid_member       447,1G
|-/dev/sda1                               /dev/sda1      /dev/sda              part  vfat                    600M
|-/dev/sda2                               /dev/sda2      /dev/sda              part  xfs                       1G
|-/dev/sda3                               /dev/sda3      /dev/sda              part  LVM2_member             154G
|-/dev/sda4                               /dev/sda4      /dev/sda              part  LVM2_member           269,2G
`-/dev/md126                              /dev/md126     /dev/sda              raid1                       424,8G
  |-/dev/md126p1                          /dev/md126p1   /dev/md126            md    vfat                    600M /boot/efi
  |-/dev/md126p2                          /dev/md126p2   /dev/md126            md    xfs                       1G /boot
  |-/dev/md126p3                          /dev/md126p3   /dev/md126            md    LVM2_member             154G
  | |-/dev/mapper/ol-root                 /dev/dm-0      /dev/md126p3          lvm   xfs                      30G /
  | |-/dev/mapper/ol-swap                 /dev/dm-1      /dev/md126p3          lvm   swap                     24G [SWAP]
  | `-/dev/mapper/ol-u01                  /dev/dm-11     /dev/md126p3          lvm   xfs                     100G /u01
  `-/dev/md126p4                          /dev/md126p4   /dev/md126            md    LVM2_member           269,2G
/dev/sdb                                  /dev/sdb                      sata   disk  isw_raid_member       447,1G
|-/dev/sdb1                               /dev/sdb1      /dev/sdb              part  vfat                    600M
|-/dev/sdb2                               /dev/sdb2      /dev/sdb              part  xfs                       1G
|-/dev/sdb3                               /dev/sdb3      /dev/sdb              part  LVM2_member             154G
|-/dev/sdb4                               /dev/sdb4      /dev/sdb              part  LVM2_member           269,2G
`-/dev/md126                              /dev/md126     /dev/sdb              raid1                       424,8G
  |-/dev/md126p1                          /dev/md126p1   /dev/md126            md    vfat                    600M /boot/efi
  |-/dev/md126p2                          /dev/md126p2   /dev/md126            md    xfs                       1G /boot
  |-/dev/md126p3                          /dev/md126p3   /dev/md126            md    LVM2_member             154G
  | |-/dev/mapper/ol-root                 /dev/dm-0      /dev/md126p3          lvm   xfs                      30G /
  | |-/dev/mapper/ol-swap                 /dev/dm-1      /dev/md126p3          lvm   swap                     24G [SWAP]
  | `-/dev/mapper/ol-u01                  /dev/dm-11     /dev/md126p3          lvm   xfs                     100G /u01
  `-/dev/md126p4                          /dev/md126p4   /dev/md126            md    LVM2_member           269,2G
/dev/nvme0n1                              /dev/nvme0n1                  nvme   disk                          5,8T
`-/dev/nvme0n1p1                          /dev/nvme0n1p1 /dev/nvme0n1   nvme   part  LVM2_member             5,8T
  |-/dev/mapper/vg_oracle-u02_zvk_muc     /dev/dm-2      /dev/nvme0n1p1        lvm   ext4                    2,5T /u02/oradata/ZVK_MUC
  |-/dev/mapper/vg_oracle-u02_zvk_da      /dev/dm-3      /dev/nvme0n1p1        lvm   ext4                   1000G /u02/oradata/ZVK_DA
  |-/dev/mapper/vg_oracle-u02_bsv         /dev/dm-4      /dev/nvme0n1p1        lvm   ext4                   1000G /u02/oradata/BSV
  |-/dev/mapper/vg_oracle-u02_kvv         /dev/dm-5      /dev/nvme0n1p1        lvm   ext4                    500G /u02/oradata/KVV
  `-/dev/mapper/vg_oracle-u02_daisy       /dev/dm-6      /dev/nvme0n1p1        lvm   ext4                    500G /u02/oradata/DAISY
/dev/nvme3n1                              /dev/nvme3n1                  nvme   disk                          2,9T
/dev/nvme1n1                              /dev/nvme1n1                  nvme   disk                          5,8T
`-/dev/nvme1n1p1                          /dev/nvme1n1p1 /dev/nvme1n1   nvme   part  LVM2_member             5,8T
  |-/dev/mapper/vg_oracle-u02_vip         /dev/dm-7      /dev/nvme1n1p1        lvm   ext4                    500G /u02/oradata/VIP
  |-/dev/mapper/vg_oracle-u02_vbo--portal /dev/dm-8      /dev/nvme1n1p1        lvm   ext4                    500G /u02/oradata/VBO-Portal
  |-/dev/mapper/vg_oracle-u02_bsv--portal /dev/dm-9      /dev/nvme1n1p1        lvm   ext4                    500G /u02/oradata/BSV-Portal
  `-/dev/mapper/vg_oracle-u02_vbo         /dev/dm-10     /dev/nvme1n1p1        lvm   ext4                    500G /u02/oradata/VBO
/dev/nvme4n1                              /dev/nvme4n1                  nvme   disk                          2,9T
/dev/nvme2n1                              /dev/nvme2n1                  nvme   disk                          2,9T
`-/dev/nvme2n1p1                          /dev/nvme2n1p1 /dev/nvme2n1   nvme   part  ext4                    2,9T /u03/flashdata/ZVK_MUC
/dev/nvme5n1                              /dev/nvme5n1                  nvme   disk                          2,9T
  • Description of the issue (ideally so that others can reproduce it):

My analysis of the attached files revealed the following facts:

ReaR confuses DM (Device Mapper Multipathing) multipathing devices with NVMe™ (Non-volatile Memory Express™) multipathing devices.

In the original disklayout.conf file, ReaR correctly detects the following DM multipath devices (Sata devices):

/dev/sda (sata)
/dev/sdb (sata)

ReaR also correctly detects the following (NVMe™) multipathing devices:

/dev/nvme0n1 (nvme)
/dev/nvme1n1 (nvme)
/dev/nvme2n1 (nvme)
/dev/nvme3n1 (nvme)
/dev/nvme4n1 (nvme)
/dev/nvme5n1 (nvme)

When testing disaster recovery, ReaR can no longer correctly reconstruct the multipathing devices.

/dev/sdb correctly recognizes ReaR as a SATA device.
/dev/sda does not correctly recognize ReaR as a SATA, but incorrectly as an NVME device
ReaR incorrectly links /dev/sda (sata) to /dev/nvme0n1 (nvme)

ReaR then fails with the error message during disaster recovery
ERROR: No file system mounted on mnt/local

AUTOEXCLUDE_MULTIPATH=n does not solve the problem.

What measures are required for proper disaster recovery?

jsmeix commented at 2024-05-22 12:33:

I am not at all a multipath expert and
I know nothing at all about "NVMe multipathing"
so only as an offhanded side note regarding

ReaR confuses DM (Device Mapper Multipathing) multipathing devices
with NVMe™ (Non-volatile Memory Express™) multipathing devices.

cf. starting at
https://github.com/rear/rear/issues/3085#issuecomment-1890616502
which - unfortunately - did not further proceed there.

So current ReaR may not work with "NVMe multipathing"
because "NVMe multipathing" may require some special
treatment in ReaR which is currently not implemented.

StefanFromMunich commented at 2024-08-19 09:54:

We use NVMe™ drives only as flash storage.
Therefore, we only store volatile data on NVMe™ drives.
We do not store permanent data on NVMe™ drives.
Therefore, we do not require ReaR disaster recovery
for the NVMe™ drives.
I therefore excluded the NVMe™ disks
('/dev/nvme0n1' '/dev/nvme1n1' '/dev/nvme2n1' '/dev/nvme3n1' '/dev/nvme4n1' '/dev/nvme5n1')
from the backup with the following configuration:

BACKUP_PROG_EXCLUDE=("${BACKUP_PROG_EXCLUDE[@]}"
                     '/media' '/var/tmp' '/var/crash' '/u02'
                     '/dev/nvme0n1' '/dev/nvme1n1' '/dev/nvme2n1' '/dev/nvme3n1' '/dev/nvme4n1' '/dev/nvme5n1' )

However, this did not help,
because ReaR disaster recovery did not work.

Is it possible to exclude the NVMe™ disks from REaR backup
so that ReaR disaster recovery works?

jsmeix commented at 2024-08-19 15:02:

@StefanFromMunich

the various exclude-this-or-that functionalities in ReaR
are rather complicated and may behave confusing so you
may have to experiment with some trial and error tests
until you may get what you need in your specific case.

You may have a look at my comments in
https://github.com/rear/rear/issues/2772
starting at
https://github.com/rear/rear/issues/2772#issuecomment-1066754013
and as needed also follow links to other issues therein.

I think what you need first and foremost is to exclude
what is called "component" in ReaR because disk devices
are "components" in ReaR.

Because you use ReaR 2.6
which is several years old (from June 2020):

In general I recommend to try out our latest GitHub master code
because the GitHub master code is the only place where we fix things
and if there are issues it helps when you use exactly the code
where we could fix things.

See the section
"Testing current ReaR upstream GitHub master code" in
https://en.opensuse.org/SDB:Disaster_Recovery
how you can try out our current ReaR GitHub master code
without conflicts with your already installed ReaR version.

In general we at ReaR upstream do not support older ReaR versions.
We at ReaR upstream do not plain reject issues with older ReaR versions
(e.g. we may answer easy to solve questions also for older ReaR versions)
but we do not spend much time on issues with older ReaR versions because
we do not (and cannot) fix issues in released ReaR versions.
Issues in released ReaR versions are not fixed by us (by ReaR upstream).
Issues in released ReaR versions that got fixed in current ReaR upstream
GitHub master code might be fixed (if the fix can be backported with
reasonable effort) by the Linux distributor wherefrom you got ReaR.

jsmeix commented at 2024-08-19 15:43:

@StefanFromMunich

a side note regarding your

AUTORESIZE_PARTITIONS=true
AUTOSHRINK_DISK_SIZE_LIMIT_PERCENTAGE=100

I think those two settings contradict each other
because usr/share/rear/conf/default.conf
reads - even for the old ReaR 2.6 - (excerpts):

# When AUTORESIZE_PARTITIONS is true, all active partitions on all active disks
# get resized by the 430_autoresize_all_partitions.sh script
...
# When the first value in AUTORESIZE_PARTITIONS is neither true nor false
# only the last active partition on each active disk gets resized
# by the 420_autoresize_last_partitions.sh script.
#
# The following applies only when the last active partition on each active disk
# gets resized by the 420_autoresize_last_partitions.sh script
...
AUTOSHRINK_DISK_SIZE_LIMIT_PERCENTAGE...

see starting at
https://github.com/rear/rear/blob/rear-2.6/usr/share/rear/conf/default.conf#L413

Regarding the 430_autoresize_all_partitions.sh script see
the current GitHub master code default.conf that reads

# Using AUTORESIZE_PARTITIONS='true' with 430_autoresize_all_partitions.sh
# may result badly aligned partitions in particular possibly harmful aligned
# according to what flash memory based disks (i.e. SSDs) actually need
# which is usually 4MiB or 8MiB alignment where a too small value
# will result lower speed and less lifetime of flash memory devices

see
https://github.com/rear/rear/blob/master/usr/share/rear/conf/default.conf#L567
and see our release notes where this was introduced
in ReaR 2.4 which reads (excerpts):

Version 2.4 (June 2018)
...
-   Major rework and changed default behaviour how ReaR behaves in migration
    mode when partitions can or must be resized to fit on replacement disks
    with different size. The new default behaviour is that only the partition
    end value of the last partition on a disk (and therefore its partition
    size) may get changed if really needed but no partition start value gets
    changed to avoid changes of the partition alignment. The new
    420_autoresize_last_partitions script implements the new behaviour and the
    old 400_autoresize_disks was renamed into 430_autoresize_all_partitions to
    still provide the old behaviour if that is explicitly requested by the
    user but the old behaviour may result unexpected changes of arbitrary
    partitions on a disk.

https://github.com/rear/rear/blob/master/doc/rear-release-notes.txt#L2778

For details see also
https://github.com/rear/rear/issues/1750
and the other issues mentioned therein
in particular
https://github.com/rear/rear/pull/1733

StefanFromMunich commented at 2024-09-05 05:30:

@jsmeix

Thank you for your comments.
I have taken all your comments into consideration.
I used the GIt master code:

Relax-and-Recover 2.7 / Git

I updated the operating system:

NAME="Oracle Linux Server"
VERSION="8.10"
ID="ol"
ID_LIKE="fedora"

I have adjusted the ReaR configuration according to your instructions:

OUTPUT=ISO
OUTPUT_URL=nfs://193.14.236.134/volume1/Rear
BACKUP=NETFS
BACKUP_URL=nfs://193.14.236.134/volume1/Rear
BACKUP_PROG_EXCLUDE=("${BACKUP_PROG_EXCLUDE[@]}" '/media' '/var/tmp' '/var/crash' '/u02' '/dev/nvme0n1'  '/dev/nvme1n1'  '/dev/nvme2n1'  '/dev/nvme3n1'  '/dev/nvme4n1'  '/dev/nvme5n1' )
EXCLUDE_COMPONENTS=( "${EXCLUDE_COMPONENTS[@]}" "/dev/nvme0n1"  "/dev/nvme1n1"  "/dev/nvme2n1"  "/dev/nvme3n1"  "/dev/nvme4n1"  "/dev/nvme5n1" )
NETFS_KEEP_OLD_BACKUP_COPY=
UEFI_BOOTLOADER=/boot/efi/EFI/redhat/grubx64.efi
SECURE_BOOT_BOOTLOADER=/boot/efi/EFI/redhat/shimx64.efi
EXCLUDE_MOUNTPOINTS=( '/u03/flashdata/ZVK_MUC' '/u02/oradata/ZVK_MUC' )
AUTOEXCLUDE_MULTIPATH=n
AUTORESIZE_PARTITIONS=false

There are no more problems with the multipath devices.

However, there seems to be something wrong with the ISO image.
The virtual machine cannot be booted with it.
Attached you will find three screenshots with the screen messages.
The numbering corresponds to the order in which the screen messages appear.
After "Press any key to continue..." nothing happens.
What else do I have to do to make the machine bootable?

Rear_boot_Screen1
Rear_boot_Screen2
Rear_boot_Screen3

StefanFromMunich commented at 2024-09-05 06:44:

@jsmeix

I also noticed that the two files xzio and fshelp,
which were not found during the boot process,
are entered in the backup log file:

block 7609484: /usr/lib/grub/x86_64-efi/xzio.mod
block 7600337: /usr/lib/grub/x86_64-efi/fshelp.mod

Only kernel.exec is entered in the ISO log file:

/usr/lib/grub/x86_64-efi/kernel.exec

The two files xzio and fshelp were not entered into the ISO log file.

The initrd.cgz file is included in the ISO log file:

2024-09-04 11:35:00.755448505 Created initrd.cgz with gzip default compression (778 MiB) in 56 seconds

StefanFromMunich commented at 2024-09-23 10:39:

@jsmeix
Can you please remove the label "old version".
I have completed the upgrade and am now working with the GIt master code:
Relax-and-Recover 2.7 / Git
I also documented this in the last bug description.

jsmeix commented at 2024-09-23 13:52:

@StefanFromMunich
according to your
https://github.com/rear/rear/issues/3230#issuecomment-2330635374
(excerpt)

There are no more problems with the multipath devices.

However, there seems to be something wrong with the ISO image.
The virtual machine cannot be booted with it.

this issue has now changed its topic / subject
from something about

NVMe multipathing devices

to something about

UEFI booting from ReaR's ISO image

which is somewhat problematic for us because
we at ReaR upstream (and in particular I)
prefer separated issues for separated topics
so that the right people who know best about
a certain area could focus on a particular issue.

So ideally I wished you could re-report your
"UEFI booting from ReaR's ISO image" issue
as a new separated issue here at GitHub.

I am not at all an expert in UEFI booting and
I am also in general not an expert in booting stuff
so personally I cannot help much with booting issues.

In particular am not a Red Hat / Fedora / Oracle Linux user
so I cannot reproduce issues which are specific for
those Linux distributions (and in particular booting issues
are often specific for a particular Linux distribution).

For my personal experience with UEFI booting
from ReaR's ISO image you may read through
https://github.com/rear/rear/issues/3084
in particular therein see
https://github.com/rear/rear/issues/3084#issuecomment-1833496190
(and follow the links therein as needed)
and see
https://github.com/rear/rear/issues/3084#issuecomment-1835773844
and finally see
https://github.com/rear/rear/issues/3084#issuecomment-2330953577

By the way:
I think both

UEFI_BOOTLOADER=/boot/efi/EFI/redhat/grubx64.efi
SECURE_BOOT_BOOTLOADER=/boot/efi/EFI/redhat/shimx64.efi

are currently not supported in ReaR because
setting SECURE_BOOT_BOOTLOADER overwrites UEFI_BOOTLOADER
and when SECURE_BOOT_BOOTLOADER is set it specifies the
first-stage bootloader (which is shim) and then shim
runs as second-stage bootloader something that is
basically hardcoded inside shim so that what ReaR does
is to copy all possible second-stage bootloaders
(basicaly all GRUB EFI binaries as far as I remember)
into the ISO to make shim's second-stage bootloader
available via the ISO, cf. the comments in
usr/share/rear/output/USB/Linux-i386/100_create_efiboot.sh
starting at

# Copy UEFI bootloader:
if test -f "$SECURE_BOOT_BOOTLOADER" ; then

currently online at
https://github.com/rear/rear/blob/master/usr/share/rear/output/USB/Linux-i386/100_create_efiboot.sh#L65
This is for OUTPUT=USB but since
https://github.com/rear/rear/pull/3031
it is similar as what is done for OUTPUT=ISO.
For background information what I had done in the past
you may have a look at
https://github.com/rear/rear/pull/3031
(and follow the links therein as needed).

StefanFromMunich commented at 2024-10-04 09:53:

@jsmeix

Thank you for the detailed analysis.
As per your suggestion, I am now closing this issue. As a replacement, I have opened issue
UEFI booting from ReaR ISO image does not work #3326


[Export of Github issue for rear/rear.]