#3459 Issue closed: Cannot create EFI Boot Manager entry for ESP /dev/md0p1 (unable to find the underlying disk)

Labels: enhancement, fixed / solved / done

sduehr opened issue at 2025-04-22 13:28:

ReaR version

Relax-and-Recover 2.9 / 2025-01-31

Describe the ReaR bug in detail

On a UEFI secure boot enabled system with sofware RAID-1 of two NVMe disks, rear recover fails with

WARNING:
For this system               
SUSE_LINUX/15.6 on Linux-i386 (based on SUSE/15/i386)
there is no code to install a boot loader on the recovered system
or the code that we have failed to install the boot loader correctly.
Please contribute appropriate code to the Relax-and-Recover project,
see http://relax-and-recover.org/development/
Take a look at the scripts in /usr/share/rear/finalize - for example
for PC architectures like x86 and x86_64 see the script 
/usr/share/rear/finalize/Linux-i386/660_install_grub2.sh
and for POWER architectures like ppc64le see the script
/usr/share/rear/finalize/Linux-ppc64le/660_install_grub2.sh
---------------------------------------------------
|  IF YOU DO NOT INSTALL A BOOT LOADER MANUALLY,  |
|  THEN YOUR SYSTEM WILL NOT BE ABLE TO BOOT.     |
---------------------------------------------------
You can use 'chroot /mnt/local bash --login'
to change into the recovered system and 
manually install a boot loader therein.

Platform

Linux x64

OS version

openSUSE Leap 15.6

Backup

BAREOS

Storage layout

# lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT
NAME           KNAME        PKNAME       TRAN   TYPE  FSTYPE            LABEL         SIZE MOUNTPOINT
/dev/sr0       /dev/sr0                  sata   rom   iso9660           REAR-ISO    848.2M 
/dev/nvme0n1   /dev/nvme0n1              nvme   disk  linux_raid_member sl15test1:0    16G 
`-/dev/md0     /dev/md0     /dev/nvme0n1        raid1                                  16G 
  |-/dev/md0p1 /dev/md0p1   /dev/md0            part                                  512M /mnt/local/boot/efi
  `-/dev/md0p2 /dev/md0p2   /dev/md0            part                                 15.3G /mnt/local
/dev/nvme0n2   /dev/nvme0n2              nvme   disk  linux_raid_member sl15test1:0    16G 
`-/dev/md0     /dev/md0     /dev/nvme0n2        raid1                                  16G 
  |-/dev/md0p1 /dev/md0p1   /dev/md0            part                                  512M /mnt/local/boot/efi
  `-/dev/md0p2 /dev/md0p2   /dev/md0            part                                 15.3G /mnt/local

What steps will reproduce the bug?

rear -D recover

Workaround, if any

No response

Additional information

Image

rear-sl15test1.log

The error message seems to come from https://github.com/rear/rear/blob/master/usr/share/rear/finalize/Linux-i386/670_run_efibootmgr.sh#L79-L97

jsmeix commented at 2025-04-22 14:45:

@sduehr

what does your etc/rear/local.conf contain?
In your
https://github.com/user-attachments/files/19850189/rear-sl15test1.log
I find

+ source /etc/rear/local.conf
++ BACKUP=BAREOS
++ BAREOS_CLIENT=sl15test1.ovirt1.bareos.test-fd
++ BAREOS_FILESET=LinuxAll
++ BAREOS_RESTORE_JOB=RestoreFiles
+ source_return_code=0

so it seems

BACKUP=BAREOS
BAREOS_CLIENT=sl15test1.ovirt1.bareos.test-fd
BAREOS_FILESET=LinuxAll
BAREOS_RESTORE_JOB=RestoreFiles

is all what you have in etc/rear/local.conf
which looks insufficient (e.g. no OUTPUT=...).
I don't want to reverse-engineer your ReaR configuration
so please show it to me in human readable form.

For RAID there is no automatism in ReaR to determine
on which devices the bootloader should be installed
so you need to tell ReaR the disk devices
where the bootloader should be installed
for BIOS normally via GRUB2_INSTALL_DEVICES
cf. my RAID tests (all with BIOS) on
https://github.com/rear/rear/wiki/Test-Matrix-ReaR-2.7

For UEFI I don't know off the top of my head
how ReaR works with RAID ...

sduehr commented at 2025-04-23 15:44:

@jsmeix

I already tried to add GRUB2_INSTALL_DEVICES to /etc/rear/local.conf, as I noticed that you mentioned it in https://github.com/rear/rear/issues/3357, so it looks like this:

BACKUP=BAREOS
BAREOS_CLIENT=sl15test1.ovirt1.bareos.test-fd
BAREOS_FILESET=LinuxAll
BAREOS_RESTORE_JOB=RestoreFiles
GRUB2_INSTALL_DEVICES="/dev/nvme0n1 /dev/nvme0n2"

and OUTPUT=ISO is the default in /usr/share/rear/conf/default.conf that's why I didn't add it.

ReaR does recreate the software RAID successfully, that works fine already in ReaR 2.9 even without explicitly setting GRUB2_INSTALL_DEVICES, but obviously 670_run_efibootmgr.sh doesn't use that variable.

I just tried this experimental quick hack in 670_run_efibootmgr.sh:

...
boot_efi_parts="/dev/nvme0n1p1 /dev/nvme0n2p1"
for efipart in $boot_efi_parts ; do
    # /dev/sda1 or /dev/mapper/vol34_part2 or /dev/mapper/mpath99p4
    partition_block_device=$( get_device_name $efipart )
    # 1 or 2 or 4 for the examples above
    partition_number=$( get_partition_number $partition_block_device )
    if ! disk=$( get_device_from_partition $partition_block_device $partition_number ) ; then
        LogPrintError "Cannot create EFI Boot Manager entry for ESP $partition_block_device (unable to find the underlying disk)"
        # do not error out - we may be able to locate other disks if there are more of them
        continue
    fi
    LogPrint "Creating  EFI Boot Manager entry '$OS_VENDOR $OS_VERSION' for '$bootloader' (UEFI_BOOTLOADER='$UEFI_BOOTLOADER') "
    Log efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label \"${OS_VENDOR} ${OS_VERSION}\" --loader \"\\${bootloader}\"
    if efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label "${OS_VENDOR} ${OS_VERSION}" --loader "\\${bootloader}" ; then
        # ok, boot loader has been set-up - continue with other disks (ESP can be on RAID)
        NOBOOTLOADER=''
    else
        LogPrintError "efibootmgr failed to create EFI Boot Manager entry on $disk partition $partition_number (ESP $partition_block_device )"
    fi
done

But then I get

2025-04-23 16:21:30.923649503 ERROR: 
                              ====================
                              BUG in /usr/share/rear/lib/layout-functions.sh line 462:
                              get_device_from_partition called with '/dev/nvme0n1p1' that is no block device
                              --------------------
                              Please report it at https://github.com/rear/rear/issues
                              and include all related parts from /var/log/rear/rear-sl15test1.log
                              preferably the whole debug information via 'rear -D recover'
                              ====================

because /dev/nvme0n1p1 does not exist, that seems to be normal and that's probably good as that's part of the software RAID.

RESCUE sl15test1:~ # parted /dev/nvme0n1 print
Warning: Not all of the space available to /dev/nvme0n1 appears to be used, you can fix the GPT to use all of the space (an
extra 256 blocks) or continue with the current setting? 
Fix/Ignore? I                                                             
Model: NVMe Device (nvme)
Disk /dev/nvme0n1: 17.2GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name   Flags
 1      1049kB  538MB   537MB   fat32        md0p1
 2      538MB   17.0GB  16.5GB  ext4         md0p2

RESCUE sl15test1:~ # fdisk -l /dev/nvme0n1 
GPT PMBR size mismatch (33554175 != 33554431) will be corrected by write.
The backup GPT table is not on the end of the device.
Disk /dev/nvme0n1: 16 GiB, 17179869184 bytes, 33554432 sectors
Disk model: VMware Virtual NVMe Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 5138A5AD-0349-46CC-B662-444781FEADE4

Device           Start      End  Sectors  Size Type
/dev/nvme0n1p1    2048  1050623  1048576  512M Linux filesystem
/dev/nvme0n1p2 1050624 33241087 32190464 15.3G Linux filesystem
RESCUE sl15test1:~ # 
RESCUE sl15test1:~ # ls /dev/nvme0n1p1
ls: cannot access '/dev/nvme0n1p1': No such file or directory
RESCUE sl15test1:~ # ls /dev/nvme0n1p2
ls: cannot access '/dev/nvme0n1p2': No such file or directory

Manually setting the variables and running efibootmgr works like this:

RESCUE sl15test1:~ # . /etc/rear/os.conf
RESCUE sl15test1:~ # echo $OS
GNU/Linux
RESCUE sl15test1:~ # bootloader='EFI\opensuse\shim.efi'
RESCUE sl15test1:~ # disk="/dev/nvme0n1"
RESCUE sl15test1:~ # partition_number=1
RESCUE sl15test1:~ # echo efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label "${OS_VENDOR} ${OS_VERSION}" --loader "\\${bootloader}"
efibootmgr --create --gpt --disk /dev/nvme0n1 --part 1 --write-signature --label SUSE_LINUX 15.6 --loader \EFI\opensuse\shim.efi
RESCUE sl15test1:~ # 
RESCUE sl15test1:~ # efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label "${OS_VENDOR} ${
OS_VERSION}" --loader "\\${bootloader}"
BootCurrent: 0002
BootOrder: 0004,0000,0001,0002,0003
Boot0000* EFI VMware Virtual NVME Namespace (NSID 1)
Boot0001* EFI VMware Virtual NVME Namespace (NSID 2)
Boot0002* EFI VMware Virtual SATA CDROM Drive (0.0)
Boot0003* EFI Network
Boot0004* SUSE_LINUX 15.6
RESCUE sl15test1:~ # echo $?
0

RESCUE sl15test1:~ # disk="/dev/nvme0n2"
RESCUE sl15test1:~ # efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label "${OS_VENDOR} ${
OS_VERSION}" --loader "\\${bootloader}"
efibootmgr: ** Warning ** : Boot0004 has same label SUSE_LINUX 15.6
BootCurrent: 0002
BootOrder: 0005,0004,0000,0001,0002,0003
Boot0000* EFI VMware Virtual NVME Namespace (NSID 1)
Boot0001* EFI VMware Virtual NVME Namespace (NSID 2)
Boot0002* EFI VMware Virtual SATA CDROM Drive (0.0)
Boot0003* EFI Network
Boot0004* SUSE_LINUX 15.6
Boot0005* SUSE_LINUX 15.6
RESCUE sl15test1:~ # echo $?
0

sduehr commented at 2025-04-23 20:36:

This it not really nice, couldn't find a better way to make it work:

--- 670_run_efibootmgr.sh.orig  2025-04-23 21:20:10.992824491 +0200
+++ 670_run_efibootmgr.sh   2025-04-23 21:51:59.862882588 +0200
@@ -76,6 +76,23 @@
 # EFI\fedora\shim.efi
 bootloader=$( echo $UEFI_BOOTLOADER | cut -d"/" -f4- | sed -e 's;/;\\;g' )

+if test "$GRUB2_INSTALL_DEVICES" ; then
+    if [[ $boot_efi_parts == "/dev/md0p"* ]]; then
+        partition_number=$( get_partition_number $boot_efi_parts )
+        for disk in $GRUB2_INSTALL_DEVICES; do
+            LogPrint "Creating  EFI Boot Manager entry '$OS_VENDOR $OS_VERSION' for '$bootloader' (UEFI_BOOTLOADER='$UEFI_BOOTLOADER') "
+            Log efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label \"${OS_VENDOR} ${OS_VERSION}\" --loader \"\\${bootloader}\"
+            if efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label "${OS_VENDOR} ${OS_VERSION}" --loader "\\${bootloader}" ; then
+                 # ok, boot loader has been set-up - continue with other disks (ESP can be on RAID)
+                 NOBOOTLOADER=''
+            else
+                 LogPrintError "efibootmgr failed to create EFI Boot Manager entry on $disk partition $partition_number (ESP $partition_block_device )"
+            fi
+        done
+        is_true $NOBOOTLOADER && return 1 || return 0
+    fi
+fi
+
 for efipart in $boot_efi_parts ; do
     # /dev/sda1 or /dev/mapper/vol34_part2 or /dev/mapper/mpath99p4
     partition_block_device=$( get_device_name $efipart )

Is it worth creating a PR from that?

jsmeix commented at 2025-04-24 08:47:

@sduehr

thank you so much for your debugging
and even more for your enhancement to make it work!

Yes, of course, please submit a pull request.

It does not matter when it is not yet a perfect solution.
Compared to when something does not work,
anything which makes it work is a vital step forward.
From there on things can be further enhanced as needed.

jsmeix commented at 2025-05-12 08:50:

@sduehr @rear/contributors
via
https://github.com/rear/rear/pull/3471
I implemented what I wrote in
https://github.com/rear/rear/pull/3466#issuecomment-2857379995
and because for me on my particular test system it works OK
I would much appreciate it if you could test
https://github.com/rear/rear/pull/3471
(of course as time permits) and provide feedback
how it behaves for you in your environment.

pcahyna commented at 2025-05-15 17:41:

For RAID there is no automatism in ReaR to determine
on which devices the bootloader should be installed
so you need to tell ReaR the disk devices
where the bootloader should be installed

There is an automatism created by yours truly - #2608, but it works only when each disk is independently partitioned and then the partitions (including the ESP) are combined into separate RAIDs (which is apparently the standard RAID install on Fedora-based distros). In this case we have a whole-disk RAID, which is apparently not supported by ReaR automatisms. I was able to install RHEL this way only via injecting some shell commands, at least when automating the installation with Kickstart. (I used this:

ignoredisk --only-use=Volume0|Volume0_0
%pre --log=/dev/console
mdadm --create /dev/md/Volume0 --name=Volume0 --metadata=1.0 --level raid1 --raid-disks=2 /dev/sda /dev/sdb
mdadm --wait /dev/md/Volume0
%end

jsmeix commented at 2025-06-12 14:30:

With https://github.com/rear/rear/pull/3471 merged
this issue is fixed so far that the user can now
manually specify what efibootmgr will create.

For what is missing for a full automated proper solution see
https://github.com/rear/rear/pull/3471#issuecomment-2962009633


[Export of Github issue for rear/rear.]