#3466 PR `closed`: Fix calling efibootmgr on software RAID¶

Labels: enhancement, won't fix / can't fix / obsolete

sduehr opened issue at 2025-04-30 15:01:¶

Type: Bug Fix
Impact: Low
Reference to related issue (URL): https://github.com/rear/rear/issues/3459
How was this pull request tested? Manually on OpenSUSE Leap 15.6
Description of the changes in this pull request:

On a system with software RAID, it is not easily possible to determine the underlying disks automatically. The following makes use of GRUB2_INSTALL_DEVICES to determine the appropriate devices for calling efibootmgr, this requires for example the following configuration: GRUB2_INSTALL_DEVICES="/dev/nvme0n1 /dev/nvme0n2"
See issue #3459 for more details

jsmeix commented at 2025-04-30 15:21:¶

@sduehr
thank you for your improvement!

Tomorrow is public holiday in Germany
and I am on vacation on Friday
so (hopefully) next week I can have a look...

jsmeix commented at 2025-05-06 08:45:¶

Regarding my above
https://github.com/rear/rear/pull/3466#discussion_r2073042135
(excerpt)

the current code

partition_number=$( get_partition_number $boot_efi_parts )

assigns what seems to be meant
to be one singular number 'partition_number'
from a variable which can have more than one value
'boot_efi_parts' - note the plural 's'.
So I wonder what 'partition_number' will become
when 'boot_efi_parts' is several partitions with
different partition numbers for example like

/dev/md127p1 /dev/md128p2

I did a test on a VM with RAID1 which I already have
but that test VM has BIOS so I have no ESP to mount
but I could do the test with the root filesystem
in a GitHub master code clone after "rear mkrescue"
(to get the diskdeps.conf and disktodo.conf files)
and then I did the actual test within the ReaR shell:

# git clone https://github.com/rear/rear.git

# mv rear rear.github.master

# cd rear.github.master

# vi etc/rear/local.conf
...

# usr/sbin/rear -D mkrescue
...

# export LAYOUT_DEPS=/root/rear.github.master/var/lib/rear/layout/diskdeps.conf

# export LAYOUT_TODO=/root/rear.github.master/var/lib/rear/layout/disktodo.conf

# usr/sbin/rear shell

REAR ... # boot_efi_parts=$( find_partition "fs:/" 'btrfsmountedsubvol fs' )

REAR ... # echo "'$boot_efi_parts'"
'/dev/md127p3
/dev/md127p4'

REAR ... # get_partition_number $boot_efi_parts
3

REAR ... # ( set -x ; get_partition_number $boot_efi_parts )
+ get_partition_number /dev/md127p3 /dev/md127p4
+ local partition_block_device=/dev/md127p3
++ grep --color=auto -o -E '[0-9]+$'
++ echo /dev/md127p3
+ local partition_number=3
+ test 3 -gt 0
+ test 3 -le 128
+ echo 3
3

REAR ... # ( set -x ; get_partition_number '/dev/md127p3 /dev/md127p4' )
+ get_partition_number '/dev/md127p3 /dev/md127p4'
+ local 'partition_block_device=/dev/md127p3 /dev/md127p4'
++ grep --color=auto -o -E '[0-9]+$'
++ echo '/dev/md127p3 /dev/md127p4'
+ local partition_number=4
+ test 4 -gt 0
+ test 4 -le 128
+ echo 4
4

REAR ... # get_partition_number "$boot_efi_parts"
bash: test: too many arguments
ERROR: 
====================
BUG in /root/rear.github.master/usr/share/rear/lib/layout-functions.sh line 11:
Partition number '3
4' of partition /dev/md127p3
/dev/md127p4 is not a valid partition number.
--------------------
Please report it at https://github.com/rear/rear/issues
and include all related parts from /root/rear.github.master/var/log/rear/rear-localhost.log
preferably the whole debug information via 'rear -D shell'
====================
Some latest log messages since the last called script 998_dump_variables.sh:
  2025-05-06 09:54:23.684387854 Finished running 'init' stage in 0 seconds
  2025-05-06 09:54:23.687733321 Running shell workflow
Use debug mode '-d' for some debug messages or debugscript mode '-D' for full debug messages with 'set -x' output
===== Stack trace =====
Trace 0: /root/rear.github.master/usr/share/rear/lib/layout-functions.sh:11 get_partition_number
Trace 1: /root/rear.github.master/usr/share/rear/lib/_framework-setup-and-functions.sh:10 BugError
=== End stack trace ===
Aborting due to an error, check /root/rear.github.master/var/log/rear/rear-localhost.log for details
Error exit of rear shell (PID 31960) and its descendant processes
bash: kill: (1600) - No such process
bash: kill: (1600) - No such process
bash: test: too many arguments
ERROR: 
====================
BUG in /root/rear.github.master/usr/share/rear/lib/layout-functions.sh line 35:
Partition /dev/md127p3
/dev/md127p4 is numbered '3
4'. More than 128 partitions are not supported.
--------------------
Please report it at https://github.com/rear/rear/issues
and include all related parts from /root/rear.github.master/var/log/rear/rear-localhost.log
preferably the whole debug information via 'rear -D shell'
====================
Some latest log messages since the last called script 998_dump_variables.sh:
                                Please report it at https://github.com/rear/rear/issues
                                and include all related parts from /root/rear.github.master/var/log/rear/rear-localhost.log
                                preferably the whole debug information via 'rear -D shell'
                                ====================
  2025-05-06 10:28:08.962743462 Error exit of rear shell (PID 31960) and its descendant processes
  2025-05-06 10:28:08.979008571 bash,31960 --rcfile /root/rear.github.master/usr/share/rear/lib/rear-shell.bashrc -i
                                  `-bash,1595 --rcfile /root/rear.github.master/usr/share/rear/lib/rear-shell.bashrc -i
                                      `-pstree,1596 -Aplau 31960
Use debug mode '-d' for some debug messages or debugscript mode '-D' for full debug messages with 'set -x' output
===== Stack trace =====
Trace 0: /root/rear.github.master/usr/share/rear/lib/layout-functions.sh:35 get_partition_number
Trace 1: /root/rear.github.master/usr/share/rear/lib/_framework-setup-and-functions.sh:10 BugError
=== End stack trace ===
Aborting due to an error, check /root/rear.github.master/var/log/rear/rear-localhost.log for details
Error exit of rear shell (PID 31960) and its descendant processes
bash: kill: (1644) - No such process
bash: kill: (1644) - No such process
3 4

REAR ... # exit
exit
Exiting rear shell (PID 31960) and its descendant processes ...
bash: kill: (1679) - No such process
Running exit tasks
Exiting rear shell (PID 31138) and its descendant processes ...
Running exit tasks

So the current code somehow works by luck
or errors out by bad luck depending on
how get_partition_number() is called:

It works for the intended cases like:
get_partition_number /dev/md127p3 results '3'
get_partition_number /dev/md127p4 results '4'

It somehow works for some unintended cases like:
get_partition_number /dev/md127p3 /dev/md127p4 results '3'
get_partition_number '/dev/md127p3 /dev/md127p4' results '4'

It errors out for other unintended cases like:
get_partition_number "$( echo -e '/dev/md127p3\n/dev/md127p4' )"

jsmeix commented at 2025-05-06 12:11:¶

@sduehr

meanwhile I am wondering why the if clause

if [[ $boot_efi_parts == "/dev/md"*"p"* ]]; then
  ...
fi

is needed at all?

When the disk devices for calling 'efibootmgr'
are specified by the user in GRUB2_INSTALL_DEVICES
then those devices should be "just used as specified"
to provide "final power to the user"
same as in finalize/Linux-i386/660_install_grub2.sh

So I wonder why not simplify the code to

# On a system with software RAID, it is not easily possible to determine the
# underlying disks automatically. The following makes use of GRUB2_INSTALL_DEVICES
# to determine the appropriate devices for calling efibootmgr, this requires for
# example to add the following configuration:
# GRUB2_INSTALL_DEVICES="/dev/nvme0n1 /dev/nvme0n2"
# See also https://github.com/rear/rear/issues/3459
if test "$GRUB2_INSTALL_DEVICES" ; then
    # FIXME: See https://github.com/rear/rear/pull/3466#issuecomment-2853742407
    partition_number=$( get_partition_number $boot_efi_parts )
    for disk in $GRUB2_INSTALL_DEVICES; do
        LogPrint "Creating  EFI Boot Manager entry '$OS_VENDOR $OS_VERSION' for '$bootloader' (UEFI_BOOTLOADER='$UEFI_BOOTLOADER') "
        Log efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label \"${OS_VENDOR} ${OS_VERSION}\" --loader \"\\${bootloader}\"
        if efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label "${OS_VENDOR} ${OS_VERSION}" --loader "\\${bootloader}" ; then
             # ok, boot loader has been set-up - continue with other disks (ESP can be on RAID)
             NOBOOTLOADER=''
        else
             LogPrintError "efibootmgr failed to create EFI Boot Manager entry on $disk partition $partition_number (ESP $partition_block_device )"
        fi
    done
    is_true $NOBOOTLOADER && return 1 || return 0
fi

Meanwhile I also wonder if we should not better introduce
a separated user config variable EFIBOOTMGR_INSTALL_DEVICES
instead of misusing an existing user config variable
with misleading name because it is meant for GRUB2.
But this does not need to be done via this pull request.
Such cleanup can be done in a subsequent pull request
as long as we are in the ReaR 3.0 development phase.
As long as nothing is documented in default.conf
we can experiment what the best way is.

sduehr commented at 2025-05-06 14:31:¶

Sure, yes, to find a rather quick solution I kind of abused GRUB2_INSTALL_DEVICES. So obviously checking for /dev/md*p* is not a good idea, my idea was to detect software RAID like this and in that case assume that $boot_efi_parts contains only one element.

Probably the approach you proposed introducing a new config variable EFIBOOTMGR_INSTALL_DEVICES makes more sense, I will create a new PR with that, then this one can just be dropped.

jsmeix commented at 2025-05-07 07:13:¶

@sduehr
thank you for your efforts to provide a proper solution!

With a new config variable EFIBOOTMGR_INSTALL_DEVICES
we are free to implement things as what works best
for the 'efibootmgr' case.

I suggest to try to implement the 'efibootmgr' case
with EFIBOOTMGR_INSTALL_DEVICES basically same
as I did with GRUB2_INSTALL_DEVICES in
finalize/Linux-i386/660_install_grub2.sh
to provide "final power to the user" which means
that with EFIBOOTMGR_INSTALL_DEVICES there must not be
automatisms where results are out of the users's control,
instead with EFIBOOTMGR_INSTALL_DEVICES the user must
get full control what 'efibootmgr' will result.

In current code of this PR 'efibootmgr' is called this way

efibootmgr --create --gpt --disk $disk --part $partition_number --write-signature --label "${OS_VENDOR} ${OS_VERSION}" --loader "\\${bootloader}"

where only $disk is under the users's control
via GRUB2_INSTALL_DEVICES.

But $partition_number is under the control of an
unreliably working automatism because of

partition_number=$( get_partition_number $boot_efi_parts )

cf. https://github.com/rear/rear/pull/3466#issuecomment-2853742407
and $boot_efi_parts is also determined by some automatism.

It seems $bootloader is somewhat under the users's control
via UEFI_BOOTLOADER but indirectly via

# UEFI_BOOTLOADER=/boot/efi/EFI/opensuse/shim.efi

# bootloader=$( echo $UEFI_BOOTLOADER | cut -d"/" -f4- | sed -e 's;/;\\;g' )

# echo "\\${bootloader}"
\EFI\opensuse\shim.efi

but that seems to be sufficiently under the users's control.

So $partition_number need to get under the users's control
when the user specifies EFIBOOTMGR_INSTALL_DEVICES.

I assume that $partition_number must be the number of a
partition on the $disk which is used for 'efibootmgr'
but for a RAID1 of /dev/nvme0n1 and /dev/nvme0n2
there are no partition device nodes
like /dev/nvme0n1p1 and /dev/nvme0n2p1
cf. https://github.com/rear/rear/issues/3459#issuecomment-2824742925

So specifying partition device nodes like

EFIBOOTMGR_INSTALL_DEVICES=( /dev/nvme0n1p1 /dev/nvme0n2p1 )

and extracting $disk and $partition_number from them
looks somewhat wong.

Perhaps it is sufficient in practice when usually
only disk device nodes need to be specified like

EFIBOOTMGR_INSTALL_DEVICES=( /dev/nvme0n1 /dev/nvme0n2 )

and then $partition_number defaults to '1'
(same as "man efibootmgr" tells)
which should work in very most cases
because usually the ESP is the first partition.

And for exceptional cases it is also possible
to specify partition device nodes like

EFIBOOTMGR_INSTALL_DEVICES=( /dev/nvme0n1p2 /dev/nvme0n2p3 )

even if they do not actually exist and then $disk and
$partition_number get somehow extracted from them
and 'efibootmgr' gets called two times as

efibootmgr ... --disk /dev/nvme0n1 --part 2 ...

and

efibootmgr ... --disk /dev/nvme0n2 --part 3 ...

according to what user had specified which
provides "final power to the user" which also means
that ReaR has to obey what the user had specified.

Regarding
"$disk and $partition_number get somehow extracted":

Perhaps it is better to use a specific separator character
how an optional partition number can be appended
to a disk device, e.g. with a colon as separator character

EFIBOOTMGR_INSTALL_DEVICES=( /dev/nvme0n1:2 /dev/nvme0n2:3 )

to make it unambiguous whether or not
an optional partition number is appended.

Alternatively when EFIBOOTMGR_INSTALL_DEVICES is an array
we can use strings of words as array elements like

EFIBOOTMGR_INSTALL_DEVICES=( '/dev/nvme0n1 2' '/dev/nvme0n2 3' )

to append an optional partition number.

In general I would recommend to use arrays for variables
which can contain more than one value.

jsmeix commented at 2025-05-08 07:40:¶

@sduehr
if what I wrote in
https://github.com/rear/rear/pull/3466#issuecomment-2857379995
looks strange to you
and/or if you don't find time to implement it,
I could try to implement what I mean with
https://github.com/rear/rear/pull/3466#issuecomment-2857379995
probably next week - as time permits.

jsmeix commented at 2025-05-09 07:14:¶

When Linux kernel RAID1 is used I am wondering
if it is correct at all to install the bootloader
on each of the RAID1 member disks.

I am neither a RAID expert nor a bootloader expert
so my following reasoning could be plain wrong.

My reasoning as far as I can imagine things is:

When Linux kernel RAID1 is used for two whole disks,
then the whole disks are RAID1 members
and both whole disks get combined into
a single RAID1 block device
so both RAID1 member disks become a single RAID1 disk
and on that RAID1 disk partitions get created.

'lsblk' shows this like (on my current RAID1 test VM)

# lsblk -ipo NAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINTS /dev/vda /dev/vdb
NAME             TRAN   TYPE  FSTYPE            LABEL        SIZE MOUNTPOINTS
/dev/vda         virtio disk  linux_raid_member any:myRAID1   10G 
`-/dev/md127            raid1                                 10G 
  |-/dev/md127p1        part  vfat                           880M /boot/efi
  |-/dev/md127p2        part  ext4                             8G /
  `-/dev/md127p3        part  swap                             1G [SWAP]
/dev/vdb         virtio disk  linux_raid_member any:myRAID1   10G 
`-/dev/md127            raid1                                 10G 
  |-/dev/md127p1        part  vfat                           880M /boot/efi
  |-/dev/md127p2        part  ext4                             8G /
  `-/dev/md127p3        part  swap                             1G [SWAP]

In particular there are no partition device nodes
on the RAID1 member disks

# ls -l /dev/vd*
brw-rw---- 1 root disk 254,  0 May  9 08:32 /dev/vda
brw-rw---- 1 root disk 254, 16 May  9 08:32 /dev/vdb

which is correct because the partitions
are on the RAID1 disk /dev/md127

# ls -l /dev/md127*
brw-rw---- 1 root disk   9, 127 May  9 08:32 /dev/md127
brw-rw---- 1 root disk 259,   0 May  9 08:32 /dev/md127p1
brw-rw---- 1 root disk 259,   1 May  9 08:32 /dev/md127p2
brw-rw---- 1 root disk 259,   2 May  9 08:32 /dev/md127p3

and this is also same in disklayout.conf

# grep -v '^#' var/lib/rear/layout/disklayout.conf | cut -b-140
disk /dev/vda 10737418240 gpt
disk /dev/vdb 10737418240 gpt
raidarray /dev/md127 level=raid1 raid-devices=2 devices=/dev/vda,/dev/vdb name=myRAID1 metadata=1.0 uuid=68d5e836:0f5d6359:f8e5bd03:6a69a1a1
raiddisk /dev/md127 10737287168 gpt
part /dev/md127 922746880 1048576 rear-noname boot,esp /dev/md127p1
part /dev/md127 8589934592 923795456 rear-noname none /dev/md127p2
part /dev/md127 1078051328 9513730048 rear-noname swap /dev/md127p3
fs /dev/md127p1 /boot/efi vfat uuid=4D3E-73EE label= options=rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mi
fs /dev/md127p2 / ext4 uuid=2e1395b7-db94-471d-920e-09531ac1b8c9 label= blocksize=4096 reserved_blocks=4% max_mounts=-1 check_interval=0d by
swap /dev/md127p3 uuid=779b25f1-046a-4caf-90a7-5b457517b038 label=

The Linux kernel RAID1 ensures that
the data on both RAID1 member disks is same.
Here I checked only the first GiB which contains
in particular the primary GPT and the ESP:

# dd if="/dev/vda" of=vda.1GiB bs=1M count=1024 status=progress
689963008 bytes (690 MB, 658 MiB) copied, 1 s, 676 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.73144 s, 620 MB/s

# dd if="/dev/vdb" of=vdb.1GiB bs=1M count=1024 status=progress
897581056 bytes (898 MB, 856 MiB) copied, 2 s, 449 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.49388 s, 431 MB/s

# ls -l vd*
-rw-r--r-- 1 root root 1073741824 May  9 08:45 vda.1GiB
-rw-r--r-- 1 root root 1073741824 May  9 08:45 vdb.1GiB

# diff -s vda.1GiB vdb.1GiB
Files vda.1GiB and vdb.1GiB are identical

So I think when Linux kernel RAID1 is used
the correct way how to install the bootloader is
to "simply" install the bootloader on the RAID1 disk and
rely on the Linux kernel RAID1 to do its job to ensure
the bootloader gets same on both RAID1 member disks
so that the system can be booted same from each one
of the two RAID1 member disks, in particular also
when one of the two RAID1 member disks fails.

jsmeix commented at 2025-05-09 10:52:¶

https://github.com/rear/rear/pull/3471
is my first attempt to implement what I wrote above in
https://github.com/rear/rear/pull/3466#issuecomment-2857379995

pcahyna commented at 2025-05-15 17:30:¶

When Linux kernel RAID1 is used I am wondering if it is correct at all to install the bootloader on each of the RAID1 member disks.

I am neither a RAID expert nor a bootloader expert so my following reasoning could be plain wrong.

My reasoning as far as I can imagine things is:

When Linux kernel RAID1 is used for two whole disks, then the whole disks are RAID1 members and both whole disks get combined into a single RAID1 block device so both RAID1 member disks become a single RAID1 disk and on that RAID1 disk partitions get created.

'lsblk' shows this like (on my current RAID1 test VM)
# lsblk -ipo NAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINTS /dev/vda /dev/vdb
NAME             TRAN   TYPE  FSTYPE            LABEL        SIZE MOUNTPOINTS
/dev/vda         virtio disk  linux_raid_member any:myRAID1   10G 
`-/dev/md127            raid1                                 10G 
  |-/dev/md127p1        part  vfat                           880M /boot/efi
  |-/dev/md127p2        part  ext4                             8G /
  `-/dev/md127p3        part  swap                             1G [SWAP]
/dev/vdb         virtio disk  linux_raid_member any:myRAID1   10G 
`-/dev/md127            raid1                                 10G 
  |-/dev/md127p1        part  vfat                           880M /boot/efi
  |-/dev/md127p2        part  ext4                             8G /
  `-/dev/md127p3        part  swap                             1G [SWAP]
In particular there are no partition device nodes on the RAID1 member disks
# ls -l /dev/vd*
brw-rw---- 1 root disk 254,  0 May  9 08:32 /dev/vda
brw-rw---- 1 root disk 254, 16 May  9 08:32 /dev/vdb
which is correct because the partitions are on the RAID1 disk /dev/md127
# ls -l /dev/md127*
brw-rw---- 1 root disk   9, 127 May  9 08:32 /dev/md127
brw-rw---- 1 root disk 259,   0 May  9 08:32 /dev/md127p1
brw-rw---- 1 root disk 259,   1 May  9 08:32 /dev/md127p2
brw-rw---- 1 root disk 259,   2 May  9 08:32 /dev/md127p3
and this is also same in disklayout.conf
# grep -v '^#' var/lib/rear/layout/disklayout.conf | cut -b-140
disk /dev/vda 10737418240 gpt
disk /dev/vdb 10737418240 gpt
raidarray /dev/md127 level=raid1 raid-devices=2 devices=/dev/vda,/dev/vdb name=myRAID1 metadata=1.0 uuid=68d5e836:0f5d6359:f8e5bd03:6a69a1a1
raiddisk /dev/md127 10737287168 gpt
part /dev/md127 922746880 1048576 rear-noname boot,esp /dev/md127p1
part /dev/md127 8589934592 923795456 rear-noname none /dev/md127p2
part /dev/md127 1078051328 9513730048 rear-noname swap /dev/md127p3
fs /dev/md127p1 /boot/efi vfat uuid=4D3E-73EE label= options=rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mi
fs /dev/md127p2 / ext4 uuid=2e1395b7-db94-471d-920e-09531ac1b8c9 label= blocksize=4096 reserved_blocks=4% max_mounts=-1 check_interval=0d by
swap /dev/md127p3 uuid=779b25f1-046a-4caf-90a7-5b457517b038 label=
The Linux kernel RAID1 ensures that the data on both RAID1 member disks is same. Here I checked only the first GiB which contains in particular the primary GPT and the ESP:
# dd if="/dev/vda" of=vda.1GiB bs=1M count=1024 status=progress
689963008 bytes (690 MB, 658 MiB) copied, 1 s, 676 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.73144 s, 620 MB/s

# dd if="/dev/vdb" of=vdb.1GiB bs=1M count=1024 status=progress
897581056 bytes (898 MB, 856 MiB) copied, 2 s, 449 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.49388 s, 431 MB/s

# ls -l vd*
-rw-r--r-- 1 root root 1073741824 May  9 08:45 vda.1GiB
-rw-r--r-- 1 root root 1073741824 May  9 08:45 vdb.1GiB

# diff -s vda.1GiB vdb.1GiB
Files vda.1GiB and vdb.1GiB are identical
So I think when Linux kernel RAID1 is used the correct way how to install the bootloader is to "simply" install the bootloader on the RAID1 disk and rely on the Linux kernel RAID1 to do its job to ensure the bootloader gets same on both RAID1 member disks so that the system can be booted same from each one of the two RAID1 member disks, in particular also when one of the two RAID1 member disks fails.

That's my conclusion as well, see https://github.com/rear/rear/pull/3471#issuecomment-2884527473 . Note though that efibootmgr does not install anything, it merely tells the firmware which partition to boot from. This actually works, as the partition is identified by its GUID, not by some hardware identifier, and the GUID is stored in the disk data. For other firmware approaches, this would not have worked (one that we actually support is OpenFirmware, where the disks are identified by their hardware paths, and thus one needs to tell the firmware to try booting from both component disks, as shown by #1887 (although that was for multipaths, the idea is the same)).