#3267 PR merged: improve device recognition when creating efibootmgr entry

damm620 opened issue at 2024-07-04 13:12:

Pull Request Details:
  • Type: Bug Fix

  • Impact: Critical

  • Reference to related issue (URL):
    fixes https://github.com/rear/rear/issues/3265

  • How was this pull request tested?
    on own servers

  • Description of the changes in this pull request:
    the devices and partition numbers used to create an entry in efibootmgr were not properly crafted when having multipath devices (mpd_boot_1p1). this is fixed by this PR

gdha commented at 2024-07-08 09:37:

@rear/contributors Would like to merge this PR on Friday 12 July 2024 if there are no objections from the ReaR contributors?

damm620 commented at 2024-07-11 14:42:

@schlomo

Can't test, LGTM in principle.

I find this code (both the old and new version) very hard to read & understand because it re-uses variables with different content and it is not really self-explanatory.

May I therefore suggest to expand it a little bit, to use different variable names for different steps of the process, and to maybe add a comment that illustrates the data coming in and going out of this code block? To help understand how the input and output look like.

Especially for special cases like multipath it will help future developers to understand the code if there are examples of typical input and output in the comments.

I did hopefully clearify the code a bit. Do you have any additional suggestions?

damm620 commented at 2024-07-11 15:01:

Ahh. now I understand the code.

I guess you could do this instead:

read pkname kname mountpoint too_many_results <<<$(lsblk -nrpo PKNAME,KNAME,MOUNTPOINT | grep '/boot/efi'|sort -u)
test "$pkname" -a -z "$too_many_results" || LogPrintError "cannot determine /boot/efi disk and partition devices:$LF$pkname $kname $mountpoint $too_many_results"
efi_disk=$(get_device_name $pkname)
efi_part=$(get_device_name $kname)

I personally would find that easier to read but that again is also a lot of personal opinion.

Your example confuses me more than clearifies. Everybodys taste is a bit different I guess, but I am glad we find a solution here :)

Could you test that the recovered system can boot from the other multipath device, e.g. if you recover, shutdown, remove the first link to the multipath device, and boot again?

I our systems we use multipath to have different path to the same storage lun. So in the end you are writing data to the same lun, regardless with (multi-)path you are using. Or did I misunterstood you question?

schlomo commented at 2024-07-11 15:05:

Could you test that the recovered system can boot from the other multipath device, e.g. if you recover, shutdown, remove the first link to the multipath device, and boot again?

I our systems we use multipath to have different path to the same storage lun. So in the end you are writing data to the same lun, regardless with (multi-)path you are using. Or did I misunterstood you question?

With multipath I would always check that the system can boot from the secondary path. The problem is maybe not the boot code on disk but the UEFI boot entry that only points to the primary path and not the secondary path as a fallback.

When I read the issue description I thought that for multipath devices you would actually create multie EFI boot entries, one per path, to facilitate booting off the secondary paths.

damm620 commented at 2024-07-12 07:27:

Could you test that the recovered system can boot from the other multipath device, e.g. if you recover, shutdown, remove the first link to the multipath device, and boot again?

I our systems we use multipath to have different path to the same storage lun. So in the end you are writing data to the same lun, regardless with (multi-)path you are using. Or did I misunterstood you question?

With multipath I would always check that the system can boot from the secondary path. The problem is maybe not the boot code on disk but the UEFI boot entry that only points to the primary path and not the secondary path as a fallback.

When I read the issue description I thought that for multipath devices you would actually create multie EFI boot entries, one per path, to facilitate booting off the secondary paths.

Maybe I do not understand the concept of EFI correctly, but I our case we do have the /boot/efi and /boot as two partitions on the same disk. So when the system finds the EFI partition, it should also find the kernel to load the linux.

Should I do anything now regarding this PR?

schlomo commented at 2024-07-12 07:43:

I'll merge it now and we see how it goes.

I'd really appreciate it if you could do the test I mentioned, to be sure that a recovered system will boot also from the other / secondary multipath devices.

Please ensure to wipe the EFI boot records on the recovery system before the test to be sure that there is no "leftover" old configuration.

For illustration: On my home PC the EFI boot entries look like this:

$ sudo efibootmgr -v
BootCurrent: 0001
Timeout: 5 seconds
BootOrder: 0001,0002
Boot0001* ubuntu    HD(1,GPT,ef2a7e35-5378-4d76-8d7c-024c8f67de08,0x800,0x100000)/File(\EFI\UBUNTU\GRUBX64.EFI)..BO
Boot0002* UEFI OS   HD(1,GPT,ef2a7e35-5378-4d76-8d7c-024c8f67de08,0x800,0x100000)/File(\EFI\BOOT\BOOTX64.EFI)..BO

I must confess to the fact that I also don't know the details of the EFI boot entries, and I don't know if a secondary multipath device will look the same as the primary device to the UEFI firmware. Hence the idea to test.


[Export of Github issue for rear/rear.]