#3176 PR merged: Skip btrfs subvolumes when detecting ESP partitions

Labels: bug, fixed / solved / done

lzaoral opened issue at 2024-03-07 16:40:

Pull Request Details:
  • Type: Bug Fix

  • Impact: Normal

  • Reference to related issue (URL): N/A

  • How was this pull request tested? Recovery of Fedora Rawhide UEFI machine.

  • Description of the changes in this pull request:

The idea is to find all direct partitions that contain the ESP mount point and to skip all other transitive fs: dependencies.

The diskdeps.conf file contains following entries on default Fedora installations (the list was shortened to only the relevant ones):

/dev/vda1 /dev/vda
/dev/vda4 /dev/vda
/dev/vda5 /dev/vda
fs:/boot/efi /dev/vda1
fs:/boot/efi fs:/boot
fs:/boot/efi fs:/
fs:/boot/efi btrfsmountedsubvol:/
fs:/boot /dev/vda4
fs:/boot fs:/
fs:/boot btrfsmountedsubvol:/
fs:/ /dev/vda5
btrfsmountedsubvol:/ /dev/vda5

The ESP partition is only on /dev/vda1. However, the find_partition call
was not taking into account the need to skip mounted btrfs subvolumes as well.
Therefore, /dev/vda5 was listed as an ESP partition as well.

This change makes sure that only direct ESP partitions are listed and
fixes a bug where ReaR would create broken BootXXXX entries which point to
completely unrelated partitions.

Relevant excerpts from logs:

++ efibootmgr --create --gpt --disk /dev/vda --part 1 --write-signature --label 'RedHatEnterpriseServer 41' --loader '\EFI\fedora\grubx64.efi'
...
++ efibootmgr --create --gpt --disk /dev/vda --part 5 --write-signature --label 'RedHatEnterpriseServer 41' --loader '\EFI\fedora\grubx64.efi'

pcahyna commented at 2024-03-08 12:01:

disktodo.conf contains following entries

ITYM diskdeps.conf, disktodo.conf has a different format. Please fix also the commit message.

a bug where ReaR would create broken BootXXXX entries which point to completely unrelated partitions.

Can you please show an example if you have saved it somewhere?

lzaoral commented at 2024-03-08 12:41:

Thank you, @pcahyna! I've fixed the typo as I meant the diskdeps.conf file and amended the commit message with the creation of those BootXXXX entries.

pcahyna commented at 2024-03-19 14:27:

thank you a lot for the fix and for providing an example of the buggy behavior!

jsmeix commented at 2024-03-20 07:40:

Only a side question FYI:

I am wondering why the ESP kernel device node
is determined so indirectly.

Do you know if there is a reason why not
something like 'lsblk' is used to get directly
the partition kernel device node
that matches a mount point?

E.g. on a running system
(my homeoffice workstation):

# lsblk -nrpo PKNAME,KNAME,MOUNTPOINT | grep '/boot/efi'
/dev/nvme0n1 /dev/nvme0n1p1 /boot/efi

I don't use the SUSE btrfs structure
so I don't have mounted btrfs subvolumes.

I assume 'lsblk' cannot be used here because
we are in the recovery system (in 'finalize' stage)
and need the ESP partition of the recreated system
below /mnt/local where the ESP is perhaps not mounted
so we need to determine it from the diskdeps.conf file?

Or why not directly from disklayout.conf?

E.g. on my homeoffice workstation (excerpt):

# Format: part <device> <partition size(bytes)> <partition start(bytes)> <partition type|name> <flags> /dev/<partition>
part /dev/nvme0n1 535822336 1048576 rear-noname boot,esp /dev/nvme0n1p1

I.e. why indirectly via the ESP mount point
than directly the ESP 'part' entry in disklayout.conf?

jsmeix commented at 2024-03-20 07:41:

@lzaoral @pcahyna
don't worry about my questions.
Feel free to merge as you like (I approved it "as is").

lzaoral commented at 2024-03-20 08:59:

@jsmeix I'll add the comment explaining why the exclusions are required.

I.e. why indirectly via the ESP mount point than directly the ESP 'part' entry in disklayout.conf?

AFAIK, it is necessary to create boot entry for every physical device present in the RAID array which contains the ESP:

lzaoral commented at 2024-03-20 11:23:

@pcahyna @jsmeix I've rebased the PR and added the explanation comment.

pcahyna commented at 2024-03-20 13:48:

@jsmeix I looked at the places that determine the ESP filesystem / device and it seems like a mess to me (several different methods are used), but I don't have enough motivation to fix this as the code "mostly works" now (I suspect it might have some problems when multipath is used though) so I am going to stop investigating and merge the code as it is.

jsmeix commented at 2024-03-20 14:57:

@pcahyna
yes,
feel free to merge when it is OK for you.
We can only improve things step by step
as far as possible with reasonable effort.


[Export of Github issue for rear/rear.]