#2748 PR merged: Include dmsetup and dmeventd unconditionally

Labels: bug, cleanup, fixed / solved / done

lzaoral opened issue at 2022-01-25 16:50:

Pull Request Details:
  • Type: Bug Fix

  • Impact: High

  • Reference to related issue (URL): None

  • How was this pull request tested?

Successful recovery of a RHEL 8.5 VM with BIOS without encryption, LVM
or multipath with this patch applied and unsuccessful recovery of the same
machine without it. RHEL 8.5 has os-prober 1.74 and does not ship
grub-mount binary.

  • Brief description of the changes in this pull request:

Changes usr/share/rear/conf/GNU/Linux.conf

Older releases of os-prober (1.74 and below) use dmsetup as a fallback
solution for mounting when grub-mount is missing.

However, dmsetup was included in the rescue image if and only if LVM,
multipath or encryption were detected. Thus, BIOS machines that do
not use these but still have dmsetup present, would block indefinitely
on the "Installing GRUB2 boot loader..." step.

GRUB2 installation is performed in a chroot after the data have already
been recovered. ReaR would call grub-mkconfig which calls os-prober
which then executes dmsetup. However, it would never receive the response
trough host's udev as the rescue system would not have dmsetup present
to send it.

EDIT: reworded the description (thanks @pcahyna for suggestions!)
EDIT2: usr/share/rear/conf/GNU/Linux.conf is now being changed.

pcahyna commented at 2022-01-25 17:04:

Similar problem was reported for another case outside the context of ReaR where grub2-mkconfig is executed in a chroot with udev running outside the chroot: the Debian installer. The result is the same: it hangs forever when executing os-prober. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=853927 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=860833 . The root cause of the problem has never been properly understood and the code in os-prober that executes dmsetup got eventually removed to fix this problem: https://bazaar.launchpad.net/~ubuntu-installer/os-prober/master/revision/419 . Some distributions still have the problematic os-prober though.

pcahyna commented at 2022-01-25 17:09:

In order to show the problem the system must not use any LVM, thus the previous code did not include dmsetup in the rescue system. Note also that the old message "Device mapper found enabled. Including LVM tools." was a bit incorrect: LVM tools got included only if both device-mapper and LVM were found, finding device-mapper enabled was not enough.

jsmeix commented at 2022-01-26 13:24:

In general when programs are useful in any case in the recovery system
shoudn't such programs be better added to PROGS in
the generic usr/share/rear/conf/GNU/Linux.conf ?

jsmeix commented at 2022-01-26 13:29:

Only FYI:
The "Device mapper found enabled. Including LVM tools." message originated from
https://github.com/rear/rear/commit/777133cb9fb944af2681a82f2ff65cf1ea869e98

pcahyna commented at 2022-01-31 09:21:

In general when programs are useful in any case in the recovery system
shoudn't such programs be better added to PROGS in
the generic usr/share/rear/conf/GNU/Linux.conf ?

@jsmeix I agree. This case of dmsetup has nothing to do with LVM, really, although superficially it may appear related. So it belong to another location, and usr/share/rear/conf/GNU/Linux.conf is a good one. (Only a comment could be left here, explaining why dmsetup is not being added here despite being related to LVM, or perhaps the current code could be left as it is - I suppose that there is no problem with adding a tool to PROGS twice?)

lzaoral commented at 2022-01-31 15:47:

@jsmeix @pcahyna I've added dmsetup and dmeventd with explanation for doing so to usr/share/rear/conf/GNU/Linux.conf.

jsmeix commented at 2022-02-01 07:15:

There should be no problem adding something to PROGS or REQUIRED_PROGS several times
because in build/GNU/Linux/390_copy_binaries_libraries.sh
there is a sort -u

local all_binaries=( $( for bin in "${PROGS[@]}" "${REQUIRED_PROGS[@]}" ; do
                            ...
                        done 2>>/dev/$DISPENSABLE_OUTPUT_DEV | sort -u ) )
...
copy_binaries "$ROOTFS_DIR/bin" "${all_binaries[@]}"

and even without sort -u copying something several times should not cause harm.

jsmeix commented at 2022-02-01 07:16:

When there are no objections I would like to merge it today afternoon.

jsmeix commented at 2022-02-02 12:29:

I assume the description is OK now
so I would like to merge it today in about one hour
unless there are objections soon.

jsmeix commented at 2022-02-02 13:54:

@lzaoral @pcahyna
thank you for your deep analysis of the problem and for your fix!

I think I understand "something" but the main point is that one understands
"without it things could block indefinitely at 'Installing GRUB2 boot loader...'"
so better safe than sorry and have sometimes needed things in the recovery system
because after "rear recover" failed it is a bit late to add missing stuff to the recovery system ;-)

pcahyna commented at 2022-02-02 14:02:

so better safe than sorry and have sometimes needed things in the recovery system
because after "rear recover" failed it is a bit late to add missing stuff to the recovery system ;-)

Technically, this is not 100% true. You can copy dmsetup from the restored chroot to the running rescue ramdisk and restart rear recover, so this is one of the less disastrous errors that may happen (there are other ways to add missing stuff to the ramdisk, even a forgotten backup client can be added). But in this case the problem is really mysterious and it took us days to even figure out that dmsetup is what one needs (debian / ubuntu were never able to debug this problem in their installer and they rather requested removal of the problematic code from os-prober entirely), so it is very unlikely that one would find out the workaround.
Also, if you get a heart attack because the recovery of your broken $CRITICAL_SERVER failed, workarounds won't help you, as you are already dead :-)


[Export of Github issue for rear/rear.]