#2153 Issue closed: Missing 'grubenv' after ReaR 2.5 recovery causes boot failure in RHEL 7

Labels: support / question, fixed / solved / done, special hardware or VM

prolly-brent opened issue at 2019-06-03 15:42:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.5 / 2019-05-10

  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
    OS_VENDOR=RedHatEnterpriseServer
    OS_VERSION=7.6

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):

USER_INPUT_TIMEOUT=10
OUTPUT=ISO
OUTPUT_URL=cifs://thrnas01/TSSGeneral/Midrange/Linux/REAR/
OUTPUT_OPTIONS="credentials=/etc/rear/cifs,gid=816800513,file_mode=0660,dir_mode=0770"
BACKUP=NETFS
BACKUP_URL=cifs://thrnas01/TSSGeneral/Midrange/Linux/REAR/
BACKUP_OPTIONS="credentials=/etc/rear/cifs,gid=816800513,file_mode=0660,dir_mode=0770"
NETWORKING_PREPARATION_COMMANDS=( 'ip addr add 10.198.13.225/23 dev eno1' 'ip link set dev eno1 up' 'ip route add default via 10.198.12.1' 'return' )
ISO_PREFIX="rear-lnxtst2-2019-06-03"
BACKUP_PROG_ARCHIVE="rear-lnxtst2-2019-06-03-backup"
LOGFILE="$LOG_DIR/rear-lnxtst2-2019-06-03.log"
  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    BareMetal - HPE ProLiant DL380 Gen9

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    UEFI, GRUB

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local disk

  • Description of the issue (ideally so that others can reproduce it):
    After successful rear recovery, can't boot the recovered image because the system can't locate the kernel. After further inspection, the following symbolic link (which is in the original image), is missing from the recovered image:

/boot/grub2/grubenv -> ../efi/EFI/redhat/grubenv
  • Workaround, if any:
    After successful recovery, go to the recovered image (before final boot) and create the symbolic link manually.

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
    rear-lnxtst2.log.gz

jsmeix commented at 2019-06-04 07:13:

@prolly-brent
had it worked before with ReaR 2.4?

If yes, I think it is related to my changes in
https://github.com/rear/rear/pull/1843
therein in particular this commit
https://github.com/rear/rear/pull/1843/commits/32098b9add5eec20b28df8e7da26b785316961b8
where its commit message reads (excerpt):

... move restored /boot/grub2/grubenv away by default

that is done via a change of BACKUP_RESTORE_MOVE_AWAY_FILES
in usr/share/rear/conf/default.conf, see the comment in default.conf
for the reason behind why the restored /boot/grub2/grubenv
is moved away by default.

See also in
http://relax-and-recover.org/documentation/release-notes-2-5
the entry

Fixed, simplified, and enhanced GRUB2 installation on x86 and ppc64le architecture (issues #1828, #1845, #1847, #1437)

that lists some related GitHub issues.

To keep the restored /boot/grub2/grubenv
set in your /etc/rear/local.conf file

BACKUP_RESTORE_MOVE_AWAY_FILES=()

I got the information about the usage/meaning of the grubenv file
from our (i.e. SUSE's) GRUB2 maintainer and if that description is right
a missing /boot/grub2/grubenv -> ../efi/EFI/redhat/grubenv symlink
or a missing grubenv file cannot cause a boot failure.

Simply put I think GRUB2 should always be able to boot a system
without any information from a previous boot.

Accordingly I assume the root cause of your boot failure is not
missing information from a previous boot but something else.

But I am not at all a bootloader or GRUB2 expert so I could be wrong
and in some cases the grubenv file might be even mandatory
for GRUB2 to be able to boot a system.

jsmeix commented at 2019-06-04 07:19:

@rmetrich
I guess for you a RHEL7 system recovery with ReaR 2.5 booted o.k.?

jsmeix commented at 2019-06-04 07:25:

@prolly-brent
your rear-lnxtst2.log.gz is only about rear -D mkbackup
but what actually failed for you is rear -D recover.

Have a look at the section
"Debugging issues with Relax-and-Recover" at
https://en.opensuse.org/SDB:Disaster_Recovery
how to get a rear -D recover log file.

prolly-brent commented at 2019-06-04 13:54:

rear-lnxtst2-2019-06-04.log.gz
rear-lnxtst2.log.gz

I updated the HP Proliant BIOS firmware to the latest level and ran the recover again and was able to boot (or better said "able to locate the kernel") after a rear recovery (even without the symlink in place).
Per your request I attached the recover debug logs. I am a bit concerned that this program is removing files that were put in place by the RHEL installation and possible downstream negative effects of that. I will look into using BACKUP_RESTORE_MOVE_AWAY_FILES to prevent this. Thanks for the suggestions and assistance - much appreciated.

jsmeix commented at 2019-06-04 14:09:

@prolly-brent
this program (i.e. ReaR) even dares to modify files like /etc/fstab
and lots of others e.g. to update UUIDs and things like that
and - even more - this program recreates initrd and bootloader files
from scratch anew and all that after the backup was restored - but that
is needed to get a consistent new recreated system without old/outdated
remainders from the original system that do no longer make sense
or may even break the new recreated system.
For details see the finalize scripts in ReaR.
Cf. "Disaster recovery means installation (reinstalling from scratch)" at
https://en.opensuse.org/SDB:Disaster_Recovery

jsmeix commented at 2019-06-05 07:32:

According to
https://github.com/rear/rear/issues/2153#issuecomment-498681818
this issue does no longer happen after a firmware update.


[Export of Github issue for rear/rear.]