#2864 Issue closed: EFI boot stick freezes reproducable on systems other than the originating one

Labels: support / question, special hardware or VM, no-issue-activity

guru4712 opened issue at 2022-09-16 11:33:

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.7 / 2022-07-13

  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
    Red Hat Enterprise Linux release 8.5 (Ootpa)

  • ReaR configuration files

"cat /etc/rear/site.conf"

USING_UEFI_BOOTLOADER=1
AUTOSHRINK_DISK_SIZE_LIMIT_PERCENTAGE=70
AUTOINCREASE_DISK_SIZE_THRESHOLD_PERCENTAGE=200
USE_DHCLIENT=NO
USE_STATIC_NETWORKING=YES
USE_RESOLV_CONF=( NO )
PROGS+=( lsusb more )

"cat /etc/rear/local.conf"

OUTPUT=USB
USB_DEVICE_FILESYSTEM_LABEL='REAR-2.7-1'
USB_DEVICE=/dev/disk/by-label/$USB_DEVICE_FILESYSTEM_LABEL
  • Hardware vendor/product (PC or PowerNV BareMetal or ARM) or VM (KVM guest or PowerVM LPAR):
    Server Fujitsu MXbla

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86_64

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    UEFI , GRUB

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local SSD - RAID1 by controller AVAGO PRaid CP500i

  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT"):

done by automatic installation

NAME                            KNAME     PKNAME    TRAN   TYPE FSTYPE      LABEL   SIZE MOUNTPOINT
/dev/sda                        /dev/sda                   disk                   223,1G
|-/dev/sda1                     /dev/sda1 /dev/sda         part vfat                600M /boot/efi
|-/dev/sda2                     /dev/sda2 /dev/sda         part xfs                   1G /boot
`-/dev/sda3                     /dev/sda3 /dev/sda         part LVM2_member       221,5G
  |-/dev/mapper/rhel_c3772-root /dev/dm-0 /dev/sda3        lvm  xfs                  70G /
  |-/dev/mapper/rhel_c3772-swap /dev/dm-1 /dev/sda3        lvm  swap               15,6G [SWAP]
  `-/dev/mapper/rhel_c3772-home /dev/dm-2 /dev/sda3        lvm  xfs               135,9G /home
  • Description of the issue (ideally so that others can reproduce it):

Prepare USB bootstick

rear format  -- --efi /dev/sdX
rear mkrescue

Bootstick works on the server it was created on. But ONLY THERE!

There are 2 other identical servers.
When trying to boot either,

- GRUB starts;
- choose "unsecure boot"
- system freezes after showing
    Loading kernel /EFI/BOOT/kernel ...
    Loading initial ramdisk /EFI/BOOT/initrd.cgz
  • Workaround, if any:

Stick to ReaR Version 2.6-4.
Using that version, everything runs smoothly.
So imho there exists no hardware-oriented problem concerning
the other servers that are failing with version 2.7-1!

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
    debug log from MKRESCUE
    rear-c3772 Issue V2.7.log

jsmeix commented at 2022-09-16 12:40:

@guru4712
could you describbe in more detail what you mean with

There are 2 other identical servers.
When trying to boot either,

- GRUB starts;
- choose "unsecure boot"
- system freezes after showing
    Loading kernel /EFI/BOOT/kernel ...
    Loading initial ramdisk /EFI/BOOT/initrd.cgz

What kind of UEFI booting do you need on
those 2 other identical servers?
With or without Secure Boot?

Only an offhanded bind shot in the dark
(because I am not at all an expert in UEFI booting):
In
https://github.com/rear/rear/files/9583585/rear-c3772.Issue.V2.7.log
I spotted

2022-09-14 13:16:32.851869822 Including output/USB/Linux-i386/100_create_efiboot.sh
2022-09-14 13:16:32.855324765 Configuring EFI partition '/dev/disk/by-label/REAR-EFI' for EFI boot with 'grubx64.efi'
'/boot/efi/EFI/redhat/grubx64.efi' -> '/var/tmp/rear-efi.atBlRkkxSa//EFI/BOOT/BOOTX64.efi'
'/boot/vmlinuz-4.18.0-348.23.1.el8_5.x86_64' -> '/var/tmp/rear-efi.atBlRkkxSa//EFI/BOOT/kernel'
'/var/tmp/rear.CNfIul43hPLvpyK/tmp/initrd.cgz' -> '/var/tmp/rear-efi.atBlRkkxSa//EFI/BOOT/initrd.cgz'
/usr/share/rear/lib/_input-output-functions.sh: line 525: type: grub-install: not found
2022-09-14 13:16:34.185773408 Configuring GRUB2 for EFI boot
2022-09-14 13:16:34.188102495 Configuring GRUB2 kernel /EFI/BOOT/kernel
2022-09-14 13:16:34.190179074 Configuring GRUB2 initrd /EFI/BOOT/initrd.cgz
2022-09-14 13:16:34.192298262 Configuring GRUB2 root device as 'search --no-floppy --set=root --label REAR-EFI'
.
.
.
2022-09-14 13:18:10.681445260 Including output/USB/Linux-i386/850_make_USB_bootable.sh
2022-09-14 13:18:10.689368102 Making /dev/sdb bootable with syslinux/extlinux
/var/tmp/rear.CNfIul43hPLvpyK/outputfs/boot/syslinux is device /dev/sdb2
2022-09-14 13:19:47.491048043 Writing syslinux MBR /usr/share/syslinux/gptmbr.bin of type gpt to /dev/sdb

To me this looks somewhat contradicting because
it seems two bootloaders get installed (GRUB2 and SYSLINUX)
and witing a MBR on a GPT disk seems to be at least useless.

jsmeix commented at 2022-09-16 12:44:

I see right now that it had worked this way for me,
see https://github.com/rear/rear/pull/2829 therein
https://github.com/rear/rear/pull/2829#issuecomment-1166925033
but I do not use Secure Boot.

jsmeix commented at 2022-09-16 12:52:

@guru4712
I suggest you may read through the whole
https://github.com/rear/rear/pull/2829
because that lists my BIOS and
UEFI (without Secure Boot) dual boot experience.
Perhaps you notice some difference between how things
behaved for me versus how it behaves for you
which may help us to find the root cause
why it does not work in your specific case.

guru4712 commented at 2022-09-19 07:21:

Let me first clarify the situation: there are 3 identical servers. Only EFI boot is possible.
On one of them, the rescue system is built and then tested for restore purposes on the others.
Astonishing enough, the resuce stick boots only on the particular server the stick was just built on, not on the others.

#2829 brougth no help: I am not dealing with dual boot situation, since the hardware needs EFI.

After adding "nomodes" to the kernel line in grub.cfg the server simply gets stuck.

After adding "vga=normal" to the kernel line in grub.cfg the server changes console output: already shown lines widen (see picture) and server gets stuck. This is identical to the behavior without tweaks.CaptureScreen during boot
STUCK

guru4712 commented at 2022-09-29 08:53:

Refined investigation. Upgraded to RHEL 8.6, because former versions contain a nasty bug:
when booting directly into graphical mode, one cannot enter rescue mode - console freezes.
If there was an working SSH-connection, it can be used to change run level again (and so freeing console).
Workaround: boot into multiuser-mode. From there any other mode is reachable.
RH admitted a bug in GDM but refused to backport since 8.5 is out of support.

Made EFI-rescue-sticks on 2 beforementioned servers called C3770 an C3772.

  • Each server can boot from the rescue stick it produced itself;
  • trying to boot on the other server fails showing the picture shown in comment above;
  • both rescue sticks boot successfully on an old PC (Compaq Pro 6305 SFF).

c3770-mkrescue.log
c3772-mkrescue.log

It is most disappointing that booting fails on hardware-identical twin but works on completely different metal.
And again, Version 2.6-4 is fine!

Any idea for further investigation?

DEvil0000 commented at 2022-10-19 14:06:

I remember seeing some HP (maybe back at gen8 or even older) issues similar to this.
If I remember correctly it was some combination of firmware bugs and EFI or GPT.
you may try:

  • firmware updates
  • legacy bios boot instead of efi
  • msdos partition table instead of gpt
  • msdos table + legacy boot

beside that your kernel version or bootloader version may be involved.
so you may try:

  • a different kernel version
  • a different bootloader (grub?) version
  • a non grub bootloader (syslinux) - or grub if you used syslinux before

also some firmware may expect certain partition positions, sizes or filesystem types (which would be a bug there as well).

guru4712 commented at 2022-10-28 06:15:

As stated before, EFI is mandatory.

  • Updated OS to RHEL 8.6 with patches as from 2022-10-27 (kernel-4.18.0-372.32.1.el8_6.x86_64);
  • updated firmware to V5.0.0.13 - R.1.22.0 as from 2022-08-25 (latest).

No improvement at all !

I do not understand how to change the bootloader used by REAR.
By what means? Any variables?

DEvil0000 commented at 2022-10-28 09:55:

I think I have reproduced this by accident when testing/finding #2884 .
extlinux was showing the two lines for loading kernel/initrd but looked like it was not booting. Not sure if the output was just not shown on the console or if it was actually hanging. I also kind of ignored it since I tried extlinux with vfat boot partition - so modified rear code.
Since the grub bootloader variation is also not working with efi at the moment (see #2883 ) this is a bit of a problem.

you should be able to manually fix the boot device (if it is a disk or usb device) by installing for example grub manually.
In case of a iso or such that is not so easy.

guru4712 commented at 2022-10-28 10:23:

I would like to defend GRUB, since same machines with identical OS work fine using ReaR 2.6-4 !
So what happened to ReaR in transition to version 2.7 ?

DEvil0000 commented at 2022-10-28 11:25:

I think the newer grub versions are more picky also a lot of rear code changed of course and new bugs may have been introduced.
I am however not sure yet if this issue is acutually due to that or something else.

github-actions commented at 2022-12-28 02:25:

Stale issue message


[Export of Github issue for rear/rear.]