#3324 Issue closed: Creating BMR ISO for Rubrik Recovery

Labels: support / question, fixed / solved / done, external tool, special hardware or VM

RLDuckworth opened issue at 2024-09-30 18:42:

  • ReaR version ("/usr/sbin/rear -V"): Relax-and-Recover 2.7 / Git

  • If your ReaR version is not the current version, explain why you can't upgrade: N/A

  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"): RHEL7.9, planning on In Place upgrade to RHEL8

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):

OUTPUT=ISO
BACKUP=CDM
UEFI_BOOTLOADER=/boot/efi/EFI/redhat/grubx64.efi
SECURE_BOOT_BOOTLOADER=/boot/efi/EFI/redhat/shimx64.efi
  • Hardware vendor/product (PC or PowerNV BareMetal or ARM) or VM (KVM guest or PowerVM LPAR): Lenovo SR630

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device): x86_64

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot): UEFI / GRUB

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe): SSD

  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT"):

-sh-4.2# lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT
NAME                              KNAME     PKNAME    TRAN   TYPE FSTYPE      LABEL         SIZE MOUNTPOINT
/dev/sda                          /dev/sda            sata   disk                         119.2G
|-/dev/sda1                       /dev/sda1 /dev/sda         part vfat                     1015M /boot/efi
|-/dev/sda2                       /dev/sda2 /dev/sda         part xfs                         1G /boot
`-/dev/sda3                       /dev/sda3 /dev/sda         part LVM2_member             117.2G
  |-/dev/mapper/vg0-root          /dev/dm-0 /dev/sda3        lvm  xfs                        20G /
  |-/dev/mapper/vg0-swap          /dev/dm-1 /dev/sda3        lvm  swap                        8G [SWAP]
  |-/dev/mapper/vg0-usr           /dev/dm-2 /dev/sda3        lvm  xfs                         5G /usr
  |-/dev/mapper/vg0-home          /dev/dm-3 /dev/sda3        lvm  xfs                         5G /home
  |-/dev/mapper/vg0-var           /dev/dm-4 /dev/sda3        lvm  xfs                        20G /var
  |-/dev/mapper/vg0-tmp           /dev/dm-5 /dev/sda3        lvm  xfs                        10G /tmp
  |-/dev/mapper/vg0-var_log       /dev/dm-6 /dev/sda3        lvm  xfs                         5G /var/log
  |-/dev/mapper/vg0-var_log_audit /dev/dm-7 /dev/sda3        lvm  xfs                         5G /var/log/audit
  |-/dev/mapper/vg0-var_tmp       /dev/dm-8 /dev/sda3        lvm  xfs                         5G /var/tmp
  `-/dev/mapper/vg0-opt           /dev/dm-9 /dev/sda3        lvm  xfs                         5G /opt
/dev/sdb                          /dev/sdb            usb    disk                          14.9G
`-/dev/sdb1                       /dev/sdb1 /dev/sdb         part vfat        RHEL-7_9 SE  14.9G
  • Description of the issue (ideally so that others can reproduce it):

This is my first time working with ReaR. I'm working with a "test" system and running through the "rear -v mkrescue" and it appears the iso is being written but I'm getting three errors or "warning" messages. It looks like It'd be safe to delete the symlinks in message 1, but I'd like to verify that it's safe to delete the symlink for message 2, or does it need to link to a valid grub.cfg file? Does the warning in message 3 need to be or can it be corrected?

Message 1) Failed to copy all contents of /lib/modules/3.10.0-1160.119.1.el7.x86_64 (dangling symlinks could be a reason)
Since there the symbolic link point to non-existent files (there are 4 of them).

Message 2) Broken symlink '/etc/grub2.cfg' in recovery system because 'readlink' cannot determine its link target
lrwxrwxrwx. 1 root root 22 Sep 26 08:37 /etc/grub2.cfg -> ../boot/grub2/grub.cfg
The target link does not exist, the only thing in /etc/grub2 is grubenv.

Message 3) Did not find /boot/grub2/locale files (minor issue for UEFI ISO boot)

  • Workaround, if any:
    Have not tried the recovery yet.

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):

You can drag-drop log files into this editor to create an attachment
or paste verbatim text like command output or file content
by including it between a leading and a closing line of
three backticks like this:

[root@rhel7-8-test rear]# !239
rear -v mkrescue
Relax-and-Recover 2.7 / Git
Running rear mkrescue (PID 99452 date 2024-09-30 09:16:48)
Using log file: /var/log/rear/rear-rhel7-8-test.log
Running workflow mkrescue on the normal/original system
Using UEFI Boot Loader for Linux (USING_UEFI_BOOTLOADER=1)
Using autodetected kernel '/boot/vmlinuz-3.10.0-1160.119.1.el7.x86_64' as kernel in the recovery system
Creating disk layout
Overwriting existing disk layout file /var/lib/rear/layout/disklayout.conf
Disabling component 'disk /dev/sdb' in /var/lib/rear/layout/disklayout.conf
Disabling component 'part /dev/sdb' in /var/lib/rear/layout/disklayout.conf
Verifying that the entries in /var/lib/rear/layout/disklayout.conf are correct
Created disk layout (check the results in /var/lib/rear/layout/disklayout.conf)
Creating recovery system root filesystem skeleton layout
Skipping 'lo': not bound to any physical interface.
To log into the recovery system via ssh set up /root/.ssh/authorized_keys or specify SSH_ROOT_PASSWORD
Using '/boot/efi/EFI/redhat/shimx64.efi' as UEFI Secure Boot bootloader file
Copying logfile /var/log/rear/rear-rhel7-8-test.log into initramfs as '/tmp/rear-rhel7-8-test-partial-2024-09-30T09:17:00-05                               00.log'
Copying files and directories
Copying binaries and libraries
Copying all kernel modules in /lib/modules/3.10.0-1160.119.1.el7.x86_64 (MODULES contains 'all_modules')

Failed to copy all contents of /lib/modules/3.10.0-1160.119.1.el7.x86_64 (dangling symlinks could be a reason)

Copying all files in /lib*/firmware/

Broken symlink '/etc/grub2.cfg' in recovery system because 'readlink' cannot determine its link target

Testing that the ReaR recovery system in '/var/tmp/rear.rmxc1GBgSiWQa1a/rootfs' contains a usable system
Creating recovery/rescue system initramfs/initrd initrd.cgz with gzip default compression
Created initrd.cgz with gzip default compression (447 MiB) in 41 seconds

Did not find /boot/grub2/locale files (minor issue for UEFI ISO boot)

Making ISO image
Wrote ISO image: /var/lib/rear/output/rear-rhel7-8-test.iso (557M)
Exiting rear mkrescue (PID 99452) and its descendant processes ...
Running exit tasks
[root@rhel7-8-test rear]#

Thanks very much!
rear-rhel7-8-test.log

gdha commented at 2024-10-03 09:44:

@RLDuckworth Did you already try to boot from the ISO image as a test? I think it should work.

RLDuckworth commented at 2024-10-03 13:57:

Good morning,

I was able to boot the ISO image, select Relax-and-Recover (UEFI and Secure Boot) and it looks like it’s ready to run rear recover. Before I had been selecting the Relax-and-Recover (BIOS or UEFI without Secure Boot) and it would not boot.
Should I be able to boot either one?

I will go ahead and run a rear recover and see what happens. This is all trial and error right now. Is there a way to determine exactly what backup will be restored from Rubrik?

We’re making progress. Thanks very much!

Randall L. Duckworth III
Linux Systems Administrator III | Information Technology
O (940) 321-7800 x.7502
7701 S Stemmons, Corinth, TX 76210

gdha commented at 2024-10-03 14:23:

Good morning, I was able to boot the ISO image, select Relax-and-Recover (UEFI and Secure Boot) and it looks like it’s ready to run rear recover. Before I had been selecting the Relax-and-Recover (BIOS or UEFI without Secure Boot) and it would not boot. Should I be able to boot either one? I will go ahead and run a rear recover and see what happens. This is all trial and error right now. Is there a way to determine exactly what backup will be restored from Rubrik?

Rubrik is an external backup program where I do not have any knowledge of - sorry.

RLDuckworth commented at 2024-10-03 15:14:

But should we be able to boot from either menu item - Relax-and-Recover (UEFI and Secure Boot) or Relax-and-Recover (BIOS or UEFI without Secure Boot)? I noticed in the ISO creation output, it said Using '/boot/efi/EFI/redhat/shimx64.efi' as UEFI Secure Boot bootloader file but didn’t say anything about without Secure Boot bootloader file.

Thanks,

Randall L. Duckworth III
Linux Systems Administrator III | Information Technology
O (940) 321-7800 x.7502
7701 S Stemmons, Corinth, TX 76210

gdha commented at 2024-10-04 06:48:

UEFI_BOOTLOADER=/boot/efi/EFI/redhat/grubx64.efi
SECURE_BOOT_BOOTLOADER=/boot/efi/EFI/redhat/shimx64.efi

You configured it to use Relax-and-Recover (UEFI and Secure Boot), therefore, that's the way to boot from then.

However, you have a point as well and the ReaR team is aware of it and it is on our to-do list for improvements. The ISO booting architecture is 20 years old and need a proper rewrite! However, as always there is a but, lack of sponsoring, which forces us to lower its priority.

End users always forget that Open Source contributors need food as well and that is not free. This is a general remark and is not meant for you.

jsmeix commented at 2024-10-07 14:45:

@RLDuckworth
regarding

But should we be able to boot from either menu item
Relax-and-Recover (UEFI and Secure Boot) or
Relax-and-Recover (BIOS or UEFI without Secure Boot)?

Both items are there because what works on your specific
replacement hardware (physical or virtual "hardware")
depends on your specific replacement hardware
which could be a bit different compared to your
original hardware so try out what works for you.
Both may work or only one or even none may work.
As far as I can imagine what is used to boot
does not change what later "rear recover" does.

FYI:

My "boilerplate" regarding third-party backup tools:

In general regarding issues with third-party backup tools:

Usually we at ReaR upstream do not have or use
third-party backup tools (in particular not if
a third-party backup tool is proprietary software)
so usually we cannot reproduce issues with
third-party backup tools.

In case of issues with third-party backup tools and ReaR
we at ReaR upstream can usually do nothing but totally
depend on contributions and help from those specific users
who use and know about each specific third-party backup tool.

RLDuckworth commented at 2024-10-10 22:56:

Hi jsmeix,

I really do appreciate your assistance and what you guys do as I continue learning about this product!

What I'm working with -
I have two Lenovo SR630's so my hardware is exactly the same. We'll call them server A (a test server), and Server B (a production server). I'm fixing to do an in place upgrade from RHEL7 to RHEL8 on server B. My goal is to be able to create the ReaR ISO on server B, boot it on server A, and recover the system.

On server A, I'm able to create the ISO, boot the ISO, and recover the server. All is good.

On server B, I create the ISO and it looks like everything is good and I should be able to boot it on server A but it will not boot using the UEFI Secure Boot option (or any option).

I get this error

No filesystem could mount root, tried:

Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

I think it's probably a simple issue but I don't understand what it is just yet. The hardware is the same. The only difference is the firmware on server B is older than the firmware on server A. Do you think this could be the part of the issue?
rear-.log

I'm attaching the log file for review / reference.

Thanks very much,
Randall

jsmeix commented at 2024-10-11 08:16:

@RLDuckworth
I know nothing about Lenovo SR630 server hardware
and/or its firmware so I cannot really help
when the issue depends on that specific
hardware and/or its firmware.

Did you try out (only as a test) if it perhaps works
when you disable Secure Boot within the UEFI firmware
on your test server?
When it works without Secure Boot you know at least
that "something related to Secure Boot" is the reason.

When it works to boot the ReaR recovery system
without Secure Boot on the replacement hardware,
I think it should work to recreate the system
with "rear recover" and then shut down the
ReaR recovery system, re-enable Secure Boot
within the UEFI firmware of the replacement hardware
and reboot the recreated system with Secure Boot enabled.

If reboot the recreated system with Secure Boot
enabled fails, it should (hopefully) at least work,
to "just reboot" after "rear recover" so that the
recreated system boots with Secure Boot disabled,
then in the booted recreated system reinstall its
bootloader (which hopefully does "the right thing"
to reinstall its bootloader so that booting will also
work with Secure Boot enabled), then shut down the system,
re-enable Secure Boot within the UEFI firmware of the
replacement hardware and reboot the recreated system
with Secure Boot enabled.
If that also fails I am at my wit's end.
But I am not a booting expert in general
and even less a UEFI Secure Boot expert.
Perhaps asking Lenovo or Red Hat for support is then
the only feasible way out of "UEFI Secure Boot hell"?

For the (not so) fun of it:
Recently I installed for a friend Linux beside
the existing Windows on his Lenovo ThinkPad laptop and
its UEFI firmware (together with Windows) annoyed me
a lot, but finally - after a terrible fight - I won ;-)
What I hate with basically all firmware and bootloaders
that I ever had to do with is that firmware and bootloaders
do not verbosely tell what is going on when they search
and load a program and start it e.g. when the firmware
searches and loads a bootloader like shim or GRUB and
starts it and then what shim is doing when it searches
and loads a second stage bootloader like GRUB and then
what GRUB is doing when it searches and loads initrd
plus kernel and finally starts the kernel which is the
first program that verbosely tells what is going on
but when the kernel fails to start up the root cause is
usually that one of the steps before had silently failed.
Either it (somehow) boots or it silently fails somewhere
without meaningful error message and then one is left alone
to imagine where and what might have gone wrong and
to also imagine what the root cause might be :-(

jsmeix commented at 2024-10-14 13:10:

@RLDuckworth
above I wrote that I would be at my wit's end
if things also fail with Secure Boot disabled
but - as always - sleeping on an issue helps
so I remembered my
"Launching the ReaR recovery system via kexec"
section in
https://en.opensuse.org/SDB:Disaster_Recovery

It describes a generic way how one could boot
the ReaR recovery system without the need for
a proper setup of bootloader stuff (EFI programs
and first/second stage bootloaders setup)
within the ReaR recovery system because for
"Launching the ReaR recovery system via kexec"
only the ReaR recovery system kernel and its
matching ReaR recovery system initrd is needed.

RLDuckworth commented at 2024-10-17 14:05:

I was able to get past the boot issue. I had opened a support case with Red Hat and the support engineer led me to a KB article that
https://access.redhat.com/solutions/7010813

Excerpt of the KB -
Because of the hardware limitation, you must make sure that the initrd file is as small as possible, to get a chance to be loaded properly. For this to happen, ReaR must be tuned to limit the number of files embedded in the initrd._

The fix in my case was the MODULES=() stanza added to the local.conf and rerun mkrescue -d -v. I was then able to boot the ISO. Now I'm on to attempting the recovery of Server A to Server B which is new hardware.

Thanks very much for your assistance!

jsmeix commented at 2024-10-17 15:27:

@RLDuckworth
I am confused because in
https://github.com/rear/rear/issues/3324#issue-2557402429
you wrote

System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
x86_64

which matches what I see at the Lenovo site about their
Lenovo SR630 systems that the CPU is Intel Xeon.

In contrast
https://access.redhat.com/solutions/7010813
is about POWER architecture (PPC64LE)
where it is known that there are rather low limits
of the maximum size of the initrd, cf.
https://github.com/rear/rear/issues/3189

But - at least as far as I experienced up to now - there
are no such limits of the maximum size of the initrd
on x86 compatible architectures like x86_64.

RLDuckworth commented at 2024-10-17 15:34:

Yes, my architecture is x86_64 but the Red Hat engineer suspected
that it could be the same issue as what was described in
https://access.redhat.com/solutions/7010813.

jsmeix commented at 2024-10-18 07:42:

Ah!
That's interesting.
It means that also on x86_64 a ReaR recovery system initrd
can become so big in practice that it hits certain limits
how big an initrd can be on x86_64.

Your initial posting shows

Created initrd.cgz with gzip default compression (447 MiB) ...

which doesn't look overly huge - in contrast to
https://github.com/rear/rear/issues/2386
where the initrd became falsely far too big (more than 3GB).

By quick "googling" for initrd maximum size x86 I found
https://lists.nongnu.org/archive/html/help-grub/2015-02/msg00007.html
(dated Feb 2015) which reads (excerpt)

We have a special application (GNU/Linux 3.4.47, x86_64, BIOS,
PXE, OpenSuSE  grub2-2.00-1.6.1) that requires large initrds.
We are approaching the 462M size limit that appears to exist
on our architecture.

and by "googling" for GRUB_LINUX_INITRD_MAX_ADDRESS I found
https://lists.gnu.org/archive/html/grub-devel/2019-07/msg00068.html
(dated Jul 2019) which reads (excerpt)

So 32 bit arm requires kernel + initrd less than 512MB,
64 bit arm requires kernel + inirrd [sic] less than 32GB.
If I read the code correctly, booting x86 in efi mode
the max initrd file allocation is 1GB (0x3fffffff).
Seems the code is shared for 32 and 64 bit EFI and
doesn't allow anything more on 64 bit than 32 bit.

I didn't research what the maximum size of the initrd
is nowadays but it shows that there are some limits
at least in GRUB is one.
I don't know what "max initrd file allocation is 1GB"
means for the actual maximum size of the initrd
(in particular because 447MiB is much less than 1GB).

RLDuckworth commented at 2024-10-30 13:44:

Hi Johannes,

I've been able to create the ISO, boot the ISO, and do a simple restore from Rubrik so I'm going to call it good from here. I do need time to "absorb" all the material you sent and or mentioned. I mainly wanted to say thanks for your assistance and for all you guys do.

Thank you so much!
Randall

jsmeix commented at 2024-10-30 14:51:

You are welcome!
Thank you for your positive feedback.


[Export of Github issue for rear/rear.]