#2118 Issue closed: After booting from rear iso, it does not show testing server hard disk

Labels: support / question, special hardware or VM, no-issue-activity

samurdhi opened issue at 2019-04-12 19:13:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"):
[root@edda rear]# rear -V
Relax-and-Recover 2.4 / 2018-06-21
[root@edda rear]#
  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
[root@edda rear]# cat /etc/oracle-release 
Oracle Linux Server release 6.4
[root@edda rear]#
  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
OUTPUT=ISO
OUTPUT_OPTIONS="nfsvers=3,nolock"
OUTPUT_URL=file:///home/rear

BACKUP=NETFS
BACKUP_OPTIONS="nfsvers=3,nolock"
BACKUP_URL=file:///home/rear

REQUIRED_PROGS=( "${REQUIRED_PROGS[@]}" lsblk vim )
BACKUP_PROG_EXCLUDE=( ${BACKUP_PROG_EXCLUDE[@]} '/home/rear/*' '/tmp/*' )
ONLY_INCLUDE_VG=( "vg_edda" )
EXCLUDE_COMPONENTS=( "${EXCLUDE_COMPONENTS[@]}" "fs:/dev/sdb" )
  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):

Intel Physical Server: HP ProLiant DL380 Gen9

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):

x86

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):

UEFI

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):

Local Disk

  • Description of the issue (ideally so that others can reproduce it):

I can successfully run the rear backup of this server. Then the problem happens when I'm going to test this backup in another physical test box (Which I'm successfully testing lots of rear images of other servers).

When I'm booting this test box with rear image of the server, it failed to recover.
After running "rear recover" command it get failed because it says in target system does not have /dev/sda but this problem does not come when I'm mounting other servers rear images.

pic-01

When I type lsblk command to see local disk in test system, it does not show anything.

pic-02

  • Workaround, if any: Not yet

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):

jsmeix commented at 2019-04-15 10:17:

@samurdhi
I have no real good idea what the reason could be
when on a Intel Physical Server: HP ProLiant DL380 Gen9
its Local Disk is not available for the kernel.

I guess there is something special on HP ProLiant
but I do not use or have any HP ProLiant system.

I think when there is no device node (like /dev/sda) for the local disk
that local disk needs a special kernel module to be loaded.

Usually kernel modules for local disks get loaded automatically
provided the needed kernel modules are there.
But by default the ReaR recovery system does not include all kernel modules
so that special kernel modules could be missing in the recovery system, cf.
https://github.com/rear/rear/issues/1202

In particular when you use the ReaR recovery system
on not 100% compatible replacement hardware,
cf. "Fully compatible replacement hardware is needed"
in https://en.opensuse.org/SDB:Disaster_Recovery
you usually need to specify in etc/rear/local.conf

MODULES=( 'all_modules' )

see its documentation in usr/share/rear/conf/default.conf

By the way: MODULES=( 'all_modules' )
will be the default in the upcoming ReaR 2.5
cf. https://github.com/rear/rear/issues/2041
and https://github.com/rear/rear/pull/2069

samurdhi commented at 2019-04-15 14:39:

@jsmeix

Thanks, Jsmeix for your reply.
I have updated local.conf file and included line (MODULES=( 'all_modules' )) and then took a new rear image. Then I boot my recovery testing server with this image and no luck.
Still I'm unable to see any internal disks,

pic-01

jsmeix commented at 2019-04-24 10:00:

@samurdhi
I am afraid, I have neither a further good idea what might help here
nor can I imagine what the reason could be why there is
no device node for the local disk in the ReaR recovery system.

In your initial description
https://github.com/rear/rear/issues/2118#issue-432707046
you wrote that you have a HP ProLiant DL380 Gen9.
I guess that is where your original system runs.
But you did not write what exactly your another physical test box is.
Is that also a HP ProLiant DL380 Gen9 or something different?
If it is different, what exactly is that another physical test box?

Some blind shots into the dark:

When you boot another installation system on that another physical test box
e.g. an Oracle Linux Server release 6.4 installation system
or a Linux distribution installation system - preferably the matching RHEL 6.4
according to "Release history" in https://en.wikipedia.org/wiki/Oracle_Linux
is then a device node for the local disk in those installation systems?

Because you are
successfully testing lots of rear images of other servers
on that another physical test box:
What Linux systems do you use on that other servers?

Perhaps it helps to enforce loading of all kernel modules that are loaded
on the original system where "rear mkbackup/mkrescue" was run
also in the ReaR recovery system via

MODULES_LOAD=( first_module second_module ... )

You may use a command like

# tac /proc/modules | cut -d ' ' -f1

to get the currently loaded modules in the right ordering
for MODULES_LOAD i.e. the first loaded module first.
But loading all kernel modules of the original system
in the ReaR recovery system may totally fail when the
ReaR recovery system hardware is too different compared
to the original system hardware.

To better see what happens during ReaR recovery system startup
add the kernel command line option debug.
When you are in boot menu of the ReaR recovery system
select the topmost entry

Recover host_name_of_your_original_system

and then press the Tab key to edit the kernel command line
where you can append debug (separated with a space).
This lets the ReaR recovery system startup scripts
run one by one (you need to press Enter to start each one)
where each one is run with 'set -x' so that it prints the commands
and their arguments as they are executed.
Perhaps you can detect something suspicious?

My reasoning behind is:
In the ReaR recovery system the exact same kernel is run as
on the original system where "rear mkbackup/mkrescue" was run.
With MODULES=( 'all_modules' ) you get the exact same kernel
and all its kernel modules in the ReaR recovery system as on the
original system where "rear mkbackup/mkrescue" was run.
When there is no device node for the local disk in the ReaR recovery system
on another physical test box it means - as far as I can imagine - that
the kernel with all its kernel modules from the original system
does not work on that another physical test box.
Accordingly I think the original system and that another physical test box
must somehow differ so that the kernel with all its kernel modules from the
original system does not work on that another physical test box.

github-actions commented at 2020-06-27 01:33:

Stale issue message


[Export of Github issue for rear/rear.]