#1912 Issue closed: RHEL 6.9 and REAR 1.17.2 - restore problem

Labels: waiting for info, support / question

will-code-for-pizza opened issue at 2018-09-18 16:13:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"):
    1.17.2

  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
    rhel 6.9 amd64

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
    site.conf

BACKUP=NETFS
OUTPUT=ISO
RETENTION_TIME="Week"
USE_CFG2HTML=y
BACKUP_PROG=tar
BACKUP_PROG_COMPRESS_OPTIONS="--gzip"
LOGFILE="/var/log/rear/$HOSTNAME.log"
BACKUP_PROG_EXCLUDE=( '/cache/*' '/tmp/*' '/dev/shm/*' '/appl/log/*' '/appl/glide/nodes/*' '/appl/glide/temp/*' '/appl/glide/logs/*' '/var/log/*' '/var/cache/*' '/var/tmp/*' $VAR_DIR/output/\* )

local.conf

NETFS_URL=nfs://139.1.xxx.xxx/backup/REAR
NETFS_KEEP_OLD_BACKUP_COPY=yes
  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    HP Proliant DL 360/380

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86 compatible

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    System ROM P89 v2.60 (05/21/2018)

  • Storage (lokal disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local disks. Backup-Server provides nfs v3 share (jus nfs server)

  • Description of the issue (ideally so that others can reproduce it):
    after creating restore-ISO file with
    /usr/sbin/rear -v -c /etc/rear/server2backup mkrescue > /var/log/rear/rear_rescue.log
    and creating backup with
    /usr/sbin/rear -v -c /etc/rear/server2backup mkbackuponly > /var/log/rear/rear_backup.log

we want to restore server by booting with rescue-ISO image.

Choosing "Automatic recover" and the following kernel parameters:

kernel initrd=initrd.cgz root=/dev/ram0 vga=normal rw selinux=0 console=ttyS0,9600 console=ttyS1,9600 console=tty0 auto_recover unattended ip=139.1.xxx.xxx nm=255.255.255.192 netdev=eth0 gw=139.1.xxx.xxx

the restore process stops in
rear>

"Please start the restore process on your backup host."

But this host is just an NFS server.

What to do next ?

Regards and thanks in advance.
Will.

  • Work-around, if any:
    Not yet.

jsmeix commented at 2018-09-19 07:59:

ReaR version 1.17.2 is rather old - from August 2015, cf.
http://relax-and-recover.org/documentation/release-notes-2-4

Try out if a more current version works better for you, cf.
http://relax-and-recover.org/download/

See
http://relax-and-recover.org/documentation/release-notes-2-4
what has changed from ReaR 1.17.2 to the current ReaR 2.4.
There are some possibly backward incompatible changes.

I would also recommend to try out our current
ReaR upstream GitHub master code as follows:

Basically "git clone" it into a separated directory and then
configure and run ReaR from within that directory like:

# git clone https://github.com/rear/rear.git

# mv rear rear.github.master

# cd rear.github.master

# vi etc/rear/local.conf

# usr/sbin/rear -D mkbackup

Note the relative paths "etc/rear/" and "usr/sbin/".

I recommend to try out our current ReaR upstream GitHub master code
because that is the only place where we fix bugs - i.e. bugs in older
ReaR versions are not fixed by us (i.e. by ReaR upstream).

I don't know if your issue is a bug in ReaR, a missing feature in ReaR,
or if it is caused by inappropriate configuration settings by you.

If your issue is a bug or a missing feature in ReaR:

Bugs in older ReaR versions are not fixed by us (i.e. by ReaR upstream).
Missing features in older ReaR versions are not backported by us.

Bugs in older ReaR versions that got fixed in current ReaR upstream
GitHub master code might be fixed (provided the fix can be backported
with reasonable effort) by the Linux distributor wherefrom you got your
older ReaR version.

Missing features in older ReaR versions are usually not backported
by the Linux distributor wherefrom you got your older ReaR version.

If the issue also happens with current ReaR upstream GitHub master code
please provide us a complete ReaR debug log file of "rear -D recover"
plus your initial/original disklayout.conf file from your original system
so that we can have a look how it behaves in your particular environment
cf. "Debugging issues with Relax-and-Recover" at
https://en.opensuse.org/SDB:Disaster_Recovery

If it works with current ReaR upstream GitHub master code
we would really appreciate an explicit positive feedback.

gdha commented at 2018-09-19 08:45:

@will-code-for-pizza if you have a support contract with RedHat you may log a case with them as they still support 1.17.2 for RHEL 6.9

will-code-for-pizza commented at 2018-09-20 13:53:

Thanks for reply. I have opened a ticket at RedHat. For me, the issue is the old 2.6.xx kernel, which is not able to re-read the partition layout after parted.

Creating partitions for disk /dev/sda (msdos)
An error occurred during layout recreation. 
1) View Relax-and-Recover log 
2) View original disk space usage 
3) Go to Relax-and-Recover shell 
4) Edit restore script (diskrestore.sh) 
5) Continue restore script 
6) Abort Relax-and-Recover #? 

6 

ERROR: There was an error restoring the system layout. 
See /var/log/rear/rear-biesv007005.log for details. 
Aborting due to an error, check /var/log/rear/rear-hostname.log for details 
You should also rm -Rf /tmp/rear.K24Hi7OQTdcQQOl 
Terminated 

RESCUE hostname:~ # partprobe 
Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy). As a result, it may not reflect all of your changes until after reboot.

RESCUE hostname:~ # uname -a
Linux hostname 2.6.32-696.23.1.el6.x86_64 # 1 SMP Sat Feb 10 11:10:31 EST 2018 x86_64 x86_64 x86_64 GNU/Linux

I guess that upgrading to RHEL 6.10 does not change anything (kernel-2.6.32-754.3.5.el6.x86_64.rpm)

will-code-for-pizza commented at 2018-09-20 15:46:

Trying ReaR 2.4-git does not improve anything.

https://pastebin.com/VjPfvRHU (line 375)

schlomo commented at 2018-09-20 16:01:

I remember that re-reading the partition table was working in 2015 and even in 2008 and before. So maybe there is another reason for your problem? Can you try sfdisk -R instead? Or blockdev --rereadpt (not sure if available on your OS). Maybe fuser -avm /dev/sd* will show you the culprit?

As a workaround, maybe the following works: restart the system and try again and then choose to continue the restore script.

will-code-for-pizza commented at 2018-09-20 16:32:

Hi Schlomo...

Thanks for the quick reply.
I´ve tried your commands in ReaR 2.4-git

rear> sfdisk -R /dev/sda
BLKRRPART: Device or resource busy
This disk is currently in use.

rear> blockdev --rereadpt /dev/sda
BLKRRPART: Device or resource busy

rear> fuser -avm /dev/sd*
bash: fuser: command not found   ( --> not integrated in the rear-iso)

Rebooting and re-initiating the recovery process didn´t solve the problem. ReaR created the disk layout again and the kernel tried to re-read the partitions... and failed.

Regards.

schlomo commented at 2018-09-20 16:41:

I meant to chose 5 in the error menu where you chose 6

Trying to narrow it down further: Does this happen on all your RHEL 6 systems?

jsmeix commented at 2018-09-21 09:37:

@will-code-for-pizza
as far as I see special issues like your
https://github.com/rear/rear/issues/1912#issuecomment-423191706
can be in general solved by editing the diskrestore.sh script
as you need it in your particular case.

The diskrestore.sh script is run with set -e which lets it
exit immediately if a command exits with a non-zero status.

I think in your particular case it might work when you simply
disable the partprobe calls or change them to something like

... partprobe ... || true

to get a command that exits always with zero exit code.

Background information how "rear recover" recreates the disk layout:
According to the disk layout information in the disklayout.conf file
the diskrestore.sh script is generated. The diskrestore.sh script
contains all commands that are needed to recreate the disk layout.
Accordingly by editing the diskrestore.sh script you can change
anything as you need it to make recreating the disk layout work
in your particular case.

In general when you know in advance about an issue
I would recommend to edit the diskrestore.sh script
before it is run and not after it was run partially and failed
because when it was run partially you have a partially
set up disk which means a second run of the (now edited)
diskrestore.sh script may fail early because of whatever
conflicts with stuff that was already partially set up.

To be able to edit the diskrestore.sh script before it is run
you need to enforce the so called "migration mode"
where manual disk layout configuration happens
via several user dialogs where in particular one of
those user dialogs lets you edit the diskrestore.sh script.

To be able to enforce the "migration mode"
you need a current ReaR version, cf.
http://relax-and-recover.org/documentation/release-notes-2-4
about the changes related to "migration mode".

See usr/share/rear/conf/default.conf about how to
enforce the "migration mode":
In the running ReaR recovery system call

export MIGRATION_MODE='true'

before you run rear -D recover.

jsmeix commented at 2018-09-21 12:34:

@will-code-for-pizza
your
https://pastebin.com/VjPfvRHU
for ReaR 2.4-git shows that diskrestore.sh fails with

+++ parted -s /dev/sda mklabel msdos
Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy).  

so that now it is parted itself that fails which does not look really promising.
Nevertheless you may try out if things work when you ignore that warning via

parted ... || true

For the (not really) fun of it you may have a look at
https://github.com/rear/rear/issues/793
and other issues linked therein, in particular
https://github.com/rear/rear/issues/791

jsmeix commented at 2018-09-21 13:08:

@will-code-for-pizza
in your
https://pastebin.com/VjPfvRHU
it seems you have LVM volumes or things like that.

Perhaps you get that
kernel failed ... /dev/sda (Device or resource busy)
because you are hit by things as described in
https://github.com/rear/rear/issues/799
or in the other issues linked therein.

Does it perhaps help when you manually wipe the disk
like I described in https://github.com/rear/rear/issues/799
in the running ReaR recovery system before you run rear recover
i.e. something like

wipefs -a -f /dev/sda9
wipefs -a -f /dev/sda8
...
wipefs -a -f /dev/sda1
wipefs -a -f /dev/sda

(which wipefs options work depend on your particular wipefs version).

will-code-for-pizza commented at 2018-09-21 17:37:

Many thanks for these great hints and tips. I will try them all and will reply soon.

Regards.

will-code-for-pizza commented at 2018-09-25 07:54:

I meant to chose 5 in the error menu where you chose 6
Trying to narrow it down further: Does this happen on all your RHEL 6 systems?

Yes, it does. All systems are RHEL 6.9

I am about to test all your hints now. Give me some hours to reply.
Regards.

will-code-for-pizza commented at 2018-09-25 10:57:

I meant to chose 5 in the error menu where you chose 6

Hi Schlomo,

if I choose 5) from the list in 1.17-2, it throws the following message:

"Script /var/lib/rear/layout/diskrestore.sh has not been changed, restarting has no impact." and returns to the list.

Nevertheless, the rear-2.4-git from TODAY (last commit 24b5853) works out of the box ! :-)
The 2.4-git from Thursday doesn´t.

This is funny, because the only change was in
rear/usr/share/rear/backup/NETFS/default/500_make_backup.sh

https://github.com/rear/rear/commit/8c835421ace2f86df80c6c70d8138e22e358d471#diff-f00be59f5ba7104a134e2ddc6864a423

https://github.com/rear/rear/commit/ba1b6db879514ca34e679ec9ec481a7361b36e36#diff-f00be59f5ba7104a134e2ddc6864a423

https://github.com/rear/rear/commit/dab8dbc5887ee4fbc4405e2a6c38aaf7eae45488#diff-f00be59f5ba7104a134e2ddc6864a423

and for me the commits have no relations to my problem. But I might be wrong.

Additional tests with 1.17-2 results with ongoing and steady problems, as described above.
I now will patch the 1.17-2 script with the " || true" option for all "parted" calls.

Keep your fingers crossed.
Regards.

will-code-for-pizza commented at 2018-09-25 11:03:

Perhaps you get that
kernel failed ... /dev/sda (Device or resource busy)
because you are hit by things as described in
#799
or in the other issues linked therein.

Does it perhaps help when you manually wipe the disk
like I described in #799
in the running ReaR recovery system before you run rear recover
i.e. something like

wipefs -a -f /dev/sda9
...

Yes, this message occurs permanently.

Unfortunatelly 'wipefs' is not available on the rescue ISO (RHEL 6.9) :-(
Would it help to install the rpm ? Would it be integrated in the rescue ISO after "mkrescue" ??

Regards.

will-code-for-pizza commented at 2018-09-25 12:20:

Additional info:

# parted --version
parted (GNU parted) 2.1

jsmeix commented at 2018-09-25 14:27:

@will-code-for-pizza
regarding 'wipefs' not available on the rescue ISO:

If 'wipefs' is installed on your original system it should get
automatically included in the ReaR recovery system via
usr/share/rear/layout/save/GNU/Linux/230_filesystem_layout.sh

# If available wipefs is used in the recovery system by 130_include_filesystem_code.sh
# as a generic way to cleanup disk partitions before creating a filesystem on a disk partition,
# see https://github.com/rear/rear/issues/540
# and https://github.com/rear/rear/issues/649#issuecomment-148725865
# Therefore if wipefs exists here in the original system it is added to REQUIRED_PROGS
# so that it will become also available in the recovery system (cf. 260_crypt_layout.sh):
has_binary wipefs && REQUIRED_PROGS=( "${REQUIRED_PROGS[@]}" wipefs ) || true

You can make anything available in the ReaR recovery system
provided you have that installed on your original system
when you specify it via config variables like
COPY_AS_IS, PROGS, REQUIRED_PROGS, and LIBS,
see default.conf about those config variables.

To make 'wipefs' available in the ReaR recovery system use

REQUIRED_PROGS=( "${REQUIRED_PROGS[@]}" wipefs )

To see what there actually is in the ReaR recovery system set

KEEP_BUILD_DIR="yes"

and ReaR's (temporary) working area where it builds in particular
the rescue/recovery system ISO image is not automatically removed
at the end of "rear mkrescue/mkbackup" so that you can
inspect the ReaR recovery system content in
$TMPDIR/rear.XXXXXXXXXXXXXXX/rootfs/
directly without the need to extract it from the initramfs/initrd
in the ISO image (again see default.conf for that info).

gdha commented at 2018-10-02 16:23:

@will-code-for-pizza is your issue sufficient answered?

gdha commented at 2018-10-10 15:26:

no reply means all fine I guess?


[Export of Github issue for rear/rear.]