#2594 Issue closed: rsync connectivity check failed in auto recovery mode in centOS7 with latest rear 2.6 version

Labels: enhancement, support / question, fixed / solved / done

cvijayvinoth opened issue at 2021-03-30 17:31:

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.6 / Git

  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
OUTPUT=ISO
BACKUP=RSYNC
RSYNC_PREFIX="test2_${HOSTNAME}"
BACKUP_PROG="/var/www/html/imageBackup/rsync"
OUTPUT_URL=rsync://test2@XXXXXXXX::rsync_backup
BACKUP_URL=rsync://test2@XXXXXXXX::rsync_backup
ISO_DIR="/var/www/html/imageBackup/iso/$HOSTNAME"
MESSAGE_PREFIX="$$: "
BACKUP_RSYNC_OPTIONS+=(-z --progress --password-file=/etc/rear/rsync_pass)
BACKUP_PROG_EXCLUDE+=( "$(</etc/rear/path.txt)/imageBackup/iso/*" "$(</etc/rear/path.txt)/imageBackup/Log/*" "/etc/rear/rsync_pass" )
PROGRESS_MODE="plain"
AUTOEXCLUDE_PATH=( /tmp )
PROGRESS_WAIT_SECONDS="1"
export TMPDIR="/var/www/html/imageBackup/iso/"
PXE_RECOVER_MODE=automatic
ISO_FILES=("/var/www/html/imageBackup/rsync")
ISO_PREFIX="${HOSTNAME}"
ISO_DEFAULT="automatic"
  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    PC

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86 compatible

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    BIOS and GRUB

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local disk

  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT" or "lsblk" as makeshift):

NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda               8:0    0 111.8G  0 disk 
├─sda1            8:1    0     1G  0 part /boot
└─sda2            8:2    0 110.8G  0 part 
  ├─centos-root 253:0    0  82.8G  0 lvm  /
  ├─centos-swap 253:1    0     8G  0 lvm  [SWAP]
  └─centos-home 253:2    0    20G  0 lvm  /home
  • Description of the issue (ideally so that others can reproduce it):

It is not working on my rear 2.5 codebase.
So i have installed and tested with rear latest version 2.6 too.

When I run auto recover, on 100_check_rsync.sh
the rsync check protocol connectivity is getting failed and it throw's error like

Test: rsync --sparse --archive --hard-links --numeric-ids --stats -z --progress --password-file=/etc/rear/rsync_pass rsync://test2@XXXXXX:1873/
446: 2021-03-30 09:46:44.770914837 ERROR: Rsync daemon not running on XXXXXX

( post this if i run rear -v recover manually it works fine )

If I run auto recover with debug mode, on 100_check_rsync.sh
the rsync check protocol connectivity is working fine.
With manual recover option too it works fine.
May I know what might be the issue?
Here I have attached both debug mode and normal auto recovery logs attached for your reference.

  • Workaround, if any:

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):

rear-localhost_debug.log
rear-localhost.log

jsmeix commented at 2021-03-31 09:31:

Only an offhanded blind guess:

Perhaps a race happens in auto recovery mode where the rsync connectivity check
is run earlier than the rsync daemon needs to get started up completely?

Some daemons need some startup time to do some initialization work
until the daemon becomes ready to accept actual client connections.

So perhaps adding some sleep time like sleep 10 at the beginning of
usr/share/rear/prep/RSYNC/default/100_check_rsync.sh
might help?

cvijayvinoth commented at 2021-03-31 09:48:

@jsmeix : Thanks for your reply. Let me check once and update .

cvijayvinoth commented at 2021-04-01 06:06:

Yes. Post applying the sleep, it started to work.

gdha commented at 2021-04-06 14:49:

@cvijayvinoth Try to decrease the 'sleep 10' to 'sleep 5' to verify if that would be sufficient as waiting on the remote rsync daemon. Too much sleep delays the recovery time in its whole and we do not want users to press control-C because they think it is hanging for some reason. Thanks for checking this out for us. We appreciate it ;-)

cvijayvinoth commented at 2021-04-09 14:35:

Sure @gdha : Thanks for your reply and support.

cvijayvinoth commented at 2021-04-15 14:17:

@gdha : Its working fine with sleep 5. Thanks a lot...

gdha commented at 2021-04-15 14:19:

@cvijayvinoth Thanks for the check on your side and in the meantime a PR is waiting to update this in our master branch.

jsmeix commented at 2021-04-16 08:34:

Only FYI:

I do no longer understand my "offhanded blind guess" in
https://github.com/rear/rear/issues/2594#issuecomment-810922738

Reason (as fas as I imagine how things are):

When "rear recover" errors out in 100_check_rsync.sh with

"Rsync daemon not running on $RSYNC_HOST"

it is about the rsync daemon on the remote RSYNC_HOST
(or do I misunderstand something here?)
and I do not understand how a sleep 5 on the machine where "rear recover" is running
could have any "positive effect" on the rsync daemon on the remote RSYNC_HOST.

A sleep 5 on the machine where "rear recover" is running must be related
to something on the machine where "rear recover" is running
that needs some time.
Currently I don't know what that "something" could be.
My next "offhanded blind guess" what that "something" could be
is "something network related" on the machine where "rear recover" is running.

So perhaps in general with things like PXE_RECOVER_MODE=automatic
or any other automated run of "rear recover" we should perhaps have
a general sleep 5 delay before "rear recover" is launched automatically
to be more on the safe side that all network related stuff is ready to be used
on the machine where "rear recover" is running?

jsmeix commented at 2021-05-03 10:18:

Seems to be fixed via https://github.com/rear/rear/pull/2599


[Export of Github issue for rear/rear.]