#2262 Issue closed: CDM: When the an incorrect IP address is specified for the new CDM cluster, ReaR should prompt the user again

Labels: enhancement, no-issue-activity

DamaniN opened issue at 2019-10-24 19:46:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"):
    2.6 - dev

  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
    All

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
    N/A

  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    All

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    All

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    All

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    N/A

  • Description of the issue (ideally so that others can reproduce it):
    When using the Rubrik CDM integration, if the user is recovering from another cluster (not the original) they are prompted to enter the IP address of the cluster. If they enter an IP address that does not work they are thrown back to the command prompt. Instead, they should be prompted for the correct IP address again.

  • Workaround, if any:

Re-run rear recover and start over.

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
RESCUE dn-rear-centos-7-6:~ # rear recover
Relax-and-Recover 2.5 / Git
Running rear recover (PID 5415)
Using log file: /var/log/rear/rear-dn-rear-centos-7-6.log
Running workflow recover within the ReaR rescue/recovery system

Is the data being restored from the original CDM Cluster? (y/n)
(default 'y' timeout 300 seconds)
n
Enter one of the IP addresses for the replica CDM cluster:
1.2.3.4
ERROR: Could not download https://1.2.3.4/connector/rubrik-agent-sunos5.10.sparc.tar.gz
Some latest log messages since the last called script 410_use_replica_cdm_cluster_cert.sh:
  * About to connect() to 1.2.3.4 port 443 (#0)
  *   Trying 1.2.3.4...
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                   Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* No route to host
  * Failed connect to 1.2.3.4:443; No route to host
  * Closing connection 0
  curl: (7) Failed connect to 1.2.3.4:443; No route to host
Aborting due to an error, check /var/log/rear/rear-dn-rear-centos-7-6.log for details
Exiting rear recover (PID 5415) and its descendant processes ...
Running exit tasks
Terminated

DamaniN commented at 2019-10-24 19:47:

This issue can be assigned to me to fix.

jsmeix commented at 2019-10-25 07:39:

@DamaniN
thank you for your continuous verification and testing
and your improvements of your BACKUP=CDM implementation
while ReaR 2.6 is still under development so that when it gets relased
most of the striking BACKUP=CDM issues will be solved for the users.

In general regarding
"in case of invalid user input - let the user retry if possible"
you may have a look at https://github.com/rear/rear/issues/2253
and my current implementation for example in
verify/NBU/default/390_request_point_in_time_restore_parameters.sh
https://github.com/rear/rear/blob/master/usr/share/rear/verify/NBU/default/390_request_point_in_time_restore_parameters.sh
and
verify/TSM/default/390_request_point_in_time_restore_parameters.sh
https://github.com/rear/rear/blob/master/usr/share/rear/verify/TSM/default/390_request_point_in_time_restore_parameters.sh

jsmeix commented at 2019-10-25 14:45:

@DamaniN
FYI a side note regarding "Re-run 'rear recover' and start over":

It is not possible to "just" re-run 'rear recover' - in particular not
when it had failed while or after the disk layout was recreated
(e.g. when it fails while the backup is to be restored)
because in this case the target system disks are either in an
incomplete state (when the disk layout was only partially recreated)
or filesystems on the target system disks are already mounted
(below /mnt/local in the recovery system).

So before one can re-run 'rear recover' one has to manually
clean up the target system disks state, see also the section
"Prepare replacement hardware for disaster recovery" in
https://en.opensuse.org/SDB:Disaster_Recovery

We do not yet have a "cleanupdisk" script that could do that
automatically, cf. https://github.com/rear/rear/issues/799

DamaniN commented at 2019-10-25 16:00:

@jsmeix, this issue occurs before the disk layout is set so I think the re-run workaround is still valid. Thanks for the pointers to your earlier work I'll use those as a guide.

github-actions commented at 2020-06-29 01:37:

Stale issue message


[Export of Github issue for rear/rear.]