#1818 Issue closed: we have faced a network connectivity issue when the REAR ISO has been booted

Labels: support / question, fixed / solved / done

nirmal21s opened issue at 2018-05-23 09:38:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"): 2.3
  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"): RHEL 7.5
  • ReaR configuration files ("cat /etc/rear/site.conf" or "cat /etc/rear/local.conf"):
NRSERVER=xxxxxxxxxxxx
NSR_CLIENT_MODE=y
OUTPUT=ISO
ISO_PREFIX="rear-nsr-$HOSTNAME"
BACKUP=NSR
OUTPUT_URL=file:///tmp
# Static IP (no DHCP!)
USE_DHCLIENT=
USE_STATIC_NETWORKING="y"
# NTP
TIMESYNC=NTP
  • System architecture (x86 compatible or POWER and/or what kind of virtual machine):x86
  • Are you using BIOS or UEFI or another way to boot? Legacy BIOS
  • Brief description of the issue:we have faced a network connectivity issue when the REAR ISO has been booted:
    The problem is in the mapping of the server physical interfces to the ethXX logaical interface names. As you can see in the screenshot (1) the eth names are assigned to the physical interfaces almost in random which is incorrect configuration. When I removed network card kernel modules (rmmod tg3;rmmod igb) and probed than back again (modprobe tg3, modprobe igb) the mapping between the physical interfaces and logical interface names is correct as you can see on screenshot (2). The mapping is configured in the udev conf file located in /lib/udev/rules.d/70-persistent-net.rules This file is present on the REAR iso as well, seems not used by the REAR initramdisk during bootup though.
    image001
    image002
    70-persistent-net_rules.txt

nirmal21s commented at 2018-05-24 04:53:

Hello @jsmeix @hpannenb

Any suggestion for the above issue.
Kindly suggest me a fix....
Thanks in Advance

jsmeix commented at 2018-05-24 10:02:

@nirmal21s
I am afraid, my networking and/or udev knowledge is insufficient
to actually help with such issues in the recovery system.

As a possible workaround in this particular case when manual
rmmod tg3 ; mmod igb ; modprobe tg3 ; modprobe igb
helps before running "rear recover" you can automate that
via the PRE_RECOVERY_SCRIPT config variable,
see usr/share/rear/conf/default.conf

Such a PRE_RECOVERY_SCRIPT is run during "rear recover" via
usr/share/rear/setup/default/010_pre_recovery_script.sh
at a very early stage, cf.

# rear -s recover
Simulation mode activated ...
Source conf/Linux-i386.conf
Source conf/GNU/Linux.conf
Source conf/SUSE_LINUX.conf
Source init/default/005_verify_os_conf.sh
Source init/default/010_set_drlm_env.sh
Source init/default/030_update_recovery_system.sh
Source init/default/050_check_rear_recover_mode.sh
Source setup/default/005_ssh_agent_start.sh
Source setup/default/010_pre_recovery_script.sh
...

jsmeix commented at 2018-05-24 10:16:

@nirmal21s
another blind guess what might perhaps help is to use MODULES_LOAD
to enforce to load the specified kernel modules in the given order
in the rescue/recovery system so that e.g. something like

MODULES_LOAD=( tg3 igb )

might help in your case.

In general to debug what happens during recovery system startup:

In general regarding debugging issues with the start up scripts
that are run initially in the ReaR recovery system
the basic idea behind is to not let those start up scripts
run automatically and mostly silently but manually
one after the other with 'set -x' bash debugging mode.

Add 'debug' to the ReaR kernel command line
when booting the ReaR recovery/rescue system.

In the ReaR recovery/rescue system boot menue select
the topmost enty of the form "Recover HOSTNAME"
and press the [Tab] key to edit the boot command line
and append the word ' debug' at its end and boot that.

You may found yourself stopped at a blank screen.
In this case press [Enter] which runs the very first of the
start up scripts (/etc/scripts/system-setup.d/00functions.sh).
There is some bug that the initial message is not always
printed so you may need to type the very first [Enter] blindly.

The further start up scripts are run one by one
each one after pressing [Enter].

In 'debug' mode the start up scripts are run with 'set -x'
so that this way you can better see what actually goes on
in each of the start up scripts.

Cf.
https://github.com/rear/rear/issues/1703#issuecomment-360398508

nirmal21s commented at 2018-05-25 06:24:

@jsmeix
I will add the below parameters in local.conf and test it :

PRE_RECOVERY_SCRIPT=(rmmod tg3 ; mmod igb ; modprobe tg3 ; modprobe igb)
MODULES_LOAD=( tg3 igb )

I hope the parameter is correct.

jsmeix commented at 2018-05-25 09:17:

PRE_RECOVERY_SCRIPT and POST_RECOVERY_SCRIPT
have same syntax and in conf/examples/SLE12-SP2-btrfs-example.conf
https://github.com/rear/rear/blob/master/usr/share/rear/conf/examples/SLE12-SP2-btrfs-example.conf
there is an example of a POST_RECOVERY_SCRIPT

nirmal21s commented at 2018-05-25 13:28:

@jsmeix
Just want to know if the reason behind the issue might be due to this parameter "USE_STATIC_NETWORKING="y"
Since the Eth interface is :
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
NM_CONTROLLED=NO

So do i need to remove "USE_STATIC_NETWORKING="y"" and use USE_DHCLIENT=y ..

hpannenb commented at 2018-05-25 17:53:

@nirmal21s
I do not know where this "70-persistent-net.rules" file is coming from. As far as I read it in RHEL 7.x this is not used any more; at least I cannot find it under RHEL7.4 or Centos 7.5.

About USE_DHCLIENT: If You have an DHCP server available use it. But I assume not using DHCP is not the issue here.

BTW, can You provide the information which kind of server is in use?

Thanks.

nirmal21s commented at 2018-05-28 07:37:

@hpannenb
I am not getting your question .. you mean whether is a DHCP server or ?
The OS version is RHEL 7.5..
Thanks.

gdha commented at 2018-05-28 15:42:

@nirmal21s I believe the question was "do you have internally a DHCP server you can use to gain an IP address (during a recovery)?"
If the answer is yes, then use "USE_DHCLIENT=yes" in the local.conf file. It would fix your problem in an easy way.

nirmal21s commented at 2018-05-28 16:56:

@gdha @hpannenb
server has static ip only :
DEVICE=bond2
NAME=bond2
TYPE=bond
BOOTPROTO=static
ONBOOT=yes
IPADDR=10.209.x.xxx
PREFIX=22
#NETMASK=255.255.255.0
#USERCTL=no
NM_CONTROLLED=no
#PEERDNS=no
BONDING_MASTER=yes
BONDING_OPTS="mode=1 miimon=100"

nirmal21s commented at 2018-05-28 17:01:

@gdha
I am not able to prepare the syntax for PRE_RECOVERY_SCRIPT=(rmmod tg3 ; mmod igb ; modprobe tg3 ; modprobe igb)..

The below mentioned is correct :

for interface_count in `seq 0 9`

do

if [ -f "/sys/class/net/eth$interface_count" ]; then

            status="success"

else

            status="failed"

fi

done

if [ "$status" = "success" ]; then

            exit

else

            echo -e "Incorrect Interfaece sequence"


rmmod tg3 ; mmod igb ; modprobe tg3 ; modprobe igb

fi

gdha commented at 2018-05-29 07:45:

@nirmal21s You could try the following. Add above commands into a script and saved it as /usr/share/rear/skel/default/etc/scripts/system-setup.d/77-my-network-setup.sh and re-run rear -v mkbackup. When booting from the resulting ISO image this script should be kicked in.

nirmal21s commented at 2018-05-29 08:12:

@gdha
Thanks for the above recommendation.
Here the issue was after rear recover the network interface mixed up .
I was going through the issue 770 (https://github.com/rear/rear/issues/770) .
Just wondering if this would fix the issue
BACKUP_RESTORE_MOVE_AWAY_FILES=( /lib/udev/rules.d/70-persistent-net.rules )

gdha commented at 2018-05-29 08:44:

@nirmal21s good thinking as on RHEL 7.5 the /lib/udev/rules.d/70-persistent-net.rules is not required it is indeed best to remove it.

nirmal21s commented at 2018-05-29 11:00:

@gdha I will test it and confirm ..
created a ISO image by running mkrescue . for BACKUP=NSR.
For restoring the files we use Networker tool.
Hope everything goes well..

nirmal21s commented at 2018-05-31 15:25:

After removing the below parameters :

BACKUP_RESTORE_MOVE_AWAY_FILES=( /lib/udev/rules.d/70-persistent-net.rules )
USE_DHCLIENT=

We tried to crated a new ISO image and tried to boot.
Everything went well , after rear recover ,we got the MAC address.

But we were able to ping only this servers gateway and not able to ping the backup server IP.
issue was with routing i guess
After running the /etc/scripts/system-setup.d/60-netowrk-devices.sh & 62-routing.sh manually it worked and this server was rechable from backup server.
Any clue why REAR behaved like this.

razorfish-sl commented at 2018-06-03 04:34:

There were changes made to how linux boots & identifies the Ethernet interfaces.
potentially there were race conditions in how linux enumerated interfaces with some physical adaptions of the motheboards
Also sometimes the initial config of the setup may make it IMPOSSIBLE for REAR to discover the correct network....
It is NOT just a matter of sticking an IP address onto an interface in some cases.

In my case I ALWAYS have to boot my HP servers with "REAR" in "manual mode", since it is impossible for rear to guess my network config. (unless i make manual scripts)
A simple

"ip addr" gets me
2: enp3s0f0:

I then

MUST
ip addr add 172.18.10.xx/24 dev enp3s0f0
ip link set enp3s0f0 up

So how can REAR fail to find the ipaddress on a functioning network, seems foolish ?
here is an example
we already know enp3s0f0 is the external interface
but during the BACKUP... REAR sees this:

cat ifcfg-enp3s0f0
NM_CONTROLLED="no"
IPV4_FAILURE_FATAL="no"
NAME="enp3s0f0"
UUID="7452703b-4fdc-4e9f-accb-8c30f7803b3d"
DEVICE="enp3s0f0"
ONBOOT="yes"
TYPE=OVSPort
DEVICETYPE=ovs
OVS_BRIDGE=br-ex

cat ifcfg-br-ex
NM_CONTROLLED="no"
DEVICE=br-ex
DEVICETYPE=ovs
TYPE=OVSBridge
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="static"
IPADDR=172.18.10.70
NETMASK=255.255.255.0
GATEWAY=172.18.10.1
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="no"
IPV6_AUTOCONF="no"
IPV6_DEFROUTE="no"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
ONBOOT="yes"

Ha Har......
during recovery ,how can we expect REAR to deal with getting a suitable ip config from the above?
it is a backup program... not an AI to discover how data is routed out of the box.

jsmeix commented at 2018-06-04 12:16:

FYI
in general regarding
how to prepare network devices setup in the rescue/recovery system
see NETWORKING_PREPARATION_COMMANDS
in usr/share/rear/conf/default.conf

Perhaps that could help in this case?

nirmal21s commented at 2018-06-12 14:02:

Yes.. I have added the PRE_RECOVERY_SCRIPT=/usr/share/rear/skel/default/etc/scripts/system-setup.d/77-my-network-setup.sh.. It worked fine..

Thanks a lot for your support.

jsmeix commented at 2018-06-12 14:17:

@nirmal21s
thanks for your feedback!


[Export of Github issue for rear/rear.]