#960 PR merged: Some fixes and addons for recovery system networking setup

Labels: enhancement, cleanup, waiting for info, fixed / solved / done

jsmeix opened issue at 2016-08-11 15:53:

Changed how rescue/GNU/Linux/31_network_devices.sh
behaves when DHCP should be used for recovery system
networking setup: Now in this case it does no longer generate /etc/scripts/system-setup.d/60-network-devices.sh
(before it generated one which does only 'return').

Added support for manual recovery system networking setup
via NETWORKING_SETUP_COMMANDS, see default.conf.
This functionality is intended to be helpful as some kind of
"last resort" when the default automated network devices setup
does not at all work in some special cases.
I tested it on my SLES12-SP2-beta5 test system with

NETWORKING_SETUP_COMMANDS=( 'ip addr add 10.160.64.54/16 dev eth0' 'ip link set dev eth0 up' 'ip route add default via 10.160.255.254' )

and for me it works.

see https://github.com/rear/rear/issues/951

jsmeix commented at 2016-08-11 17:37:

@gdha
please have a look in particular at this change in
usr/share/rear/rescue/GNU/Linux/31_network_devices.sh
Before it was:

# Skip netscript if noip is configured on the command line:
cat <<EOT >> $netscript
if [[ -e /proc/cmdline ]] ; then
    if grep -q 'noip' /proc/cmdline ; then
        return
    fi
fi
EOT
# Add a line at the top of netscript to skip if dhclient will be used:
cat - <<EOT > $netscript
# if USE_DHCLIENT=y then use DHCP instead and skip 60-network-devices.sh
[[ ! -z "\$USE_DHCLIENT" && -z "\$USE_STATIC_NETWORKING" ]] && return
...
EOT

Therin the > $netscript was plain wrong because
it made the above >> $netscript useless.
I fixed that but that is not my actual question here.

Now it is

# When DHCP is used via 58-start-dhclient.sh do not mess up the existing networking setup
# here by possibly conflicting network devices setup via 60-network-devices.sh:
if test $USE_DHCLIENT -a -z $USE_STATIC_NETWORKING ; then
    Log "No network devices setup via 60-network-devices.sh because that happens via DHCP in 58-start-dhclient.sh."
    # To be on the safe side remove network_devices_setup_script if it somehow already exists:
    rm $v -f $network_devices_setup_script >&2
    return
fi

This changes the behaviour becaus now it evaluates
the USE_DHCLIENT and USE_STATIC_NETWORKING
variables during "rear mkbackup" and not during booting
in the recovery system.

But I think now it is still correct and much simpler
because I cannot imagine how the values of the
USE_DHCLIENT and USE_STATIC_NETWORKING
variables could be changed between "rear mkbackup"
and booting in the recovery system.

I know that after the recovery system has booted
(i.e. after one has logged in ther as root) one
can change /etc/rear/local.conf but then
/etc/scripts/system-setup.d/60-network-devices.sh
was already run so that changes of the
USE_DHCLIENT and USE_STATIC_NETWORKING
variables in the recovery system have no effect on
/etc/scripts/system-setup.d/60-network-devices.sh

gdha commented at 2016-08-11 17:52:

@jsmeix issue #819 is also linked to this I think. Will check it tomorrow.

jsmeix commented at 2016-08-11 17:57:

Ah!
your
https://github.com/rear/rear/issues/819#issuecomment-224315823
shows a real behavioural difference / regression
because with my change there is no longer a
/etc/scripts/system-setup.d/60-network-devices.sh
in case of DHCP.

Therefore I will change my change so that even in case of DHCP
the /etc/scripts/system-setup.d/60-network-devices.sh is
generated as usual so that it is available in the recovery system.

@gdha
many thanks for pointing out that regression to me!

gozora commented at 2016-08-11 21:19:

UEFI starts to be pretty boring, so here I am messing a bit with networking :-) (hope nobody minds).
My first stop was 58-start-dhclient.sh and I was a bit surprised here.

I have three basic questions (most probably for @gdha):

  1. Did I understood it well, that once original system have DHCP enabled on at least one interface, rear recover will start $DHCLIENT_BIN and stop processing with remaining network configuration?
  2. Is it desired behavior to try to start $DHCLIENT_BIN on first interface only?
    e.g. since I had eth0 connected to DHCP (network 10.0.2.0) and eth1 to static setup (network 192.168.0.0) all was "fine", (eth0 was started with DHCP and eth1 left down)
    but as soon as I "switch cables" and connected eth0 to network 192.168.0.0 and eth1 to (10.0.2.0 - DHCP network) none of the interfaces went up :-( (network addresses are here only for illustration)
  3. do you have some documents/issues/comments/anything describing how you intended to setup network in rear recover phase?

I personally don't mind setting up network manually (I've used recovery solution where I needed to load firmware to NIC prior driver loading not talking about pure happiness when I was able to successfully finish PXE boot without kernel panic or hang, so for me ReaR is real RELAX and RECOVER :-)), but I've understood that your policy is to bring rear recover to point where everything is ready.

jsmeix commented at 2016-08-12 09:41:

@gozora
I guess you mean since you fixed basically all UEFI issues
in Relax-and-Recover UEFI starts to be pretty boring
but you like to avoid to get over-relaxed so that now
you intend to fix basically all networking issues ;-)
I have great plans for you:
After you had fixed basically all networking issues
you could fix all systemd issues which nowadays
include all udev issues ;-)

Seriously:
I have to do some major changes here
see https://github.com/rear/rear/pull/960#issuecomment-239239792
that I (hopefully) commit today...

gozora commented at 2016-08-12 10:32:

@jsmeix actually working on ReaR gave a the reason to study UEFI and boot loaders in more detail (there is nothing like a good motivation). And yes, couple days ago I've started to dig into systemd, so I hope this knowledge will become useful one day.

When talking about network automatic configuration, I'd like to know what is ReaR policy:

  • would you like to have network configured exactly same way as on original system when in recovery mode (including all potentional networks / bonds / taps / tuns ...)
  • or having minimal network configuration that is sufficient for restore

?

Thanks for that link!

jsmeix commented at 2016-08-12 11:36:

I replaced support for NETWORKING_SETUP_COMMANDS
with support for NETWORKING_PREPARATION_COMMANDS
see default.conf for description.

jsmeix commented at 2016-08-12 11:36:

I still need to test it...

jsmeix commented at 2016-08-12 12:37:

Regarding
https://github.com/rear/rear/pull/960#issuecomment-239414330
what network config should be set up in the recovery system:

Personally I would prefer to do only a simple
minimal network setup that is sufficient for restore.

But I fear enterprise users love to have complicated
network setups where there is no such thing as a
simple network setup that is sufficient for restore.

Cf. https://github.com/rear/rear/issues/951#issuecomment-238651910

Unfortunately in our env we do not have DHCP assigned
IP addresses and we MUST use VLAN tagging.

and I also remember a SUSE customer where there
was no simple network setup that is sufficient for restore
because he can only access his server where his backup is
after a complicated networking setup.

Because I think it is hopeless to try to make the
network setup automatism in Relax-and-Recover
"just work" for any kind of complicated networking setup,
I implemented here the new support for the
NETWORKING_PREPARATION_COMMANDS
so that users can - if needed - specify whatever they need
in their particular case to set up networking in the recovery system.

gozora commented at 2016-08-12 12:50:

Well, VLAN is definitely a must have. I was rather talking about luxury things like bonding that could be dropped in theory (during recovery phase).
Ok, so if we would have network in state where we car run rear recover and it would work, everybody is happy. Hope I get it right ...

jsmeix commented at 2016-08-12 13:40:

My current fixes and the new support for the
NETWORKING_PREPARATION_COMMANDS
work perfectly well for me according to my test.

When I have only

USE_DHCLIENT="yes"

nothing changed for me - i.e. I get only networking setup
via DHCP via 58-start-dhclient.sh and 60-network-devices.sh
skips with 'return'.

When I have only

NETWORKING_PREPARATION_COMMANDS=( 'ip addr add 10.160.4.255/16 dev eth0' 'ip link set dev eth0 up' 'ip route add default via 10.160.255.254' 'return' )

I get networking setup only with that commands.

When I have (same as before but without the 'return'):

NETWORKING_PREPARATION_COMMANDS=( 'ip addr add 10.160.4.255/16 dev eth0' 'ip link set dev eth0 up' 'ip route add default via 10.160.255.254' )

I get networking setup via the
NETWORKING_PREPARATION_COMMANDS
plus the other commands in in 60-network-devices.sh - i.e. I
get two IP addresses for 'eth0'.

When I have

USE_DHCLIENT="yes"
NETWORKING_PREPARATION_COMMANDS=( 'ip addr add 10.160.4.255/16 dev eth0' 'ip link set dev eth0 up' 'ip route add default via 10.160.255.254' 'return' )

I get first networking setup via DHCP via 58-start-dhclient.sh and
additionally via the NETWORKING_PREPARATION_COMMANDS
in 60-network-devices.sh - i.e. I get two IP addresses for 'eth0'.

When I have

USE_DHCLIENT="yes"
NETWORKING_PREPARATION_COMMANDS=( 'ip addr add 10.160.4.255/16 dev eth0' 'ip link set dev eth0 up' 'ip route add default via 10.160.255.254' 'USE_STATIC_NETWORKING="yes"' )

I get first networking setup via DHCP via 58-start-dhclient.sh and
then via the NETWORKING_PREPARATION_COMMANDS
plus the other commands in in 60-network-devices.sh - i.e. I get
three IP addresses for 'eth0'.

Not that it makes much sense to have many IP addresses
but I only liked to test if it works as intended.

I think I will merge it soon if there is no furious objection...

jsmeix commented at 2016-08-12 14:07:

I merged it because I like it so much now ;-)

I would very much apprerciate feedback if the new
NETWORKING_PREPARATION_COMMANDS
is really helpful to get problems with recovery system
networking setup more easily fixed - or at least to get
more easily a quick and dirty workaround in case of
issues with recovery system networking setup.


[Export of Github issue for rear/rear.]