#1701 Issue closed: On PPC64 "rear mkrescue" fails with BugError in 310_network_devices.sh

Labels: enhancement, support / question, fixed / solved / done, special hardware or VM

schubiduuu opened issue at 2018-01-23 11:58:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • rear version (/usr/sbin/rear -V): 2-3-36.git.0.0293785.unknown.changed.ppc64
  • OS version (cat /etc/rear/os.conf or lsb_release -a):
SB Version:    core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-ppc64:core-3.2-ppc64:core-4.0-ppc64:desktop-4.0-noarch:desktop-4.0-ppc32:desktop-4.0-ppc64:graphics-2.0-noarch:graphics-2.0-ppc32:graphics-2.0-ppc64:graphics-3.2-noarch:graphics-3.2-ppc32:graphics-3.2-ppc64:graphics-4.0-noarch:graphics-4.0-ppc32:graphics-4.0-ppc64
Distributor ID: SUSE LINUX
Description:    SUSE Linux Enterprise Server 11 (ppc64)
Release:        11
Codename:       n/a
  • rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
AUTOEXCLUDE_MULTIPATH=n
BOOT_OVER_SAN=y

OUTPUT=ISO
BACKUP=TSM
OUTPUT_URL=file:///iso/


EXCLUDE_VG=( vgHANA-data-HC2 vgHANA-data-HC3 vgHANA-log-HC2 vgHANA-log-HC3 vgHANA-shared-HC2 vgHANA-hared-HC3 )
BACKUP_PROG_EXCLUDE=( "${BACKUP_PROG_EXCLUDE[@]}" '/hana/*' )
COPY_AS_IS_TSM=( /etc/adsm/TSM.PWD /opt/tivoli/tsm/client/ba/bin/dsmc /opt/tivoli/tsm/client/ba/bin/inclexcl /opt/tivoli/tsm/client/ba/bin/dsm.sys /opt/tivoli/tsm/client/ba/bin/dsm.opt /opt/tivoli/tsm/client/api/bin64/libgpfs.so /opt/tivoli/tsm/client/api/bin64/libdmapi.so /opt/tivoli/tsm/client/ba/bin/EN_US/dsmclientV3.cat /usr/local/ibm/gsk8* )
  • Are you using legacy BIOS or UEFI boot? BIOS
  • Brief description of the issue:

BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306:
'Unexpected operational state 'unknown' for 'eth0'.'

  • Work-around, if any: no

rear-system.log:

2018-01-23 12:52:37.747091226 Relax-and-Recover 2.3-git.0.0293785.unknown.changed / 2018-01-17
2018-01-23 12:52:37.748090177 Command line options: /usr/sbin/rear mkrescue
2018-01-23 12:52:37.749022831 Using log file: /var/log/rear/rear-system.log
2018-01-23 12:52:37.750400246 Including /etc/rear/os.conf
2018-01-23 12:52:37.753286485 Including conf/Linux-ppc64.conf
2018-01-23 12:52:37.754739055 Including conf/GNU/Linux.conf
2018-01-23 12:52:37.771722352 Including conf/SUSE_LINUX.conf
2018-01-23 12:52:37.773190475 Including /etc/rear/local.conf
2018-01-23 12:52:37.775043330 ======================
2018-01-23 12:52:37.775957912 Running 'init' stage
2018-01-23 12:52:37.776901710 ======================
2018-01-23 12:52:37.783248732 Including init/default/010_set_drlm_env.sh
2018-01-23 12:52:37.786144714 Including init/default/030_update_recovery_system.sh
2018-01-23 12:52:37.789034150 Including init/default/050_check_rear_recover_mode.sh
2018-01-23 12:52:37.790102592 Finished running 'init' stage in 0 seconds
2018-01-23 12:52:37.797255559 Using build area '/tmp/rear.lth9kCedXLHuKMC'
2018-01-23 12:52:37.799787295 Running mkrescue workflow
2018-01-23 12:52:37.800835441 ======================
2018-01-23 12:52:37.801710355 Running 'prep' stage
2018-01-23 12:52:37.802581819 ======================
2018-01-23 12:52:37.808916529 Including prep/default/005_remove_workflow_conf.sh
2018-01-23 12:52:37.813107033 Including prep/default/020_translate_url.sh
2018-01-23 12:52:37.816155164 Including prep/default/030_translate_tape.sh
2018-01-23 12:52:37.819060068 Including prep/default/040_check_backup_and_output_scheme.sh
2018-01-23 12:52:37.823924636 Including prep/default/050_check_keep_old_output_copy_var.sh
2018-01-23 12:52:37.826667099 Including prep/default/100_init_workflow_conf.sh
2018-01-23 12:52:37.830165444 Including prep/GNU/Linux/200_include_getty.sh
2018-01-23 12:52:37.851152133 Including prep/GNU/Linux/200_include_serial_console.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: getty: not found
2018-01-23 12:52:37.870002279 Including prep/GNU/Linux/210_include_dhclient.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: dhclient: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: dhcp6c: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: dhclient6: not found
2018-01-23 12:52:37.885511137 Including prep/GNU/Linux/220_include_lvm_tools.sh
2018-01-23 12:52:37.887115932 Device mapper found enabled. Including LVM tools.
2018-01-23 12:52:37.890996559 Including prep/GNU/Linux/230_include_md_tools.sh
2018-01-23 12:52:37.894339015 Including prep/GNU/Linux/240_include_multipath_tools.sh
2018-01-23 12:52:37.903815648 Including prep/GNU/Linux/280_include_systemd.sh
2018-01-23 12:52:37.918046129 Including prep/GNU/Linux/280_include_virtualbox.sh
2018-01-23 12:52:37.922187698 Including prep/GNU/Linux/280_include_vmware_tools.sh
2018-01-23 12:52:37.926067497 Including prep/GNU/Linux/290_include_drbd.sh
2018-01-23 12:52:37.930035414 Including prep/GNU/Linux/300_check_backup_and_output_url.sh
2018-01-23 12:52:37.935275996 Including prep/ISO/default/300_check_iso_dir.sh
2018-01-23 12:52:37.937950015 Including prep/GNU/Linux/300_include_grub_tools.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: grub-probe: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: grub2-probe: not found
2018-01-23 12:52:37.943281247 Including prep/GNU/Linux/310_include_cap_utils.sh
2018-01-23 12:52:37.946094767 Including prep/ISO/default/320_check_cdrom_size.sh
2018-01-23 12:52:37.948417136 ISO Directory '/var/lib/rear/output' [/dev/mapper/system-root] has 9868 MB free space
2018-01-23 12:52:37.951286820 Including prep/default/320_include_uefi_env.sh
2018-01-23 12:52:37.956210203 Including prep/ISO/GNU/Linux/320_verify_mkisofs.sh
2018-01-23 12:52:37.957463045 Using '/usr/bin/mkisofs' to create ISO images
2018-01-23 12:52:37.960284339 Including prep/default/330_include_uefi_tools.sh
2018-01-23 12:52:37.963077191 Including prep/ISO/GNU/Linux/340_add_isofs_module.sh
2018-01-23 12:52:37.967192727 Including prep/default/380_include_opal_tools.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: sedutil-cli: not found
2018-01-23 12:52:37.969947817 Including prep/TSM/default/400_prep_tsm.sh
2018-01-23 12:52:37.976027119 Including prep/default/400_save_directories.sh
/home 755 root root
2018-01-23 12:52:38.027068427 FHS directory /run does not exist
/usr/sap 755 root sapsys
/usr/sap/DAA 755 root root
2018-01-23 12:52:38.084920940 Including prep/default/950_check_missing_programs.sh
2018-01-23 12:52:38.089250057 Finished running 'prep' stage in 1 seconds
2018-01-23 12:52:38.090225975 ======================
2018-01-23 12:52:38.091188723 Running 'layout/save' stage
2018-01-23 12:52:38.092083387 ======================
2018-01-23 12:52:38.098409601 Including layout/save/GNU/Linux/100_create_layout_file.sh
2018-01-23 12:52:38.099461282 Creating disk layout
2018-01-23 12:52:38.100707462 Preparing layout directory.
2018-01-23 12:52:38.106154398 Removing old layout file.
2018-01-23 12:52:38.109204458 Including layout/save/GNU/Linux/150_save_diskbyid_mappings.sh
2018-01-23 12:52:38.405675647 Saved diskbyid_mappings
2018-01-23 12:52:38.408504934 Including layout/save/GNU/Linux/190_opaldisk_layout.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: sedutil-cli: not found
2018-01-23 12:52:38.411400432 Including layout/save/GNU/Linux/200_partition_layout.sh
2018-01-23 12:52:38.419659696 Saving disk partitions.
2018-01-23 12:52:38.424955954 Ignoring sda: it is a path of a multipath device
2018-01-23 12:52:38.428456808 Ignoring sdb: it is a path of a multipath device
2018-01-23 12:52:38.432124182 Ignoring sdc: it is a path of a multipath device
2018-01-23 12:52:38.436499143 Ignoring sdd: it is a path of a multipath device
2018-01-23 12:52:38.440080070 Ignoring sde: it is a path of a multipath device
2018-01-23 12:52:38.443884737 Ignoring sdf: it is a path of a multipath device
2018-01-23 12:52:38.447407754 Ignoring sdg: it is a path of a multipath device
2018-01-23 12:52:38.451078496 Ignoring sdh: it is a path of a multipath device
2018-01-23 12:52:38.454180629 Including layout/save/GNU/Linux/210_raid_layout.sh
2018-01-23 12:52:38.458198118 Including layout/save/GNU/Linux/220_lvm_layout.sh
2018-01-23 12:52:38.459504308 Saving LVM layout.
2018-01-23 12:52:38.731401799 Including layout/save/GNU/Linux/230_filesystem_layout.sh
2018-01-23 12:52:38.732577287 Begin saving filesystem layout
2018-01-23 12:52:38.734721590 Saving filesystem layout (using the findmnt command).
2018-01-23 12:52:38.916414856 End saving filesystem layout
2018-01-23 12:52:38.919581815 Including layout/save/GNU/Linux/240_swaps_layout.sh
2018-01-23 12:52:38.920870962 Saving Swap information.
2018-01-23 12:52:38.968141342 Including layout/save/GNU/Linux/250_drbd_layout.sh
2018-01-23 12:52:38.971125457 Including layout/save/GNU/Linux/260_crypt_layout.sh
2018-01-23 12:52:38.972303787 Saving Encrypted volumes.
2018-01-23 12:52:38.977694792 Including layout/save/GNU/Linux/270_hpraid_layout.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: hpacucli: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: hpssacli: not found
2018-01-23 12:52:38.980899306 Including layout/save/GNU/Linux/280_multipath_layout.sh
2018-01-23 12:52:39.333969346 Including layout/save/default/300_list_dependencies.sh
2018-01-23 12:52:39.454504984 Including layout/save/default/310_autoexclude_usb.sh
2018-01-23 12:52:39.460451134 Including layout/save/default/310_include_exclude.sh
2018-01-23 12:52:39.461876800 Excluding Volume Group vgHANA-data-HC2.
2018-01-23 12:52:39.465843306 Excluding Volume Group vgHANA-data-HC3.
2018-01-23 12:52:39.469901349 Excluding Volume Group vgHANA-log-HC2.
2018-01-23 12:52:39.474108518 Excluding Volume Group vgHANA-log-HC3.
2018-01-23 12:52:39.478167861 Excluding Volume Group vgHANA-shared-HC2.
2018-01-23 12:52:39.482250464 Excluding Volume Group vgHANA-hared-HC3.
2018-01-23 12:52:39.490950283 Including layout/save/default/320_autoexclude.sh
2018-01-23 12:52:40.152757606 Excluding multipath slave /dev/sdd.
2018-01-23 12:52:40.156280410 Excluding multipath slave /dev/sdh.
2018-01-23 12:52:40.159894026 Excluding multipath slave /dev/sda.
2018-01-23 12:52:40.163344608 Excluding multipath slave /dev/sde.
2018-01-23 12:52:40.166907938 Excluding multipath slave /dev/sdc.
2018-01-23 12:52:40.170333086 Excluding multipath slave /dev/sdg.
2018-01-23 12:52:40.173921191 Excluding multipath slave /dev/sdb.
2018-01-23 12:52:40.177307039 Excluding multipath slave /dev/sdf.
2018-01-23 12:52:40.182493435 Including layout/save/default/330_remove_exclusions.sh
2018-01-23 12:52:40.187702767 Including layout/save/default/335_remove_excluded_multipath_vgs.sh
2018-01-23 12:52:40.208731785 Including layout/save/GNU/Linux/340_false_blacklisted.sh
2018-01-23 12:52:40.222444741 Including layout/save/default/340_generate_mountpoint_device.sh
2018-01-23 12:52:40.554312358 Including layout/save/GNU/Linux/350_copy_drbdtab.sh
2018-01-23 12:52:40.557121957 Including layout/save/default/350_save_partitions.sh
2018-01-23 12:52:40.559969037 Including layout/save/default/400_check_backup_special_files.sh
2018-01-23 12:52:40.563154679 Including layout/save/default/445_guess_bootloader.sh
2018-01-23 12:52:40.566820289 Using sysconfig bootloader 'ppc'
2018-01-23 12:52:40.570848481 Including layout/save/default/450_check_bootloader_files.sh
2018-01-23 12:52:40.574964533 Including layout/save/default/450_check_network_files.sh
2018-01-23 12:52:40.577866747 Including layout/save/GNU/Linux/500_extract_vgcfg.sh
2018-01-23 12:52:40.646217553 Including layout/save/GNU/Linux/510_current_disk_usage.sh
2018-01-23 12:52:40.652741955 Including layout/save/default/600_snapshot_files.sh
2018-01-23 12:52:40.657238847 Finished running 'layout/save' stage in 2 seconds
2018-01-23 12:52:40.658320841 ======================
2018-01-23 12:52:40.659256628 Running 'rescue' stage
2018-01-23 12:52:40.660177691 ======================
2018-01-23 12:52:40.666510210 Including rescue/default/010_merge_skeletons.sh
2018-01-23 12:52:40.667652288 Creating root filesystem layout
2018-01-23 12:52:40.669159495 Adding 'default'
2018-01-23 12:52:40.678139092 Adding 'Linux-ppc64'
2018-01-23 12:52:40.685437421 Including rescue/default/100_hostname.sh
2018-01-23 12:52:40.688275903 Including rescue/default/200_etc_issue.sh
2018-01-23 12:52:40.692613682 Including rescue/GNU/Linux/220_load_modules_from_initrd.sh
2018-01-23 12:52:40.696346972 Including rescue/GNU/Linux/230_storage_and_network_modules.sh
2018-01-23 12:52:40.697535341 Including storage drivers
2018-01-23 12:52:40.702444802 Including network drivers
2018-01-23 12:52:40.707341261 Including crypto drivers
2018-01-23 12:52:40.710992313 Including virtualization drivers
2018-01-23 12:52:40.714194110 Including additional drivers
2018-01-23 12:52:40.718988754 Including rescue/GNU/Linux/240_kernel_modules.sh
2018-01-23 12:52:40.725827218 Including rescue/GNU/Linux/250_udev.sh
2018-01-23 12:52:40.730229664 Including rescue/GNU/Linux/260_collect_initrd_modules.sh
2018-01-23 12:52:40.735880316 Including rescue/GNU/Linux/260_storage_drivers.sh
2018-01-23 12:52:40.863859994 Including rescue/GNU/Linux/270_fc_transport_info.sh
2018-01-23 12:52:40.869720857 Including rescue/GNU/Linux/290_kernel_cmdline.sh
2018-01-23 12:52:40.875161739 Including rescue/GNU/Linux/300_dns.sh
2018-01-23 12:52:40.878419887 Including rescue/GNU/Linux/310_network_devices.sh
2018-01-23 12:52:40.895364103 ERROR:
====================
BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306:
'Unexpected operational state 'unknown' for 'eth0'.'
--------------------
Please report this issue at https://github.com/rear/rear/issues
and include the relevant parts from /var/log/rear/rear-system.log
preferably with full debug information via 'rear -d -D mkrescue'
====================
==== Stack trace ====
Trace 0: /usr/sbin/rear:543 main
Trace 1: /usr/share/rear/lib/mkrescue-workflow.sh:16 WORKFLOW_mkrescue
Trace 2: /usr/share/rear/lib/framework-functions.sh:101 SourceStage
Trace 3: /usr/share/rear/lib/framework-functions.sh:49 Source
Trace 4: /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh:812 source
Trace 5: /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh:306 is_interface_up
Trace 6: /usr/share/rear/lib/_input-output-functions.sh:307 BugError
Message:
====================
BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306:
'Unexpected operational state 'unknown' for 'eth0'.'
--------------------
Please report this issue at https://github.com/rear/rear/issues
and include the relevant parts from /var/log/rear/rear-system.log
preferably with full debug information via 'rear -d -D mkrescue'
====================
== End stack trace ==
2018-01-23 12:52:40.899819732 Running exit tasks.
2018-01-23 12:52:40.901709363 Finished in 3 seconds
2018-01-23 12:52:40.902953087 Removing build area /tmp/rear.lth9kCedXLHuKMC
2018-01-23 12:52:40.911961106 End of program reached

jsmeix commented at 2018-01-23 15:57:

@schubiduuu
please provide a debug log with

rear -D mkrescue

as requested by ReaR's BUG error message.

schubiduuu commented at 2018-01-26 08:35:

rear-SYSTEM.log

Sorry for the delay but here is the log.

schubiduuu commented at 2018-01-30 10:34:

Is anything else missing for further investigation?

jsmeix commented at 2018-01-30 11:20:

@rmetrich
could you have a look what goes on here because according to

# git log -p --follow usr/share/rear/rescue/GNU/Linux/310_network_devices.sh | egrep 'Unexpected operational state|^commit|^Author'
...
commit 15567ede425401b008e5b1680db36a2c62752b8f
Author: Renaud Métrich 
+        BugError "Unexpected operational state '$state' for '$network_interface'."
...

this BugError belongs to your new 310_network_devices.sh code.
Many thanks in advance!

gdha commented at 2018-01-30 11:26:

@schubiduuu How does your network layout looks like? Can we see the output of ip a?

2018-01-26 09:27:10.748644798 Including rescue/GNU/Linux/310_network_devices.sh
2018-01-26 09:27:10.749557540 Entering debugscripts mode via 'set -x'.
+ source /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh
++ network_devices_setup_script=/tmp/rear.uCrRqgsehRpGe5L/rootfs/etc/scripts/system-setup.d/60-network-devices.sh
++ echo '# Network devices setup:'
++ cat -
++ test ''
++ cat -
++ cat -
++ readlink /foo /bar
++ net_devices_have_lower_links=false
++ ls /sys/class/net/eth0 /sys/class/net/eth1 /sys/class/net/lo
++ grep -q '^lower_'
++ ip_link_supports_bridge=false
++ ip link help
++ grep -qw bridge
++ ip_link_supports_bridge=true
++ MAPPED_NETWORK_INTERFACES=()
++ rc_success=0
++ rc_error=1
++ rc_ignore=2
++ already_set_up_interfaces=
++ already_set_up_bridges=
++ already_set_up_teams=
++ already_set_up_bonds=
++ already_set_up_vlans=
++ generated_vlans=
++ already_set_up_physdevs=
++ already_set_up_physdev_drivers=
+++ mktemp
++ tmpfile=/tmp/tmp.Br88ibzhue
++ rc=
+++ ip r
+++ awk '$2 == "dev" && $8 == "src" { print $3 }'
+++ sort -u
++ for network_interface in '$( ip r | awk '\''$2 == "dev" && $8 == "src" { print $3 }'\'' | sort -u )'
++ is_linked_to_physical eth0
++ local network_interface=eth0
++ local sysfspath=/sys/class/net/eth0
++ is_devpath_linked_to_physical /sys/class/net/eth0
++ local sysfspath=/sys/class/net/eth0
++ '[' '!' -e /sys/class/net/eth0/device ']'
++ return 0
++ is_interface_up eth0
++ local network_interface=eth0
++ local sysfspath=/sys/class/net/eth0
+++ cat /sys/class/net/eth0/operstate
++ local state=unknown
++ '[' unknown = down ']'
++ '[' unknown = up ']'
++ BugError 'Unexpected operational state '\''unknown'\'' for '\''eth0'\''.'

@schabrolles Can you also have a closer look at this issue please?

schubiduuu commented at 2018-01-30 11:29:

: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
    inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UNKNOWN qlen 1000
    link/ether f6:44:7d:92:b2:04 brd ff:ff:ff:ff:ff:ff
    inet 10.224.0.14/25 brd 10.224.0.127 scope global eth0
    inet6 fe80::f444:7dff:fe92:b204/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    link/ether f6:44:7d:92:b2:05 brd ff:ff:ff:ff:ff:ff
    inet 10.224.0.230/25 brd 10.224.0.255 scope global eth1
    inet6 fe80::f444:7dff:fe92:b205/64 scope link
       valid_lft forever preferred_lft forever

rmetrich commented at 2018-01-30 11:39:

@schubiduuu Is your network eth0 currently operational?
Could you please dump out of the following:
grep "" /sys/class/net/eth0/*

In the code, I'm relying on "operstate" to set up the link up/down mode. But it looks like this is not reliable.

schubiduuu commented at 2018-01-30 11:50:

/sys/class/net/eth0/addr_assign_type:0
/sys/class/net/eth0/addr_len:6
/sys/class/net/eth0/address:f6:44:7d:92:b2:04
/sys/class/net/eth0/broadcast:ff:ff:ff:ff:ff:ff
/sys/class/net/eth0/carrier:1
/sys/class/net/eth0/dev_id:0x0
/sys/class/net/eth0/dev_port:0
/sys/class/net/eth0/dormant:0
/sys/class/net/eth0/duplex:full
/sys/class/net/eth0/features:0x20114813
/sys/class/net/eth0/flags:0x1003
/sys/class/net/eth0/ifindex:2
/sys/class/net/eth0/iflink:2
/sys/class/net/eth0/link_mode:0
/sys/class/net/eth0/mtu:9000
/sys/class/net/eth0/netdev_group:0
/sys/class/net/eth0/operstate:unknown
/sys/class/net/eth0/speed:1000
/sys/class/net/eth0/tx_queue_len:1000
/sys/class/net/eth0/type:1
/sys/class/net/eth0/uevent:INTERFACE=eth0
/sys/class/net/eth0/uevent:IFINDEX=2

schubiduuu commented at 2018-01-30 11:50:

and yes eth0 is operational.

rmetrich commented at 2018-01-30 12:11:

@schubiduuu could you please set the interface explictly down (using ip link set dev eth0 down) and dump again the /sys/class/net/eth0 output?

schubiduuu commented at 2018-01-30 12:54:

I will post the output tomorrow morning as I need to agree on a downtime.

rmetrich commented at 2018-01-30 13:05:

@schubiduuu maybe do that on eth1?

schubiduuu commented at 2018-01-30 13:09:

Unfortunately both interfaces are used productively.

rmetrich commented at 2018-01-30 13:14:

ok, as a workaround, you may edit the /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh file, and comment out the line 306 and replace by a "return 0", which will evaluate as "interface is up".

schubiduuu commented at 2018-01-30 13:15:

ok, I will edit the file and start another test.

rmetrich commented at 2018-01-30 13:22:

could you dump the lspci -vv output and lsmod output for reference?

schubiduuu commented at 2018-01-30 13:37:

@rmetrich Should I perform "lspci -vv" and "lsmod" on the source system? or after rebooting via the rescue iso?
The test after commenting out the line 306 shows that I can't modify the interface config and this message is display every time:

SIOCSIFFLAGS: Invalid argument

rmetrich commented at 2018-01-30 13:46:

Please run lspci -vv and lsmod from the source system, no need to reboot in rescue iso.
Please try with the following patch:

function is_interface_up () {
local network_interface=$1
local sysfspath=/sys/class/net/$network_interface

local state=$( cat $sysfspath/carrier 2>/dev/null || echo 0)
if [ $state -eq 0 ] ; then
return 1
else
return 0
fi

If that fails, please provide debug output for read -dD mkrescue, 310_network_devices.sh part

schubiduuu commented at 2018-01-30 13:50:

No output for "lspci -vv" and for "lsmod":

Module                  Size  Used by
st                     51177  0
sr_mod                 23543  0
ide_cd_mod             39518  0
cdrom                  51354  2 sr_mod,ide_cd_mod
joydev                 16492  0
nfs                   583351  0
fscache                85204  1 nfs
lockd                 113448  1 nfs
auth_rpcgss            61878  1 nfs
nfs_acl                 3888  1 nfs
sunrpc                366268  7 nfs,lockd,auth_rpcgss,nfs_acl
fuse                  127337  1
xfs                  1077534  7
loop                   25404  0
ipv6                    1881  1
ipv6_lib              440978  127 ipv6
sg                     45603  0
nx_crypto              45618  0
ibmveth                32645  0
ext3                  206842  2
jbd                   102286  1 ext3
mbcache                10827  1 ext3
dm_mirror              23171  0
dm_region_hash         16423  1 dm_mirror
dm_log                 15841  2 dm_mirror,dm_region_hash
linear                  6958  0
sd_mod                 54931  8
crc_t10dif              1691  1 sd_mod
dm_service_time         4794  5
dm_least_pending        4374  0
dm_queue_length         4402  0
dm_round_robin          3996  0
dm_multipath           30372  8 dm_service_time,dm_least_pending,dm_queue_length,dm_round_robin
scsi_dh_hp_sw           6663  0
scsi_dh_rdac           12199  0
scsi_dh_emc            10476  0
scsi_dh_alua           17206  0
scsi_dh                11199  5 dm_multipath,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua
dm_snapshot            47338  0
dm_mod                127248  58 dm_mirror,dm_log,dm_multipath,dm_snapshot
ibmvscsic              37737  16
scsi_transport_srp      8416  1 ibmvscsic
scsi_tgt               17750  1 scsi_transport_srp
scsi_mod              291791  12 st,sr_mod,sg,sd_mod,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua,scsi_dh,ibmvscsic,scsi_transport_srp,scsi_tgt

rmetrich commented at 2018-01-30 13:54:

You may need to be root for lspci -vv.
You may also try dmidecode > /tmp/dmidecode.txt and upload the file.
The idea is to know which driver/hardware is doesn't provide the data.

schubiduuu commented at 2018-01-30 13:56:

I am root on the system and the system says "dmidecode: not found".

When executing "rear -D mkresuce" there are no error messages.

rmetrich commented at 2018-01-30 14:15:

Could you upload the log anyway, for me to check.
dmidecode is not installed by default, maybe @jsmeix may help (I have no SuSE in house :-) )

schubiduuu commented at 2018-01-30 14:18:

rear-SYSTEM.log

rmetrich commented at 2018-01-30 14:41:

Sorry, I missed a "operstate" use just above.

function get_interface_state () {
local network_interface=$1
local sysfspath=/sys/class/net/$network_interface

cat $sysfspath/operstate
}

should be replaced by

function get_interface_state () {
local network_interface=$1
local sysfspath=/sys/class/net/$network_interface

if [ $(cat $sysfspath/carrier 2>/dev/null || 0) -eq 0 ]; then
return "down"
else
return "up"
fi
}

rmetrich commented at 2018-01-30 14:42:

With that, in the log file, you should see something like the following:

Was "unknown":

++ code='ip link set dev eth0 unknown
ip link set dev eth0 mtu 9000'

Should then be "up":

++ code='ip link set dev eth0 up
ip link set dev eth0 mtu 9000'

schubiduuu commented at 2018-01-30 14:59:

Do you mean the log file used by "rear -D mkrescue"?

rmetrich commented at 2018-01-30 15:16:

yes, the one you uploaded 1 hour ago.
we can see "ip link set dev ... unknown" in it.
With the latest "fix", this should disappear and you should get "up" instead.

I'll then produce a real patch once we know more about a reliable way of getting the interface status.
Hence my question about the hardware/driver used.

jsmeix commented at 2018-01-30 15:27:

Regarding dmidecode:
On my SLES11 x86 system I have:

# type -a dmidecode 
dmidecode is /usr/sbin/dmidecode

# rpm -qf /usr/sbin/dmidecode
pmtools-20071116-44.33.1

I don't know if dmidecode is also avialable for ppc64 architecture.

rmetrich commented at 2018-01-30 16:06:

Didn't notice the ppc64 arch, may be a hint

schubiduuu commented at 2018-01-31 06:31:

Output of grep "" /sys/class/net/eth0/* after executing "ip link set dev eth0 down":

/sys/class/net/eth0/addr_assign_type:0
/sys/class/net/eth0/addr_len:6
/sys/class/net/eth0/address:f6:44:7d:92:b2:04
/sys/class/net/eth0/broadcast:ff:ff:ff:ff:ff:ff
grep: /sys/class/net/eth0/carrier: Invalid argument
/sys/class/net/eth0/dev_id:0x0
/sys/class/net/eth0/dev_port:0
grep: /sys/class/net/eth0/dormant: Invalid argument
grep: /sys/class/net/eth0/duplex: Invalid argument
/sys/class/net/eth0/features:0x20114813
/sys/class/net/eth0/flags:0x1002
/sys/class/net/eth0/ifindex:2
/sys/class/net/eth0/iflink:2
/sys/class/net/eth0/link_mode:0
/sys/class/net/eth0/mtu:9000
/sys/class/net/eth0/netdev_group:0
/sys/class/net/eth0/operstate:down
grep: /sys/class/net/eth0/speed: Invalid argument
/sys/class/net/eth0/tx_queue_len:1000
/sys/class/net/eth0/type:1
/sys/class/net/eth0/uevent:INTERFACE=eth0
/sys/class/net/eth0/uevent:IFINDEX=2

jsmeix commented at 2018-01-31 08:50:

@schubiduuu
FYI how you could work around this issue for now:

Usually you do not need the full networking setup of your original system
also in the ReaR recovery system.

In general the recovery system you only need minimal networking setup
that is just sufficient to access your backup from within the running
recovery system (with sufficient thoughput if your backup is big)
so that your backup can be restored during "rear recover".

If ReaR's automated networking setup for ReaR's recovery system
does not work, you can manually specify the networking setup commands
that you need in your particular case in your recovery system via the
NETWORKING_PREPARATION_COMMANDS
config variable, see usr/share/rear/conf/default.conf for documentation.

In this particular case here you need additionally to modify
usr/share/rear/rescue/GNU/Linux/310_network_devices.sh
to let it return early after its very basic initial stuff to avoid that
later it fails with BugError by adding an early return 0 e.g. after
the "If IPADDR=1.2.3.4 has been defined at boot time" part like

# If IPADDR=1.2.3.4 has been defined at boot time via ip=1.2.3.4
...
EOT

# FIXME: Early 'return' to avoid https://github.com/rear/rear/issues/1701
return 0

# Detect whether 'readlink' supports multiple filenames or not

Furthermore if you need additional special commands to be run
in the ReaR recovery system (e.g. launching a special service)
you can specify that via the config variable
PRE_RECOVERY_SCRIPT
see usr/share/rear/conf/default.conf for documentation.

rmetrich commented at 2018-01-31 08:53:

@schubiduuu @jsmeix I'll submit a patch later today (or tomorrow), based on operstate or carrier availability.

schubiduuu commented at 2018-01-31 13:04:

@jsmeix: Even if I don't need the full network configuration of the source system, I should be able to change manually the IP config with commands like ifconfig. When trying to change the IP address or changing the operational state of eth0 it keeps telling me: SIOCSIFFLAGS: Invalid argument.

jsmeix commented at 2018-01-31 14:50:

@schubiduuu
do you mean that in the running recovery system
manual network setup with "usual commands"
does not work
while in contrast that same "usual commands"
work in the original system?

schubiduuu commented at 2018-01-31 15:38:

I can use the usual command while the source system is running but not in the recovery environment.

schubiduuu commented at 2018-02-02 08:40:

Should I start another issue for the nonfunctional network commands?

rmetrich commented at 2018-02-02 08:55:

What do you exactly mean by "nonfunctional network commands?".
Please provide some examples and context (in the source system, or within recovery)

schubiduuu commented at 2018-02-02 09:00:

Within the recovery environment after booting via a rear ISO I am not able to make any changes to the network configuration. Ifconfig or ip don't work as I get always this error message: SIOCSIFFLAGS: Invalid argument

rmetrich commented at 2018-02-02 09:30:

Then, yes, this looks like a different issue, unless the network script, when running, sets up something bad (but I doubt).
To make sure, please boot the recovery with "ip=1.2.3.4", this will then bypass the script.
If the issue persists, there is for sure another issue, likely related to some missing driver.

schubiduuu commented at 2018-02-02 09:32:

Ok and I did this change but still get the same error. The strange thing about this issue is that I was able to use an older rear version for some time but then it didn't work suddenly.

rmetrich commented at 2018-02-02 09:34:

Try comparing the modules loaded (lsmod) and please give some examples of command you run.

schubiduuu commented at 2018-02-02 09:47:

lsmod on source system:

Module                  Size  Used by
st                     51177  0
sr_mod                 23543  0
ide_cd_mod             39518  0
cdrom                  51354  2 sr_mod,ide_cd_mod
joydev                 16492  0
nfs                   583351  0
fscache                85204  1 nfs
lockd                 113448  1 nfs
auth_rpcgss            61878  1 nfs
nfs_acl                 3888  1 nfs
sunrpc                366268  7 nfs,lockd,auth_rpcgss,nfs_acl
fuse                  127337  1
xfs                  1077534  7
loop                   25404  0
ipv6                    1881  1
ipv6_lib              440978  127 ipv6
sg                     45603  0
nx_crypto              45618  0
ibmveth                32645  0
ext3                  206842  2
jbd                   102286  1 ext3
mbcache                10827  1 ext3
dm_mirror              23171  0
dm_region_hash         16423  1 dm_mirror
dm_log                 15841  2 dm_mirror,dm_region_hash
linear                  6958  0
sd_mod                 54931  8
crc_t10dif              1691  1 sd_mod
dm_service_time         4794  5
dm_least_pending        4374  0
dm_queue_length         4402  0
dm_round_robin          3996  0
dm_multipath           30372  8 dm_service_time,dm_least_pending,dm_queue_length,dm_round_robin
scsi_dh_hp_sw           6663  0
scsi_dh_rdac           12199  0
scsi_dh_emc            10476  0
scsi_dh_alua           17206  0
scsi_dh                11199  5 dm_multipath,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua
dm_snapshot            47338  0
dm_mod                127248  58 dm_mirror,dm_log,dm_multipath,dm_snapshot
ibmvscsic              37737  16
scsi_transport_srp      8416  1 ibmvscsic
scsi_tgt               17750  1 scsi_transport_srp
scsi_mod              291791  12 st,sr_mod,sg,sd_mod,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua,scsi_dh,ibmvscsic,scsi_transport_srp,scsi_tgt

lsmod on recovery system:

Module                  Size  Used by
ipv6                    1881  1
ipv6_lib              440978  137 ipv6
dm_mod                127248  0
nx_crypto              45618  0
ibmveth                32645  0
ibmvscsic              37737  0
scsi_transport_srp      8416  1 ibmvscsic
scsi_tgt               17750  1 scsi_transport_srp
scsi_mod              291791  3 ibmvscsic,scsi_transport_srp,scsi_tgt

example of executed command:

ifconfig -a -s eth0 1.2.3.4/24

jsmeix commented at 2018-02-06 09:36:

By default during "rear mkrescue/mkbackup" at least those
kernel modules that are currently loaded on the original system
get included in the recovery system.

But by default not all those currently loaded modules on the original system
will also get (automatically) loaded in the running recovery system.

See the description of the config variables
MODULES and MODULES_LOAD
in usr/share/rear/conf/default.conf

To be more on the safe side you may include all
files in /lib/modules/$KERNEL_VERSION
in the recovery sytem by using
MODULES=( 'all_modules' )
cf. https://github.com/rear/rear/issues/1202
and https://github.com/rear/rear/issues/1355
which show examples of awkward unexpected failures
when needed kernel modules are missing.

If you use special hardware that may even
need special kernel modules outside of
/lib/modules/$KERNEL_VERSION
you must use something like
COPY_AS_IS=( "${COPY_AS_IS[@]:-}" /path/to/special/modules )
to also get your special kernel modules included
in your recovery system.

To enforce loading of kernel modules during startup
of the recovery system use something like
MODULES_LOAD=( first_module second_module ... )

If you use special hardware that needs firmware
(e.g. some network interface cards need firmware)
you may additionally have to get all firmware files
included in your recovery system, see the
FIRMWARE_FILES config variable in
usr/share/rear/conf/default.conf
and note the special exception about ppc64
so that you may have to enforce via
FIRMWARE_FILES=( 'yes' )
to get all files from the /lib*/firmware/ directories
included in the recovery system.

But both
MODULES=( 'all_modules' )
and even more
FIRMWARE_FILES=( 'yes' )
will increase the size of the recovery system
which is stored in ReaR's initrd.

But at least on some POWER architecture systems
with some POWER architecture specific bootloaders
(e.g. the yaboot bootloader)
the maximum size of the initrd is rather small, see the
REAR_INITRD_COMPRESSION config variable
in usr/share/rear/conf/default.conf

schubiduuu commented at 2018-02-06 10:48:

I used MODULES=( 'all_modules' ) and FIRMWARE_FILES=( 'yes' ) in the local.conf.
To reduce the size of initrd I used "REAR_INITRD_COMPRESSION="lzma"".

Unfortunately this didn't help. Did I do anything wrong?

jsmeix commented at 2018-02-06 13:10:

As in
https://github.com/rear/rear/issues/1724#issuecomment-363413974
I guess (but I am not at all a sufficient networking expert
to make an authoritative decision here) that this issue
depends on "special hardware" - probably not on the ppc64
architecture because @schabrolles tested ReaR a lot
on POWER architecture - but perhaps more likely on special
networking hardware (e.g. special network interface card).

jsmeix commented at 2018-02-07 12:56:

@rmetrich
in addition to your https://github.com/rear/rear/pull/1719
I did https://github.com/rear/rear/pull/1725
that also fixes and enhances some minor issues
regarding networking setup in the recovery system.
I would appreciate it if you could have a look at
my https://github.com/rear/rear/pull/1725
whether or not it looks o.k. to you.

rmetrich commented at 2018-02-07 13:25:

@jsmeix this looks good to me, but I will not be able to test in the coming days.

jsmeix commented at 2018-02-08 07:58:

With https://github.com/rear/rear/pull/1719 merged
this particular issue here should be fixed.

jsmeix commented at 2018-02-08 08:05:

@rmetrich
many thanks for your analysis what the root cause is
and for your enhancement that makes ReaR even work
with network drivers that do not set /sys/class/net/INTERFACE/operstate
to one of the usually expeced values 'up' or 'down' which is
a nice example of ReaR's "Dirty hacks welcome" style in
https://github.com/rear/rear/wiki/Coding-Style

jsmeix commented at 2018-02-27 10:41:

According to https://github.com/rear/rear/issues/1741
this issue

BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306:
'Unexpected operational state 'unknown' for ...

could have also happened on non-POWER architecture,
in that case on "Ubuntu 16.04.4 LTS" as

BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306:
'Unexpected operational state 'unknown' for 'vnet0'.'

[Export of Github issue for rear/rear.]