#1701 Issue closed
: On PPC64 "rear mkrescue" fails with BugError in 310_network_devices.sh¶
Labels: enhancement
, support / question
,
fixed / solved / done
, special hardware or VM
schubiduuu opened issue at 2018-01-23 11:58:¶
Relax-and-Recover (ReaR) Issue Template¶
Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):
- rear version (/usr/sbin/rear -V): 2-3-36.git.0.0293785.unknown.changed.ppc64
- OS version (cat /etc/rear/os.conf or lsb_release -a):
SB Version: core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-ppc64:core-3.2-ppc64:core-4.0-ppc64:desktop-4.0-noarch:desktop-4.0-ppc32:desktop-4.0-ppc64:graphics-2.0-noarch:graphics-2.0-ppc32:graphics-2.0-ppc64:graphics-3.2-noarch:graphics-3.2-ppc32:graphics-3.2-ppc64:graphics-4.0-noarch:graphics-4.0-ppc32:graphics-4.0-ppc64 Distributor ID: SUSE LINUX Description: SUSE Linux Enterprise Server 11 (ppc64) Release: 11 Codename: n/a
- rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
AUTOEXCLUDE_MULTIPATH=n BOOT_OVER_SAN=y OUTPUT=ISO BACKUP=TSM OUTPUT_URL=file:///iso/ EXCLUDE_VG=( vgHANA-data-HC2 vgHANA-data-HC3 vgHANA-log-HC2 vgHANA-log-HC3 vgHANA-shared-HC2 vgHANA-hared-HC3 ) BACKUP_PROG_EXCLUDE=( "${BACKUP_PROG_EXCLUDE[@]}" '/hana/*' ) COPY_AS_IS_TSM=( /etc/adsm/TSM.PWD /opt/tivoli/tsm/client/ba/bin/dsmc /opt/tivoli/tsm/client/ba/bin/inclexcl /opt/tivoli/tsm/client/ba/bin/dsm.sys /opt/tivoli/tsm/client/ba/bin/dsm.opt /opt/tivoli/tsm/client/api/bin64/libgpfs.so /opt/tivoli/tsm/client/api/bin64/libdmapi.so /opt/tivoli/tsm/client/ba/bin/EN_US/dsmclientV3.cat /usr/local/ibm/gsk8* )
- Are you using legacy BIOS or UEFI boot? BIOS
- Brief description of the issue:
BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line
306:
'Unexpected operational state 'unknown' for 'eth0'.'
- Work-around, if any: no
rear-system.log:
2018-01-23 12:52:37.747091226 Relax-and-Recover 2.3-git.0.0293785.unknown.changed / 2018-01-17
2018-01-23 12:52:37.748090177 Command line options: /usr/sbin/rear mkrescue
2018-01-23 12:52:37.749022831 Using log file: /var/log/rear/rear-system.log
2018-01-23 12:52:37.750400246 Including /etc/rear/os.conf
2018-01-23 12:52:37.753286485 Including conf/Linux-ppc64.conf
2018-01-23 12:52:37.754739055 Including conf/GNU/Linux.conf
2018-01-23 12:52:37.771722352 Including conf/SUSE_LINUX.conf
2018-01-23 12:52:37.773190475 Including /etc/rear/local.conf
2018-01-23 12:52:37.775043330 ======================
2018-01-23 12:52:37.775957912 Running 'init' stage
2018-01-23 12:52:37.776901710 ======================
2018-01-23 12:52:37.783248732 Including init/default/010_set_drlm_env.sh
2018-01-23 12:52:37.786144714 Including init/default/030_update_recovery_system.sh
2018-01-23 12:52:37.789034150 Including init/default/050_check_rear_recover_mode.sh
2018-01-23 12:52:37.790102592 Finished running 'init' stage in 0 seconds
2018-01-23 12:52:37.797255559 Using build area '/tmp/rear.lth9kCedXLHuKMC'
2018-01-23 12:52:37.799787295 Running mkrescue workflow
2018-01-23 12:52:37.800835441 ======================
2018-01-23 12:52:37.801710355 Running 'prep' stage
2018-01-23 12:52:37.802581819 ======================
2018-01-23 12:52:37.808916529 Including prep/default/005_remove_workflow_conf.sh
2018-01-23 12:52:37.813107033 Including prep/default/020_translate_url.sh
2018-01-23 12:52:37.816155164 Including prep/default/030_translate_tape.sh
2018-01-23 12:52:37.819060068 Including prep/default/040_check_backup_and_output_scheme.sh
2018-01-23 12:52:37.823924636 Including prep/default/050_check_keep_old_output_copy_var.sh
2018-01-23 12:52:37.826667099 Including prep/default/100_init_workflow_conf.sh
2018-01-23 12:52:37.830165444 Including prep/GNU/Linux/200_include_getty.sh
2018-01-23 12:52:37.851152133 Including prep/GNU/Linux/200_include_serial_console.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: getty: not found
2018-01-23 12:52:37.870002279 Including prep/GNU/Linux/210_include_dhclient.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: dhclient: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: dhcp6c: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: dhclient6: not found
2018-01-23 12:52:37.885511137 Including prep/GNU/Linux/220_include_lvm_tools.sh
2018-01-23 12:52:37.887115932 Device mapper found enabled. Including LVM tools.
2018-01-23 12:52:37.890996559 Including prep/GNU/Linux/230_include_md_tools.sh
2018-01-23 12:52:37.894339015 Including prep/GNU/Linux/240_include_multipath_tools.sh
2018-01-23 12:52:37.903815648 Including prep/GNU/Linux/280_include_systemd.sh
2018-01-23 12:52:37.918046129 Including prep/GNU/Linux/280_include_virtualbox.sh
2018-01-23 12:52:37.922187698 Including prep/GNU/Linux/280_include_vmware_tools.sh
2018-01-23 12:52:37.926067497 Including prep/GNU/Linux/290_include_drbd.sh
2018-01-23 12:52:37.930035414 Including prep/GNU/Linux/300_check_backup_and_output_url.sh
2018-01-23 12:52:37.935275996 Including prep/ISO/default/300_check_iso_dir.sh
2018-01-23 12:52:37.937950015 Including prep/GNU/Linux/300_include_grub_tools.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: grub-probe: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: grub2-probe: not found
2018-01-23 12:52:37.943281247 Including prep/GNU/Linux/310_include_cap_utils.sh
2018-01-23 12:52:37.946094767 Including prep/ISO/default/320_check_cdrom_size.sh
2018-01-23 12:52:37.948417136 ISO Directory '/var/lib/rear/output' [/dev/mapper/system-root] has 9868 MB free space
2018-01-23 12:52:37.951286820 Including prep/default/320_include_uefi_env.sh
2018-01-23 12:52:37.956210203 Including prep/ISO/GNU/Linux/320_verify_mkisofs.sh
2018-01-23 12:52:37.957463045 Using '/usr/bin/mkisofs' to create ISO images
2018-01-23 12:52:37.960284339 Including prep/default/330_include_uefi_tools.sh
2018-01-23 12:52:37.963077191 Including prep/ISO/GNU/Linux/340_add_isofs_module.sh
2018-01-23 12:52:37.967192727 Including prep/default/380_include_opal_tools.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: sedutil-cli: not found
2018-01-23 12:52:37.969947817 Including prep/TSM/default/400_prep_tsm.sh
2018-01-23 12:52:37.976027119 Including prep/default/400_save_directories.sh
/home 755 root root
2018-01-23 12:52:38.027068427 FHS directory /run does not exist
/usr/sap 755 root sapsys
/usr/sap/DAA 755 root root
2018-01-23 12:52:38.084920940 Including prep/default/950_check_missing_programs.sh
2018-01-23 12:52:38.089250057 Finished running 'prep' stage in 1 seconds
2018-01-23 12:52:38.090225975 ======================
2018-01-23 12:52:38.091188723 Running 'layout/save' stage
2018-01-23 12:52:38.092083387 ======================
2018-01-23 12:52:38.098409601 Including layout/save/GNU/Linux/100_create_layout_file.sh
2018-01-23 12:52:38.099461282 Creating disk layout
2018-01-23 12:52:38.100707462 Preparing layout directory.
2018-01-23 12:52:38.106154398 Removing old layout file.
2018-01-23 12:52:38.109204458 Including layout/save/GNU/Linux/150_save_diskbyid_mappings.sh
2018-01-23 12:52:38.405675647 Saved diskbyid_mappings
2018-01-23 12:52:38.408504934 Including layout/save/GNU/Linux/190_opaldisk_layout.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: sedutil-cli: not found
2018-01-23 12:52:38.411400432 Including layout/save/GNU/Linux/200_partition_layout.sh
2018-01-23 12:52:38.419659696 Saving disk partitions.
2018-01-23 12:52:38.424955954 Ignoring sda: it is a path of a multipath device
2018-01-23 12:52:38.428456808 Ignoring sdb: it is a path of a multipath device
2018-01-23 12:52:38.432124182 Ignoring sdc: it is a path of a multipath device
2018-01-23 12:52:38.436499143 Ignoring sdd: it is a path of a multipath device
2018-01-23 12:52:38.440080070 Ignoring sde: it is a path of a multipath device
2018-01-23 12:52:38.443884737 Ignoring sdf: it is a path of a multipath device
2018-01-23 12:52:38.447407754 Ignoring sdg: it is a path of a multipath device
2018-01-23 12:52:38.451078496 Ignoring sdh: it is a path of a multipath device
2018-01-23 12:52:38.454180629 Including layout/save/GNU/Linux/210_raid_layout.sh
2018-01-23 12:52:38.458198118 Including layout/save/GNU/Linux/220_lvm_layout.sh
2018-01-23 12:52:38.459504308 Saving LVM layout.
2018-01-23 12:52:38.731401799 Including layout/save/GNU/Linux/230_filesystem_layout.sh
2018-01-23 12:52:38.732577287 Begin saving filesystem layout
2018-01-23 12:52:38.734721590 Saving filesystem layout (using the findmnt command).
2018-01-23 12:52:38.916414856 End saving filesystem layout
2018-01-23 12:52:38.919581815 Including layout/save/GNU/Linux/240_swaps_layout.sh
2018-01-23 12:52:38.920870962 Saving Swap information.
2018-01-23 12:52:38.968141342 Including layout/save/GNU/Linux/250_drbd_layout.sh
2018-01-23 12:52:38.971125457 Including layout/save/GNU/Linux/260_crypt_layout.sh
2018-01-23 12:52:38.972303787 Saving Encrypted volumes.
2018-01-23 12:52:38.977694792 Including layout/save/GNU/Linux/270_hpraid_layout.sh
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: hpacucli: not found
/usr/share/rear/lib/_input-output-functions.sh: line 208: type: hpssacli: not found
2018-01-23 12:52:38.980899306 Including layout/save/GNU/Linux/280_multipath_layout.sh
2018-01-23 12:52:39.333969346 Including layout/save/default/300_list_dependencies.sh
2018-01-23 12:52:39.454504984 Including layout/save/default/310_autoexclude_usb.sh
2018-01-23 12:52:39.460451134 Including layout/save/default/310_include_exclude.sh
2018-01-23 12:52:39.461876800 Excluding Volume Group vgHANA-data-HC2.
2018-01-23 12:52:39.465843306 Excluding Volume Group vgHANA-data-HC3.
2018-01-23 12:52:39.469901349 Excluding Volume Group vgHANA-log-HC2.
2018-01-23 12:52:39.474108518 Excluding Volume Group vgHANA-log-HC3.
2018-01-23 12:52:39.478167861 Excluding Volume Group vgHANA-shared-HC2.
2018-01-23 12:52:39.482250464 Excluding Volume Group vgHANA-hared-HC3.
2018-01-23 12:52:39.490950283 Including layout/save/default/320_autoexclude.sh
2018-01-23 12:52:40.152757606 Excluding multipath slave /dev/sdd.
2018-01-23 12:52:40.156280410 Excluding multipath slave /dev/sdh.
2018-01-23 12:52:40.159894026 Excluding multipath slave /dev/sda.
2018-01-23 12:52:40.163344608 Excluding multipath slave /dev/sde.
2018-01-23 12:52:40.166907938 Excluding multipath slave /dev/sdc.
2018-01-23 12:52:40.170333086 Excluding multipath slave /dev/sdg.
2018-01-23 12:52:40.173921191 Excluding multipath slave /dev/sdb.
2018-01-23 12:52:40.177307039 Excluding multipath slave /dev/sdf.
2018-01-23 12:52:40.182493435 Including layout/save/default/330_remove_exclusions.sh
2018-01-23 12:52:40.187702767 Including layout/save/default/335_remove_excluded_multipath_vgs.sh
2018-01-23 12:52:40.208731785 Including layout/save/GNU/Linux/340_false_blacklisted.sh
2018-01-23 12:52:40.222444741 Including layout/save/default/340_generate_mountpoint_device.sh
2018-01-23 12:52:40.554312358 Including layout/save/GNU/Linux/350_copy_drbdtab.sh
2018-01-23 12:52:40.557121957 Including layout/save/default/350_save_partitions.sh
2018-01-23 12:52:40.559969037 Including layout/save/default/400_check_backup_special_files.sh
2018-01-23 12:52:40.563154679 Including layout/save/default/445_guess_bootloader.sh
2018-01-23 12:52:40.566820289 Using sysconfig bootloader 'ppc'
2018-01-23 12:52:40.570848481 Including layout/save/default/450_check_bootloader_files.sh
2018-01-23 12:52:40.574964533 Including layout/save/default/450_check_network_files.sh
2018-01-23 12:52:40.577866747 Including layout/save/GNU/Linux/500_extract_vgcfg.sh
2018-01-23 12:52:40.646217553 Including layout/save/GNU/Linux/510_current_disk_usage.sh
2018-01-23 12:52:40.652741955 Including layout/save/default/600_snapshot_files.sh
2018-01-23 12:52:40.657238847 Finished running 'layout/save' stage in 2 seconds
2018-01-23 12:52:40.658320841 ======================
2018-01-23 12:52:40.659256628 Running 'rescue' stage
2018-01-23 12:52:40.660177691 ======================
2018-01-23 12:52:40.666510210 Including rescue/default/010_merge_skeletons.sh
2018-01-23 12:52:40.667652288 Creating root filesystem layout
2018-01-23 12:52:40.669159495 Adding 'default'
2018-01-23 12:52:40.678139092 Adding 'Linux-ppc64'
2018-01-23 12:52:40.685437421 Including rescue/default/100_hostname.sh
2018-01-23 12:52:40.688275903 Including rescue/default/200_etc_issue.sh
2018-01-23 12:52:40.692613682 Including rescue/GNU/Linux/220_load_modules_from_initrd.sh
2018-01-23 12:52:40.696346972 Including rescue/GNU/Linux/230_storage_and_network_modules.sh
2018-01-23 12:52:40.697535341 Including storage drivers
2018-01-23 12:52:40.702444802 Including network drivers
2018-01-23 12:52:40.707341261 Including crypto drivers
2018-01-23 12:52:40.710992313 Including virtualization drivers
2018-01-23 12:52:40.714194110 Including additional drivers
2018-01-23 12:52:40.718988754 Including rescue/GNU/Linux/240_kernel_modules.sh
2018-01-23 12:52:40.725827218 Including rescue/GNU/Linux/250_udev.sh
2018-01-23 12:52:40.730229664 Including rescue/GNU/Linux/260_collect_initrd_modules.sh
2018-01-23 12:52:40.735880316 Including rescue/GNU/Linux/260_storage_drivers.sh
2018-01-23 12:52:40.863859994 Including rescue/GNU/Linux/270_fc_transport_info.sh
2018-01-23 12:52:40.869720857 Including rescue/GNU/Linux/290_kernel_cmdline.sh
2018-01-23 12:52:40.875161739 Including rescue/GNU/Linux/300_dns.sh
2018-01-23 12:52:40.878419887 Including rescue/GNU/Linux/310_network_devices.sh
2018-01-23 12:52:40.895364103 ERROR:
====================
BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306:
'Unexpected operational state 'unknown' for 'eth0'.'
--------------------
Please report this issue at https://github.com/rear/rear/issues
and include the relevant parts from /var/log/rear/rear-system.log
preferably with full debug information via 'rear -d -D mkrescue'
====================
==== Stack trace ====
Trace 0: /usr/sbin/rear:543 main
Trace 1: /usr/share/rear/lib/mkrescue-workflow.sh:16 WORKFLOW_mkrescue
Trace 2: /usr/share/rear/lib/framework-functions.sh:101 SourceStage
Trace 3: /usr/share/rear/lib/framework-functions.sh:49 Source
Trace 4: /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh:812 source
Trace 5: /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh:306 is_interface_up
Trace 6: /usr/share/rear/lib/_input-output-functions.sh:307 BugError
Message:
====================
BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306:
'Unexpected operational state 'unknown' for 'eth0'.'
--------------------
Please report this issue at https://github.com/rear/rear/issues
and include the relevant parts from /var/log/rear/rear-system.log
preferably with full debug information via 'rear -d -D mkrescue'
====================
== End stack trace ==
2018-01-23 12:52:40.899819732 Running exit tasks.
2018-01-23 12:52:40.901709363 Finished in 3 seconds
2018-01-23 12:52:40.902953087 Removing build area /tmp/rear.lth9kCedXLHuKMC
2018-01-23 12:52:40.911961106 End of program reached
jsmeix commented at 2018-01-23 15:57:¶
@schubiduuu
please provide a debug log with
rear -D mkrescue
as requested by ReaR's BUG error message.
schubiduuu commented at 2018-01-26 08:35:¶
Sorry for the delay but here is the log.
schubiduuu commented at 2018-01-30 10:34:¶
Is anything else missing for further investigation?
jsmeix commented at 2018-01-30 11:20:¶
@rmetrich
could you have a look what goes on here because according to
# git log -p --follow usr/share/rear/rescue/GNU/Linux/310_network_devices.sh | egrep 'Unexpected operational state|^commit|^Author' ... commit 15567ede425401b008e5b1680db36a2c62752b8f Author: Renaud Métrich+ BugError "Unexpected operational state '$state' for '$network_interface'." ...
this BugError belongs to your new 310_network_devices.sh code.
Many thanks in advance!
gdha commented at 2018-01-30 11:26:¶
@schubiduuu How does your network layout looks like? Can we see the
output of ip a
?
2018-01-26 09:27:10.748644798 Including rescue/GNU/Linux/310_network_devices.sh
2018-01-26 09:27:10.749557540 Entering debugscripts mode via 'set -x'.
+ source /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh
++ network_devices_setup_script=/tmp/rear.uCrRqgsehRpGe5L/rootfs/etc/scripts/system-setup.d/60-network-devices.sh
++ echo '# Network devices setup:'
++ cat -
++ test ''
++ cat -
++ cat -
++ readlink /foo /bar
++ net_devices_have_lower_links=false
++ ls /sys/class/net/eth0 /sys/class/net/eth1 /sys/class/net/lo
++ grep -q '^lower_'
++ ip_link_supports_bridge=false
++ ip link help
++ grep -qw bridge
++ ip_link_supports_bridge=true
++ MAPPED_NETWORK_INTERFACES=()
++ rc_success=0
++ rc_error=1
++ rc_ignore=2
++ already_set_up_interfaces=
++ already_set_up_bridges=
++ already_set_up_teams=
++ already_set_up_bonds=
++ already_set_up_vlans=
++ generated_vlans=
++ already_set_up_physdevs=
++ already_set_up_physdev_drivers=
+++ mktemp
++ tmpfile=/tmp/tmp.Br88ibzhue
++ rc=
+++ ip r
+++ awk '$2 == "dev" && $8 == "src" { print $3 }'
+++ sort -u
++ for network_interface in '$( ip r | awk '\''$2 == "dev" && $8 == "src" { print $3 }'\'' | sort -u )'
++ is_linked_to_physical eth0
++ local network_interface=eth0
++ local sysfspath=/sys/class/net/eth0
++ is_devpath_linked_to_physical /sys/class/net/eth0
++ local sysfspath=/sys/class/net/eth0
++ '[' '!' -e /sys/class/net/eth0/device ']'
++ return 0
++ is_interface_up eth0
++ local network_interface=eth0
++ local sysfspath=/sys/class/net/eth0
+++ cat /sys/class/net/eth0/operstate
++ local state=unknown
++ '[' unknown = down ']'
++ '[' unknown = up ']'
++ BugError 'Unexpected operational state '\''unknown'\'' for '\''eth0'\''.'
@schabrolles Can you also have a closer look at this issue please?
schubiduuu commented at 2018-01-30 11:29:¶
: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether f6:44:7d:92:b2:04 brd ff:ff:ff:ff:ff:ff inet 10.224.0.14/25 brd 10.224.0.127 scope global eth0 inet6 fe80::f444:7dff:fe92:b204/64 scope link valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether f6:44:7d:92:b2:05 brd ff:ff:ff:ff:ff:ff inet 10.224.0.230/25 brd 10.224.0.255 scope global eth1 inet6 fe80::f444:7dff:fe92:b205/64 scope link valid_lft forever preferred_lft forever
rmetrich commented at 2018-01-30 11:39:¶
@schubiduuu Is your network eth0 currently operational?
Could you please dump out of the following:
grep "" /sys/class/net/eth0/*
In the code, I'm relying on "operstate" to set up the link up/down mode. But it looks like this is not reliable.
schubiduuu commented at 2018-01-30 11:50:¶
/sys/class/net/eth0/addr_assign_type:0
/sys/class/net/eth0/addr_len:6
/sys/class/net/eth0/address:f6:44:7d:92:b2:04
/sys/class/net/eth0/broadcast:ff:ff:ff:ff:ff:ff
/sys/class/net/eth0/carrier:1
/sys/class/net/eth0/dev_id:0x0
/sys/class/net/eth0/dev_port:0
/sys/class/net/eth0/dormant:0
/sys/class/net/eth0/duplex:full
/sys/class/net/eth0/features:0x20114813
/sys/class/net/eth0/flags:0x1003
/sys/class/net/eth0/ifindex:2
/sys/class/net/eth0/iflink:2
/sys/class/net/eth0/link_mode:0
/sys/class/net/eth0/mtu:9000
/sys/class/net/eth0/netdev_group:0
/sys/class/net/eth0/operstate:unknown
/sys/class/net/eth0/speed:1000
/sys/class/net/eth0/tx_queue_len:1000
/sys/class/net/eth0/type:1
/sys/class/net/eth0/uevent:INTERFACE=eth0
/sys/class/net/eth0/uevent:IFINDEX=2
schubiduuu commented at 2018-01-30 11:50:¶
and yes eth0 is operational.
rmetrich commented at 2018-01-30 12:11:¶
@schubiduuu could you please set the interface explictly down
(using ip link set dev eth0 down
) and dump again the
/sys/class/net/eth0 output?
schubiduuu commented at 2018-01-30 12:54:¶
I will post the output tomorrow morning as I need to agree on a downtime.
rmetrich commented at 2018-01-30 13:05:¶
@schubiduuu maybe do that on eth1?
schubiduuu commented at 2018-01-30 13:09:¶
Unfortunately both interfaces are used productively.
rmetrich commented at 2018-01-30 13:14:¶
ok, as a workaround, you may edit the
/usr/share/rear/rescue/GNU/Linux/310_network_devices.sh
file, and
comment out the line 306 and replace by a "return 0", which will
evaluate as "interface is up".
schubiduuu commented at 2018-01-30 13:15:¶
ok, I will edit the file and start another test.
rmetrich commented at 2018-01-30 13:22:¶
could you dump the lspci -vv
output and lsmod
output for reference?
schubiduuu commented at 2018-01-30 13:37:¶
@rmetrich Should I perform "lspci -vv" and "lsmod" on the source system?
or after rebooting via the rescue iso?
The test after commenting out the line 306 shows that I can't modify the
interface config and this message is display every time:
SIOCSIFFLAGS: Invalid argument
rmetrich commented at 2018-01-30 13:46:¶
Please run lspci -vv
and lsmod
from the source system, no need to
reboot in rescue iso.
Please try with the following patch:
function is_interface_up () {
local network_interface=$1
local sysfspath=/sys/class/net/$network_interfacelocal state=$( cat $sysfspath/carrier 2>/dev/null || echo 0)
if [ $state -eq 0 ] ; then
return 1
else
return 0
fi
If that fails, please provide debug output for read -dD mkrescue
,
310_network_devices.sh
part
schubiduuu commented at 2018-01-30 13:50:¶
No output for "lspci -vv" and for "lsmod":
Module Size Used by st 51177 0 sr_mod 23543 0 ide_cd_mod 39518 0 cdrom 51354 2 sr_mod,ide_cd_mod joydev 16492 0 nfs 583351 0 fscache 85204 1 nfs lockd 113448 1 nfs auth_rpcgss 61878 1 nfs nfs_acl 3888 1 nfs sunrpc 366268 7 nfs,lockd,auth_rpcgss,nfs_acl fuse 127337 1 xfs 1077534 7 loop 25404 0 ipv6 1881 1 ipv6_lib 440978 127 ipv6 sg 45603 0 nx_crypto 45618 0 ibmveth 32645 0 ext3 206842 2 jbd 102286 1 ext3 mbcache 10827 1 ext3 dm_mirror 23171 0 dm_region_hash 16423 1 dm_mirror dm_log 15841 2 dm_mirror,dm_region_hash linear 6958 0 sd_mod 54931 8 crc_t10dif 1691 1 sd_mod dm_service_time 4794 5 dm_least_pending 4374 0 dm_queue_length 4402 0 dm_round_robin 3996 0 dm_multipath 30372 8 dm_service_time,dm_least_pending,dm_queue_length,dm_round_robin scsi_dh_hp_sw 6663 0 scsi_dh_rdac 12199 0 scsi_dh_emc 10476 0 scsi_dh_alua 17206 0 scsi_dh 11199 5 dm_multipath,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua dm_snapshot 47338 0 dm_mod 127248 58 dm_mirror,dm_log,dm_multipath,dm_snapshot ibmvscsic 37737 16 scsi_transport_srp 8416 1 ibmvscsic scsi_tgt 17750 1 scsi_transport_srp scsi_mod 291791 12 st,sr_mod,sg,sd_mod,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua,scsi_dh,ibmvscsic,scsi_transport_srp,scsi_tgt
rmetrich commented at 2018-01-30 13:54:¶
You may need to be root for lspci -vv
.
You may also try dmidecode > /tmp/dmidecode.txt
and upload the file.
The idea is to know which driver/hardware is doesn't provide the data.
schubiduuu commented at 2018-01-30 13:56:¶
I am root on the system and the system says "dmidecode: not found".
When executing "rear -D mkresuce" there are no error messages.
rmetrich commented at 2018-01-30 14:15:¶
Could you upload the log anyway, for me to check.
dmidecode is not installed by default, maybe @jsmeix may help (I have no
SuSE in house :-) )
schubiduuu commented at 2018-01-30 14:18:¶
rmetrich commented at 2018-01-30 14:41:¶
Sorry, I missed a "operstate" use just above.
function get_interface_state () {
local network_interface=$1
local sysfspath=/sys/class/net/$network_interfacecat $sysfspath/operstate
}
should be replaced by
function get_interface_state () {
local network_interface=$1
local sysfspath=/sys/class/net/$network_interfaceif [ $(cat $sysfspath/carrier 2>/dev/null || 0) -eq 0 ]; then
return "down"
else
return "up"
fi
}
rmetrich commented at 2018-01-30 14:42:¶
With that, in the log file, you should see something like the following:
Was "unknown":
++ code='ip link set dev eth0 unknown
ip link set dev eth0 mtu 9000'
Should then be "up":
++ code='ip link set dev eth0 up
ip link set dev eth0 mtu 9000'
schubiduuu commented at 2018-01-30 14:59:¶
Do you mean the log file used by "rear -D mkrescue"?
rmetrich commented at 2018-01-30 15:16:¶
yes, the one you uploaded 1 hour ago.
we can see "ip link set dev ... unknown" in it.
With the latest "fix", this should disappear and you should get "up"
instead.
I'll then produce a real patch once we know more about a reliable way of
getting the interface status.
Hence my question about the hardware/driver used.
jsmeix commented at 2018-01-30 15:27:¶
Regarding dmidecode:
On my SLES11 x86 system I have:
# type -a dmidecode dmidecode is /usr/sbin/dmidecode # rpm -qf /usr/sbin/dmidecode pmtools-20071116-44.33.1
I don't know if dmidecode is also avialable for ppc64 architecture.
rmetrich commented at 2018-01-30 16:06:¶
Didn't notice the ppc64 arch, may be a hint
schubiduuu commented at 2018-01-31 06:31:¶
Output of grep "" /sys/class/net/eth0/* after executing "ip link set dev eth0 down":
/sys/class/net/eth0/addr_assign_type:0 /sys/class/net/eth0/addr_len:6 /sys/class/net/eth0/address:f6:44:7d:92:b2:04 /sys/class/net/eth0/broadcast:ff:ff:ff:ff:ff:ff grep: /sys/class/net/eth0/carrier: Invalid argument /sys/class/net/eth0/dev_id:0x0 /sys/class/net/eth0/dev_port:0 grep: /sys/class/net/eth0/dormant: Invalid argument grep: /sys/class/net/eth0/duplex: Invalid argument /sys/class/net/eth0/features:0x20114813 /sys/class/net/eth0/flags:0x1002 /sys/class/net/eth0/ifindex:2 /sys/class/net/eth0/iflink:2 /sys/class/net/eth0/link_mode:0 /sys/class/net/eth0/mtu:9000 /sys/class/net/eth0/netdev_group:0 /sys/class/net/eth0/operstate:down grep: /sys/class/net/eth0/speed: Invalid argument /sys/class/net/eth0/tx_queue_len:1000 /sys/class/net/eth0/type:1 /sys/class/net/eth0/uevent:INTERFACE=eth0 /sys/class/net/eth0/uevent:IFINDEX=2
jsmeix commented at 2018-01-31 08:50:¶
@schubiduuu
FYI how you could work around this issue for now:
Usually you do not need the full networking setup of your original
system
also in the ReaR recovery system.
In general the recovery system you only need minimal networking setup
that is just sufficient to access your backup from within the running
recovery system (with sufficient thoughput if your backup is big)
so that your backup can be restored during "rear recover".
If ReaR's automated networking setup for ReaR's recovery system
does not work, you can manually specify the networking setup commands
that you need in your particular case in your recovery system via the
NETWORKING_PREPARATION_COMMANDS
config variable, see usr/share/rear/conf/default.conf for documentation.
In this particular case here you need additionally to modify
usr/share/rear/rescue/GNU/Linux/310_network_devices.sh
to let it return early after its very basic initial stuff to avoid
that
later it fails with BugError by adding an early return 0
e.g. after
the "If IPADDR=1.2.3.4 has been defined at boot time" part like
# If IPADDR=1.2.3.4 has been defined at boot time via ip=1.2.3.4 ... EOT # FIXME: Early 'return' to avoid https://github.com/rear/rear/issues/1701 return 0 # Detect whether 'readlink' supports multiple filenames or not
Furthermore if you need additional special commands to be run
in the ReaR recovery system (e.g. launching a special service)
you can specify that via the config variable
PRE_RECOVERY_SCRIPT
see usr/share/rear/conf/default.conf for documentation.
rmetrich commented at 2018-01-31 08:53:¶
@schubiduuu @jsmeix I'll submit a patch later today (or tomorrow), based
on operstate
or carrier
availability.
schubiduuu commented at 2018-01-31 13:04:¶
@jsmeix: Even if I don't need the full network configuration of the source system, I should be able to change manually the IP config with commands like ifconfig. When trying to change the IP address or changing the operational state of eth0 it keeps telling me: SIOCSIFFLAGS: Invalid argument.
jsmeix commented at 2018-01-31 14:50:¶
@schubiduuu
do you mean that in the running recovery system
manual network setup with "usual commands"
does not work
while in contrast that same "usual commands"
work in the original system?
schubiduuu commented at 2018-01-31 15:38:¶
I can use the usual command while the source system is running but not in the recovery environment.
schubiduuu commented at 2018-02-02 08:40:¶
Should I start another issue for the nonfunctional network commands?
rmetrich commented at 2018-02-02 08:55:¶
What do you exactly mean by "nonfunctional network commands?".
Please provide some examples and context (in the source system, or
within recovery)
schubiduuu commented at 2018-02-02 09:00:¶
Within the recovery environment after booting via a rear ISO I am not able to make any changes to the network configuration. Ifconfig or ip don't work as I get always this error message: SIOCSIFFLAGS: Invalid argument
rmetrich commented at 2018-02-02 09:30:¶
Then, yes, this looks like a different issue, unless the network script,
when running, sets up something bad (but I doubt).
To make sure, please boot the recovery with "ip=1.2.3.4", this will then
bypass the script.
If the issue persists, there is for sure another issue, likely related
to some missing driver.
schubiduuu commented at 2018-02-02 09:32:¶
Ok and I did this change but still get the same error. The strange thing about this issue is that I was able to use an older rear version for some time but then it didn't work suddenly.
rmetrich commented at 2018-02-02 09:34:¶
Try comparing the modules loaded (lsmod
) and please give some examples
of command you run.
schubiduuu commented at 2018-02-02 09:47:¶
lsmod on source system:
Module Size Used by st 51177 0 sr_mod 23543 0 ide_cd_mod 39518 0 cdrom 51354 2 sr_mod,ide_cd_mod joydev 16492 0 nfs 583351 0 fscache 85204 1 nfs lockd 113448 1 nfs auth_rpcgss 61878 1 nfs nfs_acl 3888 1 nfs sunrpc 366268 7 nfs,lockd,auth_rpcgss,nfs_acl fuse 127337 1 xfs 1077534 7 loop 25404 0 ipv6 1881 1 ipv6_lib 440978 127 ipv6 sg 45603 0 nx_crypto 45618 0 ibmveth 32645 0 ext3 206842 2 jbd 102286 1 ext3 mbcache 10827 1 ext3 dm_mirror 23171 0 dm_region_hash 16423 1 dm_mirror dm_log 15841 2 dm_mirror,dm_region_hash linear 6958 0 sd_mod 54931 8 crc_t10dif 1691 1 sd_mod dm_service_time 4794 5 dm_least_pending 4374 0 dm_queue_length 4402 0 dm_round_robin 3996 0 dm_multipath 30372 8 dm_service_time,dm_least_pending,dm_queue_length,dm_round_robin scsi_dh_hp_sw 6663 0 scsi_dh_rdac 12199 0 scsi_dh_emc 10476 0 scsi_dh_alua 17206 0 scsi_dh 11199 5 dm_multipath,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua dm_snapshot 47338 0 dm_mod 127248 58 dm_mirror,dm_log,dm_multipath,dm_snapshot ibmvscsic 37737 16 scsi_transport_srp 8416 1 ibmvscsic scsi_tgt 17750 1 scsi_transport_srp scsi_mod 291791 12 st,sr_mod,sg,sd_mod,scsi_dh_hp_sw,scsi_dh_rdac,scsi_dh_emc,scsi_dh_alua,scsi_dh,ibmvscsic,scsi_transport_srp,scsi_tgt
lsmod on recovery system:
Module Size Used by ipv6 1881 1 ipv6_lib 440978 137 ipv6 dm_mod 127248 0 nx_crypto 45618 0 ibmveth 32645 0 ibmvscsic 37737 0 scsi_transport_srp 8416 1 ibmvscsic scsi_tgt 17750 1 scsi_transport_srp scsi_mod 291791 3 ibmvscsic,scsi_transport_srp,scsi_tgt
example of executed command:
ifconfig -a -s eth0 1.2.3.4/24
jsmeix commented at 2018-02-06 09:36:¶
By default during "rear mkrescue/mkbackup" at least those
kernel modules that are currently loaded on the original system
get included in the recovery system.
But by default not all those currently loaded modules on the original
system
will also get (automatically) loaded in the running recovery system.
See the description of the config variables
MODULES and MODULES_LOAD
in usr/share/rear/conf/default.conf
To be more on the safe side you may include all
files in /lib/modules/$KERNEL_VERSION
in the recovery sytem by using
MODULES=( 'all_modules' )
cf.
https://github.com/rear/rear/issues/1202
and
https://github.com/rear/rear/issues/1355
which show examples of awkward unexpected failures
when needed kernel modules are missing.
If you use special hardware that may even
need special kernel modules outside of
/lib/modules/$KERNEL_VERSION
you must use something like
COPY_AS_IS=( "${COPY_AS_IS[@]:-}" /path/to/special/modules )
to also get your special kernel modules included
in your recovery system.
To enforce loading of kernel modules during startup
of the recovery system use something like
MODULES_LOAD=( first_module second_module ... )
If you use special hardware that needs firmware
(e.g. some network interface cards need firmware)
you may additionally have to get all firmware files
included in your recovery system, see the
FIRMWARE_FILES config variable in
usr/share/rear/conf/default.conf
and note the special exception about ppc64
so that you may have to enforce via
FIRMWARE_FILES=( 'yes' )
to get all files from the /lib*/firmware/ directories
included in the recovery system.
But both
MODULES=( 'all_modules' )
and even more
FIRMWARE_FILES=( 'yes' )
will increase the size of the recovery system
which is stored in ReaR's initrd.
But at least on some POWER architecture systems
with some POWER architecture specific bootloaders
(e.g. the yaboot bootloader)
the maximum size of the initrd is rather small, see the
REAR_INITRD_COMPRESSION config variable
in usr/share/rear/conf/default.conf
schubiduuu commented at 2018-02-06 10:48:¶
I used MODULES=( 'all_modules' ) and FIRMWARE_FILES=( 'yes' ) in the
local.conf.
To reduce the size of initrd I used "REAR_INITRD_COMPRESSION="lzma"".
Unfortunately this didn't help. Did I do anything wrong?
jsmeix commented at 2018-02-06 13:10:¶
As in
https://github.com/rear/rear/issues/1724#issuecomment-363413974
I guess (but I am not at all a sufficient networking expert
to make an authoritative decision here) that this issue
depends on "special hardware" - probably not on the ppc64
architecture because @schabrolles tested ReaR a lot
on POWER architecture - but perhaps more likely on special
networking hardware (e.g. special network interface card).
jsmeix commented at 2018-02-07 12:56:¶
@rmetrich
in addition to your
https://github.com/rear/rear/pull/1719
I did
https://github.com/rear/rear/pull/1725
that also fixes and enhances some minor issues
regarding networking setup in the recovery system.
I would appreciate it if you could have a look at
my
https://github.com/rear/rear/pull/1725
whether or not it looks o.k. to you.
rmetrich commented at 2018-02-07 13:25:¶
@jsmeix this looks good to me, but I will not be able to test in the coming days.
jsmeix commented at 2018-02-08 07:58:¶
With
https://github.com/rear/rear/pull/1719
merged
this particular issue here should be fixed.
jsmeix commented at 2018-02-08 08:05:¶
@rmetrich
many thanks for your analysis what the root cause is
and for your enhancement that makes ReaR even work
with network drivers that do not set
/sys/class/net/INTERFACE/operstate
to one of the usually expeced values 'up' or 'down' which is
a nice example of ReaR's "Dirty hacks welcome" style in
https://github.com/rear/rear/wiki/Coding-Style
jsmeix commented at 2018-02-27 10:41:¶
According to
https://github.com/rear/rear/issues/1741
this issue
BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306: 'Unexpected operational state 'unknown' for ...
could have also happened on non-POWER architecture,
in that case on "Ubuntu 16.04.4 LTS" as
BUG in /usr/share/rear/rescue/GNU/Linux/310_network_devices.sh line 306: 'Unexpected operational state 'unknown' for 'vnet0'.'
[Export of Github issue for rear/rear.]