#2142 PR merged: support for s390 (zLinux) arch

Labels: enhancement, fixed / solved / done

mutable-dan opened issue at 2019-05-09 15:33:

Pull Request Details:
  • Testing against s390 rhel 7 - using mkrescue
  • Not tested against SLEL > 11 but will
  • Details:
    Start of process to add s390 support to rear for RHEL. This pull adds some support to identify and verify boot partition and identify dependencies. Not tested with sles which boots to zipl and hands off to grub

mutable-dan commented at 2019-05-09 19:14:

fixed a syntax error in 950_verify_disklayout_file.sh after the pull
It looks like the fix made it into the pull

jsmeix commented at 2019-05-10 08:23:

@mutable-dan
thank you for your early pull request!

I added some comments about what I recognized
from plain looking at the code.

jsmeix commented at 2019-05-17 13:57:

@mutable-dan
many thanks for all your work here - it is much appreciated.
As time permits I will try to have a closer look next week.
I am still struggling with getting used to use an IBM Z system
e.g. x3270 terminal emulator, XEDIT, #cp, and things like that
(I feel I drop one clanger after another)...

jsmeix commented at 2019-05-21 14:53:

Here more of my changes that I did (as a IBM Z noob)
in my (currently basically blind) attempt to move forward a bit:

Because OUTPUT=ISO does not make sense on IBM Z
I created a new file usr/share/rear/output/IPL/Linux-s390/800_create_ipl.sh

# Create the 'initial program' to boot/load the ReaR recovery system
# on IBM Z via IPL (initial program load)
LogPrint "Creating initial program for IPL on IBM Z"
RESULT_FILES+=( $KERNEL_FILE $TMP_DIR/$REAR_INITRD_FILENAME )

which is currently basically an almost empty dummy
that only results that kernel and ReaR's initrd get copied
whereto the RESULT_FILES are normally copied.

Now I can use a new tentative output method OUTPUT=IPL
(we can rename or change all that as we like).

This way I get the currently used kernel of my running original system
and ReaR's created initrd that contains the ReaR recovery system
copied to my NFS server (I use BACKUP=NETFS).

But I needed also some generic improvements and fixes in
usr/share/rear/output/default/950_copy_result_files.sh
to make things work for me:

--- usr/share/rear/output/default/950_copy_result_files.sh.original     2019-05-20 13:50:22.938205486 +0200
+++ usr/share/rear/output/default/950_copy_result_files.sh      2019-05-21 15:32:50.638205486 +0200
@@ -1,39 +1,40 @@
 #
 # copy resulting files to network output location

+# For example for "rear mkbackuponly" there are usually no result files
+# that would need to be copied here to the network output location:
+test "$RESULT_FILES" || return 0
+
 local scheme=$( url_scheme $OUTPUT_URL )
 local host=$( url_host $OUTPUT_URL )
 local path=$( url_path $OUTPUT_URL )
 local opath=$( output_path $scheme $path )

 # if $opath is empty return silently (e.g. scheme tape)
-if [[ -z "$opath" || -z "$OUTPUT_URL" || "$scheme" == "obdr" || "$scheme" == "tape" ]]; then
+if [[ -z "$opath" || -z "$OUTPUT_URL" || "$scheme" == "obdr" || "$scheme" == "tape" ]] ; then
     return 0
 fi

 LogPrint "Copying resulting files to $scheme location"

 echo "$VERSION_INFO" >"$TMP_DIR/VERSION" || Error "Could not create $TMP_DIR/VERSION file"
-if test -s $(get_template "RESULT_usage_$OUTPUT.txt") ; then
-    cp $v $(get_template "RESULT_usage_$OUTPUT.txt") "$TMP_DIR/README" >&2
-    StopIfError "Could not copy '$(get_template RESULT_usage_$OUTPUT.txt)'"
+RESULT_FILES+=( "$TMP_DIR/VERSION" )
+
+local usage_readme_file=$( get_template "RESULT_usage_$OUTPUT.txt" )
+if test -s $usage_readme_file ; then
+    cp $v $usage_readme_file "$TMP_DIR/README" || Error "Could not copy $usage_readme_file to $TMP_DIR/README"
+    RESULT_FILES+=( "$TMP_DIR/README" )
 fi

 # Usually RUNTIME_LOGFILE=/var/log/rear/rear-$HOSTNAME.log
 # The RUNTIME_LOGFILE name is set by the main script from LOGFILE in default.conf
 # but later user config files are sourced in the main script where LOGFILE can be set different
 # so that the user config LOGFILE basename is used as final logfile name:
-final_logfile_name=$( basename $LOGFILE )
+local final_logfile_name=$( basename $LOGFILE )
 cat "$RUNTIME_LOGFILE" > "$TMP_DIR/$final_logfile_name" || Error "Could not copy $RUNTIME_LOGFILE to $TMP_DIR/$final_logfile_name"
+RESULT_FILES+=( "$TMP_DIR/$final_logfile_name" )
 LogPrint "Saving $RUNTIME_LOGFILE as $final_logfile_name to $scheme location"

-# Add the README, VERSION and the final logfile to the RESULT_FILES array
-RESULT_FILES=( "${RESULT_FILES[@]}" "$TMP_DIR/VERSION" "$TMP_DIR/README" "$TMP_DIR/$final_logfile_name" )
-
-# For example for "rear mkbackuponly" there are usually no result files
-# that would need to be copied here to the network output location:
-test "$RESULT_FILES" || return 0
-
 # The real work (actually copying resulting files to the network output location):
 case "$scheme" in
     (nfs|cifs|usb|file|sshfs|ftpfs|davfs)

With hat changes and this etc/rear/local.conf

OUTPUT=IPL
BACKUP=NETFS
BACKUP_OPTIONS="nfsvers=3,nolock"
BACKUP_URL=nfs://10.160.67.243/nfs
SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"
KEEP_BUILD_DIR="yes"
FIRMWARE_FILES=( 'no' )
PROGRESS_MODE="plain"
PROGRESS_WAIT_SECONDS="5"

"rear -D mkbackup" does this for me

# usr/sbin/rear -D mkbackup
Relax-and-Recover 2.4 / Git
Running rear mkbackup (PID 2922)
Using log file: /root/rear.pull2142/var/log/rear/rear-linux-coqs.log
Using backup archive '/tmp/rear.mj7lzA4sKBwh903/outputfs/linux-coqs/backup.tar.gz'
Using autodetected kernel '/boot/image-4.12.14-94.41-default' as kernel in the recovery system
Creating disk layout
Using sysconfig bootloader 'grub2'
Verifying that the entries in /root/rear.pull2142/var/lib/rear/layout/disklayout.conf are correct ...
Creating root filesystem layout
Handling network interface 'eth0'
eth0 is a physical device
Handled network interface 'eth0'
Copying logfile /root/rear.pull2142/var/log/rear/rear-linux-coqs.log into initramfs as '/tmp/rear-linux-coqs-partial-2019-05-21T16:44:13+02:00.log'
Copying files and directories
Copying binaries and libraries
Copying all kernel modules in /lib/modules/4.12.14-94.41-default (MODULES contains 'all_modules')
Omit copying files in /lib*/firmware/ (FIRMWARE_FILES='no')
Skip copying broken symlink '/etc/mtab' target '/proc/11599/mounts' on /proc/ /sys/ /dev/ or /run/
Testing that the recovery system in /tmp/rear.mj7lzA4sKBwh903/rootfs contains a usable system
Creating recovery/rescue system initramfs/initrd initrd.cgz with gzip default compression
Created initrd.cgz with gzip default compression (75587348 bytes) in 13 seconds
Creating initial program for IPL on IBM Z
Copying resulting files to nfs location
Saving /root/rear.pull2142/var/log/rear/rear-linux-coqs.log as rear-linux-coqs.log to nfs location
Copying result files '/boot/image-4.12.14-94.41-default /tmp/rear.mj7lzA4sKBwh903/tmp/initrd.cgz /tmp/rear.mj7lzA4sKBwh903/tmp/VERSION /tmp/rear.mj7lzA4sKBwh903/tmp/rear-linux-coqs.log' to /tmp/rear.mj7lzA4sKBwh903/outputfs/linux-coqs at nfs location
Creating tar archive '/tmp/rear.mj7lzA4sKBwh903/outputfs/linux-coqs/backup.tar.gz'
Preparing archive operation
Archived 50 MiB [avg 7442 KiB/sec] 
...
Archived 990 MiB [avg 4737 KiB/sec] 
OK
Archived 990 MiB in 219 seconds [avg 4629 KiB/sec]
Exiting rear mkbackup (PID 2922) and its descendant processes ...
Running exit tasks
You should also rm -Rf /tmp/rear.mj7lzA4sKBwh903

and it results those files on my NFS server

# ls -lhtr /nfs/linux-coqs
total 1.1G
-rw-rw-rw- 1 nobody nobody   13M May 21 16:44 image-4.12.14-94.41-default
-rw-rw-rw- 1 nobody nobody   73M May 21 16:44 initrd.cgz
-rw-rw-rw- 1 nobody nobody   267 May 21 16:44 VERSION
-rw-rw-rw- 1 nobody nobody  2.7M May 21 16:44 rear-linux-coqs.log
-rw-rw-rw- 1 nobody nobody 1013M May 21 16:48 backup.tar.gz
-rw-rw-rw- 1 nobody nobody  9.6M May 21 16:48 backup.log

So now I got kernel, ReaR recovery system initrd and the backup
out of the original system and safe on my NFS server.

My next step is now to learn how I could "boot" (i.e. IPL)
a IBM Z system with that kernel and ReaR's initrd
or to learn that I need to do things totally differently...

mutable-dan commented at 2019-05-21 19:28:

Here more of my changes that I did (as a IBM Z noob)
in my (currently basically blind) attempt to move forward a bit:

thx, will merge your changes. on rhel, there are errors in the networking.... which I am looking at

i will make the changes regarding partition info (above) and push the changes. Then continue with the networking problems. There is code for copying kernel modules (for network (and dasd) drivers for s390) which I am not following 100%. I made some very small changes to add directories with modules.

jsmeix commented at 2019-05-22 06:58:

@mutable-dan
in general regarding "copying kernel modules" i.e. what kernel modules
will be copied into the ReaR recovery system:

Since the curent ReaR 2.5 release by default all kernel modules
will be copied into the ReaR recovery system, see in the
"Release Notes for Relax-and-Recover version 2.5" at
http://relax-and-recover.org/documentation/release-notes-2-5
the part about

Version 2.5 (May 2019)

Abstract

New features, bigger enhancements, and possibly backward incompatible changes:

...

Now there is in default.conf MODULES=( 'all_modules' )
which means that now by default all kernel modules
get included in the recovery system (issue #2041).
Usually this is required when ...
Additionaly it makes the recovery system better prepared
when ... (issue #1870) ... (issue #1202) ...
Furthermore this is helpful to be on the safe side ... (issue #1355) ...

and see the issues mentioned therein for background information.

Therefore since ReaR 2.5 you should not need to worry about
possibly missing kernel modules in the recovery system.
If on IBM Z architecture some kernel modules are missing
in the recovery system it would be a bug in the current
usr/share/rear/build/GNU/Linux/400_copy_modules.sh
https://raw.githubusercontent.com/rear/rear/master/usr/share/rear/build/GNU/Linux/400_copy_modules.sh
that would have to be fixed.

jsmeix commented at 2019-05-22 10:07:

Here some more of my further steps to move forward a bit:

I use two same IBM Z test systems each on z/VM 5.4.

On the first one (my "original system") with hostname s390vsl179
I did "rear mkbackup" as described in my above
https://github.com/rear/rear/pull/2142#issuecomment-494426331

On the second one (my "replacement hardware") with hostname s390vsl180
I try to do the "rear recover".

I learned that to ipl my kernel and ReaR's initrd that I saved on my NFS server
on an empty IBM Z system is an advanced task that I postpone to learn later.

What I did as an easier way how to "boot" my kernel and ReaR's initrd
on an IBM Z system is to "boot" that via kexec from within a
normally running IBM Z system that runs SLES12-SP4.

I.e. on my second IBM Z system with hostname s390vsl180
I have SLES12-SP4 normally running and there I copied
my kernel and ReaR's initrd from my NFS server
so that I have both locally as those files

/root/s390vsl179/initrd.cgz
/root/s390vsl179/image-4.12.14-94.41-default

Then I did

# sync

# kexec -l /root/s390vsl179/image-4.12.14-94.41-default --initrd=/root/s390vsl179/initrd.cgz --command-line=root=/dev/ram0 rw vga=normal hvc_iucv=8 TERM=dumb

# kexec -e

The kexec -e does an immediate hard boot of the new kernel
and initrd which is why I did sync before to be on the safe side
so that I could still reboot the system from the disk if kexec -e fails
but for me kexec -e just worked (that was really unexpected).

I got the --command-line paremeters for the kexec -l from
a combination of what /proc/cmdline shows on my original
system s390vsl179 wherefrom I took

hvc_iucv=8 TERM=dumb

plus what /proc/cmdline shows from another booted
ReaR recovery system on x86 hardware wherefrom I took

root=/dev/ram0 rw vga=normal

but I did something wrong with the kexec -l call
(probably I forgot proper quoting of the command-line paremeters)
because in the via kexec -e booted ReaR recovery system I get

RESCUE s390vsl179:~ # cat /proc/cmdline 
root=/dev/ram0

After the kexec -e I used the x3270 terminal to log in
at that new booted system to find out what its IP address is.
For me it is fortunately still the same as before so that I could
log in via ssh to the running ReaR recovery system on s390vsl179
because I use in my etc/rear/local.conf

SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"

cf. https://github.com/rear/rear/pull/2142#issuecomment-494426331
The DHCP server that I happen to use in my particular environment
does fortunately "the right thing" for me (otherwise I would have had
to do some manual networking setup from within the x3270 terminal).

In the ReaR recovery system I called "rear -D recover"

RESCUE s390vsl179:~ # rear -D recover
Relax-and-Recover 2.4 / Git
Running rear recover (PID 1021)
Using log file: /var/log/rear/rear-s390vsl179.log
Running workflow recover within the ReaR rescue/recovery system
Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.
Started RPC portmapper 'rpcbind'.
RPC portmapper 'rpcbind' available.
Started rpc.statd.
RPC status rpc.statd available.
Using backup archive '/tmp/rear.7zf40H5a5ZjPj1S/outputfs/s390vsl179/backup.tar.gz'
Calculating backup archive size
Backup archive size is 672M     /tmp/rear.7zf40H5a5ZjPj1S/outputfs/s390vsl179/backup.tar.gz (compressed)
Comparing disks
Device dasda has expected (same) size 7385333760 (will be used for recovery)
Disk configuration looks identical
UserInput -I DISK_LAYOUT_PROCEED_RECOVERY needed in /usr/share/rear/layout/prepare/default/250_compare_disks.sh line 148
Proceed with recovery (yes) otherwise manual disk layout configuration is enforced
(default 'yes' timeout 30 seconds)

UserInput: No real user input (empty or only spaces) - using default input
UserInput: No choices - result is 'yes'
User confirmed to proceed with recovery
Partition rear-noname on /dev/dasda: size reduced to fit on disk.
Start system layout restoration.
Creating partitions for disk /dev/dasda (dasd)
UserInput -I LAYOUT_CODE_RUN needed in /usr/share/rear/layout/recreate/default/200_run_layout_code.sh line 127
The disk layout recreation script failed
1) Rerun disk recreation script (/var/lib/rear/layout/diskrestore.sh)
2) View 'rear recover' log file (/var/log/rear/rear-s390vsl179.log)
3) Edit disk recreation script (/var/lib/rear/layout/diskrestore.sh)
4) View original disk space usage (/var/lib/rear/layout/config/df.txt)
5) Use Relax-and-Recover shell and return back to here
6) Abort 'rear recover'
(default '1' timeout 300 seconds)
6
UserInput: Valid choice number result 'Abort 'rear recover''
ERROR: User chose to abort 'rear recover' in /usr/share/rear/layout/recreate/default/200_run_layout_code.sh
Some latest log messages since the last called script 200_run_layout_code.sh:
  2019-05-22 11:32:22.413374182 4) View original disk space usage (/var/lib/rear/layout/config/df.txt)
  2019-05-22 11:32:22.414657490 5) Use Relax-and-Recover shell and return back to here
  2019-05-22 11:32:22.415920179 6) Abort 'rear recover'
  2019-05-22 11:32:22.417189991 (default '1' timeout 300 seconds)
  2019-05-22 11:32:37.982610365 UserInput: 'read' got as user input '6'
  2019-05-22 11:32:37.985135427 UserInput: Valid choice number result 'Abort 'rear recover''
  2019-05-22 11:32:37.986851756 Error detected during restore.
  2019-05-22 11:32:37.987994430 Restoring saved original /var/lib/rear/layout/disklayout.conf
Aborting due to an error, check /var/log/rear/rear-s390vsl179.log for details
Exiting rear recover (PID 1021) and its descendant processes ...
Running exit tasks
You should also rm -Rf /tmp/rear.7zf40H5a5ZjPj1S
Terminated

so ReaR works in general but as expected for this very first attempt
"rear recover" fails.

My next step is now to analyze the "rear recover" debug log
to find out what went wrong...

jsmeix commented at 2019-05-22 10:16:

Except from the "rear recover" debug log
(i.e. what the diskrestore.sh script does)
that shows why it failed

++ source /var/lib/rear/layout/diskrestore.sh
+++ LogPrint 'Start system layout restoration.'
+++ Log 'Start system layout restoration.'
+++ echo '2019-05-22 11:32:22.177658372 Start system layout restoration.'
2019-05-22 11:32:22.177658372 Start system layout restoration.
+++ Print 'Start system layout restoration.'
+++ mkdir -p /mnt/local
+++ create_component vgchange rear
+++ local device=vgchange
+++ local type=rear
+++ local touchfile=rear-vgchange
+++ '[' -e /tmp/rear.7zf40H5a5ZjPj1S/tmp/touch/rear-vgchange ']'
+++ return 0
+++ lvm vgchange -a n
+++ component_created vgchange rear
+++ local device=vgchange
+++ local type=rear
+++ local touchfile=rear-vgchange
+++ touch /tmp/rear.7zf40H5a5ZjPj1S/tmp/touch/rear-vgchange
+++ set -e
+++ set -x
+++ create_component /dev/dasda disk
+++ local device=/dev/dasda
+++ local type=disk
+++ local touchfile=disk--dev-dasda
+++ '[' -e /tmp/rear.7zf40H5a5ZjPj1S/tmp/touch/disk--dev-dasda ']'
+++ return 0
+++ Log 'Stop mdadm'
+++ echo '2019-05-22 11:32:22.262948865 Stop mdadm'
2019-05-22 11:32:22.262948865 Stop mdadm
+++ grep -q md /proc/mdstat
+++ Log 'Erasing MBR of disk /dev/dasda'
+++ echo '2019-05-22 11:32:22.264773053 Erasing MBR of disk /dev/dasda'
2019-05-22 11:32:22.264773053 Erasing MBR of disk /dev/dasda
+++ dd if=/dev/zero of=/dev/dasda bs=512 count=1
1+0 records in
1+0 records out
512 bytes copied, 2.8952e-05 s, 17.7 MB/s
+++ sync
+++ LogPrint 'Creating partitions for disk /dev/dasda (dasd)'
+++ Log 'Creating partitions for disk /dev/dasda (dasd)'
+++ echo '2019-05-22 11:32:22.267478178 Creating partitions for disk /dev/dasda (dasd)'
2019-05-22 11:32:22.267478178 Creating partitions for disk /dev/dasda (dasd)
+++ Print 'Creating partitions for disk /dev/dasda (dasd)'
+++ my_udevsettle
+++ has_binary udevadm
+++ for bin in '$@'
+++ type udevadm
+++ return 0
+++ udevadm settle
+++ return 0
+++ parted -s /dev/dasda mklabel dasd
+++ my_udevsettle
+++ has_binary udevadm
+++ for bin in '$@'
+++ type udevadm
+++ return 0
+++ udevadm settle
+++ return 0
+++ my_udevsettle
+++ has_binary udevadm
+++ for bin in '$@'
+++ type udevadm
+++ return 0
+++ udevadm settle
+++ return 0
+++ parted -s /dev/dasda mkpart ''\''dasda1'\''' 98304B 314621951B
parted: invalid token: dasda1
Error: Expecting a file system type.

Those are the parted calls in diskrestore.sh

RESCUE s390vsl179:~ # grep ^parted /root/rear.pull2142/var/lib/rear/layout/diskrestore.sh

parted -s /dev/dasda mklabel dasd >&2
parted -s /dev/dasda mkpart "'dasda1'" 98304B 314621951B >&2
parted -s /dev/dasda mkpart "'dasda2'" 314621952B 838926335B >&2
parted -s /dev/dasda mkpart "'dasda3'" 838926336B 7385198591B >&2

I will now do manual parted commands in the
ReaR recovery system to learn how to correctly
do that for IBM Z architecture...

jsmeix commented at 2019-05-22 13:55:

Here the parted commands that were run by YaST
when installing the original system:

# grep ' SystemCmd Executing: ' /var/log/YaST2/y2log-1 | grep -o 'parted .*' | egrep -v ' print | rm '

parted -s --wipesignatures --align=optimal '/dev/dasda' unit cyl mkpart ext2 1 3414"
parted -s --wipesignatures --align=optimal '/dev/dasda' unit cyl mkpart ext2 3414 9103"
parted -s --wipesignatures --align=optimal '/dev/dasda' unit cyl mkpart ext2 9103 80135"

and how the disk looks on the original system:

# parted -s /dev/dasda unit cyl print unit MiB print unit B print

Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 80135cyl
Sector size (logical/physical): 512B/4096B
BIOS cylinder,head,sector geometry: 80136,15,12.  Each cylinder is 92.2kB.
Partition Table: dasd
Disk Flags:

Number  Start    End       Size      File system     Flags
 1      1cyl     3413cyl   3412cyl   ext2
 2      3413cyl  9102cyl   5689cyl   linux-swap(v1)
 3      9102cyl  80135cyl  71033cyl  ext4

Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 7043MiB
Sector size (logical/physical): 512B/4096B
Partition Table: dasd
Disk Flags:

Number  Start    End      Size     File system     Flags
 1      0.09MiB  300MiB   300MiB   ext2
 2      300MiB   800MiB   500MiB   linux-swap(v1)
 3      800MiB   7043MiB  6243MiB  ext4

Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 7385333760B
Sector size (logical/physical): 512B/4096B
Partition Table: dasd
Disk Flags:

Number  Start       End          Size         File system     Flags
 1      98304B      314621951B   314523648B   ext2
 2      314621952B  838926335B   524304384B   linux-swap(v1)
 3      838926336B  7385333759B  6546407424B  ext4

jsmeix commented at 2019-05-22 14:19:

The following parted commands work for me in the recovery system

RESCUE s390vsl179:~# parted -s /dev/dasda mklabel dasd

RESCUE s390vsl179:~# parted -s /dev/dasda mkpart ext2 98304B 314621951B

RESCUE s390vsl179:~# parted -s /dev/dasda mkpart linux-swap 314621952B 838926335B

RESCUE s390vsl179:~# parted -s /dev/dasda mkpart ext4 838926336B 7385333759B

RESCUE s390vsl179:~ # parted -s /dev/dasda unit B print
Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 7385333760B
Sector size (logical/physical): 512B/4096B
Partition Table: dasd
Disk Flags:

Number  Start       End          Size         File system  Flags
 1      98304B      314621951B   314523648B   ext2
 2      314621952B  838926335B   524304384B
 3      838926336B  7385333759B  6546407424B

RESCUE s390vsl179:~ # wipefs --all --force /dev/dasda3 || wipefs --all /dev/dasda3 || dd if=/dev/zero of=/dev/dasda3 bs=512 count=1 || true

RESCUE s390vsl179:~ # mkfs -t ext4 -b 4096 -i 16372 -U 6e3eac1a-6b90-450a-87b4-1d4336c43282 -F /dev/dasda3
mke2fs 1.43.8 (1-Jan-2018)
Creating filesystem with 1598244 4k blocks and 400624 inodes
Filesystem UUID: 6e3eac1a-6b90-450a-87b4-1d4336c43282
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

RESCUE s390vsl179:~ # tune2fs  -m 4 -c -1 -i 0d -o user_xattr,acl /dev/dasda3
tune2fs 1.43.8 (1-Jan-2018)
Setting maximal mount count to -1
Setting interval between checks to 0 seconds
Setting reserved blocks percentage to 4% (63929 blocks)

RESCUE s390vsl179:~ # mkdir -p /mnt/local/

RESCUE s390vsl179:~ # mount -o rw,relatime,data=ordered /dev/dasda3 /mnt/local/

RESCUE s390vsl179:~ # wipefs --all --force /dev/dasda1 || wipefs --all /dev/dasda1 || dd if=/dev/zero of=/dev/dasda1 bs=512 count=1 || true
/dev/dasda1: 2 bytes were erased at offset 0x00000438 (ext2): 53 ef

RESCUE s390vsl179:~ # mkfs -t ext2 -b 4096 -i 4095 -U 5b24febc-c69f-4555-9ec0-d15ce8d51b79 -F /dev/dasda1
mke2fs 1.43.8 (1-Jan-2018)
Creating filesystem with 76788 4k blocks and 76896 inodes
Filesystem UUID: 5b24febc-c69f-4555-9ec0-d15ce8d51b79
Superblock backups stored on blocks: 
        32768

Allocating group tables: done                            
Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done

RESCUE s390vsl179:~ # tune2fs  -m 4 -c -1 -i 0d -o user_xattr,acl /dev/dasda1
tune2fs 1.43.8 (1-Jan-2018)
Setting maximal mount count to -1
Setting interval between checks to 0 seconds
Setting reserved blocks percentage to 4% (3071 blocks)

RESCUE s390vsl179:~ # mkdir -p /mnt/local/boot/zipl

RESCUE s390vsl179:~ # mount -o rw,relatime,block_validity,barrier,user_xattr,acl /dev/dasda1 /mnt/local/boot/zipl

RESCUE s390vsl179:~ # mkswap -U 8093294b-52ef-4e9f-999e-f1cbaa49a2b9 /dev/dasda2
Setting up swapspace version 1, size = 500 MiB (524300288 bytes)
no label, UUID=8093294b-52ef-4e9f-999e-f1cbaa49a2b9

# mount | grep dasd
/dev/dasda3 on /mnt/local type ext4 (rw,relatime,data=ordered)
/dev/dasda1 on /mnt/local/boot/zipl type ext2 (rw,relatime,block_validity,barrier,user_xattr,acl)

RESCUE s390vsl179:~ # rear -D restoreonly
Relax-and-Recover 2.4 / Git
Running rear restoreonly (PID 7859)
Using log file: /var/log/rear/rear-s390vsl179.7859.log
Running workflow restoreonly within the ReaR rescue/recovery system
Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.
Started RPC portmapper 'rpcbind'.
RPC portmapper 'rpcbind' available.
RPC status rpc.statd available.
Using backup archive '/tmp/rear.JQtYoJkM2KD4JnQ/outputfs/s390vsl179/backup.tar.gz'
Calculating backup archive size
Backup archive size is 672M     /tmp/rear.JQtYoJkM2KD4JnQ/outputfs/s390vsl179/backup.tar.gz (compressed)
Restoring from '/tmp/rear.JQtYoJkM2KD4JnQ/outputfs/s390vsl179/backup.tar.gz' (restore log in /var/lib/rear/restore/restoreonly.backup.tar.gz.7859.restore.log) ...
Backup restore program 'tar' started in subshell (PID=8568)
Restored 300 MiB [avg. 61472 KiB/sec] 
Restored 442 MiB [avg. 45309 KiB/sec] 
Restored 585 MiB [avg. 40000 KiB/sec] 
Restored 719 MiB [avg. 36850 KiB/sec] 
Restored 857 MiB [avg. 35138 KiB/sec] 
Restored 989 MiB [avg. 33790 KiB/sec] 
Restored 1204 MiB [avg. 35250 KiB/sec] 
Restored 1566 MiB [avg. 40102 KiB/sec] 
OK
Restored 1608 MiB in 45 seconds [avg. 36592 KiB/sec]
Restoring finished (verify backup restore log messages in /var/lib/rear/restore/restoreonly.backup.tar.gz.7859.restore.log)
Recreating directories (with permissions) from /var/lib/rear/recovery/directories_permissions_owner_group
Finished recovering your system. You can explore it under '/mnt/local'.
Saving /var/log/rear/rear-s390vsl179.7859.log as /var/log/rear/rear-s390vsl179.log
Exiting rear restoreonly (PID 7859) and its descendant processes ...
Running exit tasks
You should also rm -Rf /tmp/rear.JQtYoJkM2KD4JnQ

RESCUE s390vsl179:~ # du -hs /mnt/local/*
5.1M    /mnt/local/bin
85M     /mnt/local/boot
4.0K    /mnt/local/dev
13M     /mnt/local/etc
48K     /mnt/local/home
49M     /mnt/local/lib
17M     /mnt/local/lib64
16K     /mnt/local/lost+found
4.0K    /mnt/local/mnt
4.0K    /mnt/local/opt
4.0K    /mnt/local/proc
37M     /mnt/local/root
4.0K    /mnt/local/run
14M     /mnt/local/sbin
4.0K    /mnt/local/selinux
20K     /mnt/local/srv
4.0K    /mnt/local/sys
4.0K    /mnt/local/tmp
1.4G    /mnt/local/usr
63M     /mnt/local/var

The du -hs /mnt/local/* output in the recovery system is basically
identical to the du -hs /* output on the original system so that
so fat things look good.

Next step is to find out how to manually install the bootloader
from within the recovery system...

jsmeix commented at 2019-05-22 14:56:

On SLES12-SP4 manually installing the bootloader
from within the recovery system is just

RESCUE s390vsl179:~ # mount --bind /proc /mnt/local/proc

RESCUE s390vsl179:~ # mount --bind /sys /mnt/local/sys

RESCUE s390vsl179:~ # mount --bind /dev /mnt/local/dev

RESCUE s390vsl179:~ # chroot /mnt/local

s390vsl179:/ # update-bootloader --reinit

s390vsl179:/ # exit
exit

RESCUE s390vsl179:~ # reboot
umounting all filesystems
/mnt/local/dev           : successfully unmounted
/mnt/local/sys           : ignored
/mnt/local/proc          : ignored
/mnt/local/boot/zipl     : successfully unmounted
/mnt/local               : successfully unmounted
/dev/pts                 : ignored
/sys/fs/cgroup/cpuset    : successfully unmounted
/sys/fs/cgroup/memory    : successfully unmounted
/sys/fs/cgroup/rdma      : successfully unmounted
/sys/fs/cgroup/devices   : successfully unmounted
/sys/fs/cgroup/perf_event: successfully unmounted
/sys/fs/cgroup/pids      : successfully unmounted
/sys/fs/cgroup/freezer   : successfully unmounted
/sys/fs/cgroup/hugetlb   : successfully unmounted
/sys/fs/cgroup/cpu,cpuacct: successfully unmounted
/sys/fs/cgroup/blkio     : successfully unmounted
/sys/fs/cgroup/net_cls,net_prio: successfully unmounted
/sys/fs/pstore           : successfully unmounted
umount: /sys/fs/cgroup/systemd: not mounted
/sys/fs/cgroup           : successfully unmounted
umount: /run: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
/dev/pts                 : ignored
/dev/shm                 : successfully unmounted
/sys/kernel/security     : successfully unmounted
/dev                     : successfully unmounted
/proc                    : ignored
/sys                     : ignored
umount: /: not mounted
syncing disks... waiting 3 seconds before reboot

and the recreated system boots and works.

jsmeix commented at 2019-05-22 14:59:

In https://github.com/rear/rear/pull/2142#issuecomment-494823543
the mkfs related commands are just copies from what I already have
in my diskrestore.sh script in the running ReaR recovery system.

So all what needs to be adapted for now to make "rear recover"
work in this particular IBM Z case is the parted commands
in the diskrestore.sh script and how to install the bootloader.

mutable-dan commented at 2019-05-22 16:29:

in general regarding "copying kernel modules" i.e. what kernel modules
will be copied into the ReaR recovery system:

@jsmeix this is the area in question, I have not checked it into my fork yet because I wanted to debug it first. Looking at the code for creating the rescue, did not seem to copy all modules

Here are changes:

git diff rescue/GNU/Linux/230_storage_and_network_modules.sh

diff --git a/usr/share/rear/rescue/GNU/Linux/230_storage_and_network_modules.sh b/usr/share/rear/rescue/GNU/Linux/230_storage_and_network_modules.sh
index 365ef5b3..2fa6d6fe 100644
--- a/usr/share/rear/rescue/GNU/Linux/230_storage_and_network_modules.sh
+++ b/usr/share/rear/rescue/GNU/Linux/230_storage_and_network_modules.sh
@@ -14,15 +14,15 @@ function find_modules_in_dirs () {

 # Include storage drivers
 Log "Including storage drivers"
-STORAGE_DRIVERS=( $( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/drivers/{block,firewire,ide,ata,md,message,scsi,usb/storage} ) )
+STORAGE_DRIVERS=( $( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/drivers/{block,firewire,ide,ata,md,message,scsi,usb/storage,s390/block,s390/scsi} ) )

 # Include network drivers
 Log "Including network drivers"
-NETWORK_DRIVERS=( $( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/drivers/net ) )
+NETWORK_DRIVERS=( $( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/drivers/{net,s390/net} ) )

 # Include crypto drivers
 Log "Including crypto drivers"
-CRYPTO_DRIVERS=( $( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/crypto ) )
+CRYPTO_DRIVERS=( $( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/{crypto,s390/crypto} ) )

Here is a dir listing for example:

ls -l /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/s390/
total 4
drwxr-xr-x 2 root root  143 Apr  4 13:27 block
drwxr-xr-x 2 root root 4096 Apr  4 13:27 char
drwxr-xr-x 2 root root   74 Apr  4 13:27 cio
drwxr-xr-x 2 root root  114 Apr  4 13:27 crypto
drwxr-xr-x 2 root root  134 Apr  4 13:27 net
drwxr-xr-x 2 root root   20 Apr  4 13:27 scsi

ls -l /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/s390/block/
total 804
-rw-r--r-- 1 root root  55985 Feb  7 02:38 dasd_diag_mod.ko
-rw-r--r-- 1 root root 235513 Feb  7 02:38 dasd_eckd_mod.ko
-rw-r--r-- 1 root root  58097 Feb  7 02:38 dasd_fba_mod.ko
-rw-r--r-- 1 root root 271673 Feb  7 02:38 dasd_mod.ko
-rw-r--r-- 1 root root  66297 Feb  7 02:38 dcssblk.ko
-rw-r--r-- 1 root root  71449 Feb  7 02:38 scm_block.ko
-rw-r--r-- 1 root root  48249 Feb  7 02:38 xpram.ko

ls -l /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/s390/net/
total 1188
-rw-r--r-- 1 root root 244969 Feb  7 02:38 ctcm.ko
-rw-r--r-- 1 root root  34833 Feb  7 02:38 fsm.ko
-rw-r--r-- 1 root root 150873 Feb  7 02:38 lcs.ko
-rw-r--r-- 1 root root 317497 Feb  7 02:38 qeth.ko
-rw-r--r-- 1 root root 154057 Feb  7 02:38 qeth_l2.ko
-rw-r--r-- 1 root root 220377 Feb  7 02:38 qeth_l3.ko
-rw-r--r-- 1 root root  38201 Feb  7 02:38 smsgiucv_app.ko
-rw-r--r-- 1 root root  42321 Feb  7 02:38 smsgiucv.ko

Here is the log before this change:

   4822 + source **/usr/share/rear/rescue/GNU/Linux/230_storage_and_network_modules.sh**
   4823 ++ Log 'Including storage drivers'
   4824 ++ echo '2019-05-10 15:06:37.460544032 Including storage drivers'
   4825 2019-05-10 15:06:37.460544032 Including storage drivers
   4826 ++ STORAGE_DRIVERS=($( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/drivers/{block,firewire,ide,ata,md,message,scsi,usb/storage} ))
   4827 +++ find_modules_in_dirs /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/block /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/firewire /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/ide /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/ata /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/md /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/message /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/scsi /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/usb/storage
   4828 +++ sed -e 's/^\(.*\)\.ko.*/\1/'
   4829 +++ find /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/block /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/firewire /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/ide /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/ata/lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/md/lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/message /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/scsi /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/usb/storage -type f -name '*.ko*' -printf '%f\n'
   4830 ++ Log 'Including network drivers'
   4831 ++ echo '2019-05-10 15:06:37.585248173 Including network drivers'
   4832 2019-05-10 15:06:37.585248173 Including network drivers
   4833 ++ NETWORK_DRIVERS=($( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/drivers/net ))
   4834 +++ find_modules_in_dirs /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/net
   4835 +++ sed -e 's/^\(.*\)\.ko.*/\1/'
   4836 +++ find /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/net -type f -name '*.ko*' -printf '%f\n'
   4837 ++ Log 'Including crypto drivers'
   4838 ++ echo '2019-05-10 15:06:37.708446923 Including crypto drivers'
   4839 2019-05-10 15:06:37.708446923 Including crypto drivers
   4840 ++ CRYPTO_DRIVERS=($( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/crypto ))
   4841 +++ find_modules_in_dirs /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/crypto
   4842 +++ find /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/crypto -type f -name '*.ko*' -printf '%f\n'
   4843 +++ sed -e 's/^\(.*\)\.ko.*/\1/'
   4844 ++ Log 'Including virtualization drivers'
   4845 ++ echo '2019-05-10 15:06:37.915648720 Including virtualization drivers'
   4846 2019-05-10 15:06:37.915648720 Including virtualization drivers
   4847 ++ VIRTUAL_DRIVERS=($( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/kernel/drivers/{virtio,xen} ))
   4848 +++ find_modules_in_dirs /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/virtio /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/xen
   4849 +++ sed -e 's/^\(.*\)\.ko.*/\1/'
   4850 +++ find /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/virtio /lib/modules/3.10.0-957.10.1.el7.s390x/kernel/drivers/xen -type f -name '*.ko*' -printf '%f\n'
   4851 ++ Log 'Including additional drivers'
   4852 ++ echo '2019-05-10 15:06:38.007149501 Including additional drivers'
   4853 2019-05-10 15:06:38.007149501 Including additional drivers
   4854 ++ EXTRA_DRIVERS=($( find_modules_in_dirs /lib/modules/$KERNEL_VERSION/{extra,weak-updates} ))
   4855 +++ find_modules_in_dirs /lib/modules/3.10.0-957.10.1.el7.s390x/extra /lib/modules/3.10.0-957.10.1.el7.s390x/weak-updates
   4856 +++ sed -e 's/^\(.*\)\.ko.*/\1/'
   4857 +++ find /lib/modules/3.10.0-957.10.1.el7.s390x/extra /lib/modules/3.10.0-957.10.1.el7.s390x/weak-updates -type f -name '*.ko*' -printf '%f\n'
   4858 ++ unset -f find_modules_in_dirs
   4859 + source_return_code=0

mutable-dan commented at 2019-05-22 23:21:

@jsmeix

there are 3 problems on rhel so far

and then comes actual recovery!

  1. StopIfError 'Could not copy '''/usr/share/rear/conf/templates/RESULT_usage_RAMDISK.txt
    which looks a little similar to the changes you made to 950_copy_result_files (have not looked deply at it)

  2. this looks tp be fixed by:
    in https://github.com/rear/rear/pull/2142#issuecomment-494426331
    file: usr/share/rear/output/default/950_copy_result_files.sh
    -- do you want me to implement your changes or can i merge your fix?

  3. Handling network interface 'enccw0.0.0610'
    enccw0.0.0610 is a physical device
    Skipping 'enccw0.0.0610': not yet supported.
    Failed to handle network interface 'enccw0.0.0610'.
    -- Will look at next.

  4. kernel not detected
    will investigate where the kernel autodetect is done

i am currently outputting to a ramdisk,
next, i will setup the netfs to nfs. we probably don't need IPL, we will see.
writing the initrd, kernel, etc to an nfs share should be sufficient
once i get the init, kernel and params on nfs, i can test the boot

i am getting the initrd written but not the kernel, will investigate where the kernel autodetect is done

Creating root filesystem layout
Handling network interface 'enccw0.0.0610'
enccw0.0.0610 is a physical device
Skipping 'enccw0.0.0610': not yet supported.
Failed to handle network interface 'enccw0.0.0610'.
To log into the recovery system via ssh set up /root/.ssh/authorized_keys or specify SSH_ROOT_PASSWORD
Copying logfile /var/log/rear/rear-red72a1.log into initramfs as '/tmp/rear-red72a1-partial-2019-05-22T18:18:40-0400.log'
Copying files and directories
Copying binaries and libraries
Copying all kernel modules in /lib/modules/3.10.0-957.10.1.el7.s390x (MODULES contains 'all_modules')
Copying all files in /lib*/firmware/
Skip copying broken symlink '/etc/mtab' target '/proc/2798/mounts' on /proc/ /sys/ /dev/ or /run/
Broken symlink '/usr/lib/modules/3.10.0-957.10.1.el7.s390x/build' in recovery system because 'readlink' cannot determine its link target
Broken symlink '/usr/lib/modules/3.10.0-957.10.1.el7.s390x/source' in recovery system because 'readlink' cannot determine its link target
Testing that the recovery system in /tmp/rear.MMwixuIEPrKkf3l/rootfs contains a usable system
Creating recovery/rescue system initramfs/initrd initrd.cgz with gzip default compression

Created initrd.cgz with gzip default compression (213731276 bytes) in 309 seconds
Copying resulting files to file location
Saving /var/log/rear/rear-red72a1.log as rear-red72a1.log to file location
Copying result files '/tmp/rear.MMwixuIEPrKkf3l/tmp/VERSION /tmp/rear.MMwixuIEPrKkf3l/tmp/README /tmp/rear.MMwixuIEPrKkf3l/tmp/rear-red72a1.log' to /home/garyd/ramdisk/red72a1 at file location
Exiting rear mkrescue (PID 58731) and its descendant processes ...
Running exit tasks
You should also rm -Rf /tmp/rear.MMwixuIEPrKkf3l

that was interesting what you did with kexec. I have read about ppl doing that, but never did so myself

jsmeix commented at 2019-05-23 06:55:

@mutable-dan
I did my generic improvements and fixes in
usr/share/rear/output/default/950_copy_result_files.sh
that I listed in my above
https://github.com/rear/rear/pull/2142#issuecomment-494426331
only in my personal ReaR scripts that I run on my IBM Z test systems.

I will do a GitHub pull request with that changes and merge it
into our current GitHub ReaR master code...

Regarding your changes in
rescue/GNU/Linux/230_storage_and_network_modules.sh

You won't actually need them because with current ReaR 2.5
default.conf MODULES=( 'all_modules' ) setting
you get by default all modules in the recovery system, cf. my
https://github.com/rear/rear/pull/2142#issuecomment-494677718

I checked that on my IBM Z test system where I had run "rear mkbackup"
(I use KEEP_BUILD_DIR="yes")

# find /tmp/rear.NlvtWuz8Sx0tjOd/rootfs/lib/modules/ | grep s390 | wc -l
69

# find /lib/modules/ | grep s390 | wc -l
69

i.e. same number of '390' kernel module files (and directories)
in the recovery system (at /tmp/rear.NlvtWuz8Sx0tjOd/rootfs/)
as in the original system and

# find /lib/modules/ | wc -l
1003

# find /tmp/rear.NlvtWuz8Sx0tjOd/rootfs/lib/modules/ | wc -l
1002

# find /lib/modules/ | grep scsi_debug
/lib/modules/4.12.14-94.41-default/kernel/drivers/scsi/scsi_debug.ko

# find /tmp/rear.NlvtWuz8Sx0tjOd/rootfs/lib/modules/ | grep scsi_debug || echo not found
not found

# grep ^EXCLUDE_MODULES usr/share/rear/conf/default.conf
EXCLUDE_MODULES=( scsi_debug )

same (except one) number of all kernel module files (and directories)
in the recovery system (at /tmp/rear.NlvtWuz8Sx0tjOd/rootfs/)
as in the original system - the exception is scsi_debug which
is intentionally excluded, see usr/share/rear/conf/default.conf

Nevertheless your changes in
rescue/GNU/Linux/230_storage_and_network_modules.sh
are useful and needed if a user sets MODULES=() in
his etc/rear/local.conf

jsmeix commented at 2019-05-23 07:47:

@mutable-dan
with https://github.com/rear/rear/pull/2147 merged
my generic improvements and fixes for
usr/share/rear/output/default/950_copy_result_files.sh
are now in our current GitHub ReaR master code
so that the items 1. and 2. in
https://github.com/rear/rear/pull/2142#issuecomment-495010023
should be fixed now.

jsmeix commented at 2019-05-23 08:07:

@mutable-dan
regarding item 3. Handling network interface 'enccw0.0.0610' in
https://github.com/rear/rear/pull/2142#issuecomment-495010023
I cannot help because I don't know much about networking details
(I use DHCP and be happy that on SLES12 eth0 is used).

I think @rmetrich can help with crazy network interface names
because he implemented very most of our current
usr/share/rear/rescue/GNU/Linux/310_network_devices.sh
which is where the network device configuration for
the ReaR recovery system happens.

A general workaround to deal with things like

Failed to handle network interface 'enccw0.0.0610'.

is to specify NETWORKING_PREPARATION_COMMANDS
see its description in usr/share/rear/conf/default.conf
or - probably easier - do it as I do with

SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"

to use DHCP for recovery system networking setup
and get remote access to it via SSH.
Usually you do not need full networking setup in the recovery system.
You need just enough to be able to access your backup.

Regarding item 4. kernel not detected in
https://github.com/rear/rear/pull/2142#issuecomment-495010023
see usr/share/rear/prep/GNU/Linux/400_guess_kernel.sh
which is where the kernel is autodetected
i.e. where the KERNEL_FILE variable is set.
You must use ReaR 2.5 because before ReaR 2.5
KERNEL_FILE was set by a mess of various scripts
that I cleaned up for ReaR 2.5, see the
"Release Notes for Relax-and-Recover version 2.5" at
http://relax-and-recover.org/documentation/release-notes-2-5
therein the part about

Cleaned up how KERNEL_FILE is set: Now the
KERNEL_FILE variable is set in the ‘prep’ stage only by
the new prep/GNU/Linux/400_guess_kernel.sh
that replaces the old
...
(issues #1851 #1983 #1985)

for details see those issues
https://github.com/rear/rear/issues/1851
https://github.com/rear/rear/pull/1983
https://github.com/rear/rear/pull/1985

The KERNEL_FILE value is the file of the currently running kernel
and that kernel should be used in the ReaR recovery system
i.e. the ReaR recovery system should be booted with that kernel.

In my usr/share/rear/output/IPL/Linux-s390/800_create_ipl.sh
cf. https://github.com/rear/rear/pull/2142#issuecomment-494426331
I use KERNEL_FILE to copy the currently used kernel
whereto the RESULT_FILES are normally copied
(which is in my case the NFS share on my NFS server)
plus the ReaR recovery system files that are in ReaR's initrd
and ReaR's initrd file is $TMP_DIR/$REAR_INITRD_FILENAME

In
https://github.com/rear/rear/pull/2142#issuecomment-494739752
I copy that kernel and ReaR's initrd from my NFS server
to my "replacement hardware" IBM Z system to boot the
ReaR recovery system there via kexec -l ... and kecex -e.

jsmeix commented at 2019-05-23 12:24:

@mutable-dan
using OUTPUT=RAMDISK on IBM Z will not call
usr/share/rear/output/RAMDISK/Linux-i386/900_copy_ramdisk.sh
because that is in .../Linux-i386/.

To get an architecure-dependent script run on IBM Z you would need
usr/share/rear/output/RAMDISK/Linux-s390/900_copy_ramdisk.sh

Cf. my usr/share/rear/output/IPL/Linux-s390/800_create_ipl.sh
in https://github.com/rear/rear/pull/2142#issuecomment-494426331

jsmeix commented at 2019-05-23 12:37:

@mutable-dan
it seems things work for me with OUTPUT=RAMDISK on IBM Z
with the following completely overhauled 900_copy_ramdisk.sh
stored as usr/share/rear/output/RAMDISK/default/900_copy_ramdisk.sh

#
# output/RAMDISK/default/900_copy_ramdisk.sh
#
# Add kernel and the ReaR recovery system initrd to RESULT_FILES
# so that the subsequent output/default/950_copy_result_files.sh
# will copy them to the output location specified via OUTPUT_URL
# (if not specified OUTPUT_URL is inherited from BACKUP_URL).
#

local kernel_file="$KERNEL_FILE"
local initrd_file="$TMP_DIR/$REAR_INITRD_FILENAME"

# The 'test' intentionally also fails when RAMDISK_SUFFIX is more than one word
# because we do not want blanks in kernel or initrd file names:
if test $RAMDISK_SUFFIX ; then
    kernel_file="$TMP_DIR/kernel-$RAMDISK_SUFFIX"
    cp $v -pLf $KERNEL_FILE $kernel_file || Error "Failed to copy KERNEL_FILE '$KERNEL_FILE'"
    initrd_file="$TMP_DIR/initramfs-$RAMDISK_SUFFIX.img"
    cp $v -pLf $TMP_DIR/$REAR_INITRD_FILENAME $initrd_file || Error "Failed to copy initramfs '$REAR_INITRD_FILENAME'"
fi

DebugPrint "Adding $kernel_file and $initrd_file to RESULT_FILES"
RESULT_FILES+=( $kernel_file $initrd_file )

With that I get (excerpts)

# usr/sbin/rear -D mkbackup
...
Adding /tmp/rear.ofaJcBdmGc6LlM5/tmp/kernel-s390vsl179 and /tmp/rear.ofaJcBdmGc6LlM5/tmp/initramfs-s390vsl179.img to RESULT_FILES
Copying resulting files to nfs location
Saving /root/rear.pull2142/var/log/rear/rear-s390vsl179.log as rear-s390vsl179.log to nfs location
Copying result files '/tmp/rear.ofaJcBdmGc6LlM5/tmp/kernel-s390vsl179 /tmp/rear.ofaJcBdmGc6LlM5/tmp/initramfs-s390vsl179.img /tmp/rear.ofaJcBdmGc6LlM5/tmp/VERSION /tmp/rear.ofaJcBdmGc6LlM5/tmp/README /tmp/rear.ofaJcBdmGc6LlM5/tmp/rear-s390vsl179.log' to /tmp/rear.ofaJcBdmGc6LlM5/outputfs/s390vsl179 at nfs location

and on my NFS server I get

# ls -lhtr /nfs/s390vsl179/
total 770M
-rw-rw-rw- 1 nobody nobody  13M May 23 14:21 kernel-s390vsl179
-rw-rw-rw- 1 nobody nobody  69M May 23 14:21 initramfs-s390vsl179.img
-rw-rw-rw- 1 nobody nobody  271 May 23 14:21 VERSION
-rw-rw-rw- 1 nobody nobody  190 May 23 14:21 README
-rw-rw-rw- 1 nobody nobody 2.7M May 23 14:21 rear-s390vsl179.log
-rw-rw-rw- 1 nobody nobody 683M May 23 14:23 backup.tar.gz
-rw-rw-rw- 1 nobody nobody 3.3M May 23 14:23 backup.log

mutable-dan commented at 2019-05-23 19:42:

@jsmeix

Sumamry

  • Backup to nfs - ok
  • Kernel is recognized and copied
  • verified kernel modules are correct

https://github.com/rear/rear/pull/2142#issuecomment-495091647

Nevertheless your changes in
rescue/GNU/Linux/230_storage_and_network_modules.sh
are useful and needed if a user sets MODULES=() in
his etc/rear/local.conf

  • please let me know if you want the modules changes checked in. right now it is stashed.
  • ran rescue to NFS and confirmed the module count

https://github.com/rear/rear/pull/2142#issuecomment-495010023
items: 1, 2, 4 are fixed

https://github.com/rear/rear/pull/2142#issuecomment-495112806

Regarding item 4. kernel not detected in
#2142 (comment)
see usr/share/rear/prep/GNU/Linux/400_guess_kernel.sh
which is where the kernel is autodetected
i.e. where the KERNEL_FILE variable is set.
You must use ReaR 2.5 because before ReaR 2.5...

I am merging from master to my fork, so i have the changes in v2.5 of usr/share/rear/prep/GNU/Linux/400_guess_kernel.sh

Will start the networking next, then IPL param
Thank you for your help

jsmeix commented at 2019-05-24 09:28:

@mutable-dan
yes, I would like to have your changes in
rescue/GNU/Linux/230_storage_and_network_modules.sh
committed because they are useful and needed
if a user sets MODULES=() in his etc/rear/local.conf

jsmeix commented at 2019-05-24 10:02:

Right now I did another (partially manual) recovery
on my "replacement hardware" with what I recently
created on my NFS server with OUTPUT=RAMDISK
(and the new output/RAMDISK/default/900_copy_ramdisk.sh)
as shown in
https://github.com/rear/rear/pull/2142#issuecomment-495200737

Here my etc/rear/local.conf that I used

OUTPUT=RAMDISK
BACKUP=NETFS
BACKUP_OPTIONS="nfsvers=3,nolock"
BACKUP_URL=nfs://10.160.67.243/nfs
SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"
KEEP_BUILD_DIR="yes"
FIRMWARE_FILES=( 'no' )
PROGRESS_MODE="plain"
PROGRESS_WAIT_SECONDS="5"

I booted kernel-s390vsl179 and initramfs-s390vsl179.img
on my "replacement hardware" via kexec from within the
running first recreated system from my above very first attempt
to do a "rear recover" (excerpts of what I did now):

s390vsl180:~ # scp root@...:/nfs/s390vsl179/*s390vsl179* .
initramfs-s390vsl179.img
kernel-s390vsl179

s390vsl180:~ # kexec -l kernel-s390vsl179 --initrd=initramfs-s390vsl179.img --command-line='root=/dev/ram0 rw vga=normal hvc_iucv=8 TERM=dumb'

s390vsl180:~ # sync

s390vsl180:~ # kexec -e

Now s390vsl180 does a hard reboot and boots the
ReaR recovery system where I log in via SSH
and run "rear recover" in enforced MIGRATION_MODE
so that I get the dialogs that allow me to adapt the
still wrong parted comands in diskrestore.sh

Don't get confused that the running recovery system shows
as hostname s390vsl179 which is the hostname of the original system
but the recovery system is actually running on my "replacement hardware".

RESCUE s390vsl179:~ # export MIGRATION_MODE=true

RESCUE s390vsl179:~ # rear -D recover
...
Confirm or edit the disk recreation script
1) Confirm disk recreation script and continue 'rear recover'
2) Edit disk recreation script (/var/lib/rear/layout/diskrestore.sh)
3) View disk recreation script (/var/lib/rear/layout/diskrestore.sh)
4) View original disk space usage (/var/lib/rear/layout/config/df.txt)
5) Use Relax-and-Recover shell and return back to here
6) Abort 'rear recover'
(default '1' timeout 300 seconds)
2

Before the parted comands in diskrestore.sh are

RESCUE s390vsl179:~ # grep ^parted /root/rear.pull2142/var/lib/rear/layout/diskrestore.sh    
parted -s /dev/dasda mklabel dasd >&2
parted -s /dev/dasda mkpart "'dasda1'" 98304B 314621951B >&2
parted -s /dev/dasda mkpart "'dasda2'" 314621952B 838926335B >&2
parted -s /dev/dasda mkpart "'dasda3'" 838926336B 7385198591B >&2

I changed them to

RESCUE s390vsl179:~ # grep ^parted /root/rear.pull2142/var/lib/rear/layout/diskrestore.sh
parted -s /dev/dasda mklabel dasd >&2
parted -s /dev/dasda mkpart ext2 98304B 314621951B >&2
parted -s /dev/dasda mkpart linux-swap 314621952B 838926335B >&2
parted -s /dev/dasda mkpart ext4 838926336B 7385333759B >&2

Note that in particular for the last partition the partition end byte
value was not right: Before it was 7385198591B
which is not what parted reports for the disk size

# parted -s /dev/dasda unit B print
Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 7385333760B
...

so that this is another issue that needs to be investigated.

With the changed parted comands in diskrestore.sh
the disk layout setup "just worked " for me
and "rear recover" continues with

UserInput -I LAYOUT_MIGRATED_CONFIRMATION needed in /usr/share/rear/layout/recreate/default/200_run_layout_code.sh line 98
Confirm the recreated disk layout or go back one step
1) Confirm recreated disk layout and continue 'rear recover'
2) Go back one step to redo disk layout recreation
3) Use Relax-and-Recover shell and return back to here
4) Abort 'rear recover'
(default '1' timeout 300 seconds)
1
UserInput: Valid choice number result 'Confirm recreated disk layout and continue 'rear recover''
User confirmed recreated disk layout
Restoring from '/tmp/rear.pH3rsfVF7M6pgA4/outputfs/s390vsl179/backup.tar.gz' (restore log in /var/lib/rear/restore/recover.backup.tar.gz.586.restore.log) ...
Backup restore program 'tar' started in subshell (PID=2697)
Restored 465 MiB [avg. 95322 KiB/sec] 
Restored 823 MiB [avg. 84298 KiB/sec] 
Restored 1215 MiB [avg. 82959 KiB/sec] 
Restoring MB/s 
OK
Restored 1628 MiB in 25 seconds [avg. 66716 KiB/sec]
Restoring finished (verify backup restore log messages in /var/lib/rear/restore/recover.backup.tar.gz.586.restore.log)
Recreating directories (with permissions) from /var/lib/rear/recovery/directories_permissions_owner_group
Migrating disk-by-id mappings in certain restored files in /mnt/local to current disk-by-id mappings ...
Replacing restored udev rule '/mnt/local//etc/udev/rules.d/70-persistent-net.rules' with the one from the ReaR rescue system
Migrating network configuration files according to the mapping files ...
UserInput -I RESTORED_FILES_CONFIRMATION needed in /usr/share/rear/finalize/default/520_confirm_finalize.sh line 41
Confirm restored config files are OK or adapt them as needed
1) Confirm it is OK to recreate initrd and reinstall bootloader and continue 'rear recover'
2) Edit restored etc/fstab (/mnt/local/etc/fstab)
3) View restored etc/fstab (/mnt/local/etc/fstab)
4) Use Relax-and-Recover shell and return back to here
5) Abort 'rear recover'
(default '1' timeout 300 seconds)
1
UserInput: Valid choice number result 'Confirm it is OK to recreate initrd and reinstall bootloader and continue 'rear recover''
User confirmed restored files
WARNING:
For this system
SUSE_LINUX/12.4 on Linux-s390 (based on SUSE/12/s390)
there is no code to install a boot loader on the recovered system
or the code that we have failed to install the boot loader correctly.
Please contribute appropriate code to the Relax-and-Recover project,
see http://relax-and-recover.org/development/
Take a look at the scripts in /usr/share/rear/finalize,
for example see the scripts
/usr/share/rear/finalize/Linux-i386/630_install_grub.sh
/usr/share/rear/finalize/Linux-i386/660_install_grub2.sh

---------------------------------------------------
|  IF YOU DO NOT INSTALL A BOOT LOADER MANUALLY,  |
|  THEN YOUR SYSTEM WILL NOT BE ABLE TO BOOT.     |
---------------------------------------------------

You can use 'chroot /mnt/local bash --login'
to change into the recovered system.
You should at least mount /proc in the recovered system
e.g. via 'mount -t proc none /mnt/local/proc'
before you change into the recovered system
and manually install a boot loader therein.

Finished recovering your system. You can explore it under '/mnt/local'.
Exiting rear recover (PID 586) and its descendant processes ...
Running exit tasks
You should also rm -Rf /tmp/rear.pH3rsfVF7M6pgA4

This is expected because we do not yet have
code to install a boot loader on IBM Z
so that I need to do that manually as before:

RESCUE s390vsl179:~ # mount --bind /proc /mnt/local/proc

RESCUE s390vsl179:~ # mount --bind /sys /mnt/local/sys

RESCUE s390vsl179:~ # mount --bind /dev /mnt/local/dev

RESCUE s390vsl179:~ # chroot /mnt/local bash --login

s390vsl179:/ # update-bootloader --reinit

s390vsl179:/ # exit
logout

RESCUE s390vsl179:~ # reboot

Now the recreated system on my "replacement hardware" reboots
and it works and I can log in via SSH

# ssh root@10.161.155.180
...
s390vsl180:~ #

Now it shows the right hostname that belongs to its IP 10.161.155.180
because its hostname is set via DHCP (in contrast to the recovery system
that does not set its hostname via DHCP regardless that I also use DHCP
for the network setup of the recovery system).

mutable-dan commented at 2019-05-24 13:42:

@mutable-dan
yes, I would like to have your changes in
rescue/GNU/Linux/230_storage_and_network_modules.sh
committed because they are useful and needed
if a user sets MODULES=() in his etc/rear/local.conf

completed

mutable-dan commented at 2019-05-24 18:13:

I will try booting the kernel on a throw away VM. I heard people talk about using kexec to boot a new kernel in a running machine, but I have not tried it yet.

jsmeix commented at 2019-05-27 07:36:

@mutable-dan
using kexec to boot a new kernel plus initrd in an already running system
is only my current workaround to boot the recovery system for my tests,
cf. https://github.com/rear/rear/pull/2142#issuecomment-494739752

But that is not a real solution because for real disaster recovery
it must be possible to boot the recovery system on "bare metal"
(where "bare metal" on IBM Z means an empty virtual machine).

jsmeix commented at 2019-05-27 09:35:

I found out why during "rear recover" the last partition end byte value
was decreased from 7385333760 to 7385198592
cf. https://github.com/rear/rear/pull/2142#issuecomment-495554406

I have in my "rear recover" log those lines (excerpts):

++ device_size=7385333760
++ [[ dasd == \g\p\t ]]
++ [[ dasd == \g\p\t\_\s\y\n\c\_\m\b\r ]]
++ [[ dasd == \d\a\s\d ]]
+++ mathlib_calculate '7385333760 - 33*4096'
+++ bc -ql
++ device_size=7385198592
...
++ start=838926336
++ end=7385333760
++ [[ -n 7385198592 ]]
++ ((  end > 7385198592  ))
++ LogPrint 'Partition rear-noname on /dev/dasda: size reduced to fit on disk.'
++ Log 'Partition rear-noname on /dev/dasda: size reduced to fit on disk.'
++ echo '2019-05-24 11:09:55.182732415 Partition rear-noname on /dev/dasda: size reduced to fit on disk.'
2019-05-24 11:09:55.182732415 Partition rear-noname on /dev/dasda: size reduced to fit on disk.
++ Print 'Partition rear-noname on /dev/dasda: size reduced to fit on disk.'
++ Log 'End changed from 7385333760 to 7385198592.'
++ echo '2019-05-24 11:09:55.184125605 End changed from 7385333760 to 7385198592.'
2019-05-24 11:09:55.184125605 End changed from 7385333760 to 7385198592.
++ end=7385198592
++ is_true ''
++ case "$1" in
++ return 1
++ [[ rear-noname == \r\e\a\r\-\n\o\n\a\m\e ]]
+++ basename /dev/dasda3
++ name=dasda3
++ [[ -n y ]]
++ [[ -n 7385198592 ]]
+++ mathlib_calculate '7385198592 - 1'
+++ bc -ql
++ end=7385198591B

The matching code for that is in the create_partitions() function in
usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh

            device_size=$( get_disk_size  "$sysfs_name" )

            ### GPT disks need 33 LBA blocks at the end of the disk
            # For the SUSE specific gpt_sync_mbr partitioning scheme
            # see https://github.com/rear/rear/issues/544
            if [[ "$label" == "gpt" || "$label" == "gpt_sync_mbr" || "$label" == "dasd" ]] ; then
                device_size=$( mathlib_calculate "$device_size - 33*$block_size" )
                # Only if resizing all partitions is explicity wanted
                # resizing of arbitrary partitions may also happen via the code below
                # in addition to layout/prepare/default/430_autoresize_all_partitions.sh
                if is_true "$autoresize_partitions" ; then
                    Log "Size reductions of GPT partitions probably needed."
                fi
            fi
...
        ### Test to make sure we're not past the end of the disk.
        if [[ "$device_size" ]] && (( end > $device_size )) ; then
            LogPrint "Partition $name on $device: size reduced to fit on disk."
            Log "End changed from $end to $device_size."
            end="$device_size"
        fi

So the cause is that in this case here dasd should not be treated
same as GPT (that needs 33 LBA blocks at the end of the disk)
which means in this case here my added dasd must be removed
from this particular if clause so that it is again as it was before:

            ### GPT disks need 33 LBA blocks at the end of the disk
            # For the SUSE specific gpt_sync_mbr partitioning scheme
            # see https://github.com/rear/rear/issues/544
            if [[ "$label" == "gpt" || "$label" == "gpt_sync_mbr" ]] ; then

jsmeix commented at 2019-05-27 11:22:

@mutable-dan
now I use your current code from your
GitHub Innovation-Data-Processing/s390 branch.

I needed the following changes to make
the disk layout recreation "just work" for me:

--- usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh.original      2019-05-27 10:38:53.938907196 +0200
+++ usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh       2019-05-27 11:36:03.538907196 +0200
@@ -141,7 +141,7 @@
             # For the SUSE specific gpt_sync_mbr partitioning scheme
             # see https://github.com/rear/rear/issues/544
             # see https://github.com/rear/rear/pull/2142 for s390 partitioning
-            if [[ "$label" == "gpt" || "$label" == "gpt_sync_mbr" || "$label" == "dasd" ]] ; then
+            if [[ "$label" == "gpt" || "$label" == "gpt_sync_mbr" ]] ; then
                 device_size=$( mathlib_calculate "$device_size - 33*$block_size" )
                 # Only if resizing all partitions is explicity wanted
                 # resizing of arbitrary partitions may also happen via the code below

to fix https://github.com/rear/rear/pull/2142#issuecomment-496149143

Additionally I needed some bigger changes in
usr/share/rear/layout/save/GNU/Linux/200_partition_layout.sh
to get things right with the partition 'Name' value and hopefully
also with partition 'Flags' (but in my case there are no flags):

200_partition_layout.sh.txt

I needed to append .txt to the actual file name 200_partition_layout.sh
to be able to attach it here because GitHub does not support *.sh files, see
https://help.github.com/en/articles/file-attachments-on-issues-and-pull-requests

jsmeix commented at 2019-05-27 12:20:

To get the bootloaders (GRUB2 and ZIPL) installed
during "rear recover" on SLES I made a new script
usr/share/rear/finalize/SUSE_LINUX/s390/660_install_grub2_and_zipl.sh

660_install_grub2_and_zipl.sh.txt

Now for the first time "rear recover" on SLES-12-SP4
"just worked" for me:

RESCUE s390vsl179:~ # rear -D recover
Relax-and-Recover 2.5 / Git
Running rear recover (PID 578)
Using log file: /var/log/rear/rear-s390vsl179.log
Running workflow recover within the ReaR rescue/recovery system
Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.
Started RPC portmapper 'rpcbind'.
RPC portmapper 'rpcbind' available.
Started rpc.statd.
RPC status rpc.statd available.
Using backup archive '/tmp/rear.3uDnuGbRknJ8oB2/outputfs/s390vsl179/backup.tar.gz'
Will do driver migration (recreating initramfs/initrd)
Calculating backup archive size
Backup archive size is 692M     /tmp/rear.3uDnuGbRknJ8oB2/outputfs/s390vsl179/backup.tar.gz (compressed)
Comparing disks
Device dasda has expected (same) size 7385333760 (will be used for recovery)
Disk configuration looks identical
UserInput -I DISK_LAYOUT_PROCEED_RECOVERY needed in /usr/share/rear/layout/prepare/default/250_compare_disks.sh line 148
Proceed with recovery (yes) otherwise manual disk layout configuration is enforced
(default 'yes' timeout 30 seconds)

UserInput: No real user input (empty or only spaces) - using default input
UserInput: No choices - result is 'yes'
User confirmed to proceed with recovery
Start system layout restoration.
Disk '/dev/dasda': creating 'dasd' partition table
Disk '/dev/dasda': creating partition number 1 with name ''ext2''
Disk '/dev/dasda': creating partition number 2 with name ''ext2''
Disk '/dev/dasda': creating partition number 3 with name ''ext2''
Creating filesystem of type ext4 with mount point / on /dev/dasda3.
Mounting filesystem /
Creating filesystem of type ext2 with mount point /boot/zipl on /dev/dasda1.
Mounting filesystem /boot/zipl
Creating swap on /dev/dasda2
Disk layout created.
Restoring from '/tmp/rear.3uDnuGbRknJ8oB2/outputfs/s390vsl179/backup.tar.gz' (restore log in /var/lib/rear/restore/recover.backup.tar.gz.578.restore.log) ...
Backup restore program 'tar' started in subshell (PID=2175)
Restored 394 MiB [avg. 80801 KiB/sec] 
Restored 801 MiB [avg. 82065 KiB/sec] 
Restored 1172 MiB [avg. 80016 KiB/sec] 
Restored 1583 MiB [avg. 81078 KiB/sec] 
OK
Restored 1646 MiB in 25 seconds [avg. 67428 KiB/sec]
Restoring finished (verify backup restore log messages in /var/lib/rear/restore/recover.backup.tar.gz.578.restore.log)
Recreating directories (with permissions) from /var/lib/rear/recovery/directories_permissions_owner_group
Migrating disk-by-id mappings in certain restored files in /mnt/local to current disk-by-id mappings ...
Replacing restored udev rule '/mnt/local//etc/udev/rules.d/70-persistent-net.rules' with the one from the ReaR rescue system
Migrating network configuration files according to the mapping files ...
Installing GRUB2 boot loader plus ZIPL...
Finished recovering your system. You can explore it under '/mnt/local'.
Exiting rear recover (PID 578) and its descendant processes ...
Running exit tasks
You should also rm -Rf /tmp/rear.3uDnuGbRknJ8oB2
RESCUE s390vsl179:~ # reboot

For comparison on the original system

# parted -s /dev/dasda unit B print
Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 7385333760B
Sector size (logical/physical): 512B/4096B
Partition Table: dasd
Disk Flags:

Number  Start       End          Size         File system     Flags
 1      98304B      314621951B   314523648B   ext2
 2      314621952B  838926335B   524304384B   linux-swap(v1)
 3      838926336B  7385333759B  6546407424B  ext4

versus on the recreated system

# parted -s /dev/dasda unit B print
Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 7385333760B
Sector size (logical/physical): 512B/4096B
Partition Table: dasd
Disk Flags:

Number  Start       End          Size         File system     Flags
 1      98304B      314621951B   314523648B   ext2
 2      314621952B  838926335B   524304384B   linux-swap(v1)
 3      838926336B  7385333759B  6546407424B  ext4

i.e. byte-by-byte identical partitioning.
The right File system entries happen via the mkfs commands
so that the hardcoded fixed dummy ext2 that are shown during

Disk '/dev/dasda': creating partition number 1 with name ''ext2''
Disk '/dev/dasda': creating partition number 2 with name ''ext2''
Disk '/dev/dasda': creating partition number 3 with name ''ext2''

do not matter in the end.

jsmeix commented at 2019-05-28 10:33:

My pull request https://github.com/rear/rear/pull/2150
contains the changes of my above
https://github.com/rear/rear/pull/2142#issuecomment-496178838
and
https://github.com/rear/rear/pull/2142#issuecomment-496194073

mutable-dan commented at 2019-05-28 21:31:

My pull request #2150
contains the changes of my above
#2142 (comment)
and
#2142 (comment)

hello,

I will merge your changes mentioned above into my fork. The one area I will need to look at is the assumption of ext2. if it acts as just a placeholder, should be ok. it it overrides the actual filesystem type, could be a problem. rhel on z uses xfs

thank you

jsmeix commented at 2019-05-29 10:23:

@mutable-dan
right now I merged https://github.com/rear/rear/pull/2150
so that now there should be generic support for dasd disks
in the current ReaR GitHub master code.

Regarding why using a dummy file system type 'ext2' for
dasd disk 'part' entries should work, e.g. as in my disklayout.conf

part /dev/dasda 314523648 98304 ext2 none /dev/dasda1
part /dev/dasda 524304384 314621952 ext2 none /dev/dasda2
part /dev/dasda 6546407424 838926336 ext2 none /dev/dasda3

see the comment at
https://github.com/rear/rear/blob/master/usr/share/rear/layout/save/GNU/Linux/200_partition_layout.sh#L175

Regarding (re)-installing the bootloader on s390 on Red Hat
I think you need to make a Red Hat specific new script named like
(directories named .../Fedora/... are also used for RHEL)

usr/share/rear/finalize/Fedora/s390/660_install_zipl.sh

when Red Hat uses plain zipl as bootloader on IMB Z.

mutable-dan commented at 2019-05-29 21:17:

re https://github.com/rear/rear/pull/2142#issuecomment-496878540
understood

i merged your commits to master into my fork.

mutable-dan commented at 2019-05-31 22:02:

working on rhel zIPL bootloader
https://github.com/rear/rear/pull/2142#issuecomment-496194073
and
https://github.com/rear/rear/pull/2142#issuecomment-496878540

created initram and kernel on nfs and we were able to ILP. don't have the recover menu because of the above
i have a suse 11& 12 zlinux which i will be able to iPL also

mutable-dan commented at 2019-06-04 20:21:

We have been IPLing the generated kernel and init. we found we were missing a few tools
added s390 tools for working with dasd and qeth, see below. working on recovery....

PROGS=( "${PROGS[@]}" cmsfscat  cmsfsck  cmsfscp  cmsfslst  cmsfsvol )
PROGS=( "${PROGS[@]}" chccwdev chshut chiucvallow chchp tape390_crypt tape390_display )
PROGS=( "${PROGS[@]}" cmsfscat  cmsfsck  cmsfscp  cmsfslst  cmsfsvol )
PROGS=( "${PROGS[@]}" lsdasd lsqeth lstape )
PROGS=( "${PROGS[@]}" cio_ignore zdump zfcpconf.sh zgetdump ziomon ziomon_mgr ziomon_zfcpdd ziorep_traffic znetconf )
PROGS=( "${PROGS[@]}" fcp_cio_free zfcpdbf zic ziomon_fcpconf ziomon_util ziorep_config ziorep_utilization znet_cio_free zramctl )

jsmeix commented at 2019-06-05 09:46:

@mutable-dan
a general note regarding things as in
https://github.com/rear/rear/pull/2142#issuecomment-498828073

In
https://github.com/rear/rear/pull/2142/files
I do not see where e.g. the program cmsfscat is called by a ReaR script.

In general regarding things like PROGS, COPY_AS_IS, REQUIRED_PROGS
and LIBS:

What is specified in those arrays is only what is needed to be run
within the running ReaR recovery system, i.e. what is called
during recovery system startup (e.g. by skel/default/etc/scripts)
and during "rear recover".

Currently I fail to see how all those PROGS and REQUIRED_PROGS
that you specified are run from within the ReaR recovery system
or at all by a ReaR script.

REQUIRED_PROGS are additionally required to exist
in the original system because e.g. "rear mkrescue"
aborts in init/default/950_check_missing_programs.sh
when an element of the REQUIRED_PROGS is missing.

See the comment in init/default/950_check_missing_programs.sh
regarding how to add programs to REQUIRED_PROGS as needed
from other ReaR scripts, for example see the script
layout/save/GNU/Linux/230_filesystem_layout.sh
and cf. https://github.com/rear/rear/issues/1963

mutable-dan commented at 2019-06-05 16:54:

@mutable-dan
a general note regarding things as in
#2142 (comment)

Hi
we put everything Z in right now while we figure out how to manually get the network, storage devices and filesystems up. it's a slow process building recoveries and booting on the zvm emulator.

some of the links provided add clarity to the usage of PROGS, COPY_AS_IS, REQUIRED_PROGS and LIBS and https://github.com/rear/rear/blob/master/doc/user-guide/09-design-concepts.adoc helps a little too, but i have been unable to find a complete description of these vars and how they should be used. any info would be helpful.
The difference between a recovery and original system was also confused a bit, once we figure out what is needed, i'll pull out the rest

jsmeix commented at 2019-06-06 14:55:

@mutable-dan
while you are trying things out I suggest to set things
like PROGS, COPY_AS_IS, REQUIRED_PROGS, LIBS
only in your etc/rear/local.conf

When you know what you need where and when,
you could add it into scripts.

This way I think I could merge this pull request soon
provided it does no longer contain things that are not needed
for a very first (limited) usage of ReaR on IBM Z
e.g. only having kernel and ReaR's initrd on an NFS server
but no yet support how to boot/IPL them on a Z system.

The support how to boot/IPL them on a Z system
can later be added via separated subsequent pull requests,
cf. the part about "Submit early, submit often" in
https://github.com/rear/rear/commit/f719fee860f656859ec12838b0c5b36d32532d6a#commitcomment-33455129
and
https://github.com/rear/rear/commit/f719fee860f656859ec12838b0c5b36d32532d6a#commitcomment-33473579

mutable-dan commented at 2019-06-07 17:45:

move most of the tools refed in 305_include_s390_tools.sh to local.conf

mutable-dan commented at 2019-06-13 22:58:

@jsmeix

manually enabled networking and dasd device on zlinux instance. modified 200_partition_layout.sh to write additional info needed to dasd format and partition the drive to disklayout.conf

# dasdfmt - disk layout is either cdl for the compatible disk layout (default) or ldl
#  example usage: dasdfmt -b 4096 -d cdl -y /dev/dasda
# dasdfmt /dev/dasda
# dasdfmt -b <blocksize> -d <layout> -y <devname>
dasdfmt -b 4096 -d CDL -y /dev/dasda
# fdasd /dev/dasda
# write fdasd as device [start,end,type]
# repeat [start,end,type] for each partition
# where: start - start track, end - end track
# parse line fdasd_config into a file and call fdasd -c filename /dev/dasda
#        type: optional and specifies the partition type. <type> can be one of: native, swap, raid, lvm, or gpfs.  If omitted, 'native' is used
# fdasd -c /var/lib/rear/layout/fdasd.conf <devname>
fdasd_config /dev/dasda [2,10668,native] [10669,491399,lvm] 
# Disk /dev/dasda
# Format: disk <devname> <size(bytes)> <partition label type>
disk /dev/dasda 24153292800 dasd
# Partitions on /dev/dasda
# Format: part <device> <partition size(bytes)> <partition start(bytes)> <partition type|name> <flags> /dev/<partition>
part /dev/dasda 524304384 98304 ext2 none /dev/dasda1
part /dev/dasda 23628890112 524402688 ext2 none /dev/dasda2
# Format for LVM PVs

please let me know if there is a more appropriate place for this code. next step is to find the place in recovery to read this and prep the drives for the 'normal' linux setup

mutable-dan commented at 2019-06-14 15:35:

I think for overall consistency it must be ensured that there is
a non-empty <partition type|name> value in disklayout.conf
cf. https://github.com/rear/rear/pull/2142/files#r285583301

completed. just was not referenced in commit

jsmeix commented at 2019-06-17 13:19:

@mutable-dan
with your changes in layout/save/GNU/Linux/200_partition_layout.sh via your
https://github.com/rear/rear/pull/2142/commits/335f060bbb89e46baa9d484fa501bfa28e243378
you introduce support for preparing a dasd disk
with the commands dasdfmt plus fdasd
instead of parted which is normally used to prepare a disk
but you do not exclude that parted is also used
when dasdfmt plus fdasd should be used.

As far as I know newer parted supports preparing a dasd disk
and on current SLES we (i.e. SUSE) do no longer provide the
commands dasdfmt and fdasd so that for IMB Z support
on current SLES only parted is used to prepare a dasd disk
and it seems parted is the only tool that is supported by SUSE
to prepare a dasd disk.
At least for my tests here plain parted had worked for me
to prepare the dasd disk on my IBM Z test system.

Therefore my question:
What is the reason why you need to additionally support
preparing a dasd disk with dasdfmt plus fdasd?

When two kind of tools can be used to do the same thing in ReaR
both methods must exclude each other (otherwise parted would
be run in addition after dasdfmt and fdasd had been run)
and both methods must be continuously maintained
in ReaR for a possibly very long time in the future.

In other words:
When there is no mandatory reason to use dasdfmt plus fdasd
to prepare a dasd disk, I suggest to not add support for them to
keep the code in ReaR simpler and better maintainable
(the current code is already rather hard to maintain).

mutable-dan commented at 2019-06-17 20:41:

@mutable-dan
with your changes in layout/save/GNU/Linux/200_partition_layout.sh via your

on rhel dasdfmt looks to be needed else it does not show up as a block device w/o. it looks like i can use parted instead of fdasd on rhel. i am testing....

jsmeix commented at 2019-06-18 14:10:

@mutable-dan
what parted does during rear recover is implemented in
layout/prepare/GNU/Linux/100_include_partition_code.sh
therein rather indirectly via the create_disk() function
that gets called during rear recover as needed by
layout/prepare/default/540_generate_device_code.sh
therein again indirectly via the create_device() function
that is implemented in lib/layout-functions.sh which calls

create_$type "$device"

where $type is basically the keyword in disklayout.conf
cf. "Disk layout file syntax" at
https://github.com/rear/rear/blob/master/doc/user-guide/06-layout-configuration.adoc

E.g. for the disk keyword in disklayout.conf

create_disk "$device"

is called (i.e. the create_disk() function in 100_include_partition_code.sh)
and that function calls the create_partitions() function that is also
implemented in 100_include_partition_code.sh so that finally
the create_partitions() function emits parted command calls
(and other stuff) into the diskrestore.sh script that is finally called by
usr/share/rear/layout/recreate/default/200_run_layout_code.sh

Just simple - isn't it ;-)

Seriously:

The indirection via the diskrestore.sh script is crucial and mandatory.

There must never even happen anything on the user's target system
before the diskrestore.sh script is run.

All commands that prepare the user's target system disks must be written
into the diskrestore.sh script (via appropriate create_$type functions
or other appropriate functions or code in rear recover scripts) and
not any command that changes any user's target system disk
must ever be run directly via a rear recover script.

The reason behind is:

The diskrestore.sh script is the one and only piece of code
that recreates the disk layout (i.e. partitions, filesystems, mount points)
of all the user's target system disks.

This way the user can see the actual commands
that recreate his target system disks, cf.
https://github.com/rear/rear/pull/2081#issuecomment-496175444
and
the user can modify those commands as needed
e.g. when something fails cf.

... if the worst comes to the worst - even temporary quick
and dirty workarounds are relatively easily possible ...

in the section "Disaster recovery with Relax-and-Recover (ReaR)"
in https://en.opensuse.org/SDB:Disaster_Recovery
or in case of a migration where the target system disks
are not fully compatible with the original system disks
(e.g. migration from two smaller disks onto one big disk).

mutable-dan commented at 2019-06-24 16:10:

With this check-in rhel is able to properly partition dasd drives. There are some items that need to be discussed, such as placement of the dasd enable and dasd info placed in the disklayout.conf. All suggestions are, of course, welcome.

usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh
This is a temp fix:
+ #if [[ "$label" == "gpt" || "$label" == "gpt_sync_mbr" || "$label" == "dasd" ]] ; then
+ if [[ "$label" == "gpt" || "$label" == "gpt_sync_mbr" ]] ; then
removed dasd while we figure out how to handle this
If i put the dasd condition back in and add something for suse such as "$OS_VENDOR" == "SUSE_LINUX", will that work?
rhel dasd are basically dos partitions, the gpd re-calc caused rhel to fail when partitioning

usr/share/rear/layout/prepare/default/205_s390_enable_disk.sh
dasd drive on rhel... needs to be enabled before partitioning, is this a good place?

usr/share/rear/layout/save/GNU/Linux/200_partition_layout.sh
currently putting dasd info, specifically bus-id, info in disklayout.conf
echo "dasd_channel $( lsdasd|grep dasd|awk '{ print $1 " " $2 " " $3 " " $4}' )"
putting it here simplifies 200_partition_layout.sh. would probably need to write another loop over sys/block if we do not put it in disklayout.conf unless you know a bash way to direct the stdout temp to another file in the loop. does this potentially break anything?

Bus-ID     Status      Name      Device  Type  BlkSz  Size      Blocks
==============================================================================
0.0.0100   active      dasda     94:0    ECKD  4096   23034MB   5896800

jsmeix commented at 2019-06-25 07:51:

@mutable-dan
to have different code for different Linux distributions
I recommend to use OS_MASTER_VENDOR
because that is more general as OS_VENDOR

Regarding how OS_MASTER_VENDOR is set see
usr/share/rear/lib/config-functions.sh

Accordingly code like the following is fine:

if test "SUSE_LINUX" = $OS_VENDOR ; then
    # Special case for SUSE
    ...
else
    # Generic case for all other Linux distributions 
    ...
fi

Probably even better because the case construct can be more easily
enhanced and adapted if needed in the future for more Linux distributions:

case $OS_MASTER_VENDOR in
    (SUSE)
        # Specific code for SUSE
        ...
        ;;
    (Fedora)
        # Specific code for Fedora, Red Hat, CentOS, Scientific Linux, and Oracle Linux
        ...
        ;;
    (Debian)
        # Specific code for Debian, Ubuntu, and Linux Mint
        ...
        ;;
    (Arch)
        # Specific code for Arch Linux
        ...
        ;;
    (*)
        # Generic default case
        ...
        ;;
esac

cf. get_part_device_name_format() usr/share/rear/lib/layout-functions.sh


Regarding dasd drive on rhel... needs to be enabled before partitioning:

In your usr/share/rear/layout/prepare/default/205_s390_enable_disk.sh
I do not see how you write the commands to enable a dasd drive
into the disrkrestore.sh file - I would expect code like

    cat >> "$LAYOUT_CODE" <<EOF
# enable the dasd device on $bus (i.e. sets the device online)
chccwdev -e $bus
EOF

cf. usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh

I would add the dasd drive enable code to the create_disk() function in
usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh
e.g. something like

if [[ "$ARCH" == "Linux-s390" && "$OS_VENDOR" != "SUSE_LINUX" ]] ; then

   cat >> "$LAYOUT_CODE" <<EOF
# enable the dasd device on $bus (i.e. sets the device online)
chccwdev -e $bus
EOF

fi

but I do not know about Red Hat on IBM Z so perhaps a separated script
is better.

In any case you must never call commands that change a disk directly
cf. my https://github.com/rear/rear/pull/2142#issuecomment-503152699
all those commands must hapen via the disrkrestore.sh script.

I do not know if chccwdev -e $bus actually changes an existing disk.

According to
https://www.ibm.com/support/knowledgecenter/en/linuxonibm/com.ibm.linux.z.lgdd/lgdd_r_chccwdev_cmd.html

Use the chccwdev command to set attributes for CCW devices
and to set CCW devices online or offline.

the chccwdev command can set attributes which seems it
can actually change an existing disk.

In general I would prefer to have all commands that are needed
to set up the disks in the disrkrestore.sh script - i.e. also needed
commands that do not actually change a disk.

Reason:

I would prefer to have the disrkrestore.sh script complete
so that the user could use that script alone on bare metal
to set up the disks, cf. my above
https://github.com/rear/rear/pull/2142#issuecomment-503152699

The diskrestore.sh script is the one and only piece of code
that recreates the disk layout

I know in current ReaR this is not always true.
There is some "initial preparation stuff" elsewhere
(e.g. there could be some special udev scripts and
there are initially needed commands like mpathconf
and initially needed services like multipathd, cf.
usr/share/rear/layout/prepare/GNU/Linux/210_load_multipath.sh).
But I think the goal should be to try to have as much as possible
in the disrkrestore.sh script (as far as possible with reasonable effort).


Regarding putting dasd info, specifically bus-id, info in disklayout.conf

I think it is fine to put such info into disklayout.conf
as long as the general disklayout.conf syntax

# a comment
unique_component_keyword first_value  second_value ...

is used, cf. the section "Disk layout file syntax" in
https://github.com/rear/rear/blob/master/doc/user-guide/06-layout-configuration.adoc
in particular

None of the component keywords is a leading substring
of another component keyword

so your new component keyword dasd_channel is o.k.
because there is no component keyword dasd.

mutable-dan commented at 2019-06-25 22:04:

@jsmeix
I am looking at your comments above....

Also looking at the networking.

on rhel the network devices are not enabled on boot:
/sys/class/net/*
has no devices and thus can not find a MAC

the manual process would be to (without persisting device info)
scan the devices

znetconf -u
Scanning for network devices...
Device IDs                 Type    Card Type      CHPID Drv.
------------------------------------------------------------
0.0.0600,0.0.0601,0.0.0602 1731/01 OSA (QDIO)        00 qeth
0.0.0610,0.0.0611,0.0.0612 1731/01 OSA (QDIO)        02 qeth
0.0.7400,0.0.7401,0.0.7402 1731/05 HiperSockets      03 qeth

ignore the hipersockers
then for each device

update:
need to remove th ignore for each device scanned by znetconf -u (excluding hipersockets)

cio_ignore -r 0.0.0600,0.0.0601,0.0.0602
cio_ignore -r 0.0.0610,0.0.0611,0.0.0612

then add the deivces:

znetconf -a 0600
Scanning for network devices...
Successfully configured device 0.0.0600 (enccw0.0.0600)

znetconf -a 0610
Scanning for network devices...
/bin/znetconf: line 560: echo: write error: No such device
znetconf: Error: Failed to make 0.0.0610 online

in this case 0610 is not configured

and for those added run:

znetconf -a 0600
Scanning for network devices...
Successfully configured device 0.0.0600 (enccw0.0.0600)

this needs to be done before etc/scripts/system-setup.d/55-migrate-network-devices.sh
How would you propose doing this for zlinux only??
Perhaps:

  • write a 50-s390-add-network-device.sh to this dir if s390 and not suse --> znetconf
  • in 55-migrate-network-devices.sh, check if znetconf exists and if /sys/class/net/ does not has a device --> znetconf

jsmeix commented at 2019-06-26 10:30:

@mutable-dan
I think in the ReaR recovery system setup scripts

usr/share/rear/skel/default/etc/scripts/system-setup
usr/share/rear/skel/default/etc/scripts/system-setup.d/*.sh

you don't have all those Linux distribution specific variables
like OS_MASTER_VENDOR set because the recovery system setup scripts
do not call the SetOSVendorAndVersion function in lib/config-functions.sh
so that in the recovery system setup scripts you cannot write code like

if test "Fedora" = $OS_MASTER_VENDOR ; then

as you could do in "normal" ReaR scripts where "normal ReaR scripts"
mean those scripts that are run when e.g. "rear mkbackup"
or "rear recover" are run, i.e. scripts that run "within" usr/sbin/rear.

I think a new separated generic recovery system setup script

usr/share/rear/skel/default/etc/scripts/system-setup.d/50-prepare-network-devices.sh

would be best to enable network devices as needed.

I think adding such code to the 55-migrate-network-devices.sh script is wrong
because it is not about migrating network devices.

I think the script name should not be specific for the architecture
(i.e. no 's390' in the script name) so that this script can be also used
if neded on other architectures in the future.

Inside the script the code can do what is neded to distingush between
different architectures.

On SUSE we also have /sbin/znetconf (via our s390-tools RPM) installed
so that you cannot distinguish between RHEL and SLES whether or not
a command znetconf exists.

E.g. on my s390 test system I get

# znetconf -u
Scanning for network devices...
Device IDs                 Type    Card Type      CHPID Drv. 
------------------------------------------------------------
0.0.0700,0.0.0701,0.0.0702 1731/01 OSA (QDIO)        01 qeth

# znetconf -a 0700
Scanning for network devices...
Successfully configured device 0.0.0700 (eth1)

# znetconf -c
Device IDs                 Type    Card Type      CHPID Drv. Name             State  
-------------------------------------------------------------------------------------
0.0.0800,0.0.0801,0.0.0802 1731/01 Virt.NIC QDIO     00 qeth eth0             online 
0.0.0700,0.0.0701,0.0.0702 1731/01 Virt.NIC QDIO     01 qeth eth1             online

But I assume when a command znetconf exists
(i.e. this is the test to find out we are on s390 architecture)
you can run znetconf -u and znetconf -a in any case
and nothing will go actually wrong so that
on RHEL the code would enable network devices while
on SLES the same code would actually do nothing.

To distinguish the architecture it seems you can use
the ARCH variable in recovery system setup scripts
because usr/share/rear/skel/default/etc/scripts/system-setup
sources /usr/share/rear/conf/default.conf which sets ARCH.

jsmeix commented at 2019-06-26 10:35:

@mutable-dan
FYI:
The main networking setup in the ReaR recovery system
happens during "rear mkrescue/mkbackup" via

usr/share/rear/rescue/GNU/Linux/310_network_devices.sh

that generates an additional recovery system setup script

$ROOTFS_DIR/etc/scripts/system-setup.d/60-network-devices.sh

which is only in the ReaR recovery system.
To see what files there are in your particular ReaR recovery system
after you did run "rear mkrescue/mkbackup" specify

KEEP_BUILD_DIR="yes"

in you etc/rear/local.conf - cf. my a bit outdated etc/rear/local.conf in
https://github.com/rear/rear/pull/2142#issuecomment-494426331
my current etc/rear/local.conf is

OUTPUT=RAMDISK
BACKUP=NETFS
BACKUP_OPTIONS="nfsvers=3,nolock"
BACKUP_URL=nfs://10.160.67.243/nfs
SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"
KEEP_BUILD_DIR="yes"
FIRMWARE_FILES=( 'no' )
PROGRESS_MODE="plain"
PROGRESS_WAIT_SECONDS="5"

and see the KEEP_BUILD_DIR description in default.conf

mutable-dan commented at 2019-06-28 23:00:

@jsmeix
see https://github.com/rear/rear/pull/2142#issuecomment-505818214
I have updated 310_network_device to write to scripts/system-setup.d/50-prepare-network-device.sh
tested on rhel

see https://github.com/rear/rear/pull/2142#issuecomment-505330190

In your usr/share/rear/layout/prepare/default/205_s390_enable_disk.sh
I do not see how you write the commands to enable a dasd drive
into the disrkrestore.sh file - I would expect code like

for the 205_s90_enable disk, i tried writing to diskrestore.sh but had some problems:
currently i am not using zipl to mod the boot loader so it boots, runs the scripts (which enable the networking device) i then run rear recover
specifically rear -vDd recover
but the dasd devices are not enabled because diskrestore has not run and i had changed layout/prepare/default/205_s390_enable_disk.sh to conditionally write to diskrestore.sh

when does diskrestore run?
is it called by the recover or meant to run manually?
if manually, should the dasd disks also be enabled in layout/prepare/default/205_s390_enable_disk.sh

jsmeix commented at 2019-07-01 08:24:

@mutable-dan
in general to get a better idea which scripts are run when
use the simulation mode via the -s option like (excerpt)

# rear -s recover
...
Source layout/prepare/GNU/Linux/100_include_partition_code.sh
...
Source layout/prepare/default/540_generate_device_code.sh
...
Source layout/recreate/default/100_confirm_layout_code.sh
Source layout/recreate/default/200_run_layout_code.sh

usually the layout/prepare/.../...include...code.sh scripts
contain only functions that are run later,
the layout/prepare/default/540_generate_device_code.sh
generates the diskrestore.sh script,
the layout/recreate/default/100_confirm_layout_code.sh
is a user dialog that appears in MIGRATION_MODE
where the user can edit the diskrestore.sh script if neded,
finally the diskrestore.sh script is run by
the layout/recreate/default/200_run_layout_code.sh script.

I would recommend to run "rear recover" in MIGRATION_MODE via

# export MIGRATION_MODE="true"

# rear -D recover

This way you get several user dialogs that you can confirm step by step.
In each one you can also Use Relax-and-Recover shell
and return back to that particular user dialog.
In the Relax-and-Recover shell you can inspect or even adapt
the system at that state during the "rear recover" process.

For example this way you could manually enable your dasd disks
before the diskrestore.sh script is run to find out what exact
command works for you to enable your dasd disks.
Or you could add that command to your diskrestore.sh script.

mutable-dan commented at 2019-07-02 21:25:

see https://github.com/rear/rear/pull/2142#issuecomment-505330190

In your usr/share/rear/layout/prepare/default/205_s390_enable_disk.sh
I do not see how you write the commands to enable a dasd drive
into the disrkrestore.sh file - I would expect code like

I don't see where disrkrestore.sh gets called

Below are all the scripts called

on boot:


Running 00-functions.sh...
Running 01-run-ldconfig.sh...
Running 10-console-setup.sh...
Running 20-check-boot-options.sh...
Running 40-start-udev-or-load-modules.sh...
Running 41-load-special-modules.sh...
Running 42-engage-scsi.sh...
Running 45-serial-console.sh...
Running 50-prepare-network-devices.sh...  <----------------------- zlinux rhel enable network devices
Running 55-migrate-network-devices.sh...
Running 58-start-dhclient.sh...
Running 60-network-devices.sh...
Running 62-routing.sh...
Running 65-sysctl.sh...
Running 99-makedev.sh...
Right now zipl is not called to setup the boot loader so rear is not called by the boot recovery

(I need to look into the proper sequences)

rear -vDd recover - or as you mentioned, rear -s recover


Relax-and-Recover 2.4 / Git
Running rear recover (PID 1465)
Using log file: /var/log/rear/rear-red72a1.log
Simulation mode activated, Relax-and-Recover base directory: /usr/share/rear
Source conf/GNU/Linux.conf
Source init/default/005_verify_os_conf.sh
Source init/default/010_EFISTUB_check.sh
Source init/default/010_set_drlm_env.sh
Source init/default/030_update_recovery_system.sh
Source init/default/050_check_rear_recover_mode.sh
Source init/default/950_check_missing_programs.sh
Source setup/default/005_ssh_agent_start.sh
Source setup/default/010_pre_recovery_script.sh
Source verify/default/020_cciss_scsi_engage.sh
Source verify/default/020_translate_url.sh
Source verify/default/030_translate_tape.sh
Source verify/default/040_validate_variables.sh
Source verify/default/050_create_mappings_dir.sh
Source verify/GNU/Linux/050_sane_recovery_check.sh
Source verify/GNU/Linux/230_storage_and_network_modules.sh
Source verify/GNU/Linux/260_recovery_storage_drivers.sh
Source layout/prepare/default/010_prepare_files.sh
Source layout/prepare/GNU/Linux/100_include_partition_code.sh
Source layout/prepare/GNU/Linux/110_include_lvm_code.sh
Source layout/prepare/GNU/Linux/120_include_raid_code.sh
Source layout/prepare/GNU/Linux/131_include_filesystem_code.sh
Source layout/prepare/GNU/Linux/133_include_mount_filesystem_code.sh
Source layout/prepare/GNU/Linux/135_include_btrfs_subvolumes_generic_code.sh
Source layout/prepare/GNU/Linux/136_include_btrfs_subvolumes_SLES_code.sh
Source layout/prepare/GNU/Linux/140_include_swap_code.sh
Source layout/prepare/GNU/Linux/150_include_drbd_code.sh
Source layout/prepare/GNU/Linux/160_include_luks_code.sh
Source layout/prepare/GNU/Linux/170_include_hpraid_code.sh
Source layout/prepare/GNU/Linux/180_include_opaldisk_code.sh
Source layout/prepare/default/200_recreate_hpraid.sh
Source layout/prepare/default/205_s390_enable_disk.sh   <----------------------- does chccwdev -e [channel] from disklayout.conf :-o
Source layout/prepare/GNU/Linux/210_load_multipath.sh
Source layout/prepare/default/250_compare_disks.sh
Source layout/prepare/default/270_overrule_migration_mode.sh
Source layout/prepare/default/300_map_disks.sh
Source layout/prepare/default/310_remove_exclusions.sh
Source layout/prepare/default/320_apply_mappings.sh
Source layout/prepare/default/420_autoresize_last_partitions.sh
Source layout/prepare/default/430_autoresize_all_partitions.sh
Source layout/prepare/default/500_confirm_layout_file.sh
Source layout/prepare/default/510_list_dependencies.sh
Source layout/prepare/default/520_exclude_components.sh
Source layout/prepare/default/540_generate_device_code.sh
Source layout/prepare/default/550_finalize_script.sh
Source layout/prepare/default/600_show_unprocessed.sh
Source layout/prepare/default/610_exclude_from_restore.sh
Source layout/recreate/default/100_confirm_layout_code.sh
Source layout/recreate/default/200_run_layout_code.sh
Source layout/recreate/default/250_verify_mount.sh
Source restore/Fedora/050_copy_dev_files.sh
Source restore/default/050_remount_async.sh
......
Source finalize/Fedora/s390/660_install_zipl.sh  <------- will look at
Source finalize/default/880_check_for_mount_by_id.sh
Source finalize/default/890_finish_checks.sh
Source finalize/default/900_remount_sync.sh
Source wrapup/default/500_post_recovery_script.sh
Source wrapup/default/980_good_bye.sh
Source wrapup/default/990_copy_logfile.sh
Exiting rear recover (PID 1465) and its descendant processes ...
Running exit tasks
You should also rm -Rf /tmp/rear.3GFZkfaiyaAeaBG

How disks are enabled

currently layout/prepare/default/205_s390_enable_disk.sh
reads from /var/lib/rear/layout/disklayout.conf

where:
   /usr/share/rear/layout/save/GNU/Linux/200_partition_layout.sh:363
      calls lsdasd and writes
         dasd_channel 0.0.0100 active dasda 94:0 --> /var/lib/rear/layout/disklayout.conf
   i know that this is not part of the disklayout.conf spec
   one way to get around this (if we need to) is to have a separate device loop and write to another file

   We can't do lsdasd -a to get all device and then enable each of them
   because they are enabled as dasda, dasdb, dasdc, ... 
   and we don't know which channel belongs to which device

As mentioned, I don't see where disrkrestore.sh is called and if i do not enable the disk devices before 250_compare_disks.sh then recovery fails

jsmeix commented at 2019-07-03 08:04:

@mutable-dan

regarding the recovery system startup scripts in etc/scripts/system-setup.d/

To debug issues with the recovery system startup scripts add debug
to the kernel command line when booting the ReaR recovery system.
This way the recovery system startup scripts will be called with set -x set
and each one after you confirmed its execution with [Enter] so that you
can see for each one its set -x output on the console.

Regarding where disrkrestore.sh is called:

It is called by layout/recreate/default/200_run_layout_code.sh
that runs disrkrestore.sh via source $LAYOUT_CODE
which happens in a user dialog while loop (excerpt):

# Run the disk layout recreation code (diskrestore.sh)
# again and again until it succeeds or the user aborts:
while true ; do
    # Run LAYOUT_CODE in a sub-shell because it sets 'set -e'
    # so that it exits the running shell in case of an error
    # but that exit must not exit this running bash here:
    ( source $LAYOUT_CODE )

jsmeix commented at 2019-07-03 08:17:

@mutable-dan
FYI what I personally basically always do to get an initial idea
what belongs to a certain "thingy" in all those ReaR scripts
is searching through all those ReaR scripts for it like

# find usr/sbin/rear usr/share/rear/ -type f | xargs grep 'diskrestore'

or when I need primarily "suspicious" ReaR script file names

# find usr/sbin/rear usr/share/rear/ -type f | xargs grep -li 'diskrestore'

The main problem is to know how a certain "thingy" is called in ReaR
e.g. diskrestore.sh or LAYOUT_CODE versus
disklayout.conf or LAYOUT_FILE or DISKLAYOUT_FILE
cf. https://github.com/rear/rear/issues/718

mutable-dan commented at 2019-07-17 20:27:

@jsmeix rhel 7 is in a working state:
able to build the recovery kernel and initfs
send it to zvm
punch the kernel init and params and ipl the recovery machine
run rear recovery, test that the dasd drives are partitioned with zipl boot loader
ipl the dasd bus-id to a recovered system

will look at sles while we put rhel through some more rigorous testing

mutable-dan commented at 2019-07-25 17:37:

@jsmeix will start testing sles15

jsmeix commented at 2019-08-02 09:50:

@mutable-dan
take your time - I am not in the office for some weeks.
I add @gdha as another reviewer so that this issue can be merged
as soon as things work for you and when the changes look o.k. for us.

mutable-dan commented at 2019-08-02 16:14:

@mutable-dan
take your time - I am not in the office for some weeks.
I add @gdha as another reviewer so that this issue can be merged

sles15 is in dev
rhel7.x is out of dev testing and going to proper testing

our additional minimum list of testing
rhel 6.x, 8.x
sles 11-15

mutable-dan commented at 2019-08-08 21:45:

@jsmeix @gdha

i am seeing the following error on sles15

5641 Create subvolume '/mnt/local//@/boot/grub2/s390x-emu'
5642 ++++ btrfs subvolume list -a /mnt/local/
5643 ++++ sed -e 's/<FS_TREE>\///'
5644 ++++ grep ' @$'
5645 ++++ tr -s '[:blank:]' ' '
5646 ++++ cut -d ' ' -f 2
5647 +++ subvolumeID=257
5648 +++ btrfs subvolume set-default 257 /mnt/local/
5649 +++ umount /mnt/local/
5650 +++ test -x /usr/lib/snapper/installation-helper
5651 +++ LogPrintError '/usr/lib/snapper/installation-helper not executable may indicate an error with btrfs default subvolume setup for @/.snapshots/1/snapshot on /dev/dasda2'
5652 +++ Log '/usr/lib/snapper/installation-helper not executable may indicate an error with btrfs default subvolume setup for @/.snapshots/1/snapshot on /dev/dasda2'
5653 +++ echo '2019-08-08 16:08:33.436925452 /usr/lib/snapper/installation-helper not executable may indicate an error with btrfs default subvolume setup for @/.snapshots/1/snapsho
5654 2019-08-08 16:08:33.436925452 /usr/lib/snapper/installation-helper not executable may indicate an error with btrfs default subvolume setup for @/.snapshots/1/snapshot on /dev/
5655 +++ PrintError '/usr/lib/snapper/installation-helper not executable may indicate an error with btrfs default subvolume setup for @/.snapshots/1/snapshot on /dev/dasda2'
5656 +++ mount -t btrfs -o subvolid=0 -o rw,relatime,ssd,space_cache /dev/dasda2 /mnt/local/
5657 +++ umount /mnt/local/
5658 +++ mount -t btrfs -o rw,relatime,ssd,space_cache /dev/dasda2 /mnt/local/
5659 +++ mount -t btrfs
5660 +++ tr -s '[:blank:]' ' '
5661 +++ grep -q ' on /mnt/local/root '
5662 +++ test -d /mnt/local/root
5663 +++ mount -t btrfs -o rw,relatime,ssd,space_cache -o subvol=@/root /dev/dasda2 /mnt/local/root
5664 +++ mount -t btrfs
5665 +++ tr -s '[:blank:]' ' '
5666 +++ grep -q ' on /mnt/local/boot/grub2/s390x-emu '
5667 +++ test -d /mnt/local/boot/grub2/s390x-emu
5668 +++ mount -t btrfs -o rw,relatime,ssd,space_cache -o subvol=@/boot/grub2/s390x-emu /dev/dasda2 /mnt/local/boot/grub2/s390x-emu
5669 +++ mount -t btrfs
5670 +++ tr -s '[:blank:]' ' '
5671 +++ grep -q ' on /mnt/local/tmp '
5672 +++ test -d /mnt/local/tmp
5673 +++ mount -t btrfs -o rw,relatime,ssd,space_cache -o subvol=@/tmp /dev/dasda2 /mnt/local/tmp
5674 +++ mount -t btrfs
5675 +++ tr -s '[:blank:]' ' '
5676 +++ grep -q ' on /mnt/local/.snapshots '
5677 +++ test -d /mnt/local/.snapshots
5678 +++ mkdir -p /mnt/local/.snapshots
5679 +++ mount -t btrfs -o rw,relatime,ssd,space_cache -o subvol=@/.snapshots /dev/dasda2 /mnt/local/.snapshots
5680 mount: /mnt/local/.snapshots: wrong fs type, bad option, bad superblock on /dev/dasda2, missing codepage or helper program, or other error.

the mount:

mount -t btrfs -o rw,relatime,ssd,space_cache -o subvol=@/root /dev/dasda2 /mnt/local/root

is failing. i played around with the mount cmd...

lsblk -f
NAME     FSTYPE LABEL UUID                                 MOUNTPOINT
dasda                                                      
|-dasda1                                                   
|-dasda2 btrfs        2829d9d0-d58f-4aaf-a6a8-d1ed02b3a78c /mnt/local
`-dasda3

i am wondering if
/usr/lib/snapper/installation-helper
is needed

i saw this older issue: https://github.com/rear/rear/issues/944

I have just started to look at it but thought you might have some insight?

jsmeix commented at 2019-08-09 05:19:

For SLES 12 and 15 with its special default btrfs structure
you need a special ReaR config, see the SLE* example config files in
usr/share/rear/conf/examples in particular for SLES 15 use
https://github.com/rear/rear/blob/master/usr/share/rear/conf/examples/SLE12-SP2-btrfs-example.conf
as template.
Terein see the comments how to add the right btrfs subvolumes to the backup.

Alternatively to simplify testing use a simpler SLES system without btrfs
e.g. with only a single ext4 filesystem, cf. the section
"First steps with Relax-and-Recover" therein in particular item 3. at
https://en.opensuse.org/SDB:Disaster_Recovery

mutable-dan commented at 2019-09-10 20:46:

@jsmeix

hi,
on rescue boot, i am seeing:

summary:

systemd: Running with unpopulated /etc. systemd: Failed to populate /etc with preset unit settings, ignoring: Invalid argument

Do you know what this means?

long:

Ý    3.561919¨ Loaded X.509 cert 'Red Hat Enterprise Linux kernel signing key: 0
26b6326e6920a99e57c526059c3007a11236fc8'
Ý    3.561936¨ registered taskstats version 1
Ý    3.562283¨ Freeing unused kernel memory: 676K (b81000 - c2a000)
Ý    3.567292¨ ip_tables: (C) 2000-2006 Netfilter Core Team
Ý    3.567307¨ systemdÝ1¨: Inserted module 'ip_tables'
Ý    3.569119¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.569560¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.573099¨ systemdÝ1¨: systemd 219 running in system mode. (+PAM +AUDIT +SEL
INUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +
XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Ý    3.573286¨ systemdÝ1¨: Detected virtualization zvm.
Ý    3.573293¨ systemdÝ1¨: Detected architecture s390x.
Ý    3.573299¨ systemdÝ1¨: Running with unpopulated /etc.

Welcome to  Ý0;31mRed Hat Enterprise Linux Server 7.6 (Maipo) Ý0m!

Ý    3.573367¨ systemdÝ1¨: Set hostname to <r76a1>.
Ý    3.573383¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.573387¨ systemdÝ1¨: Initializing machine ID from random generator.
Ý    3.574060¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.574355¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.574889¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.575209¨ systemdÝ1¨: Failed to populate /etc with preset unit settings, ig
noring: Invalid argument
Ý    3.577334¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.577351¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.577406¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.578011¨ random: systemd: uninitialized urandom read (16 bytes read)
Ý    3.580211¨ systemdÝ1¨: Reached target System Initialization.
Ý Ý32m  OK   Ý0m¨ Reached target System Initialization.
Ý    3.580494¨ systemdÝ1¨: Created slice -.slice.
Ý Ý32m  OK   Ý0m¨ Created slice -.slice.
Ý    3.580546¨ systemdÝ1¨: Listening on Syslog Socket.
Ý Ý32m  OK   Ý0m¨ Listening on Syslog Socket.
Ý    3.580580¨ systemdÝ1¨: Listening on Delayed Shutdown Socket.
Ý Ý32m  OK   Ý0m¨ Listening on Delayed Shutdown Socket.
Ý    3.580724¨ systemdÝ1¨: Created slice system.slice.
Ý Ý32m  OK   Ý0m¨ Created slice system.slice.
Ý    3.580810¨ systemdÝ1¨: Listening on udev Control Socket.
Ý Ý32m  OK   Ý0m¨ Listening on udev Control Socket.
Ý    3.580911¨ systemdÝ1¨: Created slice system-getty.slice.
Ý Ý32m  OK   Ý0m¨ Created slice system-getty.slice.
Ý    3.580980¨ systemdÝ1¨: Listening on udev Kernel Socket.
Ý Ý32m  OK   Ý0m¨ Listening on udev Kernel Socket.
Ý    3.581496¨ systemdÝ1¨: Starting udev Kernel Device Manager...
         Starting udev Kernel Device Manager...
Ý    3.582078¨ systemdÝ1¨: Starting udev Coldplug all Devices...
         Starting udev Coldplug all Devices...
Ý    3.582223¨ systemdÝ1¨: Created slice system-serial\x2dgetty.slice.
Ý Ý32m  OK   Ý0m¨ Created slice system-serial\x2dgetty.slice.
Ý    3.582879¨ systemdÝ1¨: Starting Relax-and-Recover boot script...
         Starting Relax-and-Recover boot script...
Ý    3.583582¨ systemdÝ1¨: Started Relax-and-Recover sshd service.
Ý Ý32m  OK   Ý0m¨ Started Relax-and-Recover sshd service.
Ý    3.587758¨ systemd-udevdÝ49¨: starting version 219
Ý    3.598385¨ systemdÝ1¨: Listening on Logging Socket.
Ý Ý32m  OK   Ý0m¨ Listening on Logging Socket.
Ý    3.598445¨ systemdÝ1¨: Listening on D-Bus System Message Bus Socket.
Ý Ý32m  OK   Ý0m¨ Listening on D-Bus System Message Bus Socket.
Ý    3.598520¨ systemdÝ1¨: Listening on Journal Socket.
Ý Ý32m  OK   Ý0m¨ Listening on Journal Socket.
Ý    3.598548¨ systemdÝ1¨: Reached target Sockets.
Ý Ý32m  OK   Ý0m¨ Reached target Sockets.
Ý    3.599012¨ systemdÝ1¨: Started Journal Service.
Ý Ý32m  OK   Ý0m¨ Started Journal Service.
Ý    3.611874¨ systemdÝ1¨: Started udev Coldplug all Devices.
Ý Ý32m  OK   Ý0m¨ Started udev Coldplug all Devices.
Ý Ý32m  OK   Ý0m¨ Started udev Kernel Device Manager.
Ý Ý32m  OK   Ý0m¨ Reached target Basic System.
         Starting Initialize Rescue System...
Ý Ý32m  OK   Ý0m¨ Started Relax-and-Recover boot script.
Ý Ý32m  OK   Ý0m¨ Found device /dev/hvc0.
Ý Ý32m  OK   Ý0m¨ Found device /dev/ttyS0.
Ý Ý32m  OK   Ý0m¨ Found device /dev/ttysclp0.

Verifying md5sums of the files in the Relax-and-Recover rescue system

md5sums are OK


Configuring Relax-and-Recover rescue system

mutable-dan commented at 2019-09-10 20:47:

For SLES 12 and 15 with its special default btrfs structure
you need a special ReaR config, see the SLE* example config files in
usr/share/rear/conf/examples in particular for SLES 15 use

note: this is working....
thx

mutable-dan commented at 2019-09-19 21:57:

@jsmeix
adding ed for 3270 terminal editing
currently working on getting recovery kernel to see the network device for rhel76

mutable-dan commented at 2019-09-19 22:01:

@jsmeix
adding ed line editor for 3270 editing
good to keep this. currently using it while working on a networking issue for the recovery kernel on rhel 76. not sure about this one. recovery does not see any net devices

mutable-dan commented at 2019-09-24 23:06:

@jsmeix @gdha
Having a problem with rhel 7.6:

  • not loading kernel modules
  • not seeing qeth devices even after qeth, qeth_l2 (even l3) are loaded
/etc/scripts/system-setup.d/01-run-ldconfig.sh: FAILED  (both 7.2 & 7.6)
./etc/udev/udev.conf: FAILED
./root/rear/usr/share/rear/skel/default/etc/scripts/system-setup.d/01-run-ldcon??? FAILED

i put a set -x in 01-run-ldconfig.sh
on boot it looks roughly the same up to 40-start-udev-or-load-modules.sh
and 40-start-udev-or-load-modules.sh look the same

we verified it's not the vm, any thoughts?

jsmeix commented at 2019-09-25 09:39:

@mutable-dan
I am not in the office since some weeks and for some more weeks
so that I cannot do much for ReaR.
I cannot try out or test anything for ReaR - in particular not on IBM Z.
I expect to be back in the office at about beginning of October.
But I also expect that I have to do first and foremost other stuff with higher priority.
Cf. https://github.com/rear/rear/issues/799#issuecomment-531247109

Regarding RHEL specific things
@rmetrich and @pcahyna
are our Red Hat experts.

Currently we do not have a dedicated IBM Z specialist.
Perhaps @schabrolles from IBM might be able to help
but he is actually an IBM POWER specialist who is
currently in vacation till the end of the month
and then busy with client on-site requests
cf. https://github.com/rear/rear/pull/2232#issuecomment-531280327

mutable-dan commented at 2019-11-05 21:13:

adding support for kernel and initrd override for s390

override kernel and initrd name if arch is s390 and ZVM_NAMING=Y is in local.conf

note: s390 kernel copy is only through nfs

s390 optional naming override of initrd and kernel to match the s390 filesytem naming conventions
on s390 in ReaR, there is an option to name the initrd and kernel in the form of VMID kernel, VMID initrd

file names on s390 are in the form of name type mode, we are concerned with name and type
if the vm name (cp q userid) is HOSTA then the files written will be HOSTA kernel and HOSTA initrd

The reason for this option is that, on the flat filesytem of s390 with many VM's there needs to be a way to organize the generated recoveries

vars needed:
ZVM_NAMING - set in local.conf, if Y then enable naming override
ZVM_KERNEL_NAME - keeps track of kernel name in results array (internal)
ARCH - override only if ARCH is Linux-s390 (internal)

initrd name override is handled in 900_create_initramfs.sh
kernel name override is handled in 400_guess_kernel.sh
kernel name override is handled in 950_copy_result_files.sh

jsmeix commented at 2019-11-06 09:27:

@mutable-dan
thank you for your continuous work on this issue!

Please tell me when things are in a at least basically usable state for you
because I would like to merge this pull request as soon as it is
sufficiently complete to provide a single useful piece of functionality.

There is no need to wait until all is done for full featured IBM Z support.

As soon as IBM Z support is usable at least in a very basic way
things should be merged into our ReaR GitHub master code
so that other IBM Z users could try it out and test it
which helps a lot to improve IBM Z support in ReaR.

mutable-dan commented at 2019-11-06 15:34:

@jsmeix

thank you.
looking at some disk issues for LDL formatted disks. I would like to wait until the end of this week before we discuss merging. CDL formatted disks work fine, but not LDL
SELinux kept us busy (recovery is a little different than non s390), I will need to make some notes for the end users. Simple solution, but it wasn't obvious

mutable-dan commented at 2019-11-18 18:26:

@jsmeix the LDL code is in testing, re-testing sles and then we can look to merge

jsmeix commented at 2019-11-19 08:18:

@mutable-dan
thank you for your current status notification!

mutable-dan commented at 2019-11-21 22:11:

@jsmeix when you ready, we are ready to begin the merge

jsmeix commented at 2019-11-22 12:44:

@mutable-dan
thank you for the notification.
I will re-do the review next week (as time permits).

jsmeix commented at 2019-12-02 10:50:

I would like to merge it tomorrow
unless there are objections.

jsmeix commented at 2019-12-02 13:52:

@mutable-dan @gdha @rmetrich
I did a lot of (hopefully correct) adaptions here.
I would much appreciate it if you could review what I did.

mutable-dan commented at 2019-12-02 19:22:

@jsmeix

all of your changes look good. i moved 305_include_s390_tools.sh to Linux-s390 dir and testing it.

will remove commented conditional once tests look ok

jsmeix commented at 2019-12-03 08:42:

@mutable-dan
thank you for testing it!
I will wait with merging it until you gave me an OK
that you finished your testing and things work for you.

There is another script where I think that it could be moved
into an architecture specific directory:

From
usr/share/rear/layout/prepare/default/205_s390_enable_disk.sh
to
usr/share/rear/layout/prepare/Linux-s390/205_s390_enable_disk.sh

See
https://github.com/rear/rear/blob/fb02447031edafb68685a93bc0ac9bdc911ab625/usr/share/rear/layout/prepare/default/205_s390_enable_disk.sh#L7

That script has also several FIXME: comments from me
because I do not understand the intent behind how
the echo commands are meant to work in that script.

mutable-dan commented at 2019-12-10 17:51:

@jsmeix
i do not see any outstanding issues, did i address all?
thx

jsmeix commented at 2019-12-13 10:14:

@rmetrich
you had

requested changes on behalf of rear/contributors on May 13

https://github.com/rear/rear/pull/2142#pullrequestreview-236492905
which is meanwhile marked as resolved.

But the change request from your review at that time is still there
so I would like to ask you to have one more look here
and dismiss or re-confirm your change request.

mutable-dan commented at 2019-12-13 16:40:

Now all looks OK to me from plain looking at the code
so I would like to merge it next Monday
unless there are objections.

no objections, please move forward with the merge on monday.
thank you

jsmeix commented at 2019-12-16 14:04:

@mutable-dan
thank you for all your continuous work
that you contributed to ReaR here!

mutable-dan commented at 2019-12-16 15:53:

@jsmeix
thank you, glad to help, we will continue to help support, bug fix and test releases, glad to have the pull request completed. thank you for your help also.

we should make a wiki section on s390 (which i would be happy to contribute to) to give people tips and support trying out zlinux. so that others do not repeat our mistakes and hurdles.

for example:
on rhel with selinux, it was necessary on booting the recovery to put selinux=0 on the kernel cmdline.... ie in the zvm param file.

and
3270 can be a challenge for those used to vt220

jsmeix commented at 2019-12-16 16:59:

@gdha
is it possible to have a wiki section on "IBM Z" (a.k.a. "s390")
where @mutable-dan could contribute to?

@mutable-dan
don't worry about 3270 terminal,
even I managed to use it (for some very limited meaning of "use")
and who fails with 3270 terminal is not old enough to use IBM Z ;-)

jsmeix commented at 2019-12-19 16:43:

@mutable-dan @gdha
I wonder if a wiki section on "IBM Z" is actually the right place
because the wiki is outside of what is provided in ReaR itself.

I wonder if it is perhaps better to provide documentation about
"ReaR on IBM Z (s390)" in the same way as other ReaR documentation
https://github.com/rear/rear/tree/master/doc/user-guide
inside ReaR e.g. as doc/user-guide/17-ReaR-on-IBM-Z-s390.adoc

mutable-dan commented at 2019-12-19 21:11:

@jsmeix @gdha
having 17-rear....

makes sense, as it is a guide to s390 zlinux and not rear

mutable-dan commented at 2020-01-02 16:05:

you mentioned a document section
https://github.com/rear/rear/tree/master/doc/user-guide above, i also see there is a zlinux section in the wiki. i am thinking that for the helper info discussed, you would want it in the wiki?

what would you want a branch made to handle this that will be merged back on approval?

thx

jsmeix commented at 2020-01-08 12:13:

@gdha
could you describe us what
https://github.com/rear/rear/wiki/zLinux
is intended for
and where documentation about "ReaR on IBM Z (s390)"
should be usually provided?

wasifmoh commented at 2020-08-14 10:10:

Hi ,
Was trying to test Rear 2.6 on s390x architecture , but the rpm build fails , Can you please help me ?

[root@test9 rear]# ls
COPYING  MAINTAINERS  Makefile  README.adoc  build  dist  doc  etc  packaging  tests  usr
[root@test9 rear]# make rpm 
rm -Rf dist build
rm -f build-stamp
make -C doc clean
make[1]: Entering directory `/data/rear/doc'
rm -f unconv.8 *.html *.xml
make -C user-guide clean
make[2]: Entering directory `/data/rear/doc/user-guide'
rm -f *.html *.svg *.xml
make[2]: Leaving directory `/data/rear/doc/user-guide'
make[1]: Leaving directory `/data/rear/doc'
== Validating scripts and configuration ==
find etc/ usr/share/rear/conf/ -name '*.conf' | xargs -n 1 bash -n
bash -n usr/sbin/rear
find . -name '*.sh' | xargs -n 1 bash -O extglob -O nullglob -n
find usr/share/rear -name '*.sh' | grep -v -E '(lib|skel|conf)' | while read FILE ; do \
    num=$(echo ${FILE##*/} | cut -c1-3); \
    if [[ "$num" = "000" || "$num" = "999" ]] ; then \
        echo "ERROR: script $FILE may not start with $num"; \
        exit 1; \
    else \
        if $( grep '[_[:alpha:]]' <<< $num >/dev/null 2>&1 ) ; then \
            echo "ERROR: script $FILE must start with 3 digits"; \
            exit 1; \
        fi; \
    fi; \
done
== Prepare manual ==
make -C doc man
make[1]: Entering directory `/data/rear/doc'
make[1]: Nothing to be done for `man'.
make[1]: Leaving directory `/data/rear/doc'
== Building archive rear-2.6-git.4110.df5e18b.master ==
rm -Rf build/rear-2.6-git.4110.df5e18b.master
mkdir -p dist build/rear-2.6-git.4110.df5e18b.master
tar -c --exclude-from=.gitignore --exclude=.gitignore --exclude=".??*" * | \
    tar -C build/rear-2.6-git.4110.df5e18b.master -x
== Rewriting packaging/rpm/rear.spec, packaging/debian/rear.dsc and usr/sbin/rear ==
sed -i.orig \
    -e 's#^Source:.*#Source: https://sourceforge.net/projects/rear/files/rear/2.6/rear-2.6-git.4110.df5e18b.master.tar.gz#' \
    -e 's#^Version:.*#Version: 2.6#' \
    -e 's#^%define rpmrelease.*#%define rpmrelease .git.4110.df5e18b.master#' \
    -e 's#^%setup.*#%setup -q -n rear-2.6-git.4110.df5e18b.master#' \
    build/rear-2.6-git.4110.df5e18b.master/packaging/rpm/rear.spec
sed -i.orig \
    -e 's#^Version:.*#Version: 2.6-0git.4110.df5e18b.master#' \
    build/rear-2.6-git.4110.df5e18b.master/packaging/debian/rear.dsc
sed -i.orig \
    -e 's#^readonly VERSION=.*#readonly VERSION=2.6-git.4110.df5e18b.master#' \
    -e 's#^readonly RELEASE_DATE=.*#readonly RELEASE_DATE="2020-08-10"#' \
    build/rear-2.6-git.4110.df5e18b.master/usr/sbin/rear
tar -czf dist/rear-2.6-git.4110.df5e18b.master.tar.gz -C build rear-2.6-git.4110.df5e18b.master
== Building SRPM package rear-2.6-git.4110.df5e18b.master ==
rpmbuild -ts --clean --nodeps \
    --define="_topdir /data/rear/build/rpmbuild" \
    --define="_sourcedir /data/rear/dist" \
    --define="_srcrpmdir /data/rear/dist" \
    --define "debug_package %{nil}" \
    dist/rear-2.6-git.4110.df5e18b.master.tar.gz
Wrote: /data/rear/dist/rear-2.6-1.git.4110.df5e18b.master.el7.src.rpm
Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.SaDiwM
+ umask 022
+ cd /data/rear/build/rpmbuild/BUILD
+ rm -rf rear-2.6-git.4110.df5e18b.master
+ exit 0
== Building RPM package rear-2.6-git.4110.df5e18b.master ==
rpmbuild --rebuild --clean \
    --define="_topdir /data/rear/build/rpmbuild" \
    --define="_rpmdir /data/rear/dist" \
    --define "_rpmfilename %%{NAME}-%%{VERSION}-%%{RELEASE}.%%{ARCH}.rpm" \
    --define "debug_package %{nil}" \
    dist/rear-2.6-1*.src.rpm
Installing dist/rear-2.6-1.git.4110.df5e18b.master.el7.src.rpm
error: Architecture is not included: s390x
make: *** [rpm] Error 1
[root@test9 rear]#

wasifmoh commented at 2020-08-14 10:12:

[root@test9 rear]# uname -a 
Linux test9 3.10.0-957.el7.s390x #1 SMP Thu Oct 4 16:53:20 EDT 2018 s390x s390x s390x GNU/Linux
[root@test9 rear]# lscpu
Architecture:          s390x
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Big Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s) per book:    1
Book(s) per drawer:    1
Drawer(s):             2
Vendor ID:             IBM/S390
Machine type:          3906
BogoMIPS:              21881.00
Hypervisor:            z/VM 7.1.0
Hypervisor vendor:     IBM
Virtualization type:   full
Dispatching mode:      horizontal
L1d cache:             128K
L1i cache:             128K
L2d cache:             4096K
L2i cache:             2048K
L3 cache:              131072K
L4 cache:              688128K
Flags:                 esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te vx vxd vxe gs sie
[root@test9 rear]#

jsmeix commented at 2020-08-14 12:30:

@wasifmoh
instead of trying to make a RPM locally
(I think you do not need to have ReaR as RPM package to test it)
I would recommend
"Testing current ReaR upstream GitHub master code"
as described in
https://en.opensuse.org/SDB:Disaster_Recovery

If you have issues with booting the ReaR recovery system
on your replacement IBM Z virtual machine you may have a look at
"Launching the ReaR recovery system via kexec" in
https://en.opensuse.org/SDB:Disaster_Recovery
as a workaround.

See also the part about
"Initial preliminary first basic support for IBM Z architecture" in
https://github.com/rear/rear/blob/master/doc/rear-release-notes.txt#L202

Many thanks in advance for testing ReaR on IBM Z architecture!


[Export of Github issue for rear/rear.]