#1026 Issue closed: REAR issues on SGI UV300/HPE MC990X

Labels: support / question, fixed / solved / done, won't fix / can't fix / obsolete

zensonic opened issue at 2016-10-05 08:55:

Relax-and-Recover (rear) Issue Template

Please fill in the following items before submitting a new issue:

  • rear version (/usr/sbin/rear -V):
    Relax-and-Recover 1.18 / Git

  • OS version (cat /etc/rear/os.conf or lsb_release -a):
    LSB Version: n/a
    Distributor ID: SUSE LINUX
    Description: SUSE Linux Enterprise Server for SAP Applications 12 SP1
    Release: 12.1.0.1
    Codename: n/a

  • rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):

    [root@nfsserver etc]# more rear/local.conf
    OUTPUT=ISO
    ONLY_INCLUDE_VG=(vgsystem)

    Exclude specific folders

    This means that if files are needed below specific forlders, they need to be r

    estored from NetBackup BACKUP_PROG_EXCLUDE=( "${BACKUP_PROG_EXCLUDE[@]}" '/gpfsFS1/' '/var/crash/' '/dev/shm/' '/home/BIA/' '/home/bia_/' '/oracle/' '/opt/oracle/' '/sapmnt/' '/var/pg-backup/' ) ISO_MKISOFS_BIN=/usr/bin/ebiso BACKUP=NETFS NETFS_KEEP_OLD_BACKUP_COPY=1 BACKUP_URL=nfs://nfsserver/export/sysdata/rear

  • Brief description of the issue
    Multiple issues:

Backup:
(1.) To make it to produce a rear at all, I had to edit /usr/share/rear/lib/uefi-functions.sh and edit efiboot_img_size to avoid "no space" issue

(2.) That produced a rear image, but with warning/errors in the log around 20_partition_layout.sh

2016-10-04 14:52:37 Including layout/save/GNU/Linux/20_partition_layout.sh
2016-10-04 14:52:37 Saving disk partitions.
Error: The backup GPT table is not at the end of the disk, as it should be.  This might mean that another operating system believes the disk is smaller.  Fix, by moving the backup to the end (and r
emoving the old backup)?
Warning: Not all of the space available to /dev/sda appears to be used, you can fix the GPT to use all of the space (an extra 6320 blocks) or continue with the current setting?

(3.) Errors in the backup log around 21_raid_layout.sh

2016-10-04 14:53:15 Including layout/save/GNU/Linux/21_raid_layout.sh
2016-10-04 14:53:15 Saving Software RAID configuration.
/usr/share/rear/layout/save/GNU/Linux/21_raid_layout.sh: line 44: let: sparedevices=2-: syntax error: operand expected (error token is "-")
/usr/share/rear/layout/save/GNU/Linux/21_raid_layout.sh: line 65: [: : integer expression expected

Restoring

(4.) We had to choose secure-boot from the produced iso. Booting into non-secure didn't work. Issues around finding kernel and initrd.

(5.) While restoring it tried to create both /dev/md126 and /dev/md127 while the system only has a single semi-hw assisted disk (called /dev/md126). So an ekstra mdadm create stanza, that should not be there, without any working raiddevices=... statement. Probably related to the issue in point 2.

When we edited the restore script we could recreate the /dev/md126 and restore the system, but the system could not boot

(6.) system restored with MBR partition while the original had GPT.

  • Work-around, if any

See above.

I have access to these UV300/MC990X for some time. I can't do a restore test on the system available now, but we have a timeframe, where I am able to help fixing the issues in 1...3 and then, when I get the next UV300/MC990X inhouse I can try a full restore test again. The one I tested on yesterday has been handed over to our sapbasis guys due to timelines in the project.

BR Thomas

jsmeix commented at 2016-10-05 10:14:

I do not have such hardware.
Therefore I cannot reproduce any of your issues.

At least provide your disklayout.conf file so that
we might have a chance to see what goes on
on your particular system.

Furthermore provide your changes in the rear scripts
so that we might have a chance to imagine what
could go wrong on your particular system.

In general see
"Debugging issues with Relax-and-Recover" in
https://en.opensuse.org/SDB:Disaster_Recovery

When you use SUSE Linux Enterprise Server:
Don't you have an official support contract?

jsmeix commented at 2016-10-05 10:26:

There is no string
"The backup GPT table is not at the end of the disk"
in the rear scripts so that this error comes from
a program that is called by rear.

On my openSUSE Leap 42.1 system that message is in
/usr/lib64/libparted.so.2.0.0

When parted exits with an error I assume there is a valid
reason for it.

@zensonic
inspect the rear log file (run it with '-d -D') perhaps there are
issues before your parted error that indicate what the root cause
is of your particular parted error.

zensonic commented at 2016-10-05 21:05:

Hi

Thanks for picking up on this. A couple of points to the general case

  1. I know that this hardware is exotic (to make an understatement ;-)), but lets see if we can get around supporting it anyow
  2. Support. Well, in a couple of days I will recieve the licenses. Lost in transit between HPE and my company at the moment. Stuff happens I suppose

And now back to the technical bits

I have installed rear-1.18-3.x86_64.rpm and ebiso-0.2.3.1-1.x86_64.rpm from

http://download.opensuse.org/repositories/Archiving:/Backup:/Rear/SLE_12_SP1/x86_64/

Since I created the ticket we have reinstalled the system but disabled secureboot in SLES when installing. That actually fixed the need for altering the initrd size

A sample run from today (2016-10-05) where SLES12SP1 has been installed without secure EFI boot

hana01:~ # rear -d -D -v mkbackup
Relax-and-Recover 1.18 / Git
Using log file: /var/log/rear/rear-dchana01.log
mkdir: created directory '/var/lib/rear/output'
Using UEFI Boot Loader for Linux (USING_UEFI_BOOTLOADER=1)
Creating disk layout
Excluding Volume Group vgHWZdata
Excluding Volume Group vgHWDshared
Excluding Volume Group vgHWTlog
Excluding Volume Group vgHWTshared
Excluding Volume Group vgHWZlog
Excluding Volume Group vgHWDlog
Excluding Volume Group vgHWZshared
Excluding Volume Group vgsap
Excluding Volume Group vgHWTdata
Excluding Volume Group vgHWDdata
Creating root filesystem layout
TIP: To login as root via ssh you need to set up /root/.ssh/authorized_keys or SSH_ROOT_PASSWORD in your configuration file
Copying files and directories
Copying binaries and libraries
Copying kernel modules
Creating initramfs
Making ISO image
Wrote ISO image: /var/lib/rear/output/rear-hana01.iso (162M)
Copying resulting files to nfs location
Encrypting disabled
Creating tar archive '/tmp/rear.RHYcPa7XlIx7eZu/outputfs/dchana01/backup.tar.gz'
Archived 1475 MiB [avg 7266 KiB/sec]OK
Archived 1475 MiB in 209 seconds [avg 7231 KiB/sec]
You should also rm -Rf /tmp/rear.RHYcPa7XlIx7eZu

Looking into the GPT issue

+++ [[ sda =~ ^mapper/ ]]
++++ readlink /dev/sda
+++ my_dm=
+++ name=sda
+++ echo /dev/sda
+++ return 1
++ devname=/dev/sda
+++ get_disk_size sda
+++ local disk_name=sda
++++ get_block_size sda
++++ '[' -r /sys/block/sda/queue/logical_block_size ']'
++++ echo 512
+++ local block_size=512
+++ '[' -r /sys/block/sda/size ']'
+++ BugIfError 'Could not determine size of disk sda, please file a bug.'
+++ ((  0 != 0  ))
+++ local nr_blocks=781422768
+++ local disk_size=400088457216
+++ echo 400088457216
++ devsize=400088457216
+++ parted -s /dev/sda print
+++ grep -E 'Partition Table|Disk label'
+++ cut -d : -f 2
+++ tr -d ' '
Error: The backup GPT table is not at the end of the disk, as it should be.  This might mean that another operating system believes the disk is smaller.  Fix, by moving the backup to the end (and removing the old backup)?
Warning: Not all of the space available to /dev/sda appears to be used, you can fix the GPT to use all of the space (an extra 176 blocks) or continue with the current setting?
++ disktype=gpt
++ echo '# Disk /dev/sda'
++ echo '# Format: disk   '
++ echo 'disk /dev/sda 400088457216 gpt'
++ echo '# Partitions on /dev/sda'
++ echo '# Format: part      /dev/'
++ extract_partitions /dev/sda
++ declare device=/dev/sda
+++ get_sysfs_name /dev/sda
+++ local name=sda
+++ name=sda
+++ [[ -e /sys/block/sda ]]
+++ echo sda
+++ return 0
++ declare sysfs_name=sda
++ sysfs_paths=(/sys/block/$sysfs_name/$sysfs_name*)
++ declare -a sysfs_paths
++ declare path sysfs_path
++ [[ 0 -eq 0 ]]
++ [[ /dev/sda != /dev/sda ]]
++ :
++ declare partition_name partition_prefix start_block
++ declare partition_nr size start
++ sort -n /tmp/rear.xNLRCa9150gPXfx/tmp/partitions_unsorted
++ [[ ! -s /tmp/rear.xNLRCa9150gPXfx/tmp/partitions ]]
++ Debug 'No partitions found on /dev/sda.'
++ test -n 1
++ Log 'No partitions found on /dev/sda.'
++ test 1 -gt 0
+++ Stamp
+++ date '+%Y-%m-%d %H:%M:%S.%N '
++ echo '2016-10-05 21:47:30.269266798 No partitions found on /dev/sda.'
2016-10-05 21:47:30.269266798 No partitions found on /dev/sda.
++ return
dchana01:/var/lib/rear/layout # grep -v ^# disklayout.conf
disk /dev/sda 400088457216 gpt
disk /dev/sdb 400088457216 gpt
raid /dev/md127 metadata=0.90 level=raid1 raid-devices=2 uuid=baad6a94:8c86c5c3:720bbfe8:1cc4b06b name=126_0 devices=/dev/sda,/dev/sdb
part /dev/md127 163577856 1048576 primary none /dev/md127p1
part /dev/md127 399922692096 164626432 primary lvm /dev/md127p2
lvmdev /dev/vgsystem /dev/md127p2 XVxaeZ-sreI-bwv6-3duX-7epu-1s3N-Goo5h9 781099008
lvmgrp /dev/vgsystem 4096 95348 390545408
lvmvol /dev/vgsystem lvaudit 512 4194304
lvmvol /dev/vgsystem lvhome 1024 8388608
lvmvol /dev/vgsystem lvopenv 2048 16777216
lvmvol /dev/vgsystem lvopt 1024 8388608
lvmvol /dev/vgsystem lvroot 2048 16777216
lvmvol /dev/vgsystem lvswap 2048 16777216
lvmvol /dev/vgsystem lvtmp 1024 8388608
lvmvol /dev/vgsystem lvusr 2048 16777216
lvmvol /dev/vgsystem lvvar 1024 8388608
fs /dev/mapper/vgsystem-lvaudit /var/log/audit xfs uuid=b96f9372-ecd1-4003-9370-629bea53de89 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvhome /home xfs uuid=33cb75f8-d2b1-4189-a8b2-6c85de75d368 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvopenv /usr/openv xfs uuid=b908b2e8-24c7-4ed9-a457-2600d3cc1896 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvopt /opt xfs uuid=fb68b3a3-8bf7-40d5-be99-7f24644a500f label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvroot / xfs uuid=a618f9f3-ac44-40d0-b87d-c32e18cdfce4 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvtmp /tmp xfs uuid=4b03ac00-67dc-4329-bb57-2033a206464c label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvusr /usr xfs uuid=9e4e8a25-50eb-40ef-b1f1-9d24a126d572 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvvar /var xfs uuid=38386ab0-9f20-455b-a5f7-966c737cc496 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/md127p1 /boot/efi vfat uuid=CC7C-CED7 label= options=rw,relatime,fmask=0002,dmask=0002,allow_utime=0020,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro
swap /dev/mapper/vgsystem-lvswap uuid=a0cb31bc-81d2-4cff-9247-94416f5a1190 label=

Run from 2016-10-04 (where SLES12SP1 was installed using secure UEFI boot)

[root@nfs hana01]# tar xvzf backup.tar.gz var/lib/rear
var/lib/rear/
var/lib/rear/output/
var/lib/rear/recovery/
var/lib/rear/recovery/bootdisk
var/lib/rear/recovery/mountpoint_permissions
var/lib/rear/recovery/bootloader
var/lib/rear/recovery/diskbyid_mappings
var/lib/rear/recovery/mountpoint_device
var/lib/rear/recovery/initrd_modules
var/lib/rear/recovery/storage_drivers
var/lib/rear/recovery/if_inet6
var/lib/rear/recovery/uefi-variables
var/lib/rear/layout/
var/lib/rear/layout/config/
var/lib/rear/layout/config/df.txt
var/lib/rear/layout/config/files.md5sum
var/lib/rear/layout/disklayout.conf
var/lib/rear/layout/diskdeps.conf
var/lib/rear/layout/disktodo.conf
var/lib/rear/layout/lvm/
var/lib/rear/layout/lvm/vgsystem.cfg
var/lib/rear/layout/lvm/vgHWZdata.cfg
var/lib/rear/layout/lvm/vgHWTlog.cfg
var/lib/rear/layout/lvm/vgHWDshared.cfg
var/lib/rear/layout/lvm/vgHWTshared.cfg
var/lib/rear/layout/lvm/vgHWDlog.cfg
var/lib/rear/layout/lvm/vgHWZshared.cfg
var/lib/rear/layout/lvm/vgsap.cfg
var/lib/rear/layout/lvm/vgHWTdata.cfg
var/lib/rear/layout/lvm/vgHWDdata.cfg
var/lib/rear/sysreqs/
var/lib/rear/sysreqs/Minimal_System_Requirements.txt
[root@rigel dchana01]# grep -v ^# var/lib/rear/layout/disklayout.conf
disk /dev/sda 400088457216 gpt
disk /dev/sdb 400088457216 gpt
raid /dev/md127 metadata=imsm level=container raid-devices= uuid=5fd887e0:5890fccf:098c48b8:fb9f7111 name=imsm0 devices=/dev/sdb,/dev/sda
raid /dev/md126 metadata= level=raid1 raid-devices=2 uuid=baad6a94:8c86c5c3:720bbfe8:1cc4b06b name=Volume0 devices=/dev/sda,/dev/sdb
part /dev/md126 163577856 1048576 primary boot /dev/md126p1
part /dev/md126 399919546368 164626432 primary lvm /dev/md126p2
lvmdev /dev/vgsystem /dev/md126p2 Pn540W-WThX-d65s-12cI-zHps-D7YY-jtJmSY 781092864
lvmgrp /dev/vgsystem 4096 95348 390545408
lvmvol /dev/vgsystem lvaudit 512 4194304
lvmvol /dev/vgsystem lvhome 1024 8388608
lvmvol /dev/vgsystem lvopenv 2048 16777216
lvmvol /dev/vgsystem lvopt 1024 8388608
lvmvol /dev/vgsystem lvroot 2048 16777216
lvmvol /dev/vgsystem lvswap 2048 16777216
lvmvol /dev/vgsystem lvtmp 1024 8388608
lvmvol /dev/vgsystem lvusr 2048 16777216
lvmvol /dev/vgsystem lvvar 1024 8388608
fs /dev/mapper/vgsystem-lvaudit /var/log/audit xfs uuid=67a13259-ece6-4b71-8f28-47b6c75f2d3c label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvhome /home xfs uuid=2b6e8a4d-3f7a-4630-a4ed-499eb5532c25 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvopenv /usr/openv xfs uuid=9eace0aa-ef0d-48c8-a00a-ab38235b504b label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvopt /opt xfs uuid=86bd1006-9bb8-4b74-89db-6651607d3d9c label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvroot / xfs uuid=f0580a4b-4eb8-4364-801c-86c35b47eb56 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvtmp /tmp xfs uuid=f09b2773-8846-4987-bcb8-1e747b7f74a7 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvusr /usr xfs uuid=6a20d967-1c0e-4f41-acfd-e489778f7880 label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/mapper/vgsystem-lvvar /var xfs uuid=e7c55276-3b3c-45d2-954b-6c7ab459207f label=  options=rw,relatime,attr2,inode64,noquota
fs /dev/md126p1 /boot/efi vfat uuid=0953-0319 label= options=rw,relatime,fmask=0002,dmask=0002,allow_utime=0020,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro
swap /dev/mapper/vgsystem-lvswap uuid=54ce1f11-01da-4516-8345-f13133701ff1 label=

I can't explain why the mdraid works today but not yesterday (before reinstallation).

I consider leaving it as is now and then restoring the image onto the next UV300 I get (2-3 weeks from now) and then see if it works of if I get an MBR partition again.

Thomas

jsmeix commented at 2016-10-06 08:29:

The parted error message come from the command

parted -s /dev/sda print

What is the output of that command on your system?

What is on your system the output of

parted -s /dev/sda unit B print

jsmeix commented at 2016-10-14 11:53:

Leaving it as is now, see
https://github.com/rear/rear/issues/1026#issuecomment-251799468

If it appears again it can be reopened or (if it is a bit different)
a new issue can be submitted.


[Export of Github issue for rear/rear.]