#2616 Issue closed
: diskrestore.sh failed with "mdadm: layout -unknown- not understood for raid0"¶
Labels: bug
, fixed / solved / done
cvijayvinoth opened issue at 2021-05-18 05:43:¶
-
ReaR version ("/usr/sbin/rear -V"):
Relax-and-Recover 2.6 / Git -
OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
- ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
OUTPUT=ISO
BACKUP=RSYNC
RSYNC_PREFIX="diskimage_${HOSTNAME}"
BACKUP_PROG="/var/www/html/imageBackup/rsync"
OUTPUT_URL=rsync://diskimage@192.168.1.123::rsync_backup
BACKUP_URL=rsync://diskimage@192.168.1.123::rsync_backup
BACKUP_RSYNC_OPTIONS+=(-z --progress --password-file=/var/www/html/imageBackup/user_profile/diskimage/rsync_pass)
BACKUP_PROG_EXCLUDE+=( "$(</etc/rear/path.txt)/imageBackup/iso/*" "$(</etc/rear/path.txt)/imageBackup/user_profile/*" "$(</etc/rear/path.txt)/imageBackup/data/rsync_pass" )
ISO_DIR="/var/www/html/imageBackup/iso/$HOSTNAME"
MESSAGE_PREFIX="$$: "
PROGRESS_MODE="plain"
AUTOEXCLUDE_PATH=( /tmp )
PROGRESS_WAIT_SECONDS="1"
export TMPDIR="/var/www/html/imageBackup/iso/"
PXE_RECOVER_MODE=automatic
ISO_FILES=("/var/www/html/imageBackup/rsync")
ISO_PREFIX="${HOSTNAME}"
ISO_DEFAULT="automatic"
-
Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
virtual machine -
System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
x86 -
Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
BIOS and GRUB -
Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
local disk -
Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT" or "lsblk" as makeshift):
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 55M 1 loop /snap/core18/1880
loop1 7:1 0 29.9M 1 loop /snap/snapd/8542
loop2 7:2 0 71.3M 1 loop /snap/lxd/16099
sda 8:0 0 500G 0 disk
├─sda1 8:1 0 1M 0 part
└─sda2 8:2 0 500G 0 part
└─md0 9:0 0 999.8G 0 raid0
├─md0p1 259:0 0 300G 0 part /
├─md0p2 259:1 0 50G 0 part /boot
├─md0p3 259:2 0 100G 0 part /home
└─md0p4 259:3 0 15G 0 part [SWAP]
sdb 8:16 0 500G 0 disk
├─sdb1 8:17 0 1M 0 part
└─sdb2 8:18 0 500G 0 part
└─md0 9:0 0 999.8G 0 raid0
├─md0p1 259:0 0 300G 0 part /
├─md0p2 259:1 0 50G 0 part /boot
├─md0p3 259:2 0 100G 0 part /home
└─md0p4 259:3 0 15G 0 part [SWAP]
-
Description of the issue (ideally so that others can reproduce it):
used RAID 0. -
Workaround, if any:
-
Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files): rear -D recover
rear-vijayyyyyyyyyyy1.log
jsmeix commented at 2021-05-18 12:05:¶
@cvijayvinoth
your
https://github.com/rear/rear/files/6498661/rear-vijayyyyyyyyyyy1.log
is no '-D' debugscript log so we cannot see what actually goes on.
It contains (excerpt):
2022: 2021-05-18 05:36:17.435825730 Start system layout restoration.
2022: 2021-05-18 05:36:17.516015546 Stop mdadm
2022: 2021-05-18 05:36:17.521266345 Erasing MBR of disk /dev/sda
2022: 2021-05-18 05:36:17.530102618 Disk '/dev/sda': creating 'gpt' partition table
2022: 2021-05-18 05:36:17.837457944 Disk '/dev/sda': creating partition number 1 with name ''sda1''
2022: 2021-05-18 05:36:18.365539190 Disk '/dev/sda': creating partition number 2 with name ''sda2''
2022: 2021-05-18 05:36:20.268499349 Stop mdadm
2022: 2021-05-18 05:36:20.272854214 Erasing MBR of disk /dev/sdb
2022: 2021-05-18 05:36:20.282599710 Disk '/dev/sdb': creating 'gpt' partition table
2022: 2021-05-18 05:36:20.525363389 Disk '/dev/sdb': creating partition number 1 with name ''sdb1''
2022: 2021-05-18 05:36:20.957689644 Disk '/dev/sdb': creating partition number 2 with name ''sdb2''
2022: 2021-05-18 05:36:22.956187543 Creating software RAID /dev/md0
2022: 2021-05-18 05:36:22.961024143 UserInput: called in /usr/share/rear/layout/recreate/default/200_run_layout_code.sh line 127
2022: 2021-05-18 05:36:22.964361086 UserInput: Default input in choices - using choice number 1 as default input
2022: 2021-05-18 05:36:22.965551677 The disk layout recreation script failed
but that tells nothing what actually failed and nothing at all why it failed.
See the part
"To analyze and debug a "rear recover" failure the following information
is mandatory" in
https://en.opensuse.org/SDB:Disaster_Recovery#Debugging_issues_with_Relax-and-Recover
for a complete list of what information we may need to analyze a "rear
recover" failure.
cvijayvinoth commented at 2021-05-18 14:18:¶
rear-vijayyyyyyyyyyy1.log
@jsmeix : Here I uploaded the correct log for your reference .
jsmeix commented at 2021-05-18 14:25:¶
OK - now we can see the actually failing command in
https://github.com/rear/rear/files/6501998/rear-vijayyyyyyyyyyy1.log
+++ Print 'Creating software RAID /dev/md0'
+++ test -b /dev/md0
+++ mdadm --create /dev/md0 --force --metadata=1.2 --level=raid0 --raid-devices=2 --uuid=65bb9239:f0273dfd:7fc22ff5:1319e0fb --layout=-unknown- --chunk=512 /dev/sda2 /dev/sdb2
mdadm: layout -unknown- not understood for raid0.
jsmeix commented at 2021-05-18 14:29:¶
There is no unknown
in
layout/prepare/GNU/Linux/120_include_raid_code.sh
so I assume it comes from something in your disklayout.conf
file
so we need that too to be able to do the next step in analyzing what
went wrong.
cvijayvinoth commented at 2021-05-19 07:12:¶
Here I attached disklayout.conf file.
pcahyna commented at 2021-05-19 08:03:¶
Here is the culprit
raid /dev/md0 metadata=1.2 level=raid0 raid-devices=2 uuid=65bb9239:f0273dfd:7fc22ff5:1319e0fb layout=-unknown- chunk=512 devices=/dev/sda2,/dev/sdb2
Can you please attach the debug log from rear -D mkrescue
?
cvijayvinoth commented at 2021-05-19 09:16:¶
Here I attached rear -D mkrescue log file.
pcahyna commented at 2021-05-19 09:56:¶
So the culprit is
+++ grep Layout /var/www/html/imageBackup/iso/rear.sLsjObhTRCeYDd8/tmp/mdraid
++ layout=-unknown-
which gets printed by
mdadm --misc --detail /dev/md0
Looking at the source, mdadm can indeed print -unknown-
for a RAID
layout and it got recently added to RAID0 (it existed before for RAID5
and RAID6 and RAID10):
https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/Detail.c?id=329dfc28debb58ffe7bd1967cea00fc583139aca
I suspect we don't support RAID levels other than RAID1 very well...
jsmeix commented at 2021-05-19 10:04:¶
It is also my gut feeling that (software) RAID0 is not well supported in
ReaR
because I assume RAID0 is not often used by system admins on their
servers
in contrast to (software) RAID1 which is more often used but I even
assume
whatever software RAID is not often used by system admins on their
servers
in contrast to real hardware RAID or even SAN storage and things like
that
but I am really not an expert in advanced storage technologies.
By the way:
I wonder why software RAID0 and not just LVM "as usual"?
cvijayvinoth commented at 2021-05-21 07:28:¶
@jsmeix : Currently we are having some of the machines in the format of RAID 0. So i tried to check it out.
jsmeix commented at 2021-05-21 09:55:¶
As fas as I see from plain looking at the code the relevant code parts are
during "rear mkrescue" in layout/save/GNU/Linux/210_raid_layout.sh
if [ -n "$layout" ] ; then
layout=" layout=$layout"
else
layout=""
fi
...
echo "raid ${device}${metadata}${level}${ndevices}${uuid}${name}${sparedevices}${layout}${chunksize}${devices}"
during "rear recover" in layout/prepare/GNU/Linux/120_include_raid_code.sh
for option in $options ; do
case "$option" in
...
(*)
mdadmcmd="$mdadmcmd --$option"
;;
esac
done
So I assume a possible fix in
layout/save/GNU/Linux/210_raid_layout.sh
is to treat layout=-unknown-
same as if $layout
was empty i.e.
--- usr/share/rear/layout/save/GNU/Linux/210_raid_layout.sh.original 2021-05-18 13:20:14.270440485 +0200
+++ usr/share/rear/layout/save/GNU/Linux/210_raid_layout.sh 2021-05-21 11:53:25.389723396 +0200
@@ -67,9 +67,16 @@
else
sparedevices=""
fi
- if [ -n "$layout" ] ; then
+ # mdadm can print '-unknown-' for a RAID layout
+ # which got recently (2019-12-02) added to RAID0 (it existed before for RAID5 and RAID6 and RAID10) see
+ # https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/Detail.c?id=329dfc28debb58ffe7bd1967cea00fc583139aca
+ # so we treat '-unknown-' same as an empty value to avoid that layout/prepare/GNU/Linux/120_include_raid_code.sh
+ # will create a 'mdadm' command in diskrestore.sh like "mdadm ... --layout=-unknown- ..." which would fail
+ # during "rear recover" with something like "mdadm: layout -unknown- not understood for raid0"
+ # see https://github.com/rear/rear/issues/2616
+ if test "$layout" -a '-unknown-' != "$layout" ; then
layout=" layout=$layout"
else
layout=""
fi
jsmeix commented at 2021-05-21 10:04:¶
@cvijayvinoth
please test if the change in
https://github.com/rear/rear/pull/2618
makes things work for you.
cvijayvinoth commented at 2021-06-09 04:24:¶
sure.. let me check and update you..
cvijayvinoth commented at 2021-06-10 10:02:¶
still facing the same issue. here I have attached the rear -D recover
log file.
rear-vijay.log
jsmeix commented at 2021-06-18 09:48:¶
@cvijayvinoth
as far as I see your
https://github.com/rear/rear/files/6630326/rear-vijay.log
is only a "rear -D recover" log but my proposed fix in
https://github.com/rear/rear/pull/2618/files
is in usr/share/rear/layout/save/GNU/Linux/210_raid_layout.sh
and the layout/save
stage is run during "rear mkrescue/mkbackup"
So to test if
https://github.com/rear/rear/pull/2618
makes things work for you
you need to test the whole thing i.e.
first "rear -D mkrescue" or "rear -D mkbackup" and attach its log here
then boot that new created ReaR recovery system on replacement test
hardware
and run "rear -D recover" there and also attach its log plus
disklayout.conf here.
After the "rear -D mkrescue" or "rear -D mkbackup"
check the new disklayout.conf if it still contains layout=-unknown-
.
With my changes in
https://github.com/rear/rear/pull/2618
the new disklayout.conf should no longer contain layout=-unknown-
.
jsmeix commented at 2021-07-05 09:43:¶
@cvijayvinoth
could you please test again and verify as I described in
https://github.com/rear/rear/issues/2616#issuecomment-863911028
if
https://github.com/rear/rear/pull/2618
fixes the issue for you?
cvijayvinoth commented at 2021-07-08 13:05:¶
yest recovery is working fine with this change on the fresh vm. Here I
attached the log files
rear-vijayraid0new-recover.log
jsmeix commented at 2021-07-13 11:35:¶
With
https://github.com/rear/rear/pull/2618
merged
this issue should be fixed.
[Export of Github issue for rear/rear.]