#1722 Issue closed
: Cannot restore disk built on multipath + md¶
Labels: enhancement
, fixed / solved / done
rmetrich opened issue at 2018-02-01 14:57:¶
Relax-and-Recover (ReaR) Issue Template¶
Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):
- rear version (/usr/sbin/rear -V):
rear 2.3
- OS version (cat /etc/rear/os.conf or lsb_release -a):
RHEL 6 or 7
- rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
AUTOEXCLUDE_MULTIPATH=n
- Brief description of the issue:
When the system disk consists in a multipath disk + software RAID, recovering the system if the disk is already partitioned will fail.
This is due to the Software Array being built at boot time, when disks
are discovered. On RHEL, at least, this is done using a udev rule.
Because of the Software Array being built, the multipath map cannot be
created (devices are busy).
- Work-around, if any:
Stop the Software Arrays, and disable the udev rule.
rmetrich commented at 2018-02-01 15:03:¶
Some more details below.
-
rear 2.0 would fail trying to partition the multipath device /dev/mpathX
-
rear 2.3 proposes some disk mapping from /dev/mpathX to one of the real disk (e.g. /dev/sda), because it believes the disk doesn't exist. Later, upon parted running, parted will fail with the following error:
+++ parted -s /dev/sda mklabel msdos
Error: Partition(s) 2 on /dev/sda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
The error is due to the use of /dev/sda by the MD array.
On RHEL, disabling the udev rule
/lib/udev/rules.d/65-md-incremental.rules
does the trick, because
arrays will not be constructed automatically anymore
I don't know for other Linux variants ...
jsmeix commented at 2018-02-06 10:17:¶
@rmetrich
regarding "disabling the udev rule
/lib/udev/rules.d/65-md-incremental.rules"
cf. what we already try to do in
usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh
and see
https://github.com/rear/rear/commit/ff1bb730d38eb42e2abf866736cf188bef0b8b9b
and
https://github.com/rear/rear/issues/533
how doing such things unconditioned could have
bad consequences on other Linux distributions.
rmetrich commented at 2018-02-06 10:32:¶
What is tried in #533 doesn't work for multipath devices, because multipath devices are currently not "discovered" at boot, leading to the creation of software raids, and then causing multipath, once loaded to fail mapping the devices.
I do see 2 solutions:
- either disable software raid creation at boot completely (using my proposed udev rule overloading)
- or include multipath support in the initramfs, as it is currently done in standard initramfs
jsmeix commented at 2018-02-06 10:36:¶
@rmetrich
because I have zero experience with multipath
I agree with anything that you do for multipath.
Perhaps @schabrolles could have a look (as time permits)
because he did lots of improvements for multipath in ReaR.
jsmeix commented at 2018-02-06 13:14:¶
@schabrolles
I dared to assign this issue to you - even if you may not have time
right now.
I will still have a look but as a multipath-noob I cannot actually help
here
or do a meaningful review of a pull request.
jsmeix commented at 2018-06-05 08:57:¶
Because the initial description
https://github.com/rear/rear/issues/1722#issue-293560693
reads (excerpts):
... recovering the system if the disk is already partitioned will fail ... ... at boot time ... [something] ... is done using a udev rule ... ... devices are busy
the actual root cause of this issue is old (meta)-data on the disk
which is known to cause arbitrary kind of weird failures for "rear
recover"
and that ReaR still has no generic "cleanupdisk" script, see
https://github.com/rear/rear/issues/799
@rmetrich
could you check in the ReaR recovery system before you run "rear
recover"
what disk device nodes (like /dev/sda) and - even more important -
what
partition device nodes (like /dev/sda1, /dev/sda2, ...) already exist.
Could you test if it helps to run wipefs -a -f
for all of them in
reverse ordering,
i.e. first wipefs
the partitions starting with the last one
and finally wipefs
the disk e.g. like
wipefs -a -f /dev/sda2 wipefs -a -f /dev/sda1 wipefs -a -f /dev/sda
cf. https://github.com/rear/rear/issues/799#issue-141001306
Probably to make the old partition device nodes go away
an additional parted /dev/sda mklabel
command is needed
cf.
https://github.com/rear/rear/issues/799#issuecomment-197286229
rmetrich commented at 2018-06-05 15:10:¶
@jsmeix The issue is not solvable using wipefs
.
Indeed, at boot, ReaR deliberated doesn't load the multipath driver,
causing disks to not be mapped by devicemapper-multipath.
However, the MD discovery mechanism still runs automatically (through
the udev rule), causing the device Arrays to be rebuilt automatically,
based on the content of the /etc/mdadm.conf file.
For example, you will have this on SLES12sp3:
# cat /etc/mdadm.conf
DEVICE containers partitions
ARRAY /dev/md0 UUID=2e09978c:ab423fe6:33d5210b:66a86e34
Causing the /dev/md0
array to be built by selecting some disks
matching the UUID, as shown below:
# blkid
/dev/sda1: UUID="2e09978c-ab42-3fe6-33d5-210b66a86e34" UUID_SUB="d588af1a-de50-199f-bd5d-caac8f70ea16" LABEL="any:0" TYPE="linux_raid_member" PARTUUID="000d64bb-01"
/dev/sdb1: UUID="2e09978c-ab42-3fe6-33d5-210b66a86e34" UUID_SUB="5cfbaa1a-a277-2979-15c5-e2abb5b49dbc" LABEL="any:0" TYPE="linux_raid_member" PARTUUID="0006447c-01"
/dev/sdc1: UUID="2e09978c-ab42-3fe6-33d5-210b66a86e34" UUID_SUB="d588af1a-de50-199f-bd5d-caac8f70ea16" LABEL="any:0" TYPE="linux_raid_member" PARTUUID="000d64bb-01"
/dev/sdd1: UUID="2e09978c-ab42-3fe6-33d5-210b66a86e34" UUID_SUB="5cfbaa1a-a277-2979-15c5-e2abb5b49dbc" LABEL="any:0" TYPE="linux_raid_member" PARTUUID="0006447c-01"
/dev/md0: UUID="603f8753-f80a-4799-b862-ef2f5968bdb6" TYPE="ext4"
/dev/sr0: UUID="2018-06-05-14-43-15-00" LABEL="RELAXRECOVER" TYPE="iso9660"
In my case, it selected paths sdd1
and sdc1
:
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdd1[1] sdc1[0]
20970368 blocks super 1.0 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk
unused devices: <none>
Then, upon running rear recover
, Multipath fails to create the
multipath mapping (e.g. /dev/mpatha
) because the disks are busy
due to the MD being built.
jsmeix commented at 2018-06-06 09:11:¶
@rmetrich
thank you so much for your explanatory description
how things behave special in case of multipath.
It helps so much!
If you find time to play around even more with it
I would like to know if you think there is a generic way
how to clean up a used disk so that multipath issues
would no longer happen.
As fas as I see it seems the UUID is the crucial point here
because I assume on an unused disk there is no disk
partition UUID that matches the UUID in /etc/mdadm.conf
so that then no disk array is automatically built.
If my assumtion is right it might be possible to somehow
clean all disk partition UUIDs from a disk as a generic solution
to make an already used disk look as if it was a new disk.
My basic idea behind such a generic "cleanupdisk" script
is to have a single place of code (i.e. one script) that
does all what is needed to make an already used disk
behave same for "rear recover" as a new disk
instead of fixing each particular issue with used disks
at various different places in the code.
But such a generic "cleanupdisk" script is for a future ReaR release.
At least for now your
https://github.com/rear/rear/pull/1819
is perfectly fine.
jsmeix commented at 2018-06-06 09:32:¶
@rmetrich
can you explain why this issue is not solvable using wipefs
in your
case?
Im my non-multipath case I can clean all disk partition UUIDs
from the disks by using wipefs
in the recovery system:
RESCUE f121:~ # blkid /dev/sdb1: UUID="e1996af3-bc9d-46fc-a5fc-a3aac09e63d9" TYPE="ext4" PARTUUID="00070810-01" /dev/sda1: UUID="b754c8dc-cd7d-42fa-8edc-a69c02c54f68" TYPE="swap" PARTUUID="00046bb9-01" /dev/sda2: UUID="83a3ba1c-7c87-4249-a0cd-a60c3e41ac8a" UUID_SUB="0e0ab27c-9995-4005-a533-1013dd616675" TYPE="btrfs" PARTUUID="00046bb9-02" /dev/sr0: UUID="2018-05-09-16-17-41-00" LABEL="RELAXRECOVER" TYPE="iso9660" RESCUE f121:~ # parted -s /dev/sda unit MiB print Model: ATA QEMU HARDDISK (scsi) Disk /dev/sda: 20480MiB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1.00MiB 1490MiB 1489MiB primary linux-swap(v1) type=83 2 1490MiB 20480MiB 18990MiB primary btrfs boot, type=83 RESCUE f121:~ # parted -s /dev/sdb unit MiB print Model: ATA QEMU HARDDISK (scsi) Disk /dev/sdb: 2048MiB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1.00MiB 2047MiB 2046MiB primary ext4 type=83 RESCUE f121:~ # ls -l /dev/sd* brw-rw---- 1 root disk 8, 0 Jun 6 09:16 /dev/sda brw-rw---- 1 root disk 8, 1 Jun 6 09:16 /dev/sda1 brw-rw---- 1 root disk 8, 2 Jun 6 09:16 /dev/sda2 brw-rw---- 1 root disk 8, 16 Jun 6 09:16 /dev/sdb brw-rw---- 1 root disk 8, 17 Jun 6 09:16 /dev/sdb1 RESCUE f121:~ # wipefs -a -f /dev/sdb1 /dev/sdb1: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef RESCUE f121:~ # wipefs -a -f /dev/sdb /dev/sdb: 2 bytes were erased at offset 0x000001fe (dos): 55 aa RESCUE f121:~ # wipefs -a -f /dev/sda2 /dev/sda2: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d RESCUE f121:~ # wipefs -a -f /dev/sda1 /dev/sda1: 10 bytes were erased at offset 0x00000ff6 (swap): 53 57 41 50 53 50 41 43 45 32 RESCUE f121:~ # wipefs -a -f /dev/sda /dev/sda: 2 bytes were erased at offset 0x000001fe (dos): 55 aa RESCUE f121:~ # blkid /dev/sr0: UUID="2018-05-09-16-17-41-00" LABEL="RELAXRECOVER" TYPE="iso9660" RESCUE f121:~ # parted -s /dev/sda unit MiB print Error: /dev/sda: unrecognised disk label Model: ATA QEMU HARDDISK (scsi) Disk /dev/sda: 20480MiB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: RESCUE f121:~ # parted -s /dev/sdb unit MiB print Error: /dev/sdb: unrecognised disk label Model: ATA QEMU HARDDISK (scsi) Disk /dev/sdb: 2048MiB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: RESCUE f121:~ # ls -l /dev/sd* brw-rw---- 1 root disk 8, 0 Jun 6 09:21 /dev/sda brw-rw---- 1 root disk 8, 2 Jun 6 09:21 /dev/sda2 brw-rw---- 1 root disk 8, 16 Jun 6 09:21 /dev/sdb RESCUE f121:~ # parted -s /dev/sda mklabel msdos RESCUE f121:~ # parted -s /dev/sdb mklabel msdos RESCUE f121:~ # parted -s /dev/sda unit MiB print Model: ATA QEMU HARDDISK (scsi) Disk /dev/sda: 20480MiB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags RESCUE f121:~ # parted -s /dev/sdb unit MiB print Model: ATA QEMU HARDDISK (scsi) Disk /dev/sdb: 2048MiB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags RESCUE f121:~ # ls -l /dev/sd* brw-rw---- 1 root disk 8, 0 Jun 6 09:26 /dev/sda brw-rw---- 1 root disk 8, 16 Jun 6 09:27 /dev/sdb RESCUE f121:~ # blkid /dev/sr0: UUID="2018-05-09-16-17-41-00" LABEL="RELAXRECOVER" TYPE="iso9660" /dev/sdb: PTUUID="000aaa99" PTTYPE="dos" /dev/sda: PTUUID="00079970" PTTYPE="dos"
Now those already used disks look to me as if they were new disks.
But perhaps something subtle is still left somewhere which is the
reason
why wipefs
(in reverse ordering) plus parted mklabel msdos
are not yet sufficient in case of multipath.
rmetrich commented at 2018-06-06 14:33:¶
For MD devices, you first need to stop the array:
# mdadm --stop /dev/md0
Then wiping the partitions and disks effectively works: after executing
rear recover
the multipath driver will be brought up and devices
mapped properly.
However, how do want to do this automatic wipe? How will you select the
devices to wipe (you cannot just wipe the output of blkid
otherwise
USB keys connected to the host will be wiped also).
Will you wipe based on /var/lib/rear/layout/disklayout.conf
?
Even with wiping, this PR can still be useful in case the admin doesn't want a real recover but fix some things in the system. I don't know if this is a ReaR use case however.
jsmeix commented at 2018-06-06 14:54:¶
Currently I do not know how to do an automatic wipe.
Currently I am trying to find out if it is possible at all
to generically wipe a disk at least for the common use cases.
I would try to wipe each active (i.e. non-commented) disk in
disklayout.conf
i.e. wipe each disk where diskrestore.sh would do partitioning because
doing partitioning would destroy all existing content on a disk
(current "rear recover" cannot leave some old partitions unchanged
on a disk, cf.
https://github.com/rear/rear/issues/1681
)
so that things cannot get worse when such a disk is wiped before.
jsmeix commented at 2018-06-12 07:07:¶
With
https://github.com/rear/rear/pull/1819
merged
this issue should be fixed.
[Export of Github issue for rear/rear.]