#2564 PR merged: Update 110_include_lvm_code.sh to make sure vgremove is called before recreating the VG

Labels: enhancement, fixed / solved / done, minor bug

rmetrich opened issue at 2021-02-05 13:00:

Pull Request Details:
  • Type: Bug Fix

  • Impact: Normal

  • How was this pull request tested? Test on RHEL7 with a LVM Raid

  • Brief description of the changes in this pull request:

Make sure we delete the volume group before re-creating it.
The issue happens in Migration mode when ReaR is not trying to use vgcfgrestore.

Reproducer:
  1. Install a system

  2. Add 2 additional disks that will be used to host a LVM VG

    • /dev/vdb
    • /dev/vdc
  3. Create a Raid volume

    # pvcreate /dev/vdb
    # pvcreate /dev/vdc
    # vgcreate data /dev/vdb /dev/vdc
    # lvcreate -n vol1 -L 1G -m 1 data
    # mkfs.xfs /dev/data/vol1
    # mount /dev/data/vol1 /mnt
    
  4. Build a rescue image and perform a recovery

    Error is shown below:

    Start system layout restoration.
    Disk '/dev/vda': creating 'msdos' partition table
    Disk '/dev/vda': creating partition number 1 with name 'primary'
    Disk '/dev/vda': creating partition number 2 with name 'primary'
    Creating LVM PV /dev/vdb
    Creating LVM PV /dev/vdc
    Creating LVM PV /dev/vda2
    Creating LVM VG 'data'; Warning: some properties may not be preserved...
    The disk layout recreation script failed
    

    Log excerpt:

    +++ Print 'Creating LVM VG '\''data'\''; Warning: some properties may not be preserved...'
    +++ '[' -e /dev/data ']'
    +++ lvm vgcreate --physicalextentsize 4096k data /dev/vdb /dev/vdc
      A volume group called data already exists.
    

gdha commented at 2021-02-05 14:29:

@rmetrich Yes I encountered the same issue also and fixed it the same way via Chef deployment on >20k systems ;-)

jsmeix commented at 2021-02-08 14:59:

@rmetrich
nice to hear from you again!

rmetrich commented at 2021-02-08 15:10:

Hello,
Yes I agree --force twice and --yes would be better, just in case.

For the VG having a PV on a disk that should not be touched, then I would consider it's a bug/limitation: it's not possible to re-create a VG in that case at all.

jsmeix commented at 2021-02-08 15:28:

But lvm vgremove --force --force --yes happens unconditioned for the VG
so I still fear this might do bad things when there is a VG that has PVs or LVs
also on another disk that must not be touched by "rear recover".

If I am right with my concerns then I am missing a test that a VG
that should be enforced 'vgremove'd does not have any PV or LV
on whatever other disk that must not be touched by "rear recover",
cf. my code about belongs_to_a_disk_to_be_wiped
and DISKS_TO_BE_WIPED in the files in
https://github.com/rear/rear/pull/2514/files

Or I am wrong with my concerns and it is safe to do an enforced 'vgremove'
regardless where that VG has its PVs and LVs because it will only remove
that specific VG without possible bad side effects on other storage objects.

I would be happy if I am wrong with my concerns because that would simplify
how to automatically get rid of old LVM stuff.

jsmeix commented at 2021-02-08 15:31:

My concern is not when for a VG having a PV on a disk that should not be touched
it's not possible to re-create a VG in that case at all.

My concern is that an unconditioned enforced vgremove may somehow
damage things on a disk that should not be touched.

My concern is also not that such an enforced vgremove may leave
unused remainders on another disk (e.g. PVs there that are no longer
used in the recreated VG).

My concern is only possible damage / loss of data / whatever problem
on a disk that should not be touched.

gdha commented at 2021-02-08 15:44:

But lvm vgremove --force --force --yes happens unconditioned for the VG
so I still fear this might do bad things when there is a VG that has PVs or LVs
also on another disk that must not be touched by "rear recover".

If I am right with my concerns then I am missing a test that a VG
that should be enforced 'vgremove'd does not have any PV or LV
on whatever other disk that must not be touched by "rear recover",
cf. my code about belongs_to_a_disk_to_be_wiped
and DISKS_TO_BE_WIPED in the files in
https://github.com/rear/rear/pull/2514/files

@jsmeix I did test your code but that did not removed the VG in question. Only when I used a similar line as @rmetrich provides it worked perfectly and I have tested a real DR several times with real servers used by developers.

jsmeix commented at 2021-02-08 16:11:

@gdha
my code in https://github.com/rear/rear/pull/2514/files
does not remove VGs or LVs.
It only removes PVs because for PVs it is easy to determine
to what disk a PV belongs.

Some code is there and it had worked for my tests but I commented it out

#for detected_LV in $detected_LVs ; do
...
#    if lvremove -f -y $detected_LV ; then

...

#for detected_VG in $detected_VGs ; do
#    if vgremove -ff -y $detected_VG ; then

because of my concern that enforced removal of VGs and/or LVs might result
bad things on disks that do not belong to DISKS_TO_BE_WIPED
and at that time I did not find a solution for the problem how to determine
to what disks a VG belongs. Probably this can be somehow deduced
from what disks the associated PVs or LVs belong.

What needs to be tested is a LVM setup with VGs that spread their PVs
over disks that should be wiped and other disks that should not be wiped
how then an enforced vgremove and/or lvremove behaves.

Did you test such a LVM setup?

gdha commented at 2021-02-09 07:21:

@jsmeix

What needs to be tested is a LVM setup with VGs that spread their PVs
over disks that should be wiped and other disks that should not be wiped
how then an enforced vgremove and/or lvremove behaves.

Did you test such a LVM setup?

In short - yes ;-)


[Export of Github issue for rear/rear.]