#3061 PR merged: Save LVM pool metadata volume size in disk layout

Labels: enhancement, bug, fixed / solved / done

pcahyna opened issue at 2023-10-30 17:42:

Pull Request Details:
  • Type: Bug Fix

  • Impact: Normal

  • Reference to related issue (URL):

  • How was this pull request tested?

    • Installed RHEL 9 system to a thin pool
    • Extended the thin pool to another disk, while artificially keeping its metadata volume size unchanged, and using 100% of the VG (no free space in the VG left).
    • Backed up and recovered

With ReaR version before this change, recovery fails with:

+++ lvm lvcreate -y -L 1073741824b -n swap rhel_kvm-08-guest06
  Volume group "rhel_kvm-08-guest06" has insufficient free space (250 extents): 256 required.

swap is not part of the thin pool and is created after the thin pool. Without the change, the thin pool consumes too much space in the VG due to the metadata volume being too large, and there is not enough space for the swap volume, leading to the error above. With this change the recovery is successful.

  • Description of the changes in this pull request:

Instead of letting LVM use the default pool metadata volume size when restoring a layout with thin pools, use the size from the original system. The pool metadata size will be saved in disklayout.conf as the new key "poolmetadatasize" on the thin pool LV.

This makes the layout of the recovered system closer to the original system.

Prevents some cases of recovery failures during layout restoration: if the original system used a non-default (in particular, smaller) metadata volume size, and the space in the VG was fully used, there would not be enough space in the VG for recovery of all LVs (as layout restoration would create a larger pool metadata size than before and if restoring to disks of the same size, the space will be missing elsewhere).

jsmeix commented at 2023-10-31 09:05:

Because the code already contains

# Check for 'lvs' support of the 'lv_layout' field:

I wonder if the here new added lv_metadata_size field
is always supported in practice - i.e. if it is also supported
for reasonably older Linux distributions (older than RHEL 9)?

jsmeix commented at 2023-10-31 09:30:

Here a general overview which fields are supported
with older LVM versions:

I test with

# lvs_fields="origin,lv_name,vg_name,lv_size,lv_layout,pool_lv,chunk_size,stripes,stripe_size,seg_size,lv_metadata_size"

# lvs --separator=':' --noheadings --units b --nosuffix -o $lvs_fields 2>&1 | grep "Unrecognised field"

and reduce step by step lvs_fields by what is reported as "Unrecognised field".

On SLES10 SP4
with lvm2-2.02.17
those fields are supported
origin,lv_name,vg_name,lv_size,chunk_size,stripes,stripe_size,seg_size
i.e. those fields are not supported
lv_layout
pool_lv
lv_metadata_size

# lvs_fields="origin,lv_name,vg_name,lv_size,chunk_size,stripes,stripe_size,seg_size"

# lvs --separator=':' --noheadings --units b --nosuffix -o $lvs_fields && echo OK
  No volume groups found
OK

ReaR would fail on SLE10.

On SLES11 SP3
with lvm2-2.02.98
those fields are supported
origin,lv_name,vg_name,lv_size,pool_lv,chunk_size,stripes,stripe_size,seg_size,lv_metadata_size
i.e. only one field is not supported
lv_layout

# lvs_fields="origin,lv_name,vg_name,lv_size,pool_lv,chunk_size,stripes,stripe_size,seg_size,lv_metadata_size"

# lvs --separator=':' --noheadings --units b --nosuffix -o $lvs_fields && echo OK
  No volume groups found
OK

ReaR should still work on SLE11.

On SLES12 SP5
with lvm2-2.02.180
all those fields
origin,lv_name,vg_name,lv_size,lv_layout,pool_lv,chunk_size,stripes,stripe_size,seg_size,lv_metadata_size
are supported
so ReaR should work on SLE12 and later.

Summary:

I do not care about SLE10.
ReaR on SLE10 was never officially supported by SUSE.

I do no longer care about SLE11.
ReaR on SLE11 was officially supported by SUSE.
Meanwhile SLE11 is out of official SUSE support, see
https://www.suse.com/lifecycle/#product-suse-linux-enterprise-high-availability-extension

SUSE Linux Enterprise High Availability Extension 11
General Support Ends    31 Mar 2019
LTSS Ends               31 Mar 2022

But in this case ReaR should even still work on SLE11.

pcahyna commented at 2023-10-31 10:03:

@jsmeix I tested that as well, lv_metadata_size (and lv_layout) is supported even on RHEL 6.

Given that

On SLES12 SP5
with lvm2-2.02.180
all those fields
origin,lv_name,vg_name,lv_size,lv_layout,pool_lv,chunk_size,stripes,stripe_size,seg_size,lv_metadata_size
are supported

and

I do no longer care about SLE11.

I propose to remove the whole lv_layout_supported=no branch, which would simplify the code a bit.

pcahyna commented at 2023-10-31 10:22:

LGTM.

Reading the code again (really not trivial), I realized - probably again - that those "LV Pairs" are actually command line arguments to lvcreate and that ReaR sort of relies on that implicitly.

Maybe worth explaining this context somewhere.

@schlomo Good point! It also took me a while to realize that I can not give the key any name that I want, unless I want to implement a special case in 110_include_lvm_code.sh . Please have a look at my last commit.

jsmeix commented at 2023-10-31 13:19:

@pcahyna
feel free to remove all what belongs to "lv_layout_supported".

By the way:
I noticed an outdated comment and an outdated message
at another code place:
Meanwhile, cf.
https://github.com/rear/rear/commit/29e739ae7c0651f8f77c60846bfbe2b6c91baa29
https://github.com/rear/rear/pull/2903
the code calls lvm pvdisplay -C --separator '|' ...
but later there still is lvm pvdisplay -c

    # Check the exit code of "lvm pvdisplay -c"
    # in the "lvm pvdisplay -c | while read line ; do ... done" pipe:
    pvdisplay_exit_code=${PIPESTATUS[0]}
    test $pvdisplay_exit_code -eq 0 || Error "LVM command 'lvm pvdisplay -c' failed with exit code $pvdisplay_exit_code"

which should be updated to something like

    # Check the exit code of "lvm pvdisplay -C ..."
    # in the "lvm pvdisplay -C ... | while read line ; do ... done" pipe:
    pvdisplay_exit_code=${PIPESTATUS[0]}
    test $pvdisplay_exit_code -eq 0 || Error "LVM command 'lvm pvdisplay' failed with exit code $pvdisplay_exit_code"

schlomo commented at 2023-10-31 13:37:

I'd recommend merging this before it turns into a general overhaul of that code area

pcahyna commented at 2023-10-31 17:45:

I noticed an outdated comment and an outdated message
at another code place:

@jsmeix thanks, fixed in the last commit (although it does not belong to this PR that much).

pcahyna commented at 2023-10-31 17:47:

Big thanks for making ReaR solve even more challenging DR scenarios!! This is where show the commercial tools our true value :-)

@schlomo and the scenario is very much real. One can encounter quite unusal setups in practice (sometimes due to creative ways of deploying systems).


[Export of Github issue for rear/rear.]