#2696 Issue closed: 420_autoresize_last_partitions.sh does not support RAID devices

Labels: enhancement, no-issue-activity

jsmeix opened issue at 2021-10-13 12:41:

Current layout/prepare/default/420_autoresize_last_partitions.sh
has the main loop

while read component_type disk_device old_disk_size disk_label junk ; do
    ...
done < <( grep "^disk " "$LAYOUT_FILE" )

so currently only disk devices in disklayout.conf can be autoresized
but e.g. no raid devices in disklayout.conf like for example (excerpts):

disk /dev/sda 12884901888 gpt
disk /dev/sdb 12884901888 gpt
disk /dev/sdc 1073741824 gpt
part /dev/sdc 1072676352 1048576 rear-noname swap /dev/sdc1
raid /dev/md127 metadata=1.0 level=raid1 raid-devices=2 uuid=8d05eb84:2de831d1:dfed54b2:ad592118 name=raid1sdab devices=/dev/sda,/dev/sdb
part /dev/md127 10485760 1048576 rear-noname bios_grub /dev/md127p1
part /dev/md127 12739067392 11534336 rear-noname none /dev/md127p2
fs /dev/mapper/cr_root / btrfs ...
swap /dev/sdc1 uuid=9c606f48-92cd-4f98-be22-0f8a75358bed label=
crypt /dev/mapper/cr_root /dev/md127p2 type=luks1 cipher=aes-xts-plain64 key_size=512 hash=sha256 uuid=d0446c00-9e79-4872-abaa-2d464fd71c99

on my test VM

# lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT,UUID
NAME                      KNAME        PKNAME       TRAN TYPE  FSTYPE             SIZE MOUNTPOINT UUID
/dev/sda                  /dev/sda                  ata  disk  linux_raid_member   12G            8d05eb84-2de8-31d1-dfed-54b2ad592118
`-/dev/md127              /dev/md127   /dev/sda          raid1                     12G            
  |-/dev/md127p1          /dev/md127p1 /dev/md127        part                      10M            
  `-/dev/md127p2          /dev/md127p2 /dev/md127        part  crypto_LUKS       11.9G            d0446c00-9e79-4872-abaa-2d464fd71c99
    `-/dev/mapper/cr_root /dev/dm-0    /dev/md127p2      crypt btrfs             11.9G /          85406026-0559-4b0d-8f67-ec19d3b556f5
/dev/sdb                  /dev/sdb                  ata  disk  linux_raid_member   12G            8d05eb84-2de8-31d1-dfed-54b2ad592118
`-/dev/md127              /dev/md127   /dev/sdb          raid1                     12G            
  |-/dev/md127p1          /dev/md127p1 /dev/md127        part                      10M            
  `-/dev/md127p2          /dev/md127p2 /dev/md127        part  crypto_LUKS       11.9G            d0446c00-9e79-4872-abaa-2d464fd71c99
    `-/dev/mapper/cr_root /dev/dm-0    /dev/md127p2      crypt btrfs             11.9G /          85406026-0559-4b0d-8f67-ec19d3b556f5
/dev/sdc                  /dev/sdc                  ata  disk                       1G            
`-/dev/sdc1               /dev/sdc1    /dev/sdc          part  swap              1023M [SWAP]     9c606f48-92cd-4f98-be22-0f8a75358bed
/dev/sr0                  /dev/sr0                  ata  rom                     1024M

On replacement VM
with smaller RAID disks sda and sdb only 11 GiB
and same 1 GiB sdc
"rear -D recover" fails with

RESCUE localhost:~ # rear -D recover
...
Comparing disks
Ambiguous disk layout needs manual configuration (more than one disk with same size used in '/var/lib/rear/layout/disklayout.conf')
Switching to manual disk layout configuration
Using /dev/sdc (same name and same size) for recreating /dev/sdc
Original disk /dev/sda does not exist (with same size) in the target system
UserInput -I LAYOUT_MIGRATION_REPLACEMENT_SDA needed in /usr/share/rear/layout/prepare/default/300_map_disks.sh line 238
Choose an appropriate replacement for /dev/sda
1) /dev/sda
2) /dev/sdb
3) Do not map /dev/sda
4) Use Relax-and-Recover shell and return back to here
(default '1' timeout 300 seconds)

UserInput: No real user input (empty or only spaces) - using default input
UserInput: Valid choice number result '/dev/sda'
Using /dev/sda (chosen by user) for recreating /dev/sda
Original disk /dev/sdb does not exist (with same size) in the target system
Using /dev/sdb (the only appropriate) for recreating /dev/sdb
Current disk mapping table (source => target):
  /dev/sdc => /dev/sdc
  /dev/sda => /dev/sda
  /dev/sdb => /dev/sdb

UserInput -I LAYOUT_MIGRATION_CONFIRM_MAPPINGS needed in /usr/share/rear/layout/prepare/default/300_map_disks.sh line 275
Confirm or edit the disk mapping
1) Confirm disk mapping and continue 'rear recover'
2) Confirm identical disk mapping and proceed without manual configuration
3) Edit disk mapping (/var/lib/rear/layout/disk_mappings)
4) Use Relax-and-Recover shell and return back to here
5) Abort 'rear recover'
(default '1' timeout 300 seconds)

UserInput: No real user input (empty or only spaces) - using default input
UserInput: Valid choice number result 'Confirm disk mapping and continue 'rear recover''
User confirmed disk mapping
Disabling excluded components in /var/lib/rear/layout/disklayout.conf
Applied disk layout mappings to /var/lib/rear/layout/disklayout.conf
Applied disk layout mappings to /var/lib/rear/layout/config/df.txt
Applied disk layout mappings to /etc/rear/rescue.conf
Examining gpt disk /dev/sdc to automatically resize its last active partition
Skipping /dev/sdc (size of new disk same as size of old disk)
UserInput -I LAYOUT_FILE_CONFIRMATION needed in /usr/share/rear/layout/prepare/default/500_confirm_layout_file.sh line 26
Confirm or edit the disk layout file
1) Confirm disk layout and continue 'rear recover'
2) Edit disk layout (/var/lib/rear/layout/disklayout.conf)
3) View disk layout (/var/lib/rear/layout/disklayout.conf)
4) View original disk space usage (/var/lib/rear/layout/config/df.txt)
5) Use Relax-and-Recover shell and return back to here
6) Abort 'rear recover'
(default '1' timeout 300 seconds)

UserInput: No real user input (empty or only spaces) - using default input
UserInput: Valid choice number result 'Confirm disk layout and continue 'rear recover''
User confirmed disk layout file
Marking component '/dev/sda' as done in /var/lib/rear/layout/disktodo.conf
Marking component '/dev/sdb' as done in /var/lib/rear/layout/disktodo.conf
Marking component '/dev/sdc' as done in /var/lib/rear/layout/disktodo.conf
Marking component '/dev/sdc1' as done in /var/lib/rear/layout/disktodo.conf
Marking component '/dev/md127' as done in /var/lib/rear/layout/disktodo.conf
Marking component '/dev/md127p1' as done in /var/lib/rear/layout/disktodo.conf
Marking component '/dev/md127p2' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'swap:/dev/sdc1' as done in /var/lib/rear/layout/disktodo.conf
Marking component '/dev/mapper/cr_root' as done in /var/lib/rear/layout/disktodo.conf
Fallback SLES-like btrfs subvolumes setup for /dev/mapper/cr_root on / (no match in BTRFS_SUBVOLUME_GENERIC_SETUP or BTRFS_SUBVOLUME_SLES_SETUP)
Marking component 'fs:/' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/boot/grub2/x86_64-efi' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/home' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/boot/grub2/i386-pc' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/root' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/srv' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/opt' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/tmp' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/usr/local' as done in /var/lib/rear/layout/disktodo.conf
Marking component 'btrfsmountedsubvol:/var' as done in /var/lib/rear/layout/disktodo.conf
UserInput -I LAYOUT_CODE_CONFIRMATION needed in /usr/share/rear/layout/recreate/default/100_confirm_layout_code.sh line 26
Confirm or edit the disk recreation script
1) Confirm disk recreation script and continue 'rear recover'
2) Edit disk recreation script (/var/lib/rear/layout/diskrestore.sh)
3) View disk recreation script (/var/lib/rear/layout/diskrestore.sh)
4) View original disk space usage (/var/lib/rear/layout/config/df.txt)
5) Use Relax-and-Recover shell and return back to here
6) Abort 'rear recover'
(default '1' timeout 300 seconds)

UserInput: No real user input (empty or only spaces) - using default input
UserInput: Valid choice number result 'Confirm disk recreation script and continue 'rear recover''
User confirmed disk recreation script
Start system layout restoration.
Disk '/dev/sdc': creating 'gpt' partition table
Disk '/dev/sdc': creating partition number 1 with name ''sdc1''
Creating software RAID /dev/md127
Disk '/dev/md127': creating 'gpt' partition table
Disk '/dev/md127': creating partition number 1 with name ''md127p1''
Disk '/dev/md127': creating partition number 2 with name ''md127p2''
UserInput -I LAYOUT_CODE_RUN needed in /usr/share/rear/layout/recreate/default/200_run_layout_code.sh line 127
The disk layout recreation script failed
1) Rerun disk recreation script (/var/lib/rear/layout/diskrestore.sh)
2) View 'rear recover' log file (/var/log/rear/rear-localhost.log)
3) Edit disk recreation script (/var/lib/rear/layout/diskrestore.sh)
4) View original disk space usage (/var/lib/rear/layout/config/df.txt)
5) Use Relax-and-Recover shell and return back to here
6) Abort 'rear recover'
(default '1' timeout 300 seconds)
6
UserInput: Valid choice number result 'Abort 'rear recover''
ERROR: User chose to abort 'rear recover' in /usr/share/rear/layout/recreate/default/200_run_layout_code.sh

Note that there is only

Examining gpt disk /dev/sdc to automatically resize its last active partition
Skipping /dev/sdc (size of new disk same as size of old disk)

but nothing for the RAID disks sda and sdb.

Accordingly the ReaR log file shows what failed in diskrestore.sh

+++ parted -s /dev/md127 mkpart ''\''md127p2'\''' 11534336B 12750601727B
Error: The location 12750601727B is outside of the device /dev/md127.

as expected.

jsmeix commented at 2021-10-14 13:55:

As far as I see currently I think
autoresizing partitions on RAID1 block devices
might become terribly overcomplicated
up to impossible in practice with reasonable effort.

Reasoning:
A main difference between making partitions directly on a disk
versus making partitions on a Linux software RAID1 block device is that
the size of the replacement disk is known directly on the hardware
so the replacement disk size can be compared with the original disk size
cf. in layout/prepare/default/420_autoresize_last_partitions.sh

sysfsname=$( get_sysfs_name $disk_device )
...
new_disk_size=$( get_disk_size "$sysfsname" )
is_positive_integer $new_disk_size || Error "Failed to get_disk_size() for $disk_device"

But the size of a RAID1 block device is not known on the hardware because
there is no Linux software RAID1 block device on the plain hardware
because that would have to be first created but that is not how
the whole automated disk migration code must work in ReaR
to be in compliance how ReaR recovery works by design.

The whole disk recreation during "rear recover" happens
by running a script diskrestore.sh that is generated by ReaR
from the predefined values in the disklayout.conf file.

This way the user gets a script diskrestore.sh that he can adapt
if needed to do or fix special things before that script is run.

Current automated disk migration code adapts the values
in the disklayout.conf file before diskrestore.sh is generated.
The diskrestore.sh script contains the right fixed commands
to set up Linux software RAID1 block devices and the right
fixed commands to create partitions on block devices.

But the setup of the RAID1 block device is a precondition
to know the RAID1 size which is a precondition to calculate
the matching values to create partitions on RAID1 block devices.
So the diskrestore.sh script would have to be generated code
that does all that dynamically which looks terribly overcomplicated
up to impossible in practice with reasonable effort and the user
could no longer easily adapt such a dynamic diskrestore.sh script
(indirections instead of plain commands with the actual values).

jsmeix commented at 2021-10-15 07:12:

I think there is a way out
that is sufficiently simple and works in compliance
with how ReaR recovery works by design.

In disklayout.conf is (excerpts)

disk /dev/sda 12884901888 gpt
disk /dev/sdb 12884901888 gpt
...
raid /dev/md127 ... devices=/dev/sda,/dev/sdb

so I know the RAID array member devices /dev/sda and /dev/sdb
and their size on the original system
and the size of /dev/sda and /dev/sdb on the replacement hardware
so I can directly compare disk sizes.

Things could become a bit tricky but hopefully still sufficiently simple
when the RAID array member devices have different size.
For example:

original system  replacement system
sda 14 GiB       sda 11 GiB
sdb 12 GiB       sdb 10 GiB

In case of RAID1 the original system RAID array
has maximum size of 12 GiB minus about 1 MB for RAID metadata
and the replacement system RAID array
has maximum size of 10 GiB minus about 1 MB for RAID metadata
so the maximum size difference is exactly 2 GiB
so the last partition on the RAID array should be shrinked by 2 GiB.

jsmeix commented at 2021-12-14 13:43:

With https://github.com/rear/rear/pull/2726 merged
automated resizing active last partitions on RAID1 disks
should work - but only RAID level 1 is currently supported.

jsmeix commented at 2021-12-22 12:57:

With https://github.com/rear/rear/pull/2730 merged
automated resizing active last partitions on RAID0 disks
should work.

There are still issues with resizing RAID1 in extreme cases, see
https://github.com/rear/rear/pull/2726#issuecomment-999546292

github-actions commented at 2022-02-21 02:09:

Stale issue message

github-actions commented at 2022-04-23 02:49:

Stale issue message

jsmeix commented at 2022-06-02 07:47:

Automated resizing active last partitions
should work for RAID1 and RAID0
for the intended use case which is
resizing when sizes differ only a little bit
(in particular when a hardware disk was replaced
with another hardware disk with "same size" e.g. both
have "1TB" but the actually usable disk space differs a bit).

What does not yet work are

  • other RAID levels (in particular neither RAID5 nor RAID10)
  • reliable error detection (in particular detection of impossible cases)

This is something for ReaR 2.8

github-actions commented at 2022-08-02 03:53:

Stale issue message

jsmeix commented at 2022-08-10 06:18:

According to my tests that I listed at
https://github.com/rear/rear/wiki/Test-Matrix-ReaR-2.7

SLES15 SP4 with RAID1 migrated to replacement systems
with smaller and bigger disk size

and

SLES15 SP4 with RAID0 migrated to replacement systems
with smaller and bigger disk size

it works sufficiently for RAID1 and RAID0
for the intended use cases.

What is not yet supported are RAID5 and RAID10.

github-actions commented at 2022-10-10 04:00:

Stale issue message

github-actions commented at 2022-12-10 02:29:

Stale issue message

github-actions commented at 2023-02-11 02:30:

Stale issue message

github-actions commented at 2023-04-15 02:19:

Stale issue message


[Export of Github issue for rear/rear.]