#544 Issue closed: rear does not support SUSE-specific 'gpt_sync_mbr' partition scheme (unable to recover when hybrid MBR found)

Labels: enhancement, bug, fixed / solved / done

roseswe opened issue at 2015-02-04 15:13:

Output of rear recover
:: cut ::

2015-02-04 15:07:13 Size of device sdi matches.
2015-02-04 15:07:13 Looking for sdj...
2015-02-04 15:07:13 Device sdj exists.
2015-02-04 15:07:13 Size of device sdj matches.
2015-02-04 15:07:13 Disk configuration is identical, proceeding with restore.
2015-02-04 15:07:13 Including layout/prepare/default/30_map_disks.sh
2015-02-04 15:07:13 Including layout/prepare/default/31_remove_exclusions.sh
2015-02-04 15:07:13 Including layout/prepare/default/32_apply_mappings.sh
2015-02-04 15:07:13 Including layout/prepare/default/40_autoresize_disks.sh
2015-02-04 15:07:13 Including layout/prepare/default/50_confirm_layout.sh
2015-02-04 15:07:13 Including layout/prepare/default/51_list_dependencies.sh
2015-02-04 15:07:13 Including layout/prepare/default/52_exclude_components.sh
2015-02-04 15:07:13 Including layout/prepare/default/54_generate_device_code.sh
2015-02-04 15:07:13 ERROR: Partition number '' of partition  is not a valid number.
=== Stack trace ===
Trace 0: /bin/rear:249 main
Trace 1: /usr/share/rear/lib/recover-workflow.sh:29 WORKFLOW_recover
Trace 2: /usr/share/rear/lib/framework-functions.sh:70 SourceStage
Trace 3: /usr/share/rear/lib/framework-functions.sh:31 Source
Trace 4: /usr/share/rear/layout/prepare/default/54_generate_device_code.sh:52 source
Trace 5: /usr/share/rear/lib/layout-functions.sh:35 create_device
Trace 6: /usr/share/rear/layout/prepare/GNU/Linux/10_include_partition_code.sh:68 create_disk
Trace 7: /usr/share/rear/layout/prepare/GNU/Linux/10_include_partition_code.sh:210 create_partitions
Trace 8: /usr/share/rear/lib/layout-functions.sh:328 get_partition_number
Trace 9: /usr/share/rear/lib/_input-output-functions.sh:132 StopIfError
Message: Partition number '' of partition  is not a valid number.
===================
/usr/share/rear/lib/layout-functions.sh: line 331: ((: <= 128 : syntax error: operand expected (error token is "<= 128 ")
2015-02-04 15:07:13 ERROR: Partition  is numbered ''. More than 128 partitions is not supported.
=== Stack trace ===
Trace 0: /bin/rear:249 main
Trace 1: /usr/share/rear/lib/recover-workflow.sh:29 WORKFLOW_recover
Trace 2: /usr/share/rear/lib/framework-functions.sh:70 SourceStage
Trace 3: /usr/share/rear/lib/framework-functions.sh:31 Source
Trace 4: /usr/share/rear/layout/prepare/default/54_generate_device_code.sh:52 source
Trace 5: /usr/share/rear/lib/layout-functions.sh:35 create_device
Trace 6: /usr/share/rear/layout/prepare/GNU/Linux/10_include_partition_code.sh:68 create_disk
Trace 7: /usr/share/rear/layout/prepare/GNU/Linux/10_include_partition_code.sh:210 create_partitions
Trace 8: /usr/share/rear/lib/layout-functions.sh:332 get_partition_number
Trace 9: /usr/share/rear/lib/_input-output-functions.sh:132 StopIfError
Message: Partition  is numbered ''. More than 128 partitions is not supported.
===================
2015-02-04 15:07:13 Running exit tasks.
2015-02-04 15:07:13 Finished in 71 seconds
2015-02-04 15:07:13 Removing build area /tmp/rear.ZhlcnK9tviL0e8q
rmdir: removing directory, `/tmp/rear.ZhlcnK9tviL0e8q/outputfs'
rmdir: failed to remove `/tmp/rear.ZhlcnK9tviL0e8q/outputfs': No such file or directory
rmdir: removing directory, `/tmp/rear.ZhlcnK9tviL0e8q'
2015-02-04 15:07:13 End of program reached

# fdisk -l /dev/sda

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 64.4 GB, 64424509440 bytes
255 heads, 63 sectors/track, 7832 cylinders, total 125829120 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     2088959     1043456   83  Linux
/dev/sda2         2088960   125804543    61857792   83  Linux
/dev/sda4               1           1           0+  ee  GPT

Partition table entries are not in disk order

# parted -l
Model: VMware Virtual disk (scsi)
Disk /dev/sda: 64.4GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt_sync_mbr

Number  Start   End     Size    File system  Name     Flags
 1      1049kB  1070MB  1068MB  ext3         primary  boot, legacy_boot
 2      1070MB  64.4GB  63.3GB               primary  lvm

gdha commented at 2015-02-04 15:56:

@roseswe was this an issue for rear-1.16.1 or the latest from OSB? Could you also share the /var/lib/rear/layout/disklayout.conf file and if possible run in debug mode rear -vD recovery so we can see where exactly it fails?

roseswe commented at 2015-02-10 15:54:

We tested both: SLES 11 SP3 version and GIT upstream:
Relax-and-Recover 1.16.1-git201501300930 / 2015-01-30
rear116-1.16-0.7.1 (SLES11)

gdha commented at 2015-02-11 08:51:

@roseswe Could you also share the /var/lib/rear/layout/disklayout.conf file and if possible run in debug mode rear -vD recovery so we can see where exactly it fails?

gdha commented at 2015-02-11 08:53:

@roseswe Do you see an error while running rear -vD savelayout?

roseswe commented at 2015-02-20 10:19:

@roseswe Do you see an error while running rear -vD savelayout?
No returns with 0; log file (360 KB) also looks at a first glance ok.

system: # head disklayout.conf
disk /dev/sda 64424509440 gpt_sync_mbr
part /dev/sda 1068498944 1048576 boot,legacy_boot  /dev/sda1
part /dev/sda 63342379008 1069547520 lvm  /dev/sda2
lvmdev /dev/systemvg /dev/sda2 WyLQQM-fWnH-Oi2x-AIYa-k2JT-1vyk-pcimDI 123715584
lvmgrp /dev/systemvg 4096 15101 61853696
lvmvol /dev/systemvg home_lv 256 2097152

gdha commented at 2015-02-21 11:17:

@roseswe see https://bugzilla.novell.com/show_bug.cgi?id=811408 (the problem seems to be the label gpt_sync_mbr which is not recognized by rear (see script layout/prepare/GNU/Linux/10_include_partition_code.sh)

gdha commented at 2015-03-30 11:45:

@roseswe what does your customer want? If code has to be written then it has to be done on-site I guess as we do not have the required setup for this behavior...

jsmeix commented at 2015-07-13 11:16:

Some background information how that gpt_sync_mbr partition type became created:

It is YaST (more precisely it is libstorage) that creates a gpt_sync_mbr partition type under special conditions.

As far as I see it seems that special conditions are hardcoded in the libstorage source file storage/Disk.cc as follows:

YaST creates a gpt_sync_mbr partition type when a gpt partition type is requested but the system does not use (U)EFI for booting (i.e. when it uses BIOS) and the system is x86 compatible (this means both x86 and x86_64) or the system is ppc but not ppc64le.

As far as I understand it it seems the basic logic behind is that YaST creates a gpt_sync_mbr partition type when a gpt partition type is requested but the system actually won't be able to boot from it.

For example a system with traditional BIOS where a gpt partition is wanted. In this case a gpt_sync_mbr partition type must be actually used (that is basically a gpt partition where the first sector on the disk is a valid MBR) so that BIOS can boot it.

The problem with gpt_sync_mbr partition type is that it is not in compliance with the EFI GPT standard (cf. http://www.rodsbooks.com/gdisk/hybrid.html ) and as a consequence it will cause problems here and there (just as an example rear does not work with it).

If possible I would recommend that on uses
either traditional BIOS with traditional MBR partition type
or (U)EFI with GPT partition type
but not a mix-up of both.

jsmeix commented at 2015-07-13 11:46:

It seems "parted -s /dev/sdX mklabel gpt_sync_mbr" is an undocumented way to use parted - i.e. it seems the gpt_sync_mbr partition type is not officially supported by parted because it is neither documented at http://www.gnu.org/software/parted/manual/parted.html#mklabel nor in "man parted" (I have parted version 3.1 on my SLES12 system).

The more I learn about it the more that gpt_sync_mbr thingy seems to "smell" - more and more it looks as if it is not something that should be officially supported at all - it seems one only gets into trouble with it...

jsmeix commented at 2015-07-13 12:32:

I found out that "parted -s /dev/sdX mklabel gpt_sync_mbr" really is not supported by upstream parted (there is no 'gpt_sync_mbr' in parted-3.2.tar.xz) - it is a SUSE-specific addon via a parted-gpt-mbr-sync.patch see https://build.opensuse.org/package/show/Base:System/parted

jsmeix commented at 2015-07-13 12:36:

It seems that one is the root of all this evil nowadays:
https://bugzilla.opensuse.org/show_bug.cgi?id=220839

gdha commented at 2015-08-06 13:35:

@roseswe @jsmeix how do you want to proceed with this case?

jsmeix commented at 2015-08-28 13:27:

@gdha

(FYI: late reply because I was on vacation.)

Currently there is no need for rear upstream to do something here.

If at all there might be a note added to the rear documentation that currently it does not work for partitions of type 'gpt_sync_mbr' aka. 'hybrid (synced) MBR' .

jsmeix commented at 2015-10-23 11:31:

I will implement support for the SUSE-specific 'gpt_sync_mbr' hack in rear because it is requested by a SUSE customer.

As usual I will try to implement it in a way so that it does not cause regressions eleswhere and finally - after confirmation by SUSE-internal testing (and ideally also by the SUSE customer) that it works - I will do a GitHub submitrequest.

jsmeix commented at 2015-10-26 13:31:

Some intermediate feedback what goes on:

I have now a KVM/QEMU machine with gpt_sync_mbr
and I can reproduce the error exit of the initial comment:

Disk configuration is identical, proceeding with restore.
ERROR: Partition number '' of partition  is not a valid number.
ERROR: Partition  is numbered ''. More than 128 partitions is not supported.

To get a KVM/QEMU machine with gpt_sync_mb
it was simply specifying a 2500GB virtual harddisk
on a "normal" (i.e. BIOS) KVM/QEMU machine.

Fortunately one can create a KVM/QEMU virtual machine
with a much bigger virtual harddisk than what is actually
available as free space on the real harddisk.
I use virt-manager to set it up.
There is a popup notification that warns about that the
virtual harddisk is bigger than the free space on the real harddisk
but one can acknowledge the warning and proceed.

When YaST detects a system with BIOS
and with a disk of more than 2TB
then YaST does the gpt_sync_mb partitioning
automatically.

On my system this gpt_sync_mb partitioning looks as follows:

# parted /dev/sda unit MiB print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sda: 2560000MiB
Sector size (logical/physical): 512B/512B
Partition Table: gpt_sync_mbr
Disk Flags: 
Number Start    End        Size       File system    Name    Flags
 1     1.00MiB  2.00MiB    1.00MiB                   primary bios_grub
 2     2.00MiB  2063MiB    2061MiB    linux-swap(v1) primary
 3     2063MiB  43026MiB   40963MiB   btrfs          primary legacy_boot
 4     43026MiB 2559999MiB 2516973MiB xfs            primary

On that system "rear mkbackup" creates this disklayout.conf
(excerpt)

# egrep '^disk|^part' /tmp/rear.WkRJ6ruTilrLlvE/rootfs/var/lib/rear/layout/disklayout.conf
disk /dev/sda 2684354560000 gpt_sync_mbr
part /dev/sda 1048576 1048576 bios_grub  /dev/sda1
part /dev/sda 2161115136 2097152 none  /dev/sda2
part /dev/sda 42952818688 2163212288 legacy_boot  /dev/sda3
part /dev/sda 2639237480448 45116030976 none  /dev/sda4

In usr/share/rear/layout/prepare/GNU/Linux/10_include_partition_code.sh there is

read part disk size pstart name flags partition junk

which reads this line from disklayout.conf

part /dev/sda 1048576 1048576 bios_grub  /dev/sda1

that results

$disk="/dev/sda"
$size="1048576"
$pstart="1048576"
$name="bios_grub"
$flags="/dev/sda1"
$partition=""

which is obviously wrong for $name $flags and $partition.

In particular because $partition is empty the line

local number=$(get_partition_number "$partition")

in usr/share/rear/layout/prepare/GNU/Linux/10_include_partition_code.sh calls get_partition_number with empty $partition and in usr/share/rear/lib/layout-functions.sh get_partition_number does an error exit in this case.

What I need to fix first and foremost is that disklayout.conf is created correctly.

Currently it seems that "$name" is empty in disklayout.conf is the root cause.

jsmeix commented at 2015-10-26 13:40:

For comparison how parted and disklayout.conf look
on a "msdos" labeled partitioning:

# parted /dev/sda unit MiB print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sda: 25600MiB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 
Number  Start     End       Size      Type     File system     Flags
 1      1.00MiB   1498MiB   1497MiB   primary  linux-swap(v1)  type=82
 2      1498MiB   13790MiB  12292MiB  primary  btrfs           boot, type=83
 3      13790MiB  25600MiB  11810MiB  primary  xfs             type=83
# egrep '^disk|^part' /tmp/rear.dYWXyVLurEQKeha/rootfs/var/lib/rear/layout/disklayout.conf
disk /dev/sda 26843545600 msdos
part /dev/sda 1569718272 1048576 primary none /dev/sda1
part /dev/sda 12889096192 1570766848 primary boot /dev/sda2
part /dev/sda 12383682560 14459863040 primary none /dev/sda3

At least on first glance it also does not look correct because here the flags are all "none" in contrast to what parted reports.

Currently the whole partitioning looks messy for me...

jsmeix commented at 2015-10-27 12:23:

@jhoekx

I have some general questions regarding the partitioning code in rear.

I like to understand the reasoning behind why the partitioning code
is as it is because from my current point of view I think I may
need to change it substantially to make it less "fragile".

Jeroen, I think you are the right one to ask because "git blame usr/share/rear/layout/save/GNU/Linux/20_partition_layout.sh" and "git blame usr/share/rear/layout/prepare/GNU/Linux/10_include_partition_code.sh" shows mostly you as the one who made them.

Jeroen, is this o.k. for you?

jsmeix commented at 2015-11-02 14:48:

The issue should be fixed with my above pull request https://github.com/rear/rear/pull/681

This minimal changes make it work for me.

Nevertheless the whole partitioning code looks fragile from my point of view but I think it is better to not do major changes here but to keep the fix for this particular issue minimal.

jsmeix commented at 2015-11-03 15:26:

Merged pull request #681 which should fix it.

At least it works for me with this partitioning

# parted /dev/sda unit MiB print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sda: 2560000MiB
Sector size (logical/physical): 512B/512B
Partition Table: gpt_sync_mbr
Disk Flags: 
Number  Start     End         Size        File system     Name     Flags
 1      1.00MiB   2.00MiB     1.00MiB                     primary  bios_grub
 2      2.00MiB   2063MiB     2061MiB     linux-swap(v1)  primary
 3      2063MiB   43026MiB    40963MiB    btrfs           primary  legacy_boot
 4      43026MiB  2559999MiB  2516973MiB  xfs             primary

with a 2500GB virtual harddisk on a "normal" (i.e. BIOS) KVM/QEMU machine.


[Export of Github issue for rear/rear.]