#2087 Issue closed: "rear mkrescue" fails for mmcblk0boot0/mmcblk0rpmb partitions (mmcblk/eMMC disk type)

Labels: enhancement, fixed / solved / done, special hardware or VM

fabz5 opened issue at 2019-03-15 16:31:

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.4 / 2018-06-21

  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
    Distributor ID: Debian
    Description: Debian GNU/Linux 9.8 (stretch)
    Release: 9.8
    Codename: stretch

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):

/etc/rear/site.conf contains:

OUTPUT=ISO
OUTPUT_URL=nfs://mynfsserver/backupdir
BACKUP=NETFS
BACKUP_URL=nfs://mynfsserver/backupdir
  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    PC (Minis Forum Z83-F)

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86-64 (amd64)

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    UEFI, GRUB (2.02~beta3-5+deb9u1)

  • Storage (lokal disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local disk (eMMC)

  • Description of the issue (ideally so that others can reproduce it):
    Running "rear -Dv mkbackup" fails with the following output:

Relax-and-Recover 2.4 / 2018-06-21
Using log file: /var/log/rear/rear-testvm02.log
Using backup archive '/tmp/rear.a0gtr0AbNoYvAgO/outputfs/testvm02/backup.tar.gz'
Using UEFI Boot Loader for Linux (USING_UEFI_BOOTLOADER=1)
Creating disk layout
ERROR: Partition number '0' of partition mmcblk0boot0 is not a valid number.
ERROR: Partition number '' of partition mmcblk0rpmb is not a valid number.
ERROR: Partition mmcblk0rpmb is numbered ''. More than 128 partitions is not supported.
Aborting due to an error, check /var/log/rear/rear-testvm02.log for details
Exiting rear mkbackup (PID 15126) and its descendant processes
Running exit tasks
You should also rm -Rf /tmp/rear.a0gtr0AbNoYvAgO
zsh: terminated  rear -Dv mkbackup
  • Workaround, if any:

None found

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
    I attached the /var/log/rear/rear-testvm02.log logfile. Do not get confused by the hostname, this is NOT a VM. :-)
    rear-testvm02.log

jsmeix commented at 2019-03-18 09:09:

@fabz5
what is the output of the command

lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT

on your original system?

In your particular case you have /dev/mmcblk0boot0
and that special kind of device node name
can currently not be handled properly by ReaR.

My current analysis:

In your case it fails in usr/share/rear/layout/save/GNU/Linux/200_partition_layout.sh
therein when it calls the get_partition_number() function as

get_partition_number mmcblk0boot0

The get_partition_number() function calls in your particular case

echo "mmcblk0boot0" | grep -o -E "[0-9]+$"

to get the partition number which results in your case the trailing 0
but 0 is not a normally valid partition number, see the get_partition_number
function code in usr/share/rear/lib/layout-functions.sh
for what kind of device node names it can deal with:

# Function returns partition number of partition block device name
#
# This function should support:
#   /dev/mapper/36001438005deb05d0000e00005c40000p1
#   /dev/mapper/36001438005deb05d0000e00005c40000_part1
#   /dev/sda1
#   /dev/cciss/c0d0p1
#
# Requires: grep v2.5 or higher (option -o)

get_partition_number() {
    local partition=$1
    local number=$(echo "$partition" | grep -o -E "[0-9]+$")

    # Test if $number is a positive integer, if not it is a bug
    [ $number -gt 0 ] 2>/dev/null
    StopIfError "Partition number '$number' of partition $partition is not a valid number."

    # Catch if $number is too big, report it as a bug
    (( $number <= 128 ))
    StopIfError "Partition $partition is numbered '$number'. More than 128 partitions is not supported."

    echo $number
}

This are the excerpts from your
https://github.com/rear/rear/files/2972012/rear-testvm02.log
that show what I described above:

+++ for device in /dev/mapper/*
++++ dmsetup info -c --noheadings -o major,minor system-var
+++ mapper_number=254:1
+++ '[' 179:256 = 254:1 ']'
+++ [[ mmcblk0boot0 =~ ^mapper/ ]]
++++ readlink /dev/mmcblk0boot0
+++ my_dm=
+++ name=mmcblk0boot0
+++ echo /dev/mmcblk0boot0
+++ return 1
++ partition_name=/dev/mmcblk0boot0
++ partition_name=mmcblk0boot0
++ partition_name=mmcblk0boot0
+++ get_partition_number mmcblk0boot0
+++ local partition=mmcblk0boot0
++++ echo mmcblk0boot0
++++ grep -o -E '[0-9]+$'
+++ local number=0
+++ '[' 0 -gt 0 ']'
+++ StopIfError 'Partition number '\''0'\'' of partition mmcblk0boot0 is not a valid number.'
+++ ((  1 != 0  ))
+++ Error 'Partition number '\''0'\'' of partition mmcblk0boot0 is not a valid number.'

jsmeix commented at 2019-03-18 09:24:

http://relax-and-recover.org/documentation/release-notes-2-4 reads

Version 2.1 (June 2017)
...
mmcblk disk types are now supported (issues #1301, #1302)

which points to
https://github.com/rear/rear/issues/1301
and
https://github.com/rear/rear/pull/1302
but it seems mmcblk disk types are currently not yet fully supported...

fabz5 commented at 2019-03-18 10:22:

I also found #1682, which seems related, but obviously does not help with this issue.

Anyhow, calling
lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT
gives:

NAME                           KNAME             PKNAME         TRAN TYPE FSTYPE       SIZE MOUNTPOINT
/dev/mmcblk0                   /dev/mmcblk0                          disk             58.2G 
|-/dev/mmcblk0p1               /dev/mmcblk0p1    /dev/mmcblk0        part vfat         190M /boot/efi
 -/dev/mmcblk0p2               /dev/mmcblk0p2    /dev/mmcblk0        part LVM2_member 58.1G 
    |-/dev/mapper/system-root  /dev/dm-0         /dev/mmcblk0p2      lvm  ext4         1.9G /
    |-/dev/mapper/system-var   /dev/dm-1         /dev/mmcblk0p2      lvm  ext4         1.9G /var
    |-/dev/mapper/system-tmp   /dev/dm-2         /dev/mmcblk0p2      lvm  ext4         976M /tmp
    |-/dev/mapper/system-swap  /dev/dm-3         /dev/mmcblk0p2      lvm  swap         976M [SWAP]
     -/dev/mapper/system-spare /dev/dm-4         /dev/mmcblk0p2      lvm              52.3G 
/dev/mmcblk0boot0              /dev/mmcblk0boot0                     disk                4M 
/dev/mmcblk0boot1              /dev/mmcblk0boot1                     disk                4M 
/dev/mmcblk0rpmb               /dev/mmcblk0rpmb                      disk                4M

Thanks already for your efforts!

jsmeix commented at 2019-03-18 11:15:

@fabz5
I detected the totally different issuse
https://github.com/rear/rear/pull/2088
when I noticed in your
https://github.com/rear/rear/issues/2087#issue-421604286

ERROR: Partition number '0' of partition mmcblk0boot0 is not a valid number.
ERROR: Partition number '' of partition mmcblk0rpmb is not a valid number.
ERROR: Partition mmcblk0rpmb is numbered ''. More than 128 partitions is not supported.
Aborting due to an error, check /var/log/rear/rear-testvm02.log for details

the triple ERROR:messages which is plain wrong.
It should cleanly abort at the first error.

Because I cannot reproduce such a multiple ERROR:message
I would like if you coud try out if my proposed fix in
https://github.com/rear/rear/pull/2088
helps in your case.

To do that edit in your usr/share/rear/lib/_input-output-functions.sh
the Error function:
Insert a sleep 1 at its very end so that it becomes this

function Error () {

    ...

    kill -USR1 $MASTER_PID
    sleep 1
}

Then re-run rear -Dv mkbackup and now it should error out with

ERROR: Partition number '0' of partition mmcblk0boot0 is not a valid number.
Aborting due to an error, check /var/log/rear/rear-testvm02.log for details

fabz5 commented at 2019-03-18 12:09:

I edited the Error function as advised, but to no avail. The 3 error messages remain. To make sure everything goes as expected I also a added another line:

     ...
     kill -USR1 $MASTER_PID
     LogPrintError "sleeping for 1 second"
     sleep 1
}

which results in:

zsh 166 # rear -Dv mkbackup                                
Relax-and-Recover 2.4 / 2018-06-21
Using log file: /var/log/rear/rear-testvm02.log
Using backup archive '/tmp/rear.sONFeWwU2OmFyV2/outputfs/testvm02/backup.tar.gz'
Using UEFI Boot Loader for Linux (USING_UEFI_BOOTLOADER=1)
Creating disk layout
ERROR: Partition number '0' of partition mmcblk0boot0 is not a valid number.
sleeping for 1 second
ERROR: Partition number '' of partition mmcblk0rpmb is not a valid number.
sleeping for 1 second
ERROR: Partition mmcblk0rpmb is numbered ''. More than 128 partitions is not supported.
sleeping for 1 second
Aborting due to an error, check /var/log/rear/rear-testvm02.log for details
Exiting rear mkbackup (PID 8097) and its descendant processes
Running exit tasks
You should also rm -Rf /tmp/rear.sONFeWwU2OmFyV2
zsh: terminated  rear -Dv mkbackup

jsmeix commented at 2019-03-18 12:54:

@fabz5
thank you so much for your prompt
https://github.com/rear/rear/issues/2087#issuecomment-473884096
I think that shows there is another bug in ReaR
https://github.com/rear/rear/issues/2089

Regarding this issue here, i.e. regarding mmcblk/eMMC disks:

I myself never had nor do I have any mmcblk/eMMC disk device
so that I cannot reproduce issues with it or actually help in this area.

Perhaps you could enhance ReaR to support mmcblk/eMMC disks
as far as you need it for your particular use case?

Cf. "How to adapt and enhance Relax-and-Recover" and related scections in
https://en.opensuse.org/SDB:Disaster_Recovery

fabz5 commented at 2019-03-19 10:34:

I'll see what I can do. :-)

jsmeix commented at 2019-03-19 13:29:

@fabz5
I look forward to your GitHub pull request.

Regardless that I do not have any mmcblk/eMMC disk device
I could help you - to some extent - with general coding stuff in ReaR.
To avoid that you do much coding work on your own and then
we will tell you about a lot of things that you should have done differently
I would recommend that you do a GitHub pull request as soon as possible
for you (i.e. when you have some very first step that works a bit for you)
so that we can check early if changes are needed.

In general:

Submit early and submit often, cf. Wikipedia:Release early, release often
http://www.wikipedia.org/wiki/Release_early,_release_often
Do not mix up several independent separated issues in one big all-in-one
GitHub pull request because a pull request cannot be partially accepted.
On the other hand do not split several changes that belong to each other
into several pull requests.

See "Contributing" at
http://relax-and-recover.org/development/

See in particular
"Keeping your forked repo synced with the upstream source" at
https://2buntu.com/articles/1459/keeping-your-forked-repo-synced-with-the-upstream-source/
for an illustrated guide how to work with GitHub in general.
The images therein might be a bit outdated but the general information
how to work with GitHub should still apply.

jsmeix commented at 2019-03-21 08:03:

@fabz5
is this issue soved for you since your https://github.com/rear/rear/pull/2091
is merged?

I.e. does now run "rear mkrescue/mkbackup" run for you without an error
or are there more adaptions and enhancements needed in ReaR
to get mmcblk/eMMC disk devices supported in general
or at least for your particular use case?

jsmeix commented at 2019-03-21 08:07:

@fabz5
how you could try out our current ReaR GitHub master code
where https://github.com/rear/rear/pull/2091 is merged:

Basically "git clone" our current ReaR upstream GitHub master code
into a separated directory and then configure and run ReaR
from within that directory like:

# git clone https://github.com/rear/rear.git

# mv rear rear.github.master

# cd rear.github.master

# vi etc/rear/local.conf

# usr/sbin/rear -D mkbackup

Note the relative paths "etc/rear/" and "usr/sbin/".

fabz5 commented at 2019-03-21 08:49:

The issue is solved, at least concerning mkbackup. Thus it can be closed.
However there is still trouble with the restore in ReaR 2.4 (efibootmgr is called with a wrong disk name). I will try out the current ReaR master to see if this issue persists (from looking at the code, I guess so). If I am right, I will open another issue.

Thanks a lot! :-)

jsmeix commented at 2019-03-21 09:03:

@fabz5
thank your for your confirmation that this particular issue is solved.

When "rear recover" fails provide its debug log file as described
in the section "Debugging issues with Relax-and-Recover" at
https://en.opensuse.org/SDB:Disaster_Recovery

jsmeix commented at 2019-03-22 14:51:

@fabz5
I have a question about what a "eMMC" device actually is.

From your 'lsblk' output in
https://github.com/rear/rear/issues/2087#issuecomment-473851770
and your description in
https://github.com/rear/rear/pull/2091#issue-262630887
I assume that at least your "eMMC" device is not one single disk
but actually your "eMMC" device consists of four disk type block devices:

NAME               KNAME             PKNAME TRAN TYPE FSTYPE  SIZE 
/dev/mmcblk0       /dev/mmcblk0                  disk        58.2G 
/dev/mmcblk0boot0  /dev/mmcblk0boot0             disk           4M 
/dev/mmcblk0boot1  /dev/mmcblk0boot1             disk           4M 
/dev/mmcblk0rpmb   /dev/mmcblk0rpmb              disk           4M

where only the /dev/mmcblk0 disk is what is of interest for ReaR.

Is my assumtion true?

If yes, it seems what is missing in ReaR is a generic method
where the user can explicitly specify to exclude unwanted
whole disks during "rear mkrescue".

It seems currently there is only a generic method in ReaR
how to exclude unwanted components during "rear recover",
see the section "Including/Excluding components" at
https://github.com/rear/rear/blob/master/doc/user-guide/06-layout-configuration.adoc

FYI:
Because of https://github.com/rear/rear/issues/1771
it is currently not possible to exclude less than a whole disk
(e.g. only certain partitions or filesystems on a disk).

fabz5 commented at 2019-03-27 20:49:

Your assumption is true. However please note, that the "eMMC" is not only recognised as four block devices, but that also the special partitions "mmcblk0boot0/1, mmcblk0rpmb" are recognised as partitions ON /dev/mmcblk0. They have the same names as the three additional disk type block devices. In this particular case the exclusion of the three special disk devices would not have been sufficient (they were already excluded anyhow by https://github.com/rear/rear/pull/1682).

In general however the possibility to exclude entire disks from mkrescue/mkbackup would be great.

jsmeix commented at 2019-03-28 11:10:

@fabz5
thank you so much for your explanation.

As far as I understand it now there are not two different kind
of mmcblk*boot* and mmcblk*rpmb thingies - the only ones
that exist are the disk type block devices

NAME               KNAME             PKNAME TRAN TYPE FSTYPE  SIZE 
/dev/mmcblk0boot0  /dev/mmcblk0boot0             disk           4M 
/dev/mmcblk0boot1  /dev/mmcblk0boot1             disk           4M 
/dev/mmcblk0rpmb   /dev/mmcblk0rpmb              disk           4M

but what ReaR wrongly did before https://github.com/rear/rear/pull/2091
is that it recognized those disk type block devices
as if they were part type block devices below PKNAME /dev/mmcblk0
i.e. as if the lsblk output would have been something like

NAME                 KNAME               PKNAME       TYPE   SIZE 
/dev/mmcblk0         /dev/mmcblk0                     disk  58.2G 
|-/dev/mmcblk0boot0  /dev/mmcblk0boot0  /dev/mmcblk0  part     4M 
|-/dev/mmcblk0boot1  /dev/mmcblk0boot1  /dev/mmcblk0  part     4M 
 -/dev/mmcblk0rpmb   /dev/mmcblk0rpmb   /dev/mmcblk0  part     4M

The reason is that currently ReaR does not use the lsblk output
but determines the disk devices layout by more traditional methods
so that it could be another cleanup/enhancement to let ReaR
use the lsblk output to determine the storage devices layout,
cf. https://github.com/rear/rear/issues/1390

fabz5 commented at 2019-04-04 18:05:

You are right about the disktype block devices. Using lsblkseems like a reasonable idea to determine disk device layout indeed.

jsmeix commented at 2019-04-05 07:18:

I even think using lsblk to recreate the storage devices layout
is a nowadays basically mandatory step forward because lsblk
should provide the right dependencies of arbitrary complex layers
of storage devices "out of the box", for an (artificial) example
of a complex layout see https://github.com/rear/rear/issues/2023

In general cf.
https://github.com/rear/rear/issues/1390#issuecomment-406516854
and follow the link therein that links to
https://github.com/rear/rear/issues/791#issuecomment-406513969
and then have a look at the links there

You may also have a look at issues that are linked in
https://github.com/rear/rear/issues/2023#issuecomment-468187071


[Export of Github issue for rear/rear.]