#895 Issue closed: Wrong GRUB version recovered

Labels: enhancement, cleanup, fixed / solved / done

GCChelp opened issue at 2016-06-27 12:49:

  • rear version (/usr/sbin/rear -V): Relax-and-Recover 1.18 / Git

  • OS version (cat /etc/rear/os.conf or lsb_release -a):
    OS_VENDOR=SUSE_LINUX
    OS_VERSION=13.1

    ARCH='Linux-i386'
    OS='GNU/Linux'
    OS_VERSION='13.1'
    OS_VENDOR='SUSE_LINUX'
    OS_VENDOR_VERSION='SUSE_LINUX/13.1'
    OS_VENDOR_ARCH='SUSE_LINUX/i386'

  • rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
    OUTPUT=USB
    USB_DEVICE=/dev/disk/by-label/REAR-000
    BACKUP=NETFS
    BACKUP_URL=nfs://nfsserver/backups/rear
    BACKUP_OPTIONS="nfsvers=3,nolock"
    EXCLUDE_MOUNTPOINTS=( /home /scratch )
    AUTOEXCLUDE_PATH=( /media /mnt )
    AUTOEXCLUDE_AUTOFS=..
    AUTOEXCLUDE_DISKS=y
    SSH_ROOT_PASSWORD=XXX

  • Brief description of the issue
    "rear recover" restores unconfigured GRUB2, while GRUB legacy was used on the system.
    Result is a unbootable machine after restore.

  • Work-around, if any
    nothing known

GCChelp commented at 2016-06-27 12:57:

rear-testupd.log-bootloader+grub.txt

@jsmeix
Attached the result of egrep -i "bootloader|grub" rear-testupd.log as separate file.
I couldn't find any entries causing the problem on first sight.
Please let me know what to search for if you have further ideas.

jsmeix commented at 2016-06-29 10:02:

I assing it to me because it is about openSUSE 13.1
regardless that I am not at all an expert in bootloader issues
i.e. do not expect too much from me here.

@GCChelp
first and foremost I like to understand
why you use GRUB legacy and not GRUB2?

On my openSUSE 13.1 test system
(cf. https://github.com/rear/rear/issues/870#issuecomment-225876504 )
I get by default GRUB 2 used as bootloader
and with that it "just works" for me.

Therefore I like to understand the reasoning behind
why you use GRUB legacy on your system.

GCChelp commented at 2016-06-29 12:06:

@jsmeix
We use GRUB legacy for legacy reasons...
Have been running openSUSE on this workstations from something like 11.2 on, updating them as required. Now, everything is updated to 13.1, only the bootloader stayed where it was.

I am not an expert in bootloader issues, too. That's why I always postpone the possible project to update GRUB legacy to GRUB 2...

jsmeix commented at 2016-06-29 12:15:

@GCChelp
many thanks for your background information.
It helps me a lot to understand how it can happen
that an old bootloader is used on a new system:

By updating only the files of the system from older ones
(i.e. by updateing the installed RPM packages)
instead of installing the whole system anew from scratch.

This means in general that rear should be prepared
to detact if an old (even outdated) bootloader is still in use.

But there is the general problem to reliably detect the actually
used bootloader, cf. "Disaster recovery does not just work" in
https://en.opensuse.org/SDB:Disaster_Recovery

I will dig into it how rear detects what bootloader is used...

jsmeix commented at 2016-06-29 12:27:

@GCChelp
In your https://github.com/rear/rear/issues/895#issuecomment-228737866 your
"result of egrep -i 'bootloader|grub' rear-testupd.log"
contains:

+++ cat /var/lib/rear/recovery/bootloader
++ BOOTLOADER=GRUB

i.e. it has correctly detected that GRUB (not GRUB2) is used
but later it does contradictingly

++ Print 'Installing GRUB2 boot loader'
++ echo -e 'Installing GRUB2 boot loader'
++ grub_name=grub2

FYI: On my openSUSE 13.1 test system
after "rear -d -D mkbackup" I have in
var/lib/rear/recovery/bootloader

GRUB2

GCChelp commented at 2016-06-29 12:41:

@jsmeix

This means in general that rear should be prepared
to detact if an old (even outdated) bootloader is still in use.

I fully agree!

i.e. it has correctly detected that GRUB (not GRUB2) is used
but later it does contradictingly

I am willing to provide further details or do some tests in our environment.
Just let me know what you suggest.

jsmeix commented at 2016-06-29 12:44:

With probability one (cf. https://en.wikipedia.org/wiki/Almost_surely )
I found the root cause:

Your "result of egrep -i 'bootloader|grub' rear-testupd.log"
contains:

+ source /usr/share/rear/finalize/Linux-i386/21_install_grub.sh
++ ((  USING_UEFI_BOOTLOADER  ))
++ [[ GRUB = \G\R\U\B ]]
+++ type -p grub-probe
+++ type -p grub2-probe
++ [[ -n /sbin/grub2-probe ]]
++ ((  USING_UEFI_BOOTLOADER  ))
2016-06-27 12:20:57.496977540 Including finalize/Linux-i386/22_install_grub2.sh
+ source /usr/share/rear/finalize/Linux-i386/22_install_grub2.sh
++ ((  USING_UEFI_BOOTLOADER  ))
+++ type -p grub-probe
+++ type -p grub2-probe
++ [[ -n /sbin/grub2-probe ]]
++ LogPrint 'Installing GRUB2 boot loader'

I.e. it runs finalize/Linux-i386/21_install_grub.sh
but that gives up because of a wrong test and
lets finalize/Linux-i386/22_install_grub2.sh do the job
which means it tries to install GRUB2.

The wrong test in finalize/Linux-i386/21_install_grub.sh is

# check the BOOTLOADER variable (read by 01_prepare_checks.sh script)
if [[ "$BOOTLOADER" = "GRUB" ]]; then
    if [[ $(type -p grub-probe) || $(type -p grub2-probe) ]]; then
        # grub2 script should handle this instead
        return
    fi
fi

In finalize/Linux-i386/22_install_grub2.sh there is the same test:

# Only for GRUB2 - GRUB Legacy will be handled by its own script
[[ $(type -p grub-probe) || $(type -p grub2-probe) ]] || return

Currently I do not understand how that tests are meant to work.
As usual no comments that tell the reasoning behind
so that I could undrerstand the intent behind that code,
cf. "Code should be easy to understand" in
https://github.com/rear/rear/wiki/Coding-Style

@GCChelp
when you are logged in as root in the recovery system
you can change whatever you want in the recovery system
before you run "rear recover".

In this case please change
/usr/share/rear/finalize/Linux-i386/21_install_grub.sh
and
/usr/share/rear/finalize/Linux-i386/22_install_grub2.sh
before you run "rear recover"
to enforce that 21_install_grub.sh does its job
and that 22_install_grub2.sh does not do anything.

For example (untested) remove the test from 21_install_grub.sh
and in 22_install_grub2.sh add a plain "return" at the beginning
so that 22_install_grub2.sh does nothing.

Then try if afterwards "rear recover" installs GRUB
(and not GRUB2).

jsmeix commented at 2016-06-29 12:57:

"git blame" indicates the above mentioned test in 21_install_grub.sh
is from "Jesper Sander Lindgren" via commit 3f8b22f3
and the matching test in 22_install_grub2.sh
is from @gdha via commit 079de45b

@gdha
could you explain how the

[[ $(type -p grub-probe) || $(type -p grub2-probe) ]]

tests in https://github.com/rear/rear/issues/895#issuecomment-229344628 are meant to work?

From my current point of view the test looks like an
overcomplicated indirection (RFC 1925 item 6a).

I would have expected something straightforward
and simple (KISS) like in 21_install_grub.sh

# do not do anything for GRUB 2 here because
# GRUB 2 is installed via 22_install_grub2.sh
test "GRUB2" = "$BOOTLOADER" && return

and accordingly in 22_install_grub2.sh

# do not do anything for GRUB (i.e. GRUB Legacy) here
# because GRUB Legacy is installed via 21_install_grub.sh
test "GRUB" = "$BOOTLOADER" && return

jsmeix commented at 2016-06-29 13:00:

Ha!
Commit 3f8b22f shows that the test was KISS before.
But currently I have no idea why it was complicated.
There must have been a reason but who knows it?

jsmeix commented at 2016-06-29 13:10:

Grepping in "git log" for "3f8b22f" results:

Merge: 78a6f9f 3f8b22f
Author: Gratien D'haese 
Date:   Thu May 21 18:24:28 2015 +0200
    Merge pull request #589 from sanderu/Grub2_support_for_Linux
 

which leads to https://github.com/rear/rear/pull/589
that contains

Problems found:
The 21_install_grub.sh checked for GRUB2 which is
not part of the first 2048 bytes of a disk - only GRUB
was present - thus the check for grub-probe/grub2-probe.

I am afraid - I still do not understand it - I mean how does

if [[ "$BOOTLOADER" = "GRUB2" ]]; then

check the first 2048 bytes of a disk?

On my openSUSE 13.1 I have
"$BOOTLOADER" = "GRUB2"
and accordingly I think it should work to test for it.

gdha commented at 2016-06-30 06:32:

@jsmeix I believe [[ $(type -p grub-probe) || $(type -p grub2-probe) ]] was meant as a test to verify if grub legacy or grub2 is present on the system. It looks indeed a bit strange, feel free to modify towards you think how it should be.

jsmeix commented at 2016-06-30 12:31:

The whole logic when what bootloader install script
is actually installing its specific bootloader
confuses me.

In
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
I tried to overhaul finalize/Linux-i386/21_install_grub.sh
but I fear I introduced regressions.

I need to test it.

For now
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
is only there FYI so that you can have a look.

@gdha
could you check my comments in
https://github.com/jsmeix/rear/blob/perfer_to_run_GRUB_install_script_as_specified_issue895/usr/share/rear/finalize/Linux-i386/21_install_grub.sh
whether or not my reasoning makes sense?

In particular I have now in finalize/Linux-i386/21_install_grub.sh

# If the BOOTLOADER variable (read by finalize/default/01_prepare_checks.sh)
# is not "GRUB" (which means GRUB Legacy) skip this script (which is only for GRUB Legacy)
# because finalize/Linux-i386/22_install_grub2.sh is for installing GRUB 2
# and finalize/Linux-i386/22_install_elilo.sh is for installing elilo:
test "GRUB" = "$BOOTLOADER" || return

But I wonder what is the intended way what "rear recover"
should do if no BOOTLOADER value exists?

Is then "NOBOOTLOADER=1" the right default?

Or should perhaps "rear recover" install GRUB 2
as fallback if no BOOTLOADER value exists?

jsmeix commented at 2016-06-30 13:10:

I tested
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
on a SLES11-SP4 system that uses GRUB Legacy:

# rpm -qa | grep -i grub
grub-0.97-162.172.1

There it still "just works" (at least for me):

RESCUE f96:~ # cat /var/lib/rear/recovery/bootloader 
GRUB
RESCUE f96:~ # rear -d -D recover
Relax-and-Recover 1.18 / Git
...
Recreated initramfs (mkinitrd).
Installing GRUB Legacy boot loader:
Installed GRUB Legacy boot loader with /boot on disk with MBR booted on 'device (hd0) /dev/sda' with 'root (hd0,1)'.
Updating udev configuration (70-persistent-net.rules)
...
Finished recovering your system. You can explore it under '/mnt/local'.

@GCChelp
can you please use
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
for a test on your openSUSE 13.1 system with GRUB Legacy.

I assume it will not yet work for you because I did not
remove the test for

type -p grub-probe || type -p grub2-probe

but I show now that message

Skip installing GRUB Legacy boot loader
because GRUB 2 is installed
(grub-probe or grub2-probe exist).

I assume you will get that message
but I would appreciate it if you could try it out
and confirm if my assumption is right (or not).

jsmeix commented at 2016-06-30 13:21:

@GCChelp
only FYI if needed how to use something like
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
for a test:

Basically "git clone" it into a directory and
then run rear from within that directory.

This is how I did in on my above mentioned SLES11-SP4 system:

# git clone https://github.com/jsmeix/rear.git
Cloning into 'rear'...
# cd rear
# git branch -a
* master
...
  remotes/origin/perfer_to_run_GRUB_install_script_as_specified_issue895
# git checkout -b perfer_to_run_GRUB_install_script_as_specified_issue895 origin/perfer_to_run_GRUB_install_script_as_specified_issue895
Branch perfer_to_run_GRUB_install_script_as_specified_issue895 set up to track remote branch perfer_to_run_GRUB_install_script_as_specified_issue895 from origin.
Switched to a new branch 'perfer_to_run_GRUB_install_script_as_specified_issue895'
# vi etc/rear/local.conf
...
# grep -v '^#' etc/rear/local.conf
OUTPUT=ISO
BACKUP=NETFS
BACKUP_OPTIONS="nfsvers=3,nolock"
BACKUP_URL=nfs://10.160.4.244/nfs
NETFS_KEEP_OLD_BACKUP_COPY=yes
SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"
KEEP_BUILD_DIR=""
# usr/sbin/rear -d -D mkbackup
...

jsmeix commented at 2016-06-30 13:38:

Interestingly on my openSUSE 13.1 system I have by default
both GRUB Legacy and GRUB 2 RPMs installed:

# cat /etc/os-release
NAME=openSUSE
VERSION="13.1 (Bottle)"
VERSION_ID="13.1"
PRETTY_NAME="openSUSE 13.1 (Bottle) (x86_64)"
# rpm -qa | grep -i grub
grub2-2.00-39.1.3.x86_64
grub-0.97-194.1.2.x86_64
grub2-i386-pc-2.00-39.1.3.x86_64
grub2-x86_64-efi-2.00-39.1.3.x86_64
grub2-branding-openSUSE-13.1-10.4.13.noarch
# type -p grub-probe
# type -p grub2-probe
/usr/sbin/grub2-probe

This means the test for

type -p grub-probe || type -p grub2-probe

succeeds which results that during "rear recover"

Skip installing GRUB Legacy boot loader
because GRUB 2 is installed
(grub-probe or grub2-probe exist).

jsmeix commented at 2016-06-30 13:45:

I tested
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895
on a SLES12-SP1 system that uses GRUB 2:

# rpm -qa | grep -i grub
grub2-2.02~beta2-69.1.x86_64
grub2-branding-SLE-12-11.3.1.noarch
grub2-i386-pc-2.02~beta2-69.1.x86_64
grub2-snapper-plugin-2.02~beta2-69.1.noarch

There it also still "just works" (at least for me):

RESCUE e229:~ # cat /var/lib/rear/recovery/bootloader
GRUB2
RESCUE e229:~ # rear -d -D recover
Relax-and-Recover 1.18 / Git
...
Recreated initramfs (mkinitrd).
Installing GRUB2 boot loader
Finished recovering your system. You can explore it under '/mnt/local'.

GCChelp commented at 2016-07-01 13:25:

@jsmeix
Thanks a lot for providing the "git clone" information. Would have been hard for me without this!

but I show now that message

Skip installing GRUB Legacy boot loader
because GRUB 2 is installed
(grub-probe or grub2-probe exist).

I assume you will get that message
but I would appreciate it if you could try it out
and confirm if my assumption is right (or not).

You are right, I got

Skip installing GRUB Legacy boot loader because GRUB 2 is installed (grub-probe or grub2-probe exist).
Installing GRUB2 boot loader

BTW:

cat /var/lib/rear/recovery/bootloader

GRUB

jsmeix commented at 2016-07-01 13:51:

@GCChelp
because you do not use GRUB 2 on your system
I assume you no not need /usr/sbin/grub2-probe
on your system so that you could move it away with

# mv /usr/sbin/grub2-probe /usr/sbin/grub2-probe.away

as a dirty hack to make "rear recover" work on
your particular system - at least for now because
I need more time to find out about the reason behind
why that special code is there not to simply trust when
the value in /var/lib/rear/recovery/bootloader is "GRUB"
(i.e. I like to avoid regressions on whatever other Linux
distributions if I simply remove that code).

gdha commented at 2016-07-04 11:41:

@jsmeix Your script version of 21_install_grub.sh looks ok to me. We can give it a try (+1)
The NOBOOTLOADER variable is meant as a kind of being desperate of not knowing what to do when there is no bootloader found. It was (and still is) up the the user to decide then. What else can we do? Make some suggestions, but I wouldn't make a decision on his behalf as some will like it, where others will be angry...

jsmeix commented at 2016-07-04 12:12:

With https://github.com/rear/rear/pull/900 I merged
https://github.com/jsmeix/rear/tree/perfer_to_run_GRUB_install_script_as_specified_issue895

It is still not fully reliably installing the right bootloader:
When GRUB 2 is installed in addition to GRUB Legacy
it still prefers to install GRUB 2 as bootloader
even if the BOOTLOADER variable tells "GRUB"
(which means GRUB Legacy).

I.e. further work is needed...

An idea how to further improve it:

If GRUB 2 and GRUB Legacy are installed and
the BOOTLOADER variable tells "GRUB",
then install GRUB Legacy as bootloader.

In contrast if GRUB Legacy is not installed then
do not install GRUB Legacy as bootloader
in finalize/Linux-i386/21_install_grub.sh
even if the BOOTLOADER variable tells "GRUB".
In this case hope for the best that then GRUB 2 is installed.
In finalize/Linux-i386/22_install_grub2.sh do the following:
If GRUB 2 is installed then install GRUB 2 as bootloader
even if the BOOTLOADER variable tells "GRUB".

GCChelp commented at 2016-07-05 05:42:

@jsmeix
It seems that openSUSE 13.1 installed both, GRUB legacy and GRUB 2 by default.

Concerning the dirty hack, I found another workaround:
If I simply deinstall all GRUB 2 rpms, the machine still is able to boot and rear is finally working.
I could even perform my first successful rear recover with this setup! :-)

Regarding the BOOTLOADER variable and the file /var/lib/rear/recovery/bootloader : Where does this value come from? How is it determined and defined?

jsmeix commented at 2016-07-11 09:09:

@GCChelp
regarding
"openSUSE 13.1 installed both, GRUB legacy and GRUB 2 by default":
Yes, see my
https://github.com/rear/rear/issues/895#issuecomment-229660412

Regarding
"deinstall all GRUB 2 ... successful rear recover":
Many thanks for the positive feedback!

Regarding
"BOOTLOADER ... Where does this value come from?":
This is what I need to learn and understand as a precondition
to further improve that the actually right bootloader gets installed
during "rear recover".

Because of this I like to keep this issue open because
in general it is not really solved regardless that for this
special case here a workaround (deinstall GRUB 2) exists.

stoxxys commented at 2016-07-13 12:44:

I had the same error in openSUSE 42.1:
You need to edit /usr/share/rear/finalize/Linux-i386/21_install_grub.sh

You need to change the bootloader settings part like this and it will work:

...

#skip if another bootloader was installed
if [[ -z "$NOBOOTLOADER" ]] ; then
return
fi

#for UEFI systems with grub legacy with should use efibootmgr instead
[[ ! -z "$USING_UEFI_BOOTLOADER" ]] && return # not empty means UEFI booting
**
#check the BOOTLOADER variable (read by 01_prepare_checks.sh script)

if [[ "$BOOTLOADER" = "GRUB" || "$BOOTLOADER" = "GRUB2" ]]; then
# grub2 script should handle this instead
return
fi

#Only for GRUB Legacy - GRUB2 will be handled by its own script
if [[ -z "$(type -p grub)" ]]; then
return
fi
...

jsmeix commented at 2016-07-13 13:03:

@stoxxys
please provide a GitHub pull request with your changes
so that I can reliably see your exact changes
(a GitHub pull request provides a nice diff)
or at least provide a "diff -u" output preferably
as a uploaded file or at least here enclosed
in <pre> ... </pre>

jsmeix commented at 2016-07-13 13:06:

@stoxxys
what "grub*" RPM packages have you installed on
your openSUSE Leap 42.1 system?

On my openSUSE Leap 42.1 system I have only grub2*
packages

# cat /etc/os-release 
NAME="openSUSE Leap"
VERSION="42.1"
VERSION_ID="42.1"
PRETTY_NAME="openSUSE Leap 42.1 (x86_64)"
...
# rpm -qa | grep -i grub
grub2-i386-pc-2.02~beta2-76.1.x86_64
grub2-2.02~beta2-76.1.x86_64
grub2-x86_64-efi-2.02~beta2-76.1.x86_64
grub2-branding-openSUSE-42.1-6.2.noarch
grub2-snapper-plugin-2.02~beta2-76.1.noarch

and for me "rear recover" just works there.

jsmeix commented at 2016-07-13 14:19:

@stoxxys
your /usr/share/rear/finalize/Linux-i386/21_install_grub.sh
in your https://github.com/rear/rear/issues/895#issuecomment-232343557
is not the current rear master, in particular ist is not
my latest version from https://github.com/rear/rear/pull/900

Please report issues with older rear versions as separated
issues and do not mix up your issues into other issues.

In general in case of issues preferably try to reproduce them
with newest rear master code.

In general regarding how to test with
the currently newest rear GitHub master code:

# git clone https://github.com/rear/rear.git
# cd rear
# vi etc/rear/local.conf
# usr/sbin/rear -d -D mkbackup
and finally test whether "rear -d -D recover" works

gdha commented at 2016-09-13 14:26:

@GCChelp Do you still need some assistance from us or are you good to close it?

GCChelp commented at 2016-09-14 09:53:

@gdha
For me the workaround was fine, no further assistance required. Thanks for asking!

@jsmeix
Do you consider this issue fixed in the next Rear release?

jsmeix commented at 2016-09-15 14:08:

All what is shown as "merged" above will be in the next
Relay-and-Recover release.


[Export of Github issue for rear/rear.]