#866 Issue closed: EFI bootable ISO image: first boot option is broken

Labels: bug, waiting for info, fixed / solved / done

abbbi opened issue at 2016-06-07 10:26:

hi,

with the actual GIT version i was able to create a SLES12 based EFI boot iso image, but the first
boot option (Relax and Recover (no Secure Boot)) does not work, the generated grub.cfg
looks like:

menuentry "Relax and Recover (no Secure Boot)" --class gnu-linux --class gnu --class os {
echo 'Loading kernel ...'
linux /isolinux/kernel [..]

The "Secure Boot" option however works, difference is that this one uses "linuxefi" instead
of "linux":

menuentry "Relax and Recover (Secure Boot)" --class gnu-linux --class gnu --class os {
echo 'Loading kernel ...'
linuxefi /isolinux/kernel [..]

jsmeix commented at 2016-06-07 11:47:

Because it is about SLE I assing it to me regardless that
basically I know nothing at all about (U)EFI or Secure Boot.

@gozora
could you please have a look?

gdha commented at 2016-06-07 11:56:

@abbbi "No Secure boot" means booting without UEFI (or via BIOS mode), which if UEFI is turned on will fail for obvious reasons. I don't consider this as a bug, perhaps, opinions may be different?
@gozora what is your point of view on this question?

abbbi commented at 2016-06-07 11:58:

@gdha Ok, Thanks for clarification. Maybe the Naming should be different (No UEFI/UEFI boot) or something, to make it more obvious.

gozora commented at 2016-06-07 12:08:

Hello Guys,
First of all, my knowledge of EFI secure boot (SB) is limited to theory only. I don't have hands on experience with it :-(. I can however take closer look on this technology and how ReaR handles it, but it will take a day or two, as I need to get my dirty hands on some SLES12 installation DVD, coz I don't have any SLES12 installed so far in my collection ...

jsmeix commented at 2016-06-07 12:33:

@gdha
when "No Secure boot" actually means
"booting without UEFI (i.e. via BIOS)"
I will prepare a pull request to get that
currently misleading naming fixed.

@gozora
accordingly I think this is not a SLE12-specific issue
so that you do not need to do a special test with SLE12.
But if you like to try our SLE12 see
https://www.suse.com/products/server/download
how to get a free download of SLES12.

gozora commented at 2016-06-07 12:45:

yes, might be, but prefer reproducing the issue on as similar HW/SW as possible.
Thanks for the link, I'm installing now ;-).

jsmeix commented at 2016-06-07 13:32:

@gdha
see https://github.com/rear/rear/pull/867 whether or not you like it.

jsmeix commented at 2016-06-07 13:35:

I think more meaningful names for the recovery system boot
options is only a first step.

I think the real solution would be when boot options that
cannot work are not shown at all - but that is something
for someone who is much more an expert in grub setup
than I am...

gozora commented at 2016-06-07 20:20:

Hello @ll,
After some struggle with new features of SLES12 (God bless /etc/modprobe.d/10-unsupported-modules.conf and systemd) and severe brain damage I found out why secure boot option works (unlike no secure boot).
This thing is that grub2 installed on SLES12 and using secure boot, uses module linuxefi, which includes commands:

  • linuxefi for loading kernel
  • initrdefi for loading initrd

but for some reason does not include linux module which uses linux and initrd commands to load initrd and kernel

According this I'd recommend to modify create_grub2_cfg() to use linuxefi and initrdefi only in case of secure boot. I can create pull request for this of course, but I'm a bit tight with time these days, so I'm not sure if I'll be able to make it this week...

@abbbi I was not able to reproduce your problem with sysfs and USING_UEFI_BOOTLOADER=0.
A thing that would bring some light into this would be if you could set shell debug options (set -x at the beginning and set +x at the end) in file /usr/share/rear/output/ISO/Linux-i386/25_populate_efibootimg.sh run rear mkrescue and send me just the debug output.
Would that be possible from your site?

@gdha, @jsmeix
And maybe one point that jumped out but could hit us in the back later. Once I was booting from rear recovery media I've got kernel panic due problem with execution of /init. Fortunately it was just due missing root=root_disk in grub2. I'll not open issue for this right now because I need to take a closer look why it was not a problem until now. It might be that I didn't used btrfs before (yes, shame on me ;-) ). But it could be some misconfiguration on my site (would not be for first time). Anyhow just keep this in mind if someone complaints about kernel panic ;-)

abbbi commented at 2016-06-08 07:05:

@gozora the SYSFS seems to be fixed with the latest git version, i was using version 1.18, i already verified with @jsmeix that it works with the latest git checkout, no troubles here.

jsmeix commented at 2016-06-08 07:55:

@gozora
many thanks for your analysis!

I have a general question about the naming
"UEFI boot" versus "Secure Boot":

As far as I know "UEFI boot" and "Secure Boot"
are different things.

As far as I know UEFI is a precondition for "Secure Boot"
but one can do "plain UEFI boot" without "Secure Boot".

Therefore I am confused with the current naming
of the boot options of the rear recovery system.

I would expect three boot options like

traditional boot (BIOS boot on x86 architecture)
plain UEFI boot (without Secure Boot)
Secure Boot (only on UEFI systems)

Curently I do not understand why there are only two
boot options of the rear recovery system.

From my current point of view it seems something is
shomehow mixed up here so that first and foremost
I would like to get it sorted out.

In short, my question is:

What is the fully correct naming of the
booting options of the recovery system?

jsmeix commented at 2016-06-08 08:25:

@gozora
a question regarding your recommendation to
"modify create_grub2_cfg() to use linuxefi and initrdefi
only in case of secure boot."

I don't see how I could currently test in rear whether or not
Secure Boot is used.

As far as I see I can currently only test in rear whether or not
UEFI is used via the USING_UEFI_BOOTLOADER variable.

Do you perhaps mean to use linuxefi and initrdefi only
in case of UEFI boot?

Do you mean to add tests for USING_UEFI_BOOTLOADER
to create_grub2_cfg() so that only those "menuentry" blocks
are output that match the actually used bootloader?

jsmeix commented at 2016-06-08 10:34:

I have dark forebodings about 'linuxefi' and 'initrdefi':

Because I cannot find them in the GRUB2 upstream sources.
I assume 'linuxefi' and 'initrdefi' is SUSE-specific stuff.

In this case anything in rear regarding 'linuxefi' and 'initrdefi'
would need to be implemented as a special case
only for SUSE.

Or is perhaps 'linuxefi' and 'initrdefi' also used in other
Linux distributions?

gozora commented at 2016-06-08 11:01:

@jsmeix

I would expect three boot options like

I agree with you! However I'm not author of those entries so I can't guess their porpose :-(. I guess that this mess comes from HANAs running SLES11. I've saw some installations with both grub and elilo installed and was never able to find out why ...

I don't see how I could currently test in rear whether or not Secure Boot is used.

You can identify secure boot when uefi_bootloader_basename=shim.efi so it should be enough to add if .. else in function.

I have dark forebodings about 'linuxefi' and 'initrdefi':

Yesterday as I was doing some research I've found this post for fedora and this one for ubuntu both describing linuxefi resp initrdefi, but grub2 documentation is not mentioning them. So maybe we appeared our selves in middle of war secure boot vs no secure boot ...

abbbi commented at 2016-06-08 11:34:

to make things more interesting i just created a Debian 8 based UEFI image with REAR (installed ebiso on the system) and the Secureboot does not work here (kernel has invalid signature). The "No Secureboot" however does.

So:

SLES12: (No Secureboot) fails with "can't find command linux"
Debian: (Secureboot) fails with "invalid signature"

jsmeix commented at 2016-06-08 12:07:

Yes, we appeared right in the middle of a war...

I asked the SUSE grub2 package maintainer
and got this information:

> I am wondering if 'linuxefi' (and 'initrdefi')
> in our grub2 RPM is a SUSE specific thing?
NO. It's a UEFI Secure Boot specfic thing.
> I am wondering if loading the kernel
> and initrd in grub.cfg via
> ---------------------------------------------------------
> menuentry ...
>     linuxefi /boot/vmlinuz...
>     initrdefi /boot/initrd...
> ---------------------------------------------------------
> is a SUSE specific way how to configure
> GRUB2 to boot on (U)EFI systems?
We have to support UEFI Secure Boot
so we have to use linuxefi as it implemented
the need verification mechanism via shim-lock
protocol to loaded the grub2 image ..
Moreover it leverages the kernel UEFI handover
protcol to boot the kernel via its own efistub
and beneficial from kernel bug fixing for
firmware issues. ..
> According to your grub2 RPM changelog entry
> ---------------------------------------------------------------
> * Mon Nov 26 2012 mchang@suse.com
> - ship a Secure Boot UEFI compatible bootloader (fate#314485)
> - added secureboot patches which introduces new linuxefi module
>   that is able to perform verifying signed images via exported
>   protocol from shim.
>   ...
>   - grub2-secureboot-add-linuxefi.patch
> ---------------------------------------------------------------
> it seems linuxefi in grub2 is a SUSE specific thing
> that is not in upstream GRUB2.
It's not upstreamed for various reasons ..
Although down-stream distros like SUSE
and RedHat already used that for a while,
upstream is still have problem accepting it ..
> Background information:
> 
> The reason why I am interested in 'linuxefi' and 'initrdefi'
> is that the Relax-and-Recover (rear) recovery system also
> needs to boot on (U)EFI systems and to do that the
> bootloader of the rear recovery system needs to
> be configured appropriately.
> When grub2 in SUSE needs specific configuration that
> is different compared to upstream GRUB2, then I must
> adapt rear to create its recovery system different
> on SUSE systems.
Please consider it a standard to boot
in UEFI Secure Boot for now, until someone
could convince upstream to include it,
there no other grub command could be used
for now in UEFI Secure Boot as Microsoft only
recongize linuxefi as a they audit the code
to issue their certifcates for shim loader ... 

Accordingly 'linuxefi' (and 'initrdefi') is
the current de-facto standard to boot
in UEFI Secure Boot that is used by
various Linux distributions like SUSE,
Red Hat, and Debian (where the latter
indicates that also Ubuntu uses it).

jsmeix commented at 2016-06-08 12:16:

Currently ther rear recovery system offers in any case
both methods to load kernel and initrd via

menuentry ...
    linux /isolinux/kernel ...
    initrd /isolinux/initrd.cgz ...

and via

menuentry ...
    linuxefi /isolinux/kernel ...
    initrdefi /isolinux/initrd.cgz ...

I do not like to exclude one of them because I fear
regressions in this or that use-cases.

Perhaps sometimes only the 'linux'/'initrd' method works,
perhaps sometimes only the 'linuxefi'/'initrdefi' method works.

I do not want to make in rear a decision which method
works in what cases.

Therefore I will change my https://github.com/rear/rear/pull/867
a bit so that the naming is more technically correct.

At least for now I leave it up to the user to find out
what method actually works in his particular
boot environment.

Later - when we know better about all the various details - then
rear might get enhanced so that non-working recovery system
boot menue entries are automatically suppressed.

jsmeix commented at 2016-06-08 12:58:

I changed https://github.com/rear/rear/pull/867 so that now
I renamed recovery system boot menue entries
to no longer so nice looking but (hopefully)
technically more correct ones.

I do not like that very much but I am not at all
a sufficient bootloader expert to implement a
real solution here.

jsmeix commented at 2016-06-08 13:00:

@abbbi regarding https://github.com/rear/rear/issues/866#issuecomment-224563239
many thanks for you tests!
It helps a lot to understand how messy it currently is.

jsmeix commented at 2016-06-08 13:18:

@gozora regarding
"SLES11 ... with both grub and elilo installed"

Curently I don't know any details about that
but I think I vaguely remember I heard about
some complicated reasoning behind it.

In short: This is not because SUSE likes to make
things overcomplicated but because something
that had needed to be solved for customers.

Fortunately on SLE12 GRUB 2 is the only bootloader
but now we have Secure Boot to keep things complicated.

gozora commented at 2016-08-07 17:41:

Ah, me and my memory. I've forgotten about this completely :-(
Fortunately after (another) week spent with grub and UEFI, I might have an idea how to solve this nice and simple ...

gozora commented at 2016-08-08 16:43:

Hi @abbbi
I've just posted pull request #955 that should solve problem of yours.
Could you please give it a try?

With this patch both entries (secure boot and no secure boot) in grub menu should be working (assuming you have correctly signed kernel ;-)

jsmeix commented at 2016-08-09 08:05:

With https://github.com/rear/rear/pull/955 merged
I consider this issue to be fixed, in particular because
@gozora has tested that it works at least for him
on SLES12 SP1, Debian and Centos,
cf. https://github.com/rear/rear/pull/955#issuecomment-238481175

@abbbi
in general regarding how to test the currently
newest rear GitHub master code:
Basically "git clone" it into a directory and
then run rear from within that directory.

# git clone https://github.com/rear/rear.git
# cd rear
# vi etc/rear/local.conf
# usr/sbin/rear -d -D mkbackup

@abbbi
please provide feedback whether or not
the current rear GitHub master code
also works for you.

jsmeix commented at 2016-08-09 08:39:

@gozora I have a question regarding your comment
https://github.com/rear/rear/issues/866#issuecomment-238295932

both entries (secure boot and no secure boot) in grub menu
should be working

As far as I understand what @abbbi reported above in
https://github.com/rear/rear/issues/866#issuecomment-224563239

SLES12: (No Secureboot) fails with "can't find command linux"

it seems in case of (U)EFI on SLES12 the GRUB 2 there
does not know about the linux command - in only
knows the linuxefi command so that

linux /isolinux/kernel ...

cannot load the kernel in case of (U)EFI on SLES12, only

linuxefi /isolinux/kernel ...

can in case of (U)EFI on SLES12.

See also what I wrote in
https://github.com/rear/rear/pull/867#issuecomment-238487352

I am still confused about the meaning of the names
for those GRUB menue entries.

For me the code looks as if it is about "(U)EFI" versus "no (U)EFI"
but the GRUB menue names make it look for the user as if it is
about "Secure Boot" versus "no Secure Boot".

gozora commented at 2016-08-09 08:52:

Actually Grub2 commands naming linux vs linuxefi is quite confusing ...
From what I've understood so far, whether you use linux or linuxefi depends if you want secure boot or not:

In other words:

  • linuxefi loads and check kernel signature
  • linux just loads kernel

So you can use both command while UEFI booting (depending if you want secure boot or not), but you can use only linux command if using legacy boot.

gozora commented at 2016-08-09 09:12:

it seems in case of (U)EFI on SLES12 the GRUB 2 there
does not know about the linux command - in only
knows the linuxefi command so that

This is actually caused by "incomplete bootx64.efi" (missing Linux module). This was a tricky bug. If you are interested I can explain it in full detail this afternoon (once I'm off my work) ...

jsmeix commented at 2016-08-09 09:22:

@gozora
of course I am always interested in detailed explanations!

gozora commented at 2016-08-09 17:16:

@jsmeix as now I'm home far from screams of my work, beer opened I'll try to awake author in me and as promised describe this fix in more detail.

I'll first summarize my assumptions that lead me through writing this patch:
Grub2 options (let's call them like this for this time) behave as follows:

  • linux/initrd just loads kernel/initrd (this is our beloved simplest option inherited from legacy boot times)
  • linuxefi/initrdefi instead of just loading kernel/initrd, checks whether kernel signatures are valid or not and refuses to load kernel that is not signed or have wrong signature. This is our Secure boot.

In 25_populate_efibootimg.sh we have following code on line 15:

cp $v "${UEFI_BOOTLOADER}" $TMP_DIR/mnt/EFI/BOOT/BOOTX64.efi >&2

so just take whatever is in $UEFI_BOOTLADER and copy it to preparation area that will be later part of ISO. (let use call this default BOOTX64.efi for now). This operation is useful for ELILO as you will learn later.

In order to successfully boot UEFI system, we need to have boot loader (*.efi). In case of Grub2 we can create our own using grub(2)-mkimage and embed modules we need to boot (which is IMHO preferred way) or in case of ELILO we can use one located under /boot/efi like done in previous cp command. This copy approach can be utilized even for Grub2, but can have nasty side effects in form for missing Grub2 functionality.

If we go on in 25_populate_efibootimg.sh, nothing really interesting is going on there (except your FIXMEs which I've noticed ;-) and thinking about them) until line 69:
build_bootx86_efi this function intends to build our own BOOTX64.efi (which of course should overwrite default BOOTX64.efi) that shall be used to boot ISO.

build_bootx86_efi is defined in uefi-functions.sh
and we had following code in it:
...
[[ ! -d /usr/lib/grub/x86_64-efi ]] && return

$gmkimage $v -O x86_64-efi -c $TMP_DIR/mnt/EFI/BOOT/embedded_grub.cfg -d /usr/lib/grub/x86_64-efi -o $TMP_DIR/mnt/EFI/BOOT/BOOTX64.efi ...

as SLES12 SP1 have Grub2 modules stored in /usr/lib/grub2/x86_64-efi, grub(2)-mkimage was never allowed to create BOOTX64.efi and we were left with default BOOTX64.efi.

I assume that abbi installed system with Secure Boot enabled, which caused default BOOTX64.efi be build without linux Grub2 module embedded (that normally includes linux and initrd options), and as default BOOTX64.eif not overwritten with build_bootx86_efi, it caused effect described in this issue.
Normally Grub2 is capable to load modules from disk in case of need (you might have been noticed all that insmods in grub.cfg), Grub2 modules are however not part of the ISO (we rather use embedding) so I guess error message abbi saw was something like “error: can't find command `linux'”

What I did to correct this issue was just to remove condition:
[[ ! -d /usr/lib/grub/x86_64-efi ]] && return
and removed -d /usr/lib/grub/x86_64-efi from grub(2)-mkimage as -d option of always defaults to location where Grub2 modules are installed.
This allowed function build_bootx86_efi to overwrite default BOOTX64.efi in case Grub2 is in use, as well as keep default BOOTX64.efi in case of ELILO

So this is not about Grub2 knowing or not knowing linux/initrd options, I'd call it just a bad coincidence aka bug ;-). Grub2 options/command are enabled by modules and modules can be either loaded dynamically (with insmod) or can be embedded to boot loader file (*.efi)
I've btw discovered that list of commands modules contain is listed in /usr/lib/grub(2)/x86_64-efi)/command.lst, might be useful in future ...

If SLES12 would use /usr/lib/grub/x86_64-efi for Grub2 module storing (like Debian or Centos does) we'd probably never discovered this.

I hope this brought a bit of light to this issue,

jsmeix commented at 2016-08-10 15:17:

@gozora
wow!
Many thanks for your awesome explanation.

I think such enlightening background explanations
should be stored somewhere in the actual rear sources
(and not only in a GitHub issue comment)
to make it easier for others to adapt and enhance the code
if needed at any later time when nobody remembers the
exact reasoning behind why the current code is as it is, cf
https://github.com/rear/rear/wiki/Coding-Style

@gdha
what do you think about having such background information
in the actual rear sources?

Personally I would provide it as comments directly in the scripts.
Why not have long explanatory comments directly in the scripts?
I think it is sufficiently easy to skip over them when coding
and having them directly in the scripts hopefully avoids that the
code documentation gets outdated when the code is changed.
Perhaps very long comments could be added after the actual code
at the bottom of the script and in the actual code have only some
kind of references to sections in the code documentation at the end?


[Export of Github issue for rear/rear.]