#2293 PR merged: Use grub-mkstandalone instead of grub-mkimage for UEFI

Labels: enhancement, cleanup, fixed / solved / done

pcahyna opened issue at 2019-12-08 22:25:

Pull Request Details:
  • Type: Bug Fix / Cleanup

  • Impact: Normal

  • Reference to related issue (URL):
    #2199 #2283 #2001 https://github.com/rear/rear/issues/1996#issuecomment-445201012

  • How was this pull request tested?
    the two cases that use the code touched by this PR:

    • creating a rescue ISO on a RHEL 8.1 UEFI system and booting it
    • creating a rescue system using GRUB_RESCUE and booting it using the EFI boot menu
    • repeating both with GRUB2_MODULES=(part_gpt part_msdos fat ext2 normal chain boot configfile linux jfs iso9660 usb usbms usb_keyboard video udf ntfs all_video gzio efi_gop reboot search test echo btrfs lvm)
  • Brief description of the changes in this pull request:
    There have been several cases recently where ReaR on UEFI was broken due to either a wrong assumption about what Grub2 modules to load (#2001) or, in an attempt to fix it, a wrong assumption about where to search for them (see #2199). I believe that such Grub internal details are better handled by Grub itself and we should not hardcode them (making them configurable is only a workaround - the default should work in most scenarios). Therefore this change switches to using grub-mkstandalone for te Grub2 UEFI image generation, which, unlike grub-mkimage, relies less on supplied information about modules. It embeds a ramdisk in the Grub2 image and by defaults includes all modules in it.
    I found that the current method is broken at least on RHEL and GRUB_RESCUE, because the current default set of modules does not include xfs and /boot is XFS by default on RHEL, making it inaccessible from Grub2. This change also fixes this, therefore it is a bugfix in addition to a cleanup.
    I found that the image creation using grub-mkstandalone still needs to be told to load partition modules, otherwise partitions will not be detected out of the box. I therefore added a dynamic detection of all modules needed to access /boot (partitions, filesystems, and others like lvm, should it be needed) using grub-probe, as it is more robust and relies on less assumptions than trying to guess a reasonable set of default modules.
    The change still honors GRUB2_MODULES, but changes it into an array. (Other lists in configuration are specified as arrays already, like MODULES, so having GRUB2_MODULES an a string with blank-separated words is not consistent with the rest of the configuration.) If GRUB2_MODULES is set and nonempty, only the specified modules will be embedded in the standalone image ramdisk and also only those will be loaded. The detection using grub-probe will be switched off.

gozora commented at 2019-12-09 09:09:

Hello @pcahyna,

I just run a simple test and created ReaR rescue system with your code.
On ISO boot, I don't get Grub2 boot menu but first boot entry is booted immediately without any prompt. Can you please fix this behavior ?

Thanks

V.

pcahyna commented at 2019-12-09 09:28:

Hi @gozora ,
thank you for testing, unfortunately I tested this case and it worked fine for me, so I don't know how to reproduce the problem, what is your distribution and Grub2 version?

gozora commented at 2019-12-09 09:30:

Hello @pcahyna,

I've used SLES12SP4 for SAP ...

# cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles_sap:12:sp4"

V.

gozora commented at 2019-12-09 09:34:

Excerpt from debug log:

...
++ cp -v /boot/efi/EFI/sles/grubx64.efi /tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/BOOTX64.efi
'/boot/efi/EFI/sles/grubx64.efi' -> '/tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/BOOTX64.efi'
+++ dirname /boot/efi/EFI/sles/grubx64.efi
++ local uefi_bootloader_dirname=/boot/efi/EFI/sles
++ test -f ''
+++ basename /usr/bin/xorrisofs
++ test ebiso = xorrisofs
+++ type -p grub
++ [[ -n '' ]]
++ create_grub2_cfg
+++ get_root_disk_UUID
++++ grep ' on / '
++++ awk '{print $1}'
++++ xargs blkid -s UUID -o value
++++ mount
+++ echo 7e36992b-b7a6-48a2-89d1-499c93a825ed
++ root_uuid=7e36992b-b7a6-48a2-89d1-499c93a825ed
++ cat
++ test -f ''
++ build_bootx86_efi /tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/BOOTX64.efi /tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/grub.cfg
++ local outfile=/tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/BOOTX64.efi
++ local embedded_config=
++ local gmkstandalone=
++ local gprobe=
++ dirs=()
++ local dirs
++ modules=(${GRUB2_MODULES:+"${GRUB2_MODULES[@]}"})
++ local modules
++ shift
++ [[ -n /tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/grub.cfg ]]
++ embedded_config=/boot/grub/grub.cfg=/tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/grub.cfg
++ shift
++ dirs=(${@:+"$@"})
++ has_binary grub-mkstandalone
++ for bin in '$@'
++ type grub-mkstandalone
/usr/share/rear/lib/_input-output-functions.sh: line 476: type: grub-mkstandalone: not found
++ return 1
++ has_binary grub2-mkstandalone
++ for bin in '$@'
++ type grub2-mkstandalone
++ return 0
++ gmkstandalone=grub2-mkstandalone
++ ((  0  ))
++ test /usr/lib/grub2/x86_64-efi/moddep.lst
++ grub2-mkstandalone -v -O x86_64-efi -o /tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/BOOTX64.efi /boot/grub/grub.cfg=/tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/grub.cfg
grub2-mkstandalone: info: copying `/usr/lib/grub2/x86_64-efi/acpi.mod' -> `/tmp/grub.rKUK6E/boot/grub/x86_64-efi/acpi.mod'.
grub2-mkstandalone: info: copying `/usr/lib/grub2/x86_64-efi/adler32.mod' -> `/tmp/grub.rKUK6E/boot/grub/x86_64-efi/adler32.mod'.
...

# cat /tmp/rear.IEzPq6u2M2oANaW/tmp/mnt/EFI/BOOT/grub.cfg
set default="0"

insmod efi_gop
insmod efi_uga
insmod video_bochs
insmod video_cirrus
insmod all_video

set gfxpayload=keep
insmod gzio
insmod part_gpt
insmod ext2

set timeout=5

search --no-floppy --file /boot/efiboot.img --set
#set root=(cd0)

menuentry "Relax-and-Recover (no Secure Boot)"  --class gnu-linux --class gnu --class os {
     echo 'Loading kernel ...'
     linux /isolinux/kernel root=UUID=7e36992b-b7a6-48a2-89d1-499c93a825ed  selinux=0 biosdevname=1 net.ifnames=0
     echo 'Loading initial ramdisk ...'
     initrd /isolinux/initrd.cgz
}

menuentry "Relax-and-Recover (Secure Boot)"  --class gnu-linux --class gnu --class os {
     echo 'Loading kernel ...'
     linuxefi /isolinux/kernel root=UUID=7e36992b-b7a6-48a2-89d1-499c93a825ed  selinux=0 biosdevname=1 net.ifnames=0
     echo 'Loading initial ramdisk ...'
     initrdefi /isolinux/initrd.cgz
}

menuentry "Reboot" {
     reboot
}

menuentry "Exit to EFI Shell" {
     exit
}

gozora commented at 2019-12-09 09:45:

@pcahyna do not spent your time on this for now!
I've reset my server several times an sometimes menu is shown and sometimes not.
This might indicate possible problem with my firmware, and display of menu might be related to bigger size of bootx64.efi.

I'll run couple of deeper tests in upcoming days and let you know if I find something ...

V.

pcahyna commented at 2019-12-09 09:48:

@gozora , thanks. I suggest that you also try GRUB_RESCUE, which uses the same code.

gozora commented at 2019-12-09 10:18:

@pcahyna I've tested GRUB_RESCUE as you've suggested and got following:

image

I'm pretty sure that kernel and initrd are available but Grub2 somehow can't locate them:

# ls /boot/
System.map-4.12.14-95.24-default  initrd-4.12.14-95.24-default        symvers-4.12.14-95.24-default.gz
boot.readme                       initrd-4.12.14-95.24-default-kdump  sysctl.conf-4.12.14-95.24-default
config-4.12.14-95.24-default      initrd-4.4.176-94.88-default        vmlinux-4.12.14-95.24-default.gz
efi                               rear-initrd.cgz                     vmlinuz
grub2                             rear-kernel                         vmlinuz-4.12.14-95.24-default
initrd                            symtypes-4.12.14-95.24-default.gz

Same ReaR configuration works fine with "old" code ...
It can be related to fact that my /boot/ is located on BTRFS.
Can you maybe extend your tests for /boot on BTRFS ?

Thanks

V.

gozora commented at 2019-12-09 10:23:

@pcahyna I just realized that I'm running my server from BTRFS snapshot ...
Can it be that build_bootx86_efi() (grub-mkstandalone) somehow can't deal with with BTRFS snapshots ?

V.

gozora commented at 2019-12-09 10:28:

@pcahyna just FYI:
My structure looks something like this:

# findmnt /
TARGET SOURCE                              FSTYPE OPTIONS
/      /dev/sda3[/@/.snapshots/1/snapshot] btrfs  rw,relatime,space_cache,subvolid=279,subvol=/@/.snapshots/1/snapshot

# df -h /boot
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3        48G  9.2G   39G  20% /

V.

pcahyna commented at 2019-12-09 10:32:

@gozora has the build code loaded a btrfs and LVM module in Grub2? If not, can you manuallly load them (insmod btrfs, insmod lvm) and try the menu again?

pcahyna commented at 2019-12-09 10:34:

P.S. you will see what modules get loaded from the grub2-mkstandalone command from the debug log. (I also added a separate log statement which will show them)

gozora commented at 2019-12-09 10:35:

I assume that all modules were loaded, because I was able to read content /boot directory from Grub2 command line without any problems.

V.

pcahyna commented at 2019-12-09 10:37:

@gozora you are able to read the content, but is the content correct? If yes, I don't see how you are able to read the content but not able to boot a kernel which is part of the very same content.

gozora commented at 2019-12-09 10:55:

I just simply read content of directory in Grub2 cmdline . I dont know what content / directory is the "correct" one for Grub2. grub-mkimage in old code worked well though ...

V.

pcahyna commented at 2019-12-09 11:01:

I mean if you read the content of the directory, do you see the kernel and initrd that you want to load?

jsmeix commented at 2019-12-09 11:08:

From my experience with the SUSE's special btrfs default structure
basically everything that can go wrong will go wrong with it.

From my personal point of view it is just too complicated.
What I mean is it does not behave sufficiently fail-safe in practice.

When all and everything is fully correctly set up for it,
it provides some extradordinary and powerful functionality.

But from my experience things fall apart as soon
as something is not set up fully correctly.

As far as I know (but I am not at all a real expert in this area)
one main reason of confusion is that what looks like /dir in the
normal running system is not /dir in the whole btrfs tree
because in the normal running system '/' is not the root of the btrfs
but a sub-directory of the btrfs tree (actually a btrfs subvolume)
that get mounted as '/' in the normal running system.

With SUSE's special btrfs default structure what is mounted as '/' in the
normal running system is a snapper snapshot sub-directory/subvolume
of the btrfs tree.

And then other sub-directories of the btrfs tree (actually btrfs subvolumes)
get mounted below what is mounted as '/' in the normal running system.

To see what of the btrfs tree is acually mounted one must mount
the root of the btrfs at a separated mountpoint so that one can
inspect the whole btrfs tree starting at its actual root.

For example on my current SLES12-SP4 system with LVM and btrfs
but with BIOS and not with UEFI, it is the same system as in
https://github.com/rear/rear/pull/2291#issuecomment-561224611
I have

# lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT
NAME                        KNAME     PKNAME    TRAN TYPE FSTYPE       SIZE MOUNTPOINT
/dev/sda                    /dev/sda            ata  disk               20G 
`-/dev/sda1                 /dev/sda1 /dev/sda       part LVM2_member   20G 
  |-/dev/mapper/system-swap /dev/dm-0 /dev/sda1      lvm  swap         1.4G [SWAP]
  `-/dev/mapper/system-root /dev/dm-1 /dev/sda1      lvm  btrfs       18.6G /
/dev/sr0                    /dev/sr0            ata  rom  iso9660      3.8G

# findmnt -at btrfs
TARGET                    SOURCE                                             FSTYPE OPTIONS
/                         /dev/mapper/system-root[/@/.snapshots/1/snapshot]  btrfs  rw,relatime,s
|-/usr/local              /dev/mapper/system-root[/@/usr/local]              btrfs  rw,relatime,s
|-/var/opt                /dev/mapper/system-root[/@/var/opt]                btrfs  rw,relatime,s
|-/var/lib/mysql          /dev/mapper/system-root[/@/var/lib/mysql]          btrfs  rw,relatime,s
|-/boot/grub2/i386-pc     /dev/mapper/system-root[/@/boot/grub2/i386-pc]     btrfs  rw,relatime,s
|-/.snapshots             /dev/mapper/system-root[/@/.snapshots]             btrfs  rw,relatime,s
|-/var/crash              /dev/mapper/system-root[/@/var/crash]              btrfs  rw,relatime,s
|-/var/lib/pgsql          /dev/mapper/system-root[/@/var/lib/pgsql]          btrfs  rw,relatime,s
|-/var/spool              /dev/mapper/system-root[/@/var/spool]              btrfs  rw,relatime,s
|-/var/lib/mailman        /dev/mapper/system-root[/@/var/lib/mailman]        btrfs  rw,relatime,s
|-/opt                    /dev/mapper/system-root[/@/opt]                    btrfs  rw,relatime,s
|-/var/cache              /dev/mapper/system-root[/@/var/cache]              btrfs  rw,relatime,s
|-/home                   /dev/mapper/system-root[/@/home]                   btrfs  rw,relatime,s
|-/var/lib/mariadb        /dev/mapper/system-root[/@/var/lib/mariadb]        btrfs  rw,relatime,s
|-/var/tmp                /dev/mapper/system-root[/@/var/tmp]                btrfs  rw,relatime,s
|-/boot/grub2/x86_64-efi  /dev/mapper/system-root[/@/boot/grub2/x86_64-efi]  btrfs  rw,relatime,s
|-/srv                    /dev/mapper/system-root[/@/srv]                    btrfs  rw,relatime,s
|-/var/lib/named          /dev/mapper/system-root[/@/var/lib/named]          btrfs  rw,relatime,s
|-/var/lib/machines       /dev/mapper/system-root[/@/var/lib/machines]       btrfs  rw,relatime,s
|-/var/lib/libvirt/images /dev/mapper/system-root[/@/var/lib/libvirt/images] btrfs  rw,relatime,s
`-/var/log                /dev/mapper/system-root[/@/var/log]                btrfs  rw,relatime,s

# ls -l /boot/grub2
drwxr-xr-x 1 root root    0 Jul 12  2017 backgrounds
-rw-r--r-- 1 root root   15 Nov 21 11:51 device.map
drwxr-xr-x 1 root root   22 Nov 21 11:52 fonts
-rw------- 1 root root 7910 Nov 21 11:52 grub.cfg
-rw-r--r-- 1 root root 1024 Nov 21 11:51 grubenv
drwxr-xr-x 1 root root 6186 Nov 21 11:52 i386-pc
drwxr-xr-x 1 root root  356 Nov 21 11:52 locale
drwxr-xr-x 1 root root    6 Nov 21 11:50 themes
drwxr-xr-x 1 root root    0 Nov 21 11:47 x86_64-efi

# mkdir /tmp/btrfsroot

# mount -t btrfs -o subvolid=0 /dev/mapper/system-root /tmp/btrfsroot

# ls -l /tmp/btrfsroot
drwxr-xr-x 1 root root 72 Nov 21 11:47 @

# ls -l /tmp/btrfsroot/@
drwxr-x--- 1 root root   42 Nov 21 11:54 .snapshots
drwxr-xr-x 1 root root   10 Nov 21 11:47 boot
drwxr-xr-x 1 root root   14 Nov 21 11:47 etc
drwxr-xr-x 1 root root   16 Nov 21 11:51 home
drwxr-xr-x 1 root root    0 Jun 27  2017 opt
drwxr-xr-x 1 root root   12 Nov 21 11:47 srv
drwxrwxrwt 1 root root 1032 Dec  9 11:46 tmp
drwxr-xr-x 1 root root   10 Nov 21 11:47 usr
drwxr-xr-x 1 root root   54 Nov 21 11:47 var

# ls -l /tmp/btrfsroot/@/boot/
drwxr-xr-x 1 root root 34 Nov 21 11:47 grub2

# ls -l /tmp/btrfsroot/@/boot/grub2/
drwxr-xr-x 1 root root 6186 Nov 21 11:52 i386-pc
drwxr-xr-x 1 root root    0 Nov 21 11:47 x86_64-efi

so what looks like /boot/grub2/i386-pc on SLES12 with btrfs
is actually /@/boot/grub2/i386-pc in the whole btrfs tree.

I guess with UEFI the same kind of confusion could happen.

I think what file and directory paths a GRUB2 command "sees"
while it runs inside a normal running SLES with btrfs
is different compared to what file and directory paths
the GRUB2 bootloader "sees" on plain btrfs during booting.

Again:
I am not at all an expert in this area so something of what I wrote above
could be wrong.

But what is true is that file and directory paths look different
from inside a running SLES12 with its mounted btrfs stuff
compared to when looking directly at the plain whole btrfs.

Cf. my fun with it in
https://github.com/rear/rear/issues/1828
see in particular
https://github.com/rear/rear/issues/1828#issuecomment-396913152
which proves things depend on how btrfs stuff is mounted inside SLE12.

gozora commented at 2019-12-09 11:34:

Thanks @jsmeix i coldnt explain it better!
It looks like grub-mkstandalone is doing something differently compared to grub-mkimage ...

V.

gozora commented at 2019-12-09 12:48:

When I'm using "old" code ReaR executes code as follows:

grub2-mkimage -v -O x86_64-efi --config /boot/grub2/rear_embed.cfg -o /boot/efi/EFI/BOOT/rear.efi -p /EFI/BOOT part_gpt part_msdos fat ext2 normal chain boot configfile linux linuxefi multiboot jfs iso9660 usb usbms usb_keyboard video udf ntfs all_video gzio efi_gop reboot search test echo btrfs lvm

However with "new" code, I have only following modules:

2019-12-09 12:30:04.779046992 GRUB2 modules to load: btrfs fat part_gpt

Any idea why only 3 modules are auto-detected? (a site note; on this server I don't have BTRFS located on LVM, so missing modules should not be a showstopper for booting)

Excerpt from debug log:

++ for bin in '$@'
++ type grub2-probe
++ return 0
++ gprobe=grub2-probe
++ '[' -n grub2-probe ']'
++ modules=($( for p in "${dirs[@]}" ; do
                             $gprobe --target=fs "$p"
                             $gprobe --target=partmap "$p" | sed -e 's/^/part_/'
                             $gprobe --target=abstraction "$p"
                         done | sort -u ))
+++ sort -u
+++ for p in '"${dirs[@]}"'
+++ grub2-probe --target=fs /boot
+++ sed -e 's/^/part_/'
+++ grub2-probe --target=partmap /boot
+++ grub2-probe --target=abstraction /boot
+++ for p in '"${dirs[@]}"'
+++ grub2-probe --target=fs /boot/efi
+++ sed -e 's/^/part_/'
+++ grub2-probe --target=partmap /boot/efi
+++ grub2-probe --target=abstraction /boot/efi
++ Log 'GRUB2 modules to load: btrfs fat part_gpt'
++ echo '2019-12-09 12:30:04.779046992 GRUB2 modules to load: btrfs fat part_gpt'
2019-12-09 12:30:04.779046992 GRUB2 modules to load: btrfs fat part_gpt

If I add modules manually by:

GRUB2_MODULES="part_gpt part_msdos fat ext2 normal chain boot configfile linux linuxefi multiboot jfs iso9660 usb usbms usb_keyboard video udf ntfs all_video gzio efi_gop reboot search test echo btrfs lvm"

It still doesn't work though ... (when I type ls I get message ls module is missing ...`

V.

gozora commented at 2019-12-09 12:53:

I can see that bootx64.efi and rear.efi are created differently:

++ build_bootx86_efi /boot/efi/EFI/BOOT/rear.efi /boot/grub2/rear.cfg /boot /boot/efi

++ build_bootx86_efi /tmp/rear.g3pKTRZrSI3pKGk/tmp/mnt/EFI/BOOT/BOOTX64.efi /tmp/rear.g3pKTRZrSI3pKGk/tmp/mnt/EFI/BOOT/grub.cfg

What is the purpose of $3 and $4 when creating rear.efi ?

V.

pcahyna commented at 2019-12-09 12:59:

++ build_bootx86_efi /boot/efi/EFI/BOOT/rear.efi /boot/grub2/rear.cfg /boot /boot/efi

What is the purpose of $3 and $4 when creating rear.efi ?

The purpose is to pass directories/files that must be accessible from Grub. My code makes sure that Grub modules needed to access those are loaded. Without this feature, you would risk not having part_gpt and / or btrfs modules loaded and you would not even see your /boot. (The log snipped you have shown proves that this part is working correctly.)

gozora commented at 2019-12-09 13:01:

@pcahyna I guess I understand the purpose now! I'm in rush so I don't read carefully :-(
When build_bootx86_efi() gets $3 and 4$ it does some kind of auto-detection otherwise you will include all available modules, right?

I guess this is not a good idea, since ESP can be located anywhere and /boot doesn't need to exist at all ... Try to somehow reasonably detect where is ESP located instead of hard-coding it.

It however still doesn't solve the problem that is auto-detection is used rear.efi doesn't boot (at lest on SLES12SP4).

V.

pcahyna commented at 2019-12-09 13:07:

However with "new" code, I have only following modules:

2019-12-09 12:30:04.779046992 GRUB2 modules to load: btrfs fat part_gpt

Any idea why only 3 modules are auto-detected? (a site note; on this server I don't have BTRFS located on LVM, so missing modules should not be a showstopper for booting)

Because you don't need any other modules to access /boot, at least the code thinks so. (As you say, missing lvm module is not a showstopper for booting, so this part was detected correctly.)

Those three are not the only modules that get loaded. Those are additional modules that are determined to be needed to access /boot. Even with no modules here, grub-mkstandalone creates a basically usable system (but it risks not being able to see /boot).

If I add modules manually by:

GRUB2_MODULES="part_gpt part_msdos fat ext2 normal chain boot configfile linux linuxefi multiboot jfs iso9660 usb usbms usb_keyboard video udf ntfs all_video gzio efi_gop reboot search test echo btrfs lvm"

This shows that the problem is probably somewhere else than in a missing module. I suspect a difference in the grub.cfg files.

It still doesn't work though ... (when I type ls I get message ls module is missing ...`

Yes, that is a possible weakness in my treatment of GRUB2_MODULES. When GRUB2_MODULES are given, they not only specify what modules to load, but also what modules to install in the standalone image ramdisk. The ls module is normally not loaded by default, but it is autoloaded when needed. But since it has not been even installed, it can not be autoloaded. Maybe I should use GRUB2_MODULES only to determine what modules to load, and install all modules, or use a separate variable to determine what modules to install.

gozora commented at 2019-12-09 13:09:

This one might help you a bit:

With "old" working code:
image
Notice that ls (hd1,gpt3)/boot returns data from my BTRFS snapshot

Where "new" code:
image

Sees bare root filesystem when ls (hd1,gpt3) is executed, exactly as @jsmeix mentioned in his https://github.com/rear/rear/pull/2293#issuecomment-563186126.

V.

pcahyna commented at 2019-12-09 13:14:

@pcahyna I guess I understand the purpose now! I'm in rush so I don't read carefully :-(
When build_bootx86_efi() gets $3 and 4$ it does some kind of auto-detection otherwise you will include all available modules, right?

It always includes all available modules, but does not load all of them. This autodetection determines additional modules to load. I.e. it does not restrict modules to be loaded, it extends them.

I guess this is not a good idea, since ESP can be located anywhere and /boot doesn't need to exist at all ... Try to somehow reasonably detect where is ESP located instead of hard-coding it.

What is an ESP?

This hardcoding of /boot was already there, I did not change it. See https://github.com/rear/rear/blob/60e19be52c3b93b1c8e70d71f35c9f11189809d0/usr/share/rear/output/default/940_grub2_rescue.sh#L42

It however still doesn't solve the problem that is auto-detection is used rear.efi doesn't boot (at lest on SLES12SP4).

Yes, I suppose it is unrelated to the modules issue. i don't think that the module detection part malfunctioned here. Must be something in the cfg file. Try setting btrfs_relative_path=y which I perhaps overzealously removed in https://github.com/rear/rear/pull/2293/files#diff-7f052550597fa70483fe418e92ef6f8cL164

pcahyna commented at 2019-12-09 13:15:

This one might help you a bit:

Thanks! Please try btrfs_relative_path=y.

pcahyna commented at 2019-12-09 13:19:

ESP = EFI system partition, right? If so, this also has been hardcoded before to /boot/efi, i.e. there is no support for having it located "anywhere".
https://github.com/rear/rear/blob/60e19be52c3b93b1c8e70d71f35c9f11189809d0/usr/share/rear/output/default/940_grub2_rescue.sh#L180

gozora commented at 2019-12-09 13:23:

Hello @pcahyna

btrfs_relative_path=y did the trick, rear.efi is booting now ...

V.

pcahyna commented at 2019-12-09 13:24:

Thanks! I will add it back to the generated .cfg.

gozora commented at 2019-12-09 13:29:

ESP = EFI system partition, right? If so, this also has been hardcoded before to boot/efi, i.e. there is no support for having it located "anywhere".

This is not a good example, code you are referring to is buggy, I'm currently working on PR to fix it, (see https://github.com/gozora/rear/commit/b368b73a6b599f4d81bda45ff5b6a58547911c84)

pcahyna commented at 2019-12-09 13:43:

ESP = EFI system partition, right? If so, this also has been hardcoded before to boot/efi, i.e. there is no support for having it located "anywhere".

This is not a good example, code you are referring to is buggy, I'm currently working on PR to fix it, (see gozora@b368b73)

I was working with what was there - if you introduce a variable for it, I will use it.

Concerning your commit, I think you can use grub2-probe -t device instead of mount | grep "$esp_mountpoint" | awk '{print $1}'. grub2-probe -t disk and grub2-probe -t fs_uuid and grub2-mkrelpath could also be very useful to simplify this code.

gozora commented at 2019-12-09 14:07:

Concerning your commit, I think you can use grub2-probe -t device instead of mount | grep "$esp_mountpoint" | awk '{print $1}'. grub2-probe -t disk and grub2-probe -t fs_uuid and grub2-mkrelpath could also be very useful to simplify this code.

Thanks, but I personally like mount ... better.

V.

pcahyna commented at 2019-12-09 14:12:

The problem is, there is a lot of text-processing magic (which encodes lots of special cases) to remove the partition identifier later down in your code, and grub2-probe -t disk would probably be more robust, if it works correctly.

gozora commented at 2019-12-09 14:21:

That text-processing magic is needed byefibootmgr.
Because efibootmgr requires disk and partition to be entered separately.

efibootmgr --disk ${Disk} --part ${ParNr} ...

V.

pcahyna commented at 2019-12-10 13:35:

@rmetrich , good catch! I rebased the PR to fix this and also to add back btrfs_relative_path=y which should fix @gozora's btrfs case.

pcahyna commented at 2019-12-10 16:54:

That text-processing magic is needed byefibootmgr.
Because efibootmgr requires disk and partition to be entered separately.

efibootmgr --disk ${Disk} --part ${ParNr} ...

I see. You still need the magic to extract ${ParNr}. But this is the simpler part of the code, the most complicated and full of special cases part is the one to extract ${Disk}. And this is the part which would IMHO benefit the most from using a single command like grub2-probe -t disk, if it DTRT (needs to be tested especially on the special cases like multipath, NVMe and MMC of course).

By the way, are those variables going to be used in another file? If not, shouldn't they be lowercase and local?

gozora commented at 2019-12-10 17:09:

Most of that code is borrowed from 670_run_efibootmgr.sh and not written by me ...
But I agree that everything can be written simpler and more elegant ...

V.

pcahyna commented at 2019-12-10 17:15:

If I add modules manually by:

GRUB2_MODULES="part_gpt part_msdos fat ext2 normal chain boot configfile linux linuxefi multiboot jfs iso9660 usb usbms usb_keyboard video udf ntfs all_video gzio efi_gop reboot search test echo btrfs lvm"

It still doesn't work though ... (when I type ls I get message ls module is missing ...`

I dislike this part of the PR. I think that the error that you got is unexpected and unhelpful. The PR changed semantics of GRUB2_MODULES a bit: formerly it determined what modules to load, now it determines also what modules to install into the standalone image ramdisk (and thus are accessible for autoloading). I would like to change it back to determining only what modules to load, and install all the modules, as it is the default for grub-mkstandalone.
For this, however, I would prefer to change the name of the variable to GRUB2_MODULES_LOAD for symmetry with MODULES_LOAD. MODULES_LOAD is an array of kernel modules that should be loaded, so in an analogy, the list of GRUB2 modules to be loaded should be called GRUB2_MODULES_LOAD. This opens the possibility of using GRUB2_MODULES for the list of GRUB2 modules to install (if needed), in an analogy with MODULES (which is a list of kernel modules to install). What do you think?

gozora commented at 2019-12-10 18:05:

@pcahyna don't forget to use reasonable defaults. We don't want users to reconfigure ReaR after upgrade!
Systems that booted fine prior this PR must boot after it as well ...

V.

pcahyna commented at 2019-12-10 18:14:

The autodetection of modules that this PR adds actually moves in this direction, as there will be (hopefully) less cases of broken GRUB2 images due to missing modules and inaccessible /boot.
Concerning GRUB2_MODULES, there has not been a release with it yet, so I hope it is acceptable to change the name, if we agree that GRUB2_MODULES_LOAD is better suitable.

jsmeix commented at 2019-12-11 08:47:

@pcahyna
I like your config variable names cleanup proposal in your
https://github.com/rear/rear/pull/2293#issuecomment-564137081

Normally we cannot change established config variable names
because that would result regressions for users who use the
old name in their etc/rear/local.conf file but in this case all is o.k.
because the GRUB2_MODULES config variable was introduced by
https://github.com/rear/rear/commit/4ac6b69f77fe2952d479f2b441a8b8890d63853e
so that the GRUB2_MODULES config variable was never used
in a released ReaR version.

We could even change an established config variable name
if there is a serious reason for such a change but then we must
either
have backward compatibility code that maps values in the old named variable
into the new named variable (if such mapping is possible at all)
or
we must error out with a meaningful error message
after reading all config files if the old name is used by the user
so that the user is informed.

The question is when backward compatibility code is possible
why the variable name needs to be changed at all
and when backward compatibility code is there
it needs to be maintained ad infinitum
so perhaps the cleaner way is to error out if the old name is used
because it is better to make a painful break than to draw out the pain.

pcahyna commented at 2019-12-11 10:45:

@jsmeix I pushed a commit to introduce GRUB2_MODULES_LOAD and to change the semantics of GRUB2_MODULES to determine what modules are to be installed in the ramdisk. I am not sure that GRUB2_MODULES is needed at all after the introduction of GRUB2_MODULES_LOAD (will there ever be a problem with the default of installing all modules unconditionally)? but I added it for symmetry with MODULES and MODULES_LOAD.

gozora commented at 2019-12-15 20:07:

Unfortunately I've found another problem while running some tests on SLE15.
When I tried to boot rear.efi I got stuck in boot process with following message:
Screenshot from 2019-12-15
20-03-37
I had to add all_video to GRUB2_MODULES_LOAD to fix this problem.
I'm not really sure how this could efficiently solved, looks like hard coding of at least some modules will be required after all ...

V.

pcahyna commented at 2019-12-15 22:15:

Unfortunately I've found another problem while running some tests on SLE15.
When I tried to boot rear.efi I got stuck in boot process with following message:
Screenshot from 2019-12-15
20-03-37
I had to add all_video to GRUB2_MODULES_LOAD to fix this problem.
I'm not really sure how this could efficiently solved, looks like hard coding of at least some modules will be required after all ...

V.

@gozora thanks for the test. The all_video module should not be built into the image, grub-install does not do it either. Rather, there should be a command to load it it the grub config file (look at your distribution grub.cfg). create_grub2_cfg in lib/bootloader-functions.sh does this already:
https://github.com/rear/rear/blob/d8946bcc1cc61086b392e5d85ce956fdcd7d69cf/usr/share/rear/lib/bootloader-functions.sh#L497, so you should not have the problem with the ISO, but 940_grub2_rescue.sh does not use it (the function is not ready to be used this way).
The problem is, we have unified Grub2 image creation between the ISO and GRUB2_RESCUE, but we have not unified the config file creation.

pcahyna commented at 2019-12-15 22:59:

@gozora hope your issue is fixed now. I just copied the graphics-related code from create_grub2_cfg, because sharing the code would mean a bigger refactoring.

gozora commented at 2019-12-19 20:57:

Hello @pcahyna,

Thanks for your update, I'll test it again soon ...

V.

gozora commented at 2020-01-20 20:09:

After running several other tests this PR looks good to me.
If there aren't any further objections I'll merge it soon.

V.

pcahyna commented at 2020-01-20 20:23:

@gozora thanks for testing! Does it fix your btrfs-on-lvm case without the need of setting GRUB2_MODULES(_LOAD) manually?

gozora commented at 2020-01-20 20:58:

Hello @pcahyna, yes btrfs-on-lvm problem is now solved without any explicit Grub module configuration inside ReaR.

V.

jsmeix commented at 2020-01-21 09:49:

@rmetrich
currently you request changes here.
Is your request still valid or is it meanwhile outdated or solved?

gozora commented at 2020-01-23 19:08:

@pcahyna thanks for your work on this PR, it is appreciated!

V.

pcahyna commented at 2020-01-30 14:26:

By the way, are those variables going to be used in another file? If not, shouldn't they be lowercase and local?

Most of that code is borrowed from 670_run_efibootmgr.sh and not written by me ...
But I agree that everything can be written simpler and more elegant ...

#2324


[Export of Github issue for rear/rear.]