#3104 PR merged: Error() instead of BugError() in 850_make_USB_bootable.sh

Labels: enhancement, fixed / solved / done

jsmeix opened issue at 2023-12-11 10:54:

  • Type: Enhancement

  • Impact: Low

  • Reference to related issue (URL):
    https://github.com/rear/rear/issues/1571
    and
    https://github.com/rear/rear/issues/3098
    i.e. false BUG Error in
    output/USB/Linux-i386/850_make_USB_bootable.sh
    because of unsupported OUTPUT_URL for OUTPUT=USB

  • How was this pull request tested?
    Not tested.
    Those code changes should not cause any issue.

  • Description of the changes in this pull request:

In output/USB/Linux-i386/850_make_USB_bootable.sh replaced the

BugError "Filesystem where the booting related files are on $RAW_USB_DEVICE could not be found"

with more explanatory LogPrintError plus Error what the actual reason is,
similar as in the related
https://github.com/rear/rear/pull/3102
because in most cases it is not a bug in ReaR
but an error (e.g. a user configuration error for OUTPUT=USB).

jsmeix commented at 2023-12-12 16:28:

For documentation here a copy of
a comment from @pcahyna directly at the code changes
which got lost because I changed that code lines via
https://github.com/rear/rear/commit/5d86ec547e03fc8fa7d635b46f642682662004da

Error "Unsupported filesystem $usb_filesystem for the booting related files on $RAW_USB_DEVICE"

This part is a bit misleading.
The purpose of the code is to determine the
filesystem on `$RAW_USB_DEVICE`.
But the code does not use `RAW_USB_DEVICE` at all
except in messages and comments.
It uses `$BUILD_DIR/outputfs` instead.
So, it may well happen that `$RAW_USB_DEVICE` is ext3-formatted,
but there is a XFS filesystem mounted at `$BUILD_DIR/outputfs`
(because of some `OUTPUT_URL` confusion that makes `RAW_USB_DEVICE`
 out of sync with `OUTPUT_URL`).
In that case, the user, when seeing the error message,
will check the filesystem on `$RAW_USB_DEVICE`
and will see that it is ext3, and will wonder
what is ReaR complaining about.

Here also copies of my replies for completeness:

My idea was to show both $BUILD_DIR/outputfs and $RAW_USB_DEVICE
because $BUILD_DIR/outputfs is a meaningless directory like

/var/tmp/rear.Bhlh9JjUxW91R0i/outputfs

so $RAW_USB_DEVICE provides some context to the user
that this two messages are about his USB device.

I will show both $BUILD_DIR/outputfs and $RAW_USB_DEVICE

I will not fix here the possible OUTPUT_URL confusion
that makes RAW_USB_DEVICE out of sync with OUTPUT_URL
because that belongs to a separated overhaul task
of our whole OUTPUT=USB code.

jsmeix commented at 2023-12-12 16:36:

With
https://github.com/rear/rear/commit/5d86ec547e03fc8fa7d635b46f642682662004da
and
https://github.com/rear/rear/commit/87ff78560da4852eabffffa6c0d6be49e0de3e7f
the user may now get error messages like

Could not find a filesystem in /proc/mounts for /var/tmp/rear.Bhlh9JjUxW91R0i/outputfs
ERROR: An ext2/3/4 or vfat filesystem must be mounted for the booting related files on /dev/sdb

or

Only ext2/3/4 and vfat are supported for the booting related files on /dev/sdb
ERROR: Unsupported filesystem xfs is mounted at /var/tmp/rear.Bhlh9JjUxW91R0i/outputfs

Of course this is still not perfect
but at least the user can now see that he should check
/var/tmp/rear.Bhlh9JjUxW91R0i/outputfs and /dev/sdb
so he may hopefully see what does not match in his case.

jsmeix commented at 2023-12-13 14:58:

@pcahyna
provided things are OK for you with my latest commits
I would like to merge it tomorrow afternoon

pcahyna commented at 2023-12-13 15:53:

I am looking at the CI error with centos-stream-8-x86_64 . It is of course not related to the change, but still is something that would be good to debug. As console log in the Testing Farm infrastructure is now enabled, we can look at the console output: https://artifacts.dev.testing-farm.io/a24b1484-7fc1-4e2a-8077-e72f9eea6cca/work-backup-and-restorefmaugnmx/console-68975dbe-0cbe-49ae-aa05-47d8fec8b66f.log . It show this error:

Running workflow recover within the ReaR rescue/recovery system
Running PRE_RECOVERY_SCRIPT 'mkdir /tmp/mnt; mount /dev/nvme0n1p1 /tmp/mnt/; modprobe brd rd_nr=1 rd_size=2097152; dd if=/tmp/mnt//var/lib/rear/output/rear-ip-172-31-16-70.iso of=/dev/ram0; umount /tmp/mnt/;'
/etc/scripts/system-setup: line 255:   545 Killed                  rear $rear_options recover
'eval rear $rear_options recover' results exit code 137
'rear recover' failed with exit code 137
1) View Relax-and-Recover log file(s)
2) Login at the rescue system

It does not tell us what happened, but it at least shows where. I suspect that the fact that we don't see what happened is due to the kernel not printing its log messages, which is due to
https://github.com/rear/rear/blob/b1e9f5e1cbc735ed90526ff7fa54b922fcfa60e7/usr/share/rear/skel/default/etc/scripts/boot#L4
that was added 92d3a15be54137ae0f00f550c5e9e58d68142617 with the commit message Added dmesg -n1 to rescue system without explaining why.
Having kernel log messages on the console would be helpful for debugging filesystem-related errors like #2777 (see #3058 for explanation : when mount fails, the mount commands display only the unhelpful message wrong fs type, bad option, bad superblock on ..., missing codepage or helper program, or other error, and the real cause of the error like XFS (...): logbuf size must be greater than or equal to log stripe size is printed by the kernel to the message log).

Anyway, since the error happens when loading the ISO into a ramdisk (the whole ISO gets loaded into a ramdisk in this test procedure), I suspect that we are out of memory at this point. The failed test shows

Memory: 2052384K/3503836K available

and a successful test ( CentOS-Stream-9 ) shows

Memory: 2931284K/3942236K available

so indeed, successful test seems to have run on a machine with more memory than the failed test.

jsmeix commented at 2023-12-14 07:29:

@pcahyna
thank you for your review!
As always it notably improved things.

jsmeix commented at 2023-12-14 07:40:

FYI:
Intentionally I have '$BUILD_DIR/outputfs'
last in those messages because that is a
meaningless and longish technical term like
'/var/tmp/rear.XXXXXXXXXXXXXXX/outputfs'
so having such stuff at the end helps to make
those messages easier to read and comprehend.


[Export of Github issue for rear/rear.]