#2608 PR merged
: Fix setting boot path in case of UEFI partition (ESP) on MD RAID¶
Labels: enhancement
, bug
, cleanup
, fixed / solved / done
pcahyna opened issue at 2021-04-28 18:09:¶
Pull Request Details:¶
-
Type: Bug Fix
-
Impact: High
-
Reference to related issue (URL): #2595
-
How was this pull request tested?
Backup & restore on a RHEL 8 host with UEFI but without RAID and on a CentOS 7 host with UEFI and RAID (ESP on RAID). Examination of logs to check that correct efibootmgr commands were executed. -
Brief description of the changes in this pull request:
Use dependencies when determining partitions for UEFI
The ESP may be located on a RAID device. In this case, we need to
determine the physical RAID components and call efibootmgr on them
(after restoring - during the finalize step), because efibootmgr does
not know how to tell the firmware to boot from RAID. Moreover, the
current code tries to parse a device like /dev/md125
into a disk
called /dev/md
and partition 125
.
Fix by using storage dependencies: RAID components (the actual ESP copies) can be found as dependencies of the ESP block device. For consistency, use dependencies also to locate the ESP block device itself (the old method is preserved as a fallback). Similar strategy is used by the GRUB2 installation code (finalize/Linux-i386/660_install_grub2.sh), which is able to install GRUB2 on the RAID components correctly.
In addition, clean up usr/share/rear/finalize/Linux-i386/670_run_efibootmgr.sh a bit and add more logging to ease analysis when something goes wrong in this area.
TODO: finalize/Linux-i386/610_EFISTUB_run_efibootmgr.sh needs a similar fix.
Fixes #2595
pcahyna commented at 2021-04-29 18:58:¶
@gdha hello, should it have the enhancement label, when it Fixes #2595, which has the bug label?
pcahyna commented at 2021-04-29 18:59:¶
@rmetrich hello, please review, as you reported the problem.
rmetrich commented at 2021-04-30 10:51:¶
Hi @pcahyna I'm all good with your code, thanks a lot for the effort.
jsmeix commented at 2021-04-30 11:25:¶
I agree that ReaR is expected to restore a software RAID also on UEFI
systems
so the issue is a missing expected functionality a.k.a. "bug".
On the other hand there are zillions of other "reasonably expected"
functionalitries
that are currently missing in ReaR so such issues are also
enhancements
in particular when "reasonably expected" code for that is just not
there.
For me a plain "bug" is only when code that is there does not work as intended.
I.e. it is also not a plain "bug" when code that is there does work as
intended
but the current intention is not what others "reasonably expect" so such
code
is "buggy" and needs some enhancement to behave as "reasonably
expected".
jsmeix commented at 2021-04-30 11:29:¶
@pcahyna
thank you so much for your additional cleanup work like your recent
https://github.com/rear/rear/pull/2608/commits/8228246d16b1374be5cc25fb7a1244c7102b804a
I do very much appreciate it!
It helps so much to get ReaR's code step by step into a better state
where others can easily understand WHY and WHAT and HOW things are done
in ReaR
so that others can easily adapt and enhance ReaR in the future.
pcahyna commented at 2021-04-30 12:37:¶
@jsmeix, glad you like it! You know,
https://despair.com/products/quality
I can also replace the non-standard case identifiers like Dev
and
ParNr
by something that conforms to the style guide, if you think that
it makes sense to do it in the same PR.
jsmeix commented at 2021-04-30 13:10:¶
@pcahyna
yes, feel free to mercilessly replace old non-standard identifiers with
better names
in particular if needed with more explanatory names
cf. "Code must be easy to read" in
https://github.com/rear/rear/wiki/Coding-Style
I appreciate it!
jsmeix commented at 2021-04-30 13:31:¶
I wish you all a relaxed and recovering weekend!
jsmeix commented at 2021-04-30 13:40:¶
Some "famous words" before weekend regarding
https://despair.com/products/quality
The race for quality has no finish line- so technically,
it's more like a death march.
Yes, we all will be dead at some time.
But if we made code as intended by
https://github.com/rear/rear/wiki/Coding-Style
our code can evolve (it may not evolve but it can evolve)
so from our code future code can be derived (this may not happen but it
can happen)
in contrast to code that (amost) nobody understands which will be
completely removed
and replaced by other completely new code from scratch
so code that is hard to understand dies out
while code that is easy to understand can have descendants.
jsmeix commented at 2021-05-03 12:48:¶
It seems there is a dilemma where in the code
the Creating EFI Boot Manager entries...
info
should be done.
Current finalize/Linux-i386/670_run_efibootmgr.sh code in this pull request is (excerpts)
LogPrint "Creating EFI Boot Manager entries..."
...
if [ "$esp_mountpoint" = "/" ] ; then
esp_mountpoint="$TARGET_FS_ROOT/boot/efi"
Log "Mountpoint of $TARGET_FS_ROOT/$UEFI_BOOTLOADER not found, trying $esp_mountpoint"
fi
...
test -d "$esp_mountpoint" || return 0
so it could tell the user Creating EFI Boot Manager entries...
but then it exits silently and does nothing.
I think when test -d "$esp_mountpoint"
fails is must no longer exit
silently
so a matching message should be added like
if ! test -d "$esp_mountpoint" ; then
LogPrint "Skip creating EFI Boot Manager entries (no ESP mountpoint found)"
fi
Or my
https://github.com/rear/rear/pull/2608/files#r622957244
was wrong and nothing should be shown to the user about
Mountpoint of /mnt/local/UEFI_BOOTLOADER not found, trying /mnt/local/boot/efi
and the LogPrint "Creating EFI Boot Manager entries..."
should be moved after the test -d "$esp_mountpoint" || return 0
.
pcahyna commented at 2021-05-03 13:35:¶
I think when
test -d "$esp_mountpoint"
fails is must no longer exit silently
This part caught my attention as well. Is it even wise to return 0 in
this case? At this point we know that a UEFI bootloader should be used,
so skipping the rest of the file is almost certainly a serious error.
(Not so much related to the subject of the PR anymore though. For now, I
can add the other LogPrint as you suggest.)
pcahyna commented at 2021-05-03 14:07:¶
For now, I can add the other LogPrint as you suggest
Done (I changed the message a bit.)
jsmeix commented at 2021-05-03 15:07:¶
@pcahyna
regarding your
https://github.com/rear/rear/pull/2608#issuecomment-831263524
At this point we know that a UEFI bootloader should be used,
so skipping the rest of the file is almost certainly a serious error.
Then show it as an error to the user so he is informed something went
wrong
(e.g. use wording like failed to ...
or cannot ...
that tells
something went wrong)
and the user knows he may have to fix something manually after "rear
recover"
and leave the script with return 1
(the Source
function doesn't
abort)
but do not error out of "rear recover", cf.
finalize/Linux-i386/660_install_grub2.sh
https://github.com/rear/rear/blob/master/usr/share/rear/finalize/Linux-i386/660_install_grub2.sh
pcahyna commented at 2021-05-03 15:26:¶
Then show it as an error to the user so he is informed something went wrong
(e.g. use wording likefailed to ...
orcannot ...
that tells something went wrong)
and the user knows he may have to fix something manually after "rear recover"
and leave the script withreturn 1
I can do that, but I feel this deviates a lot from the subject of this PR (the questionable return 0 is inherited from the original code).
pcahyna commented at 2021-05-03 20:57:¶
@jsmeix I believe that I addressed all your comments except for the
return 0
in some places where it had existed before my changes. Please
review again. If everything is ok, I will force push to squash some
commits together.
pcahyna commented at 2021-05-03 20:59:¶
@pcahyna
yes, feel free to mercilessly replace old non-standard identifiers with better names
in particular if needed with more explanatory names
cf. "Code must be easy to read" in
https://github.com/rear/rear/wiki/Coding-StyleI appreciate it!
One of the last commits does that now.
jsmeix commented at 2021-05-04 12:01:¶
@pcahyna
thank you for all your work here - in particular for your additional
cleanup work!
jsmeix commented at 2021-05-04 12:02:¶
@rear/contributors
if there are no objections I would like to merge it tomorrow afternoon.
jsmeix commented at 2021-05-04 12:15:¶
@pcahyna
the return 0
when there is
no ESP mountpoint directory $esp_mountpoint
is from me via
https://github.com/rear/rear/commit/cb6fe3be719c1361b22d0484cd940c681752a973
I am sure I could not think about if return 1
could be better for this
particular case
because at that time I did not know your reasoning why return 1
is
right here in your
https://github.com/rear/rear/pull/2608#issuecomment-831263524
pcahyna commented at 2021-05-04 12:51:¶
@jsmeix are there any guidelines for return codes from scripts? ReaR
does not seem to care that much, actually. There is just
Debug "Source function: 'source $source_file' returns $source_return_code"
in the Source()
function. So it looks like returning 1 when problems
are discovered is not enough to get them noticed by the user and one
should complain very loudly (LogPrintError
, probably).
jsmeix commented at 2021-05-04 13:09:¶
@pcahyna
yes, what you describe in your
https://github.com/rear/rear/pull/2608#issuecomment-831916709
is exactly the current state in ReaR's code.
See also
https://github.com/rear/rear/pull/1970
So currently there is no difference in user experience between
leaving a script with return 0
versus return 1
.
But for those who read the code of the script it makes clear
if things were expected and all is well with leaving the script at that
place
versus an error-like issue happened why the script cannot be continued
(which is not a fatal error to Error
out of the whole
rear <workflow>
).
It would take some longer time until all scripts are fixed that
currently
exit with non-zero exit code by accident when the last command
in the script exits with non-zero exit code which is the reason why
there is no user alert when a script exits with non-zero exit code.
Currently it is intentionally only a Debug
message because
some users inspect the ReaR log and raise issues here
for all what looks like an error to them ;-)
See
https://github.com/rear/rear/issues/2416
and as an example the therein mentioned
https://github.com/rear/rear/issues/2473
and
https://github.com/rear/rear/issues/2479
pcahyna commented at 2021-05-04 16:31:¶
@jsmeix please check if you approve the latest change, esp. the wording of the error messages. If yes, I consider this PR functionally complete and I will rebase and force-push the branch to squash related commits together.
jsmeix commented at 2021-05-05 07:18:¶
@pcahyna
thank you for your final adaptions and enhancements.
Even if those kind of smaller things might look unimportant or
nitpicking
I appreciate them so much because at least for me such smaller things
are important
because they can make a bigger difference how others can understand what
goes on.
I dared to do some further tiny adaptions and enhancements regarding
such smaller things
via my recent commits here - I hope that is OK for you @pcahyna
Now all looks very well to me
and I would like to merge it today afternoon
unless there are objections from other @rear/contributors
jsmeix commented at 2021-05-05 07:35:¶
@pcahyna
no need to rebase and force-push the branch to squash related commits
together.
Just keep the individual development commits as is.
The merge commit will automatically be the one that has all of them in one commit.
I even prefer when all the individual development commits are kept as
is.
It preserves the development history in the master branch
i.e. I perfer to not remove information unless there is a serious reason
to do so.
In my local git clone https://github.com/rear/rear.git
I often use
git log --format="%ae %H %ad%n%s :%n%b%n" --graph | fmt -w 120 -u -t | less
to show the whole git history in the master branch and there
all the individual development commits often help me
to better understand in retrospect why things were changed
without the need to dig out the GitHub issue and pull request
and read through all the details there.
Having all the individual development commits in the master branch
history
is often even crucial to understand in retrospect what was actually
changed
because the merge commit message is often only rather meaningless, e.g.:
* | | | gratien.dhaese@gmail.com 285c0c7660caca184ea53dbec5525a69af1fdcc9 Tue Jan 26 14:44:39 2021 +0100
|\ \ \ \ Merge pull request #2544 from pcahyna/netbackup-vxpbx_exchanged :
| | | | | Changes for NetBackup (NBU) support
| | | | |
| * | | | pcahyna@redhat.com 9a0fcbfd0dcfecdfce5921e24a89e3918d962467 Wed Dec 16 11:08:14 2020 +0100
| | | | | Changes for NetBackup (NBU) support :
| | | | | Add /usr/openv/tmp directory to the NBU skeleton
| | | | |
| | | | |
| * | | | pcahyna@redhat.com 0c6909ed2c37dfa92f87b0714b62af7cae4720c9 Wed Dec 16 10:40:55 2020 +0100
| |/ / / Changes for NetBackup (NBU) support :
| | | | Copy NetBackup PBX related files to the rescue system and start
| | | | vxpbx_exchanged on boot
The merge commit message Changes for NetBackup (NBU) support
alone
tells nothing what was changed.
pcahyna commented at 2021-05-05 08:35:¶
@jsmeix
your newest commits are ok. But I don't entirely agree with
no need to rebase and force-push the branch to squash related commits together.
Just keep the individual development commits as is.
There are commits marked as squash! and fixup! and those were intended to be squashed with others. My intent was to preserve the history, but not all of the commits, there were commits that merely amend earlier commits and the history would IMO look better with them squashed into earlier commits. The squash! and fixup! log messages will look particularly ugly in the history because they were not intended to be part of the final history.
jsmeix commented at 2021-05-05 08:44:¶
@pcahyna
feel free to squash your commits as you like
provided the commit messages of the commits you finally keep
still show sufficient information so that others can understand in
retrospect
what was changed without the need to dig out the GitHub issue
and pull request and read through all the details there.
In particular we need that understanding in retrospect
when we make the doc/rear-release-notes.txt
https://github.com/rear/rear/blob/master/doc/rear-release-notes.txt
cf. "compile the information for the release notes" in
https://github.com/rear/rear/wiki/Release-process
pcahyna commented at 2021-05-05 11:28:¶
@jsmeix rebase done.
jsmeix commented at 2021-05-05 12:04:¶
@pcahyna
thank you.
I will merge it soon today.
pcahyna commented at 2021-05-05 19:08:¶
Oops. I accidentally made the file executable in 27de4cd95dabe86f531f132108744459b2dbcfd0 @jsmeix can you please revert the mode change, or should I submit a new PR reverting it?
jsmeix commented at 2021-05-07 09:44:¶
@pcahyna regarding executable
finalize/Linux-i386/670_run_efibootmgr.sh
you could simply submit it with mode -rw-r--r--
as "by the way"
addon
via your next PR without mentioning it in a commit message
(but you can mention it in a separated PR comment).
[Export of Github issue for rear/rear.]