#2218 PR merged
: Do not keep the build dir when used noninteractively¶
Labels: cleanup
, fixed / solved / done
pcahyna opened issue at 2019-08-22 09:59:¶
Print helpful messages about keeping/not keeping the build directory in case of errors.
Relax-and-Recover (ReaR) Pull Request Template¶
Please fill in the following items before submitting a new pull request:
Pull Request Details:¶
-
Type: Bug Fix
-
Impact: High
-
Reference to related issue (URL):
#2121, https://bugzilla.redhat.com/show_bug.cgi?id=1693608 -
How was this pull request tested?
yum -y install rear
echo 'COPY_AS_IS+=( /usr/bin/xterm )' >> /etc/rear/local.conf
yum -y install xterm
rpm -e --nodeps libXaw.x86_64 # don't do it on a workstation
This makes ReaR complain about broken xterm and abort. To test it interactively:
rear mkrescue
Result:
Broken symlink '/usr/lib/modules/3.10.0-1077.el7.x86_64/build' in recovery system because 'readlink' cannot determine its link target
Broken symlink '/usr/lib/modules/3.10.0-1077.el7.x86_64/source' in recovery system because 'readlink' cannot determine its link target
There are binaries or libraries in the ReaR recovery system that need additional libraries
/bin/xterm requires additional libraries (fatal error)
libXaw.so.7 => not found
Build area kept for investigation in /tmp/rear.YiON7zQALRvXj6Q, remove it when not needed
ReaR recovery system in '/tmp/rear.YiON7zQALRvXj6Q/rootfs' needs additional libraries, check /root/rear/var/log/rear/rear-ci-vm-10-0-137-193.log for details
ERROR: ReaR recovery system in '/tmp/rear.YiON7zQALRvXj6Q/rootfs' not usable (required libraries are missing)
Some latest log messages since the last called script 990_verify_rootfs.sh:
stty is /bin/stty
parted is /bin/parted
partprobe is /bin/partprobe
wipefs is /bin/wipefs
mkfs is /bin/mkfs
mkfs.xfs is /bin/mkfs.xfs
xfs_admin is /bin/xfs_admin
ldconfig is /bin/ldconfig
Aborting due to an error, check /root/rear/var/log/rear/rear-ci-vm-10-0-137-193.log for details
Terminated
To test noninteractively:
crontab -e
#add
#*/5 * * * * /usr/sbin/rear mkrescue
and wait several minutes. Mail from cron will read:
There are binaries or libraries in the ReaR recovery system that need additional
libraries
Build area /tmp/rear.pHJTicEpfOCMZVf will be removed
To preserve it for investigation set KEEP_BUILD_DIR=1 or run ReaR with -d
/bin/xterm requires additional libraries (fatal error)
libXaw.so.7 => not found
ERROR: ReaR recovery system in '/tmp/rear.pHJTicEpfOCMZVf/rootfs' not usable
Aborting due to an error, check /var/log/rear/rear-ci-vm-10-0-137-193.log for details
- Brief description of the changes in this pull request:
The rear package in Fedora and derived distributions drops a file to /etc/cron.d which runsrear mkrescue
.
When the recovery system is broken (an example is #2144),rear mkrescue
will fail and leave the build area behind, without giving a hint that the build area should be removed (because this hit is printed only when running rear with the -v flag).
This change:- does not disable the removal of the build area when used noninteractively - instead it gives a hint how to keep the build area should one really want it
- when the build area is kept, print a more prominent message, even when run without -v.
pcahyna commented at 2019-08-22 10:11:¶
The description of KEEP_BUILD_DIR from #2121 should be updated together with this change.
pcahyna commented at 2019-08-28 10:53:¶
The description of KEEP_BUILD_DIR from #2121 should be updated together with this change.
Done.
pcahyna commented at 2019-08-29 09:45:¶
according to a review by @xjezda00, I introduced a new variable
KEEP_BUILD_DIR_ON_ERRORS
which can be used to restore the previous
behaviour of keeping the build dir on errors always by setting it to
yes
.
pcahyna commented at 2019-08-29 10:45:¶
see https://bugzilla.redhat.com/show_bug.cgi?id=1693608#c4
jsmeix commented at 2019-08-29 11:16:¶
I am still not in the office and will not be for some more weeks
so I cannot actually work on ReaR.
jsmeix commented at 2019-08-29 11:40:¶
Offhandedly I think I did it wrong to set KEEP_BUILD_DIR=1 in a
script
because this is against what the user specified and I am in general
against it
when ReaR works in an automated way against the user because I want
the user to always have the final power.
So if the user wants to keep the build directory for whatever reason
he can run ReaR with the -D option or he can specify
KEEP_BUILD_DIR=1
in his etc/rear/local.conf file.
So from my current point of view I think the right, clean, and simple
solution is
to remove the KEEP_BUILD_DIR=1 settings from the
990_verify_rootfs.sh script.
Furthermore that I had not added a comment that explains why I thought
it would be right to set KEEP_BUILD_DIR=1 in 990_verify_rootfs.sh
(which is against
https://github.com/rear/rear/wiki/Coding-Style)
basically proves that I had done that without a real good reason.
Finally that the current attempts to solve it result more and more
complicated code indicate that the actual root cause is "something
else"
(i.e. the actual root cause are my wrong hardcoded KEEP_BUILD_DIR=1
settings in the 990_verify_rootfs.sh script).
pcahyna commented at 2019-08-29 11:53:¶
Offhandedly I think I did it wrong to set KEEP_BUILD_DIR=1 in a script
I found it strange as well. Just removing it would be definitely a
simplification, but it would be less helpful for debugging of errors, as
@xjezda00 pointed out in
https://bugzilla.redhat.com/show_bug.cgi?id=1693608#c4
. I would recommend keeping the error message part of the PR if this
solution is accepted:
https://github.com/rear/rear/pull/2218/files#diff-14e72ffdba2836896118ed562369b421R27
jsmeix commented at 2019-08-29 11:53:¶
@schlomo
I added you as reviewer here because I would like to get your opinion
about the generic issue that lays behind here:
I had added a hardcoded KEEP_BUILD_DIR=1 in 990_verify_rootfs.sh
when that script detects an error in the ReaR recovery system so that
the user could directy inspect the contents of the recovery system
without the need to re-run "rear mkrescue" with the -D option
or with KEEP_BUILD_DIR=1 explicitly set by the user.
But this way it works in case of an error against what the user had
specified when he runs it without -D or without KEEP_BUILD_DIR=1.
jsmeix commented at 2019-08-29 12:19:¶
@pcahyna
more clear messages that help the user to better understand what goes
on
are much appreciated.
By the way:
Another generic improvement could be to make KEEP_BUILD_DIR a ternary
variable
(but that requires changes at all places where KEEP_BUILD_IR is
used)
so that the user gets real final power to enforce "yes" or "no" as he
wants.
Currently KEEP_BUILD_DIR="no" behaves same as KEEP_BUILD_DIR=1
because of the "non-empty means yes" boolean behaviour in ReaR, cf.
https://github.com/rear/rear/blob/master/usr/share/rear/lib/framework-functions.sh#L124
which is confusing for the user and currently we cannot decide in our
code
if the user actually wants "no" when KEEP_BUILD_DIR is empty.
I.e. currently we cannot distinguish between the default value
and an enforced user setting.
pcahyna commented at 2019-08-29 12:23:¶
@pcahyna
more clear messages that help the user to better understand what goes on
are much appreciated.By the way:
Another generic improvement could be to make KEEP_BUILD_DIR a ternary variable
(but that requires changes at all places where KEEP_BUILD_IR is used)
so that the user gets real final power to enforce "yes" or "no" as he wants.Currently KEEP_BUILD_DIR="no" behaves same as KEEP_BUILD_DIR=1
because of the "non-empty means yes" boolean behaviour in ReaR, cf.
https://github.com/rear/rear/blob/master/usr/share/rear/lib/framework-functions.sh#L124
which is confusing for the user and currently we cannot decide in our code
if the user actually wants "no" when KEEP_BUILD_DIR is empty.
I.e. currently we cannot distinguish between the default value
and an enforced user setting.
That was my thinking as well. I wanted to use the empty value as meaning "not set by user", but then I realized that it already means "false". Unfortunately, switching it to ternary means a compatibility break. By the way, your reasoning that "non-empty means yes" is confusing for the user is valid, but applies to most of the variables, because this boolean (not ternary) semantics is very common in ReaR.
jsmeix commented at 2019-08-29 12:37:¶
@pcahyna
I don't think that switching it to ternary means a compatibility break
as long as "empty" means "no" by default/fallback
(and "yes" if the -D option is used).
But an explicit KEEP_BUILD_DIR="no" would enforce
that BUILD_DIR is never kept.
This way we would be able to set KEEP_BUILD_DIR="yes" in a script
but only if it is not already set to "no".
And then we might even still set it in 990_verify_rootfs.sh as
follows
(provided we come to the conclusion it is really a good idea to do that)
if ERROR_CONDITION ; then
# Keep BUILD_DIR for easier debugging of the error
# unless the user had explicitly specified he never wants that:
is_false "$KEEP_BUILD_DIR" || KEEP_BUILD_DIR="yes"
Error "..."
pcahyna commented at 2019-08-29 14:14:¶
@jsmeix Indeed, this we do not break compatibility in the usual case,
where the user sets KEEP_BUILD_DIR to a string accepted by is_true
to indicate yes
. (Potential problem is, anything that is not false and
is not empty is now treated as true, which would not be compatible with
the change. (A particularly silly case is if the user sets
KEEP_BUILD_DIR to 0 or "no" to mean "yes".))
Still, your snippet does not address the concern that by default, automated runs of ReaR clutter /tmp on errors. We could do
if ! is_false "$KEEP_BUILD_DIR" && tty -s; then
KEEP_BUILD_DIR="yes"
but it would be not possible to get the previous behaviour (keep the build dir on error only) - @xjezda00 's concern.
if ! is_false "$KEEP_BUILD_DIR" && { tty -s || test "$KEEP_BUILD_DIR" == errors; }; then
KEEP_BUILD_DIR="yes"
would do it (there would be a special value "errors" which would indicate, keep it only on errors).
pcahyna commented at 2019-08-29 15:30:¶
What to do with scripts that check the value of KEEP_BUILD_DIR before 990_verify_rootfs.sh modifies it? Such as build/OPALPBA/Linux-i386/391_list_executable_dependencies.sh ( https://github.com/rear/rear/pull/1659/files#diff-94b2089c3ecae157af07001369db4783 ), which is now maybe the only one, but in the future there may be more of them.
jsmeix commented at 2019-08-30 17:15:¶
@pcahyna
I would assume that users who set KEEP_BUILD_DIR to a strange value
have to expect strange things ;-)
When someone sets it to "no" and now it behaves this way,
I think this change is rather a fix than a regression.
When someone sets it to 'X' that neither is_true
nor is_false
he wold get the default behaviour which is a minor regression for him.
Of course we would document such a change in the release notes
so the users are informed.
I think we cannot keep 100.0% backward compatibility in any case
when there is a good reason for a change with minimal regressions.
We did several such changes in the past, cf. the various
"backward incompatible changes" parts in our release notes
https://github.com/rear/rear/blob/master/doc/rear-release-notes.txt
I like variables that have in the end a boolean 'yes/no' meaning
but also support enhanced ways to specify more exactly how things are
meant
so I like it when KEEP_BUILD_DIR="errors" means "yes but only in case
of errors".
I do not see a problem with scripts that check KEEP_BUILD_DIR.
For example if is_true $KEEP_BUILD_DIR; then
in
https://github.com/rear/rear/pull/1659/files#diff-94b2089c3ecae157af07001369db4783
looks fine to me in particular when KEEP_BUILD_DIR gets ternary
semantics.
pcahyna commented at 2019-08-30 17:22:¶
I like variables that have in the end a boolean 'yes/no' meaning
but also support enhanced ways to specify more exactly how things are meant
so I like it when KEEP_BUILD_DIR="errors" means "yes but only in case of errors".
Ok, I will update the PR to introduce this semantics instead of
KEEP_BUILD_DIR_ON_ERRORS
.
I do not see a problem with scripts that check KEEP_BUILD_DIR.
For exampleif is_true $KEEP_BUILD_DIR; then
in
https://github.com/rear/rear/pull/1659/files#diff-94b2089c3ecae157af07001369db4783
looks fine to me in particular when KEEP_BUILD_DIR gets ternary semantics.
The problem is that this script checks the value, then 990_verify_rootfs.sh changes it to something else (it is sourced after).
jsmeix commented at 2019-08-30 17:52:¶
@pcahyna
you are right.
I had thought 990_verify_rootfs.sh sets KEEP_BUILD_DIR=1
only when it errors out.
But now I see there are two cases where 990_verify_rootfs.sh
sets KEEP_BUILD_DIR=1 but does not error out later:
if contains_visible_char "$broken_binaries" ; then
LogPrintError "There are binaries or libraries in the ReaR recovery system that need additional libraries"
KEEP_BUILD_DIR=1
...
LogPrintError "$binary requires additional libraries (specified as non-fatal)"
...
LogPrintError "$binary requires additional libraries"
In those two cases fatal_missing_library="yes"
is not set.
This needs to be corrected because there is no reason
to automatically keep BUILD_DIR when there is no error.
jsmeix commented at 2019-08-30 18:01:¶
I mean that in those two cases where fatal_missing_library="yes"
is (correctly) not set, it needs to be corrected that
KEEP_BUILD_DIR=1
is set.
E.g. the KEEP_BUILD_DIR=1 at the beginning of the if
clause
must be removed and at each place where fatal_missing_library="yes"
is
set
also KEEP_BUILD_DIR=1 must be set.
Alternatively KEEP_BUILD_DIR=1 can be set directly before the BugError
and Error calls
at the end of the script.
pcahyna commented at 2019-09-02 12:27:¶
@jsmeix I had something else in mind than the problem you have found.
The issue is that
build/OPALPBA/Linux-i386/391_list_executable_dependencies.sh
gets
sourced, KEEP_BUILD_DIR
is not true, then 990_verify_rootfs.sh
gets
sourced and can set KEEP_BUILD_DIR
to true. But
build/OPALPBA/Linux-i386/391_list_executable_dependencies.sh
behaved
as if it were not true, i.e. the behaviour was inconsistent with the
final value.
pcahyna commented at 2019-09-03 10:36:¶
@jsmeix I pushed a commit to address your last comment.
pcahyna commented at 2019-09-03 12:21:¶
@jsmeix I pushed a commit to implement what I proposed (changes to default.conf to be done).
jsmeix commented at 2019-09-03 12:31:¶
As far as I understand the code in
build/OPALPBA/Linux-i386/391_list_executable_dependencies.sh
the intent is to do that when the user had set KEEP_BUILD_DIR.
@OliverO2 may explain his intent behind here in more detail.
In contrast we set KEEP_BUILD_DIR automatically in
990_verify_rootfs.sh only when it errors out because
here the intent is to make debugging easier for the user.
So I think that both intents are independent of each other
so that the default behaviour does not need to be the same
for different intents.
For me a crucial point is "final power to the user" so that an
explicit user setting KEEP_BUILD_DIR="no" must be obeyed
which means we must not automatically set KEEP_BUILD_DIR=1
i.e. ReaR must not work against what its user has specified.
pcahyna commented at 2019-09-03 12:53:¶
@jsmeix
For me a crucial point is "final power to the user" so that an
explicit user setting KEEP_BUILD_DIR="no" must be obeyed
which means we must not automatically set KEEP_BUILD_DIR=1
i.e. ReaR must not work against what its user has specified.
this should be the effect of the last changes. I rebased the last commit to include updates to the description in default.conf, so it is ready to be reviewed, and, after testing, merged.
jsmeix commented at 2019-09-03 15:10:¶
@rear/contributors
I am still not in the office and will not be for some more weeks
so I cannot actually test anything in ReaR.
Therefore I would appreciate it when one of you
could also review it and merge it if it is o.k.
gdha commented at 2019-09-13 06:31:¶
@rear/contributors for me it is ok and for you?
rmetrich commented at 2019-09-13 07:35:¶
@gdha It's ok for me
jsmeix commented at 2019-09-17 11:15:¶
@gdha @rmetrich
thank you for your review.
[Export of Github issue for rear/rear.]