#534 Issue closed: 20_recreate_hpraid.sh creates "set -e ... StopIfError" which cannot work

Labels: bug, fixed / solved / done

jsmeix opened issue at 2015-01-21 13:56:

In general a bash script like

#! /bin/sh
set -e
do_something
if ! test "$?" = "0"
then echo "error message"
fi

cannot work because "set -e" exits the script when do_something results non-zero exit code so that the subsequent condition is never reached in this case.

In scripts with "set -e" it has to be as follows:

#! /bin/sh
set -e
do_something || echo "error message"

In particular I suggest to do in 20_recreate_hpraid.sh something like this (but not yet tested by me if I call 'Error' correctly here):

# Unload CCISS module to make sure nothing is using it
rmmod cciss || Error 1 "CCISS failed to unload, something is still using it !"

jsmeix commented at 2015-01-21 13:59:

I general when "set -e" is active no "...IfError" function will work because all the "...IfError" functions test

if (( $? != 0 )); then

jsmeix commented at 2015-01-21 14:03:

In my recovery system I did:

RESCUE f74:~ # find /var/lib/rear /usr/share/rear | xargs grep -l 'set -e' 2>/dev/null
/var/lib/rear/layout/diskrestore.sh
/var/lib/rear/layout/hpraid.sh
/usr/share/rear/verify/DP/default/50_select_dp_restore.sh
/usr/share/rear/verify/DP/default/45_request_gui_restore.sh
/usr/share/rear/layout/prepare/default/54_generate_device_code.sh
/usr/share/rear/layout/prepare/default/20_recreate_hpraid.sh
RESCUE f74:~ # grep 'IfError' /var/lib/rear/layout/diskrestore.sh /var/lib/rear/layout/hpraid.sh /usr/share/rear/verify/DP/default/50_select_dp_restore.sh /usr/share/rear/ve
rify/DP/default/45_request_gui_restore.sh /usr/share/rear/layout/prepare/default/54_generate_device_code.sh /usr/share/rear/layout/prepare/default/20_recreate_hpraid.sh
/var/lib/rear/layout/hpraid.sh:StopIfError "CCISS failed to unload, something is still using it !"
/usr/share/rear/layout/prepare/default/20_recreate_hpraid.sh:StopIfError "CCISS failed to unload, something is still using it !"
RESCUE f74:~ # 

Accordingly it seems 20_recreate_hpraid.sh is the only case where "set -e" is used together with a "...IfError" function.

gdha commented at 2015-01-21 15:48:

@jsmeix thanks for the nice debugging session

jsmeix commented at 2015-01-27 14:33:

Gratien,
I think your commit ae3a930bf26f82e5739c7799e1be0d79444dc19e
cannot work because it also does the same

set -e
...
rmmod cciss
if (( $? != 0 )); then

but when "set -e" is set and "rmmod cciss" fails, the script is exited (with the exit code of "rmmod cciss") so that "if (( $? != 0 )); then" is never reached.

jsmeix commented at 2015-01-28 13:35:

For doumentation what I tested on SLES12:

user@host$ ./test_set-e.sh
Error: no foobar
Unspecific error
ls: cannot access foobar: No such file or directory

user@host$ echo $?
2

user@host$ cat test_set-e.sh
#! /bin/bash
#
function ErrorMsg
{ if test -n "$1"
  then echo "Error: $1"
       return 0
  fi
  echo 'Unspecific error'
  ls foobar
  return 99
}
# From "help set" (using GNU bash, version 4.2.47):
#   -e Exit immediately if a command exits with a non-zero status
set -e
# From "man bash":
#   A simple command is a sequence of optional variable assignments
#     followed by blank-separated words and redirections,
#     and terminated by a control operator.
#   A pipeline is a sequence of one or more commands
#      separated by one of the control operators | or |&.
#   A list is a sequence of one or more pipelines
#     separated by one of the operators ;, &, &&, or ||, and
#     optionally terminated by one of ;, &, or .
# Now it is questionable what "command" means in "help set".
# If "command" in "help set" means "simple command" or even "pipeline"
# then the following "command" would exit immediately
# before the subsequent "echo" or the "ErrorMsg" call
echo 'Hello' | grep -q 'foobar' && echo 'Success' || ErrorMsg 'no foobar'
# so that no subsequent line is reached
echo 'Hello' | grep -q 'foobar' && echo 'Success' || ErrorMsg
echo 'Finished'
# but in practice (using GNU bash, version 4.2.47) the complete output is
#   Error: no foobar
#   Unspecific error
#   ls: cannot access foobar: No such file or directory
# so that we can conclude "command" in "help set" means "list" and
# because the exit code is 2 from "ls foobar" (not 99 from "return 99")
# the "set -e" behaviour also happens within function calls (as expected)
# which means functions must be implemented to work correctly also
# when they are called from a "set -e" environment.

jsmeix commented at 2015-01-29 13:37:

@gdha

See my above testing script.

I like to check whether or not functions in rear are implemented to work correctly also when they are called from a "set -e" environment.

Again (as for "my_udevesettle" in https://github.com/rear/rear/issues/533) I find two "Error" function implementations:

A simple one in usr/share/rear/skel/default/etc/scripts/system-setup.d/00-functions.sh and a more complicated one in usr/share/rear/lib/_input-output-functions.sh

I find this really confusing.

I guess as in https://github.com/rear/rear/issues/533 the one in 00-functions.sh is only meant for the startup script so that the one in usr/share/rear/lib/_input-output-functions.sh is used here?

How can I find out what implementation of a function is used in a particular rear script?

Another question:

Exist already a function that implements waiting until the user did something? I would like to have such a function for debugging. I would add it as a breakpoint in a script so that the script waits there and I can analyze the curent state and when I am finished I do something to let the script continue. For an example see the script in "How does a backup restore script look like?" in
https://en.opensuse.org/SDB:Disaster_Recovery

It has some debugging triggers where it can stop until a particular file is manually deleted so that one can inspect the recreated system at those states (excerpt):

# wait so that one can inspect the system before the actual work is done here:
test -n "$backup_wait_before_restore" && touch $backup_wait_before_restore
while test -e "$backup_wait_before_restore"
do echo "waiting until $backup_wait_before_restore is removed (sleeping 10 seconds)"
   sleep 10
done
# start the actual work:

jsmeix commented at 2020-07-01 13:45:

Via https://github.com/rear/rear/commit/daf35e235d0770c663ff8dba866dddec76586a27
I added an explanatory comment in lib/_input-output-functions.sh
that using the ...IfError functions can result unexpected behaviour in certain cases.


[Export of Github issue for rear/rear.]