#3171 PR merged: Make OS detection from /etc/os-release more robust

Labels: enhancement, bug, cleanup, fixed / solved / done

lzaoral opened issue at 2024-02-29 16:28:

Relax-and-Recover (ReaR) Pull Request Template

Please fill in the following items before submitting a new pull request:

Pull Request Details:
  • Type: Bug Fix

  • Impact: High

  • Reference to related issue (URL): https://github.com/rear/rear/issues/3149

  • How was this pull request tested? rear dump and then rescue on x86_64 Fedora Rawhide and RHEL 9.4

  • Description of the changes in this pull request - Notable changes:

    1. Search for rhel when detecting RHEL

      The redhat substring is present only in auxiliary variables and URLs in /etc/os-release both on Fedora and RHEL. Therefore, the grep worked only by accident on RHEL and was broken on Fedora because it wrongly classified Fedora as RHEL because its BUG_REPORT_URL refers to Red Hat's Bugzilla.

      On the other hand, rhel is the actual ID of RHEL and this string is
      not present in Fedora's /etc/os-release at all.

      Resolves: https://github.com/rear/rear/issues/3149#issuecomment-1964290116

    2. Always source a script in SourceStage at most once

      sort -u makes sure that each relevant script is listed and sourced
      only and that duplicates get discarded.

      Otherwise, some scripts could have been executed repeatedly (see the last
      linked comment). Especially, when OS_VENDOR and OS_MASTER_VENDOR have the
      same value which is the case at least on Fedora.

      Related: https://github.com/rear/rear/pull/3171#pullrequestreview-2053278291
      Resolves: https://github.com/rear/rear/issues/3149#issuecomment-1935586311

lzaoral commented at 2024-03-01 09:28:

Thank you for the analysis, @jsmeix! I've pushed changes that rename Arch to Arch_Linux and completely remove RedHatEnterpriseServer. I still have to test these changes on an actual RHEL machine so I'll let you know ASAP if the recovery is still ok.

lzaoral commented at 2024-03-01 10:04:

Good news is that the changes recovered the machine successfully. However, I've just remembered that ReaR uses OS_VENDOR for the name of the recreated EFI boot entry so we will not get Fedora for all Fedora-like distributions.

@pcahyna What do you think?

rvdkraan commented at 2024-03-01 10:35:

Hi
Clould this OS detection fix also solve the root problem for #3170 ?
Which is in my opinion also related to OS detection.

lzaoral commented at 2024-03-01 11:48:

Hey @rvdkraan! Unfortunately, this PR will not fix your issues because both the OS_VENDOR and OS_MASTER_VENDOR variables from your report were already correct.

jsmeix commented at 2024-03-06 13:02:

@pcahyna
could you please have a look here (as time permits), cf.
https://github.com/rear/rear/pull/3171#issuecomment-1972887597

pcahyna commented at 2024-03-07 11:18:

@jsmeix sorry for the delay, I will have a look.

jsmeix commented at 2024-03-20 12:38:

Some further thoughts regarding the
get_var_from_file() function:

Because at least at first glance it looks rather confusing
what $0 $1 $2 actually is when looking at the
get_var_from_file() function code, better tell in a comment
how get_var_from_file() needs to be called,
e.g. like

# get_var_from_file file_name variable_name

I think it would be good to BugError()
when $file_name is no readable file
and/or when $variable_name is empty.

Perhaps it is somehow possible to let
get_var_from_file() provide a useful exit code.
Currently it always results exit code 0 because
echo something is (basically) always successful:

# function get_var_from_file() { bash -c 'source "$0"; echo "${!1}"' "$1" "$2"; }

# get_var_from_file /etc/os-release ID && echo OK || echo FAIL
opensuse-leap
OK

# get_var_from_file qqq ID && echo OK || echo FAIL
qqq: qqq: No such file or directory

OK

# get_var_from_file /etc/os-release QQQ && echo OK || echo FAIL

OK

# cat my-invalid-assignments
if then
VAR="value"

# get_var_from_file my-invalid-assignments VAR && echo OK || echo FAIL
my-invalid-assignments: line 1: syntax error near unexpected token `then'
my-invalid-assignments: line 1: `if then'

OK

Therefore I suggest a better implementation
of the actual get_var_from_file() code body:

# function get_var_from_file() { bash -c 'source "$0" && echo "${!1}" || false' "$1" "$2" ; }

# get_var_from_file my-invalid-assignments VAR && echo OK || echo FAIL
my-invalid-assignments: line 1: syntax error near unexpected token `then'
my-invalid-assignments: line 1: `if then'
FAIL

# get_var_from_file /etc/os-release QQQ && echo OK || echo FAIL

OK

# get_var_from_file qqq ID && echo OK || echo FAIL
qqq: qqq: No such file or directory
FAIL

# get_var_from_file /etc/os-release ID && echo OK || echo FAIL
opensuse-leap
OK

An offhanded proposal how to let get_var_from_file()
return non-zero exit code when $variable_name is not assigned:

# function get_var_from_file() { bash -c 'source "$0" || false ; test -v "$1" && echo "${!1}" || false' "$1" "$2" ; }

# get_var_from_file /etc/os-release ID && echo OK || echo FAIL
opensuse-leap
OK

# get_var_from_file /etc/os-release QQQ && echo OK || echo FAIL
FAIL

# get_var_from_file qqq ID && echo OK || echo FAIL
qqq: qqq: No such file or directory
FAIL

# get_var_from_file my-invalid-assignments VAR && echo OK || echo FAIL
my-invalid-assignments: line 1: syntax error near unexpected token `then'
my-invalid-assignments: line 1: `if then'
FAIL

jsmeix commented at 2024-03-20 12:41:

@lzaoral
I am afraid - again it gets long and longer.
Hopefully you don't mind too much
when I make such kind of pedantic comments?

lzaoral commented at 2024-03-21 11:25:

Hopefully you don't mind too much when I make such kind of pedantic comments?

@jsmeix Not at all, both yours and @pcahyna's suggestions are very constructive!

lzaoral commented at 2024-03-21 12:06:

@jsmeix A bit shorter approach is to use the -u shell option. Also, source "$0" || false ; ... will not halt the execution immediately in your second implementation.

This should return 0 if the sourcing succeeded and the given variable is set. Otherwise, it returns 1 (the final || return 1 is necessary because bash returns 127 on unbound variables).

$ function get() { bash -c 'source "$0" || exit 1; set -u; echo "${!1}"' "$1" "$2" || return 1 }
$ get /etc/os-release ID; echo $?
fedora
0
$ get /etc/osrelease ID; echo $?
/etc/osrelease: line 1: /etc/osrelease: No such file or directory
1
$ get /etc/os-release IDD; echo $?
/etc/os-release: line 1: !1: unbound variable
1

On the other hand, if it would be ok to fail on any problems in the sourced file as well, bash -euc ... may be enough.

jsmeix commented at 2024-03-22 13:16:

I like bash -euc very much because it is simple
and it makes useful use of the "bash in between"
which was up to now more a workaround for source
that has a problem with readonly variables,
cf. our "late but helpful party guest" ;-) comment
https://github.com/rear/rear/pull/3171#discussion_r1521750947

So I played around with bash -euc and
this seems to behave best at least for my tests:

# function get_var_from_file() { bash -eu -c 'source "$0" && echo "${!1}"' "$1" "$2" ; }

FYI: Intentionally I separated -eu form -c to make it
more clear that there is different meaning of arguments
and I used source ... && echo to make it more clear
that successful source is a condition for echo
(instead of having this condition as an implicit result of -e).

During my tests how it behaves in error cases
I was thinking about that it could be helpful for debugging
when also within the bash inside that function
ReaR's outer DEBUGSCRIPTS setting would be supported,
so I enhanced it further to

# function get_var_from_file() { test "$DEBUGSCRIPTS" && local debug="-x" ; bash $debug -eu -c 'source "$0" && echo "${!1}"' "$1" "$2" ; }

That seems to behave well at least for me on command line

# unset myvar ; myvar="$( get_var_from_file qqq VAR && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
qqq: qqq: No such file or directory
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file my-invalid-assignments VAR && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
my-invalid-assignments: line 1: syntax error near unexpected token `then'
my-invalid-assignments: line 1: `if then'
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments QQQ && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
my-shell-assignments: !1: unbound variable
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments VAR && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
OK
declare -- myvar="value"

# DEBUGSCRIPTS=1

# unset myvar ; myvar="$( get_var_from_file qqq VAR && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source qqq
qqq: qqq: No such file or directory
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file my-invalid-assignments VAR && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source my-invalid-assignments
my-invalid-assignments: line 1: syntax error near unexpected token `then'
my-invalid-assignments: line 1: `if then'
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments QQQ && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source my-shell-assignments
++ VAR=value
++ STRING='lvalue = rvalue'
++ ARR=('first element = 1' 'second element = 2' 'last element = 3')
my-shell-assignments: !1: unbound variable
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments VAR && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source my-shell-assignments
++ VAR=value
++ STRING='lvalue = rvalue'
++ ARR=('first element = 1' 'second element = 2' 'last element = 3')
+ echo value
OK
declare -- myvar="value"

The last two examples (i.e. when 'source' succeeds with DEBUGSCRIPTS=1)
show that the whole sourced file gets output via 'set -x'
which might be unwanted e.g. when that file is huge and
which might expose (sensitive) variable values in ReaR's log file
(where stderr is redirected within ReaR's runtime environment).

So we must also test it within ReaR's runtime environment.

Perhaps it is best to "Keep It Simple and Secure"
and omit such transitive DEBUGSCRIPTS setting?

For completeness:
Tests with added EMPTY and BLANK variables and
with STRING (correct) and ARR (wrong) results
(with DEBUGSCRIPTS=1):

# cat my-shell-assignments 
VAR="value"
EMPTY=
BLANK=' '
STRING="lvalue \
= \
rvalue"
ARR=( 'first element = 1'
      'second element = 2'
      'last element = 3' )

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments EMPTY && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source my-shell-assignments
++ VAR=value
++ EMPTY=
++ BLANK=' '
++ STRING='lvalue = rvalue'
++ ARR=('first element = 1' 'second element = 2' 'last element = 3')
+ echo ''
OK
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments BLANK && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source my-shell-assignments
++ VAR=value
++ EMPTY=
++ BLANK=' '
++ STRING='lvalue = rvalue'
++ ARR=('first element = 1' 'second element = 2' 'last element = 3')
+ echo ' '
OK
declare -- myvar=" "

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments STRING && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source my-shell-assignments
++ VAR=value
++ EMPTY=
++ BLANK=' '
++ STRING='lvalue = rvalue'
++ ARR=('first element = 1' 'second element = 2' 'last element = 3')
+ echo 'lvalue = rvalue'
OK
declare -- myvar="lvalue = rvalue"

# unset myvar ; myvar="$( get_var_from_file my-shell-assignments ARR && echo OK 1>&2 || echo FAIL with $? 1>&2  )" ; declare -p myvar
+ source my-shell-assignments
++ VAR=value
++ EMPTY=
++ BLANK=' '
++ STRING='lvalue = rvalue'
++ ARR=('first element = 1' 'second element = 2' 'last element = 3')
+ echo 'first element = 1'
OK
declare -- myvar="first element = 1"

pcahyna commented at 2024-03-22 16:54:

Perhaps it is best to "Keep It Simple and Secure"
and omit such transitive DEBUGSCRIPTS setting?

I think so, I would not bother with it.

More important question is whether it is ok to fail on any problems in the sourced file as well. I think that the sourced file may use variable assignments that are not intended to work correctly with set -eu. For example FOO="$BAR" where BAR is unset. For this reason I think I would prefer to add set -u or set -eu after the source instead of having -eu as flags on the bash invocation.

jsmeix commented at 2024-03-25 13:15:

Regarding
https://github.com/rear/rear/pull/3171#issuecomment-2015507343

I do fully agree.

Meanwhile
(after I had some "deep thought" about it over the weekend)
I consider hardcoded "transitive DEBUG options setting"
a horrible idea regarding security and privacy.

Not only in this specific case but in general:
I.e. when ReaR scripts run with 'set -x'
it does not mean that programs which are called by ReaR
also run with whatever program-specific debug settings.

Reasoning
with this specific case as an example:

For example (as far as I know) some networking config files
could contain confidential data (like WLAN passwords)
so when ReaR needs some other value from such a file
the whole file (except bash comments)
would appear in the log (in DEBUGSCRIPTS mode)
so the confidential values would leak out => security issue.

Furthermore when whole user config files (except bash comments)
appear in the log (in DEBUGSCRIPTS mode)
ReaR would cross a privacy boundary:
From what belongs to ReaR (e.g. the value that ReaR needs
from a user config file) to what belongs only to the user
i.e. his whole config file with all his values => privacy issue.

If at all "transitive DEBUG options setting" might be enabled
only on explicit user request via '--expose-secrets'.

But in this specific case "transitive DEBUGSCRIPTS setting"
is not needed in practice because also without 'set -x'
the stderr messages tell sufficiently what the problem is
so that the user can inspect his config file on his own
what the actual reason inside his config file is.
Also when a user reports an issue to us
we can show him by the stderr messages what the problem is
but we neither need to know all his config values
nor should we know all his config values
to help him sufficiently.

jsmeix commented at 2024-03-25 13:24:

The current implementation

# function get_var_from_file() { bash -c 'source "$0"; echo "${!1}"' "$1" "$2" ; }

does not yet work correctly
when there is stdout output of the sourced file

# cat my-assignment 
if echo HELLO ; then
  VAR="value"
fi

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK 1>&2 || echo FAIL with $? 1>&2 ; declare -p myvar
OK
declare -- myvar="HELLO
value"

# unset VAR ; source my-assignment && echo OK 1>&2 || echo FAIL with $? 1>&2 ; declare -p VAR
HELLO
OK
declare -- VAR="value"

This one fixes it for me:

# function get_var_from_file() { bash -c 'source "$0" 1>/dev/null ; echo "${!1}"' "$1" "$2" ; }

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK 1>&2 || echo FAIL with $? 1>&2 ; declare -p myvar
OK
declare -- myvar="value"

The fix here (by added 1>/dev/null to the 'source' call)
is only meant as an offhanded proposal to fix
when there is stdout output of the sourced file.

It is not meant as a proposal how to deal with errors
(via set -e or set -eu or ... || exit 1 or ...).

pcahyna commented at 2024-03-25 14:22:

Yes, you are right; nit: 1 is superfluous, >/dev/null is enough.

jsmeix commented at 2024-03-25 14:52:

A 'nit' reply only for the fun of it:
I prefer 1> over > because I prefer explicit code.
In particular explicit 1> would make it easier
in theory to search the code where stdout is redirected.
In practice this does not work because also > is used
so in the end it does not matter if 1> or > is used
and I am fine with both.

pcahyna commented at 2024-03-25 14:57:

@jsmeix I happen to disagree here because everyone knows > /dev/null and is used to it, so if I see 1> /dev/null, I have to think about what is the 1 doing there, while > /dev/null is immediately obvious. Related: > /dev/null is more visually recognizable from let's say 7> /dev/null, and we use numbered redirections often in ReaR.

jsmeix commented at 2024-03-26 07:05:

Sigh - endless problems - here the next one:

We cannot fail when 'source' fails because
'source' returns the status of the last command
executed in the sourced file so

# function get_var_from_file() { bash -c 'source "$0" >/dev/null || exit 1 ; set -u ; echo "${!1}"' "$1" "$2" || return 1 ; }

# cat my-assignment 
echo HELLO && VAR="hello"
ls QQQ && VAR="qqq"

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK 1>&2 || echo FAIL with $? 1>&2 ; declare -p myvar
ls: cannot access 'QQQ': No such file or directory
FAIL with 1
declare -- myvar=""

# unset VAR ; source my-assignment && echo OK 1>&2 || echo FAIL with $? 1>&2 ; declare -p VAR
HELLO
ls: cannot access 'QQQ': No such file or directory
FAIL with 2
declare -- VAR="hello"

jsmeix commented at 2024-03-26 07:13:

What seems to behave correctly
(at least for me with what I test here)
is

# function get_var_from_file() { bash -c 'source "$0" >/dev/null ; set -u ; echo "${!1}"' "$1" "$2" || return 1 ; }

# cat my-assignment
echo HELLO && VAR="hello"
ls QQQ && VAR="qqq"

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK || echo FAIL with $? ; declare -p myvar
ls: cannot access 'QQQ': No such file or directory
OK
declare -- myvar="hello"

# unset myvar ; myvar="$( get_var_from_file my-assignment qqq )" && echo OK || echo FAIL with $? ; declare -p myvar
ls: cannot access 'QQQ': No such file or directory
my-assignment: !1: unbound variable
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file qqq VAR )" && echo OK || echo FAIL with $? ; declare -p myvar
qqq: qqq: No such file or directory
qqq: !1: unbound variable
FAIL with 1
declare -- myvar=""

# cat my-assignment
if then
echo HELLO && VAR="hello"
ls QQQ && VAR="qqq"

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK || echo FAIL with $? ; declare -p myvar
my-assignment: line 1: syntax error near unexpected token `then'
my-assignment: line 1: `if then'
my-assignment: !1: unbound variable
FAIL with 1
declare -- myvar=""

# cat my-assignment
echo HELLO && VAR="hello"
if then
ls QQQ && VAR="qqq"

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK || echo FAIL with $? ; declare -p myvar
my-assignment: line 2: syntax error near unexpected token `then'
my-assignment: line 2: `if then'
OK
declare -- myvar="hello"

# cat my-assignment
echo HELLO && VAR="hello"
ls QQQ && VAR="qqq"
if then

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK || echo FAIL with $? ; declare -p myvar
ls: cannot access 'QQQ': No such file or directory
my-assignment: line 3: syntax error near unexpected token `then'
my-assignment: line 3: `if then'
OK
declare -- myvar="hello"

pcahyna commented at 2024-03-26 13:36:

@jsmeix

We cannot fail when 'source' fails because
'source' returns the status of the last command
executed in the sourced file so (...)

... so what? Let me quote the os-release(5) manual page:

The format of os-release is a newline-separated list of
environment-like shell-compatible variable assignments. It is possible
to source the configuration from Bourne shell scripts, however, beyond
mere variable assignments, no shell features are supported
(this means
variable expansion is explicitly not supported), allowing applications
to read the file without implementing a shell compatible execution
engine
.

(emphasis mine).

So, why to worry about the exit status from source if the only allowed statements are variable assignment statements? (ls in your example is thus not allowed.) I consider this a non-issue.

jsmeix commented at 2024-03-26 13:58:

@pcahyna
in your
https://github.com/rear/rear/pull/3171#discussion_r1521750947
you wrote that
"the get_var_from_file function is meant to be generic"
which was the reason that I suggested in my
https://github.com/rear/rear/pull/3171#discussion_r1523283636
that "it fits perhaps better in lib/global-functions.sh"

When we have a generic get_var_from_file function
then it must work reasonably well for any file
that can be sourced, in particular for a bash script.

Otherwise when we only want a function to get values
from os-release then we must not have such a function
as a generic function lib/global-functions.sh
but preferably as it was before only a "local" function
called explicitly get_var_from_os_release to make it clear
that this function cannot be used generically
to get a value from any file that can be sourced.

But why not have a generic get_var_from_file function
when it seems it can be implemented reasonably well,
for example as I proposed in my last
https://github.com/rear/rear/pull/3171#issuecomment-2019541083

Or is there something severely wrong with my recent proposal?

I mean: It is certainly possible to make a bash script
where my recent get_var_from_file function fails
but in "sufficiently normal" cases it seems to work OK

# function get_var_from_file() { bash -c 'source "$0" >/dev/null ; set -u ; echo "${!1}"' "$1" "$2" || return 1 ; }

# cat my-assignment 
echo HELLO && VAR="$( echo -n "Today is $( date )" )"
ls QQQ && VAR="qqq"
if then

# unset myvar ; myvar="$( get_var_from_file my-assignment VAR )" && echo OK || echo FAIL with $? ; declare -p myvar ; date
ls: cannot access 'QQQ': No such file or directory
my-assignment: line 3: syntax error near unexpected token `then'
my-assignment: line 3: `if then'
OK
declare -- myvar="Today is Tue 26 Mar 2024 03:06:09 PM CET"
Tue 26 Mar 2024 03:06:09 PM CET

For the fun of it:
I even found a simple bash script where
my recent get_var_from_file function fails,
something like:

poweroff
sleep 120
VAR="value"

;-)

pcahyna commented at 2024-03-26 14:21:

When we have a generic get_var_from_file function
then it must work reasonably well for any file
that can be sourced, in particular for a bash script.

Depends on how "generic" we want it to have. We can define it as being generic enough to be able to read files that have the same format as os-release, i.e. variable assignments but no other shell commands. I believe that files under /etc/sysconfig fall in this category.

But why not have a generic get_var_from_file function
when it seems it can be implemented reasonably well,
for example as I proposed in my last
https://github.com/rear/rear/pull/3171#issuecomment-2019541083
Or is there something severely wrong with my recent proposal?

My only problem with it is that we lose the ability to detect errors when sourcing the file (including file not found or readable), compared to the source ... || exit 1 version.

jsmeix commented at 2024-03-27 08:41:

It also fails for file not found or not readable

# function get_var_from_file() { bash -c 'source "$0" >/dev/null ; set -u ; echo "${!1}"' "$1" "$2" || return 1 ; }

# unset myvar ; myvar="$( get_var_from_file qqq VAR )" && echo OK || echo FAIL with $? ; declare -p myvar
qqq: qqq: No such file or directory
qqq: !1: unbound variable
FAIL with 1
declare -- myvar=""

# unset myvar ; myvar="$( get_var_from_file /etc/shadow VAR )" && echo OK || echo FAIL with $? ; declare -p myvar
/etc/shadow: /etc/shadow: Permission denied
/etc/shadow: !1: unbound variable
FAIL with 1
declare -- myvar=""

but it always fails at 'set -u' because of "unbound variable".

As far as I see it is not possible to
fail at 'source' for file not found or not readable
but not fail at 'source' when the last command in file
results non-zero exit code which is usually '1'

$ source /etc/shadow || echo $?
bash: /etc/shadow: Permission denied
1

$ source qqq || echo $?
bash: qqq: No such file or directory
1

$ cat my-script
grep qqq /etc/fstab

$ source my-script || echo $?
1

so we would have to add special checks
for cases like file not found or not readable.

I think such specific checks are not needed
because stderr shows the root cause why it failed
regardless that actually it failed at 'set -u'.

jsmeix commented at 2024-03-27 11:46:

Bottom line of what I mean is:

As far as I see

function get_var_from_file() {
    bash -c 'source "$0" >/dev/null ; set -u ; echo "${!1}"' "$1" "$2" || return 1
}

provides simple "all or nothing" return code behaviour:

Zero return code means the variable was set in the file and
stdout is the value of the variable (could be empty or blank).

Non-zero return code in all other cases.

So get_var_from_file can be used in a simple way like

if my_var="$( get_var_from_file FILE_NAME VAR_NAME )" ; then
    # Code when VAR_NAME was set in FILE_NAME:
    ...
else
    # Code when the value of VAR_NAME is unknown
    # (e.g. error handling code or fallback code):
    ...
fi

jsmeix commented at 2024-03-27 12:15:

For the fun of craziness:

My above described simple "all or nothing" return code behaviour
does not hold in any case, in particular not for shell variables:

# cat my-assignment
VAR="value"
BASH="my bash"
BASHPID=99999999

# unset myvar ; if myvar="$( get_var_from_file my-assignment BASH )" ; then declare -p myvar ; else echo FAIL with $? ; fi
declare -- myvar="my bash"

# unset myvar ; if myvar="$( get_var_from_file my-assignment BASHPID )" ; then declare -p myvar ; else echo FAIL with $? ; fi
declare -- myvar="6854"

# unset myvar ; if myvar="$( get_var_from_file my-assignment BASH_COMMAND )" ; then declare -p myvar ; else echo FAIL with $? ; fi
declare -- myvar="echo \"\${!1}\""

but BASHPID cannot be set in my-assignment because it is readonly
and BASH_COMMAND was not set in my-assignment.

So for the fun of craziness I further enhanced it:

# function get_var_from_file() { bash -c 'unset $1 || exit 1 ; source "$0" >/dev/null ; set -u ; echo "${!1}"' "$1" "$2" || return 1 ; }

This seems to even behave correctly - as far as I tested:

# unset myvar ; if myvar="$( get_var_from_file my-assignment VAR )" ; then declare -p myvar ; else echo FAIL with $? ; fi
declare -- myvar="value"

# unset myvar ; if myvar="$( get_var_from_file my-assignment BASH )" ; then declare -p myvar ; else echo FAIL with $? ; fi
declare -- myvar="my bash"

# unset myvar ; if myvar="$( get_var_from_file my-assignment BASHPID )" ; then declare -p myvar ; else echo FAIL with $? ; fi
my-assignment: line 0: unset: BASHPID: cannot unset: readonly variable
FAIL with 1

# unset myvar ; if myvar="$( get_var_from_file my-assignment BASH_COMMAND )" ; then declare -p myvar ; else echo FAIL with $? ; fi
my-assignment: !1: unbound variable
FAIL with 1

The idea behind to ensure the above described
simple "all or nothing" return code behaviour
is:

When VAR_NAME cannot be unset before 'source FILE_NAME'
then VAR_NAME cannot be (successfully) set in FILE_NAME
so get_var_from_file must return non-zero return code.

@lzaoral @pcahyna
I know I am here far beyond what is reasonable
but I currently have "just fun" to play around
with even such crazy exceptional corner cases, cf.
https://github.com/rear/rear/pull/3168#issuecomment-1991713382

jsmeix commented at 2024-03-27 12:44:

@pcahyna @lzaoral
when you agree that having a generic function

function get_var_from_file() {
    bash -c 'unset $1 || exit 1 ; source "$0" >/dev/null ; set -u ; echo "${!1}"' "$1" "$2" || return 1
}

is useful in ReaR - in particular when you think
the above implementation is reasonably correct,
would it then help you with this pull request here
when I make a pull request to add that function
to lib/global-functions.sh so that this part
can be excluded from this pull request?

And in case of issues with that function
I am to "git blame" for it ;-)

jsmeix commented at 2024-03-27 12:54:

FYI:
How it behaves for /etc/os-release for me

# for varname in $( cut -s -d '=' -f1 /etc/os-release ) ; \
  do unset myvar ; \
     if myvar="$( get_var_from_file /etc/os-release $varname )" ; \
     then grep "^$varname=" /etc/os-release ; \
          declare -p myvar ; \
          echo ; \
     else echo FAIL with $? ; \
     fi ; \
   done

NAME="openSUSE Leap"
declare -- myvar="openSUSE Leap"

VERSION="15.5"
declare -- myvar="15.5"

ID="opensuse-leap"
declare -- myvar="opensuse-leap"

ID_LIKE="suse opensuse"
declare -- myvar="suse opensuse"

VERSION_ID="15.5"
declare -- myvar="15.5"

PRETTY_NAME="openSUSE Leap 15.5"
declare -- myvar="openSUSE Leap 15.5"

ANSI_COLOR="0;32"
declare -- myvar="0;32"

CPE_NAME="cpe:/o:opensuse:leap:15.5"
declare -- myvar="cpe:/o:opensuse:leap:15.5"

BUG_REPORT_URL="https://bugs.opensuse.org"
declare -- myvar="https://bugs.opensuse.org"

HOME_URL="https://www.opensuse.org/"
declare -- myvar="https://www.opensuse.org/"

DOCUMENTATION_URL="https://en.opensuse.org/Portal:Leap"
declare -- myvar="https://en.opensuse.org/Portal:Leap"

LOGO="distributor-logo-Leap"
declare -- myvar="distributor-logo-Leap"

Same result when I do this inside ReaR
by adding that code at the beginning of
init/default/001_verify_config_arrays.sh

pcahyna commented at 2024-04-09 10:12:

Sorry @jsmeix , we got a bit lost in the discussion - please implement the proposed get_var_from_file.

jsmeix commented at 2024-04-09 11:45:

@pcahyna
my pleasure: https://github.com/rear/rear/pull/3203

gdha commented at 2024-04-26 07:06:

@jsmeix @pcahyna @lzaoral is this issue now also on hold because of #3203 ?

lzaoral commented at 2024-04-29 12:29:

@gdha Not necessarily. If the original approach to parse /etc/os-release

eval "$(grep '^ID=' /etc/os-release)"

is fine by you and @schlomo, I can revert the PR to that state.

jsmeix commented at 2024-04-29 12:54:

@lzaoral
yes, please change things as needed
so that we can move forward
with this specific case here and
again thank you for your patience with us!

https://github.com/rear/rear/pull/3203
and all what is related to that can of worms will take
a lot of time to get sorted out and properly solved.

In particular my last finding that we basically
carelessly "just source" various files at
about 43 places in our scripts scares me
so "something needs to be done".

lzaoral commented at 2024-05-13 15:07:

@jsmeix @gdha I've rebased the PR and removed usage of get_var_from_file.

jsmeix commented at 2024-05-13 15:53:

@lzaoral
in usr/share/rear/lib/config-functions.sh
in the

    # set OS_MASTER_VENDOR in case this is a derived OS and ReaR differentiates it
    case "${OS_VENDOR,,}" in
...
    esac

section
could you keep a simplified version of the comment

# When OS_VENDOR_VERSION contains 'SUSE', set OS_MASTER_VENDOR to 'SUSE'
# but do not set OS_MASTER_VENDOR same as OS_VENDOR (i.e. 'SUSE_LINUX')
# (cf. above: all SUSE distributions ... must be unified to 'SUSE_LINUX')
# because then scripts in a .../SUSE_LINUX/... sub-directoriy and conf/SUSE_LINUX.conf
# get sourced twice by the (buggy) SourceStage function in lib/framework-functions.sh

that tells why it is crucial to
not set OS_MASTER_VENDOR same as OS_VENDOR?

I needed some second and third looks to understand that
OS_MASTER_VENDOR same as OS_VENDOR is carefully avoided
in the current code but that is neither obvious
nor explicitly visible in the code.

For example you may add something like

    # Set OS_MASTER_VENDOR in case this is a derived OS and ReaR differentiates it.
    # Do not set OS_MASTER_VENDOR same as OS_VENDOR
    # e.g. OS_MASTER_VENDOR = OS_VENDOR = Some_Vendor
    # because then scripts in .../Some_Vendor/... sub-directories
    # and conf/Some_Vendor.conf get sourced twice by the (buggy)
    # SourceStage function in lib/framework-functions.sh
    # cf. https://github.com/rear/rear/pull/1241
    case "${OS_VENDOR,,}" in
    ...
        OS_MASTER_VENDOR=...
    ...
    esac

Or - perhaps even better - BugError when both are same like

    # set OS_MASTER_VENDOR in case this is a derived OS and ReaR differentiates it
    case "${OS_VENDOR,,}" in
    ...
        OS_MASTER_VENDOR=...
    ...
    esac
    # OS_MASTER_VENDOR must not be same as OS_VENDOR
    # e.g. OS_MASTER_VENDOR = OS_VENDOR = Some_Vendor
    # because then scripts in .../Some_Vendor/... sub-directories
    # and conf/Some_Vendor.conf get sourced twice by the (buggy)
    # SourceStage function in lib/framework-functions.sh
    # cf. https://github.com/rear/rear/pull/1241#issuecomment-295292372
    test "$OS_MASTER_VENDOR" != "$OS_VENDOR" || BugError "OS_MASTER_VENDOR must not be same as OS_VENDOR '$OS_VENDOR'"

cf. https://github.com/rear/rear/pull/1241#issuecomment-295292372

schlomo commented at 2024-05-14 10:51:

To follow up, in 1ed656d6ab631ef606f967d460c9643c28d21f61 I fixed the problem with Archlinux and Manjaro so that we now get this, IMHO also broken, behaviour on those distros:

❯ tools/run-in-docker -a amd64 arch manj -- "grep 'ID.*=' /etc/os-release || echo NO /etc/os-release FOUND ; echo ; rear dump | grep OS_ || echo REAR ERROR"
Using architecture amd64 instead of default Docker architecture aarch64
********** archlinux                                **********
ID=arch
BUILD_ID=rolling
VERSION_ID=20240101.0.204074

  OS_MASTER_VENDOR=""
  OS_MASTER_VERSION="20240101.0.204074"
  OS_MASTER_VENDOR_ARCH="i386"
  OS_MASTER_VENDOR_VERSION="20240101.0.204074"
  OS_MASTER_VENDOR_VERSION_ARCH="20240101.0.204074/i386"
  OS_VENDOR="Arch"
  OS_VERSION="20240101.0.204074"
  OS_VENDOR_ARCH="Arch/i386"
  OS_VENDOR_VERSION="Arch/20240101.0.204074"
  OS_VENDOR_VERSION_ARCH="Arch/20240101.0.204074/i386"
********** manjarolinux/base                        **********
ID=manjaro
ID_LIKE=arch
BUILD_ID=rolling

  OS_MASTER_VENDOR=""
  OS_MASTER_VERSION="none"
  OS_MASTER_VENDOR_ARCH="i386"
  OS_MASTER_VENDOR_VERSION="none"
  OS_MASTER_VENDOR_VERSION_ARCH="none/i386"
  OS_VENDOR="Arch"
  OS_VERSION="none"
  OS_VENDOR_ARCH="Arch/i386"
  OS_VENDOR_VERSION="Arch/none"
  OS_VENDOR_VERSION_ARCH="Arch/none/i386"
** SCRIPT RUN TIME 16 SECONDS **

Here we have even more broken variables so that I think that fixing the OS detection will be a very good improvement.

Thank you @lzaoral for opening this up!

jsmeix commented at 2024-05-14 13:11:

@schlomo
see my quick first analysis starting at
https://github.com/rear/rear/issues/3149#issuecomment-1934386437

Some results from my offhanded tests:

https://github.com/rear/rear/issues/3149#issuecomment-1934402243
excerpts

Where those variables are used in ReaR scripts:

...

So it seems (according to that quick and dirty test)
those variables are not actually used in ReaR scripts

OS_MASTER_VENDOR_ARCH
OS_MASTER_VENDOR_VERSION
OS_MASTER_VENDOR_VERSION_ARCH
OS_VENDOR_ARCH
OS_VENDOR_VERSION
OS_VENDOR_VERSION_ARCH
VERSION_ID

https://github.com/rear/rear/issues/3149#issuecomment-1934474211
excerpts

What directories we have where the value of such variables
is used as directory name (also a quick and dirty test):

...

So it seems only those values are actually used
as directory names:

Debian
Fedora
IPL
Linux-arm
Linux-i386
Linux-ia64
Linux-ppc64
Linux-ppc64le
Linux-s390
SUSE_LINUX
Ubuntu
i386
ppc64
ppc64le
s390

And some other tests later
https://github.com/rear/rear/issues/3149#issuecomment-1972731026
excerpts

What OS_VENDOR value dependant scripts we have

...

So we neither have scripts for 'Arch'
nor for 'RedHatEnterpriseServer'.

And finally my personal opinion
as a result of my above tests:
https://github.com/rear/rear/issues/3149#issuecomment-1972786350
excerpt

What I think what is needed and what works in practice
(in particular what is maintainable in practice) is
only two semantically different/separated variables:

    one for the Linux distribution
    one for the hardware architecture

Nothing more - in particular no versions.

Why no versions
Testing for version values is almost always plain wrong
because testing for version values is almost always
not testing "the real thing" so a version value test
could too easily have a wrong result.
E.g. assume in version 1 there is /etc/old
which is replaced since version 2 by /etc/new
then /etc/old versus /etc/new should be directly tested
instead of an indirect test of the version that has
a wrong result when e.g. /etc/old is still used
regardless that version 2 is already there.

...

I would mercilessly simplify and clean up that part in ReaR
to have only two variables left

    DISTRIBUTION
    ARCHITECTURE

which matches what we actually have as scripts
for 'Debian' 'Fedora' and 'SUSE_LINUX'
(nothing version dependant there),
...
This is known to be sufficient and works
in practice since many years.

jsmeix commented at 2024-05-15 12:05:

To avoid possible misunderstanding of my above
https://github.com/rear/rear/pull/3171#issuecomment-2110203576

I did not mean we should do such a
"merciless simplification and clean up"
right now.

I meant that we should not spend more time
than really needed right now for current real issues
with all that variables, cf. my
https://github.com/rear/rear/issues/3149#issuecomment-1968483977
excerpt

it is perfectly OK for me when you do only minimal changes
to make things work again for Red Hat and Fedora and
leave the "great cleanup work" to others...

and my
https://github.com/rear/rear/issues/3149#issuecomment-1973347166
excerpt

(as a totally separated subsequent major step)
I will have a look
if I could do such a "great cleanup work"
(i.e. simplify it to only DISTRIBUTION and ARCHITECTURE)
with reasonable effort
(provided time permits and provided I have nothing better to do).

Bottom line:
I would like to keep this pull request limited
to what it was originally meant to fix:
Some specific bug with that variables
for Red Hat and Fedora.

I think it is not the task of @lzaoral
to do some major generic work in ReaR.
Of course if he likes to contribute that
then I would very much appreciate it.
What I mean is that we should not ask contributors
who like to contribute just one specific thing
to do major generic work "by the way".

Or in other words:
Keep Separated Issues Separated "KSIS" ;-)

gdha commented at 2024-06-11 08:45:

@rear/contributors @lzaoral the milestone "2.8" will never be released as we go for the "3.0". However, it may be wise to select milestone "3.1" for this issue, no?

lzaoral commented at 2024-06-11 10:09:

@gdha This should be fixed in 3.0 rather than later because the OS detection in ReaR on Fedora is completely broken.

lzaoral commented at 2024-10-17 07:21:

I've split c4c25839ecef77d9ecfd44a6bd412b371220b23a into a separate PR #3331 since it is needed for RHEL 10.

jsmeix commented at 2024-10-25 07:27:

@lzaoral
because this pull request is meant for ReaR 3.0
I would like to know how far things meanwhile work
for you with current Red Hat and Fedora versions
with what is already merged in current master code
(e.g. https://github.com/rear/rear/pull/3331) ?

In other words:
Is more needed to make ReaR 3.0 work
with current Red Hat and Fedora versions
or is the current master code already "good enough"
(likely not really good but perhaps OK for now)?

Simply put:
Is https://github.com/rear/rear/issues/3149
good enough fixed for now in current master code?

lzaoral commented at 2024-10-25 08:38:

Hey @jsmeix!
The current state is definitely better than it was in January but two major issues raised in https://github.com/rear/rear/issues/3149 are still not resolved. The current state is:

Fixed:

Not fixed:

Of course, I have no abjections against splitting this PR into multiple smaller ones to move this effort further.

edit: fix broken comment link

lzaoral commented at 2024-10-25 08:41:

I've also rebased this PR to resolve the conflicts.

jsmeix commented at 2024-11-11 09:25:

@lzaoral
I am sorry that I could not reply earlier,
too much other stuff filled up my brain :-(

I would much appreciate it if you could
split this PR into separated parts
i.e. one for each separated "Not fixed" issue in
https://github.com/rear/rear/pull/3171#issuecomment-2437219729

I think this could make this issue move forward
because smaller steps are easier to deal with.

But do not split it up into artificial parts
where the partial changes depend on each other,
i.e. keep together what technically belongs together.
But you can split up one same global task into parts
where each change implements/fixes a separated part.

Again sorry that this issue is delayed for such a long time.
For me personally the main reason is that in this case here
from plain looking at only the changes in this PR
I fail to understand what the changes mean in practice.
So in this case a proper review is rather hard for me.
I hope separated parts make the review easier for me.

jsmeix commented at 2024-11-15 07:55:

@pcahyna @lzaoral
it is still approved "bona fide" by me (see my first comment)
so for me personally it is fine when @pcahyna approves it
and merges it as he likes.
I.e. you do not need to get an additional review from me.

There is still @schlomo who requests changes
but currently I cannot find out what it is or was
where he requested changes.
In the GitHub web frontend view at the top right of this
pull request in the "Reviewers" part in the "schlomo" line
at the right the red frame with + and - in it links to
https://github.com/rear/rear/pull/3171/files/4300f05411f263f2b69da95a67c243248e0f52ba
which shows

We went looking everywhere, but couldn’t find those commits.
Sometimes commits can disappear after a force-push.
Head back to the latest changes here
https://github.com/rear/rear/pull/3171/files

jsmeix commented at 2024-11-15 08:04:

One can still access
https://github.com/rear/rear/commit/4300f05411f263f2b69da95a67c243248e0f52ba
so this commit did not really disappear but GitHub shows

This commit does not belong to any branch on this repository,
and may belong to a fork outside of the repository.

and this commit alone does not tell
what it was why @schlomo requested changes.

lzaoral commented at 2024-11-19 10:59:

Sorry everyone for not replying sooner, everyone! I was really busy with other projects. When going through @schlomo's main concerns in https://github.com/rear/rear/pull/3171#pullrequestreview-2053278291, I propose the following:

  1. Move the SUSE/s390x-related change and the rename of Arch to Arch_Linux (suggested by @jsmeix at the bottom of https://github.com/rear/rear/issues/3149#issuecomment-1972731026 to precent confusion with "Architecture") to a separate PR. These two changes are independent of the rest and I'll create this PR ASAP. edit: Created as https://github.com/rear/rear/pull/3344/.
  2. Make only a minimal change to fix detection of Fedora (and move the rest of this PR on-hold). The plain chain of grep commands used currently (without any actual parsing) classifies Fedora as RHEL, because Fedora uses RH bugzilla for bug-tracking:
$ grep -i -E '(centos|redhat|scientific|oracle)' /etc/os-release
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=41
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=41

Amusingly, on actual RHEL the situation is somewhat similar (I've tried 7.9, 9.5 and 10 Beta), primarily URLs with the redhat substring get matched:

$ grep -i -E '(centos|redhat|scientific|oracle)' /etc/os-release
CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9"
BUG_REPORT_URL="https://issues.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_BUGZILLA_PRODUCT_VERSION=9.5
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.5"

In both cases, the grepped value (redhat) was never present in the ID (or ID_LIKE) fields of /etc/os-release so this always worked "by accident".

This minimal patch fixes the autodetection (at least) on Fedora 40 (though the original proposed commit is more robust):

diff --git a/usr/share/rear/lib/config-functions.sh b/usr/share/rear/lib/config-functions.sh
index daeeb416..0a7f8b6a 100644
--- a/usr/share/rear/lib/config-functions.sh
+++ b/usr/share/rear/lib/config-functions.sh
@@ -20,7 +20,7 @@ function SetOSVendorAndVersion () {
         # Try to find all the required information from that file
         if [[ -f /etc/os-release ]] ; then
             grep -q -i 'fedora' /etc/os-release && OS_VENDOR=Fedora
-            grep -q -i -E '(centos|redhat|scientific|oracle)' /etc/os-release && OS_VENDOR=RedHatEnterpriseServer
+            grep -q -i -E '(centos|rhel|scientific|oracle)' /etc/os-release && OS_VENDOR=RedHatEnterpriseServer
             grep -q -i 'suse' /etc/os-release && OS_VENDOR=SUSE_LINUX
             grep -q -i 'debian' /etc/os-release && OS_VENDOR=Debian
             grep -q -i -E '(ubuntu|linuxmint)' /etc/os-release && OS_VENDOR=Ubuntu

This way, the actual logic for setting OS_VERSION and OS_MASTER_VERSION stays unchanged.

  1. We still need to decide what to do about:

Duplicated code execution whenever OS_MASTER_VENDOR and OS_VENDOR have the same values

The original idea was to leave OS_MASTER_VENDOR empty for non-derived systems but that was discouraged by @schlomo in https://github.com/rear/rear/pull/3171#pullrequestreview-2053278291 and I see his concerns. If nobody has objections, I'll implement the following suggestion by @schlomo (https://github.com/rear/rear/pull/3171#pullrequestreview-2053278291) instead:

I'm wondering if we maybe should "fix" the problem of duplicate script inclusion via a sort -u in SourceStage instead, even if this is a bit hackish?

Once again my apologies for not moving this further sooner. @jsmeix @gdha @schlomo @pcahyna Your suggestions and comments are welcome. Thank you!

edit: fix typo and add link to https://github.com/rear/rear/pull/3344

jsmeix commented at 2024-11-19 15:15:

I fully support to

Make only a minimal change to fix detection of Fedora
(and move the rest of this PR on-hold).

lzaoral commented at 2024-11-20 15:27:

@schlomo @pcahyna @jsmeix I've rebased the PR to implement the absolutely minimal change to correctly classify Fedora as Fedora (but the current parsing logic is still rather brittle) and to resolve duplicated stage executions when OS_VENDOR and OS_VENDOR_MASTER match. Thank you in advance for the review!

jsmeix commented at 2024-11-22 08:27:

@rear/contributors
provided there are no objections
I would merge it on Monday afternoon

jsmeix commented at 2024-11-25 12:27:

Puh!
So (hopefully) it is DONE now - at least for now
what is needed to release ReaR 2.8.

@lzaoral @pcahyna
please test (as time permits) current GitHub master code
with this merged pull request and ideally when things work
report what you tested successfully in
https://github.com/rear/rear/wiki/Test-Matrix-ReaR-2.8

lzaoral commented at 2024-11-29 13:54:

Thnak you for the reminder, @jsmeix! Hopefully, I'll get to the testing next week.


[Export of Github issue for rear/rear.]