#2930 Issue closed: Ensure config array variables are specified as arrays by the user

Labels: enhancement, fixed / solved / done

jsmeix opened issue at 2023-02-14 10:32:

It can lead to various rather unexpected behaviour
when an array variable is not set as an array
but usually as a string instead.

For example:
In bash assinging a string to an array
assingns the first array element to that string:

#  arr=()

# declare -p arr
declare -a arr=()

# arr=''

# declare -p arr
declare -a arr=([0]="")

But an array with an empty first element is something different
than an empty array which can make big differences in how
things behave, see for example
https://github.com/rear/rear/issues/2911
therein in particular
https://github.com/rear/rear/issues/2911#issuecomment-1387056556
and subsequent comments.

From the bash manpage section "Arrays":

Referencing an array variable without a subscript
is equivalent to referencing the array with a subscript of 0

To avoid that users set config array variables
as something that is not an array we should check
after sourcing the config files that the array variables
that are set in default.conf are still arrays.

See
https://github.com/rear/rear/issues/2925#issuecomment-1429432172

schlomo commented at 2023-02-14 23:27:

I played around with this and found out that the problem is slightly different than what we thought:

Overriding an array variable with a single value actually only changes the first member of the array!

This demo within rear shell shows this clearly:
image

So in the original example this was a problem because the first member of COPY_AS_IS happens to be the ReaR files themselves, which led to the observed error.

To be honest, as much es it pains me, I currently have no idea how to detect this type of misconfiguration properly.

Even comparing the array length is probably a wrong check and it also won't catch cases of accidentially overwriting the first array member.

Maybe a totally silly idea: check for every array variable from default.conf if it appears in /etc/rear/*.conf without a ( in the same line. I can still think about ways to misconfigure ReaR that this check won't catch, but maybe it will help against this very typical user configuration error?

What do you think? Too crazy or worth implementing?

jsmeix commented at 2023-02-16 10:02:

Yes,
setting an array variable without a subscript
changes (only) the first member of the array
as "man bash" tells.

My above example was only meant as an example how
things could even fail when "all looks just empty"
(because not all is really empty in this example)
cf. "Beware of the emptiness" in
https://github.com/rear/rear/wiki/Coding-Style

Basically all we do in ReaR (including all checks)
must follow the general rule to

provide final power to the user

ReaR must never "patronize/infantilize" the user
because Free Software Freedom Rights are sacrosanct
where in this case it is mostly about "freedom 0" in
https://www.gnu.org/philosophy/free-sw.html#four-freedoms

In general software that patronizes/infantilizes
its users treat its users as if they were idiots.

For checks in ReaR it means they inform the user
to help them to avoid accidental mistakes
(like confusing arrays with strings).

But checks in ReaR are normally not meant to enforce
what the user can do, so when a user specified e.g.

COPY_AS_IS=( /some/directory )

then ReaR must obey (because the user has final power)
and (try to) do what is requested by the user.
E.g. assume the user has carefully setup /some/directory
as he needs it so COPY_AS_IS=( /some/directory )
is exactly what he deliberately wants ReaR to do.

When ReaR cannot do what the user requested,
it must error out with a meaningful error message,
cf. "Try hard to care about possible errors" in
https://github.com/rear/rear/wiki/Coding-Style

schlomo commented at 2023-02-16 10:18:

Yes, the check only confirms that the syntax is correct, meaning that the user knew about the fact that this variable is an array and treats it as such.

jsmeix commented at 2023-02-17 08:48:

It is not yet completed, see
https://github.com/rear/rear/pull/2932#issuecomment-1434293424

schlomo commented at 2023-02-17 09:16:

Please have another look.

jsmeix commented at 2023-02-20 13:02:

Ah - I didn't understand where I should have a look but now I found
https://github.com/rear/rear/commit/46d4dd3fe03595cc7b969db66d6ab1a881fcb6eb

Now it looks good to me (from plain looking at the code).

A side note:
While I tried to reverse engineer

declare -p | sed -n -E -e '/^declare -a/s/declare [-arxlu]+ ([A-Za-z0-9_-]+)=.*/\1/p'

I did the following as currently logged in normal user:

$ declare -p | grep '^declare -a'
declare -a BASH_ARGC=()
declare -a BASH_ARGV=()
declare -a BASH_COMPLETION_VERSINFO=([0]="2" [1]="7")
declare -a BASH_LINENO=()
declare -a BASH_SOURCE=()
declare -ar BASH_VERSINFO=([0]="4" [1]="4" [2]="23" [3]="1" [4]="release" [5]="x86_64-suse-linux-gnu")
declare -a DIRSTACK=()
declare -a FUNCNAME
declare -a GROUPS=()
declare -a PIPESTATUS=([0]="0" [1]="0")

$ declare -p | sed -n -E -e '/^declare -a/s/declare [-arxlu]+ ([A-Za-z0-9_-]+)=.*/\1/p'
BASH_ARGC
BASH_ARGV
BASH_COMPLETION_VERSINFO
BASH_LINENO
BASH_SOURCE
BASH_VERSINFO
DIRSTACK
GROUPS
PIPESTATUS

so the unassigned array FUNCNAME gets skipped.

Is it intentional to skip unassigned arrays?
Is perhaps

sed -n -E -e '/^declare -a/s/declare [-arxlu]+ ([A-Za-z0-9_-]+).*/\1/p'

better (i.e. without the '=')?

schlomo commented at 2023-02-20 13:20:

Wow, thanks for taking a deep look at this.

My original thinking was: If it hasn't been assigned then there is no potential for a wrong assignment, therefore it is enough to look at the arrays with content.

jsmeix commented at 2023-02-20 14:00:

It is fully sufficient in practice to only check assigned arrays.
I think we will never have declare -a CONF_VAR in default.conf.
I only noticed it by chance so I reported my finding.


[Export of Github issue for rear/rear.]