#1068 Issue closed: Possibly faulty usage of '[*]' subscript for arrays

Labels: cleanup, won't fix / can't fix / obsolete, minor bug

jsmeix opened issue at 2016-11-15 11:08:

Current ReaR code uses in very most cases
(actually in 682 cases)

${array[@]}

when "the whole array" is meant.

But sometimes (actually in 22 cases)

${array[*]}

is used.

I think I should investigate the array[*] cases and try to find out
whether or not each array[*] usage is actually right.

FYI: I used

find usr/share/rear/* | xargs grep -ho '[[:alnum:]_]*\[@\]'

versus

find usr/share/rear/* | xargs grep -ho '[[:alnum:]_]*\[\*\]'

to find array[@] versus array[*] usage.

Background information:

See "man bash" for the difference:

If subscript is @ or *, the word expands to all members of name.
These subscripts differ only when the word appears within
double quotes. If the word is double-quoted, ${name[*]}
expands to a single word with the value of each array
member separated by the first character of the IFS special
variable, and ${name[@]} expands each element of name
to a separate word.

In general ${array[*]} is problematic and
using ${array[@]} without double-quotes is also problematic.

For example see how using ${array[*]} or "${array[*]}"
or even ${array[@]} results probably unexpected stuff
(unless those results are actually intended) while
only "${array[@]}" results what is usually expected:

# arr1=( 'first1 first2' 'second1 second2' )

# for e in ${arr1[*]} ; do echo "'$e'" ; done
'first1'
'first2'
'second1'
'second2'

# for e in ${arr1[@]} ; do echo "'$e'" ; done
'first1'
'first2'
'second1'
'second2'

# for e in "${arr1[*]}" ; do echo "'$e'" ; done
'first1 first2 second1 second2'

# for e in "${arr1[@]}" ; do echo "'$e'" ; done
'first1 first2'
'second1 second2'

# arr2=( ${arr1[*]} 'third1 third2' )

# for e in "${arr2[@]}" ; do echo "'$e'" ; done
'first1'
'first2'
'second1'
'second2'
'third1 third2'

# arr2=( ${arr1[@]} 'third1 third2' )

# for e in "${arr2[@]}" ; do echo "'$e'" ; done
'first1'
'first2'
'second1'
'second2'
'third1 third2'

# arr2=( "${arr1[*]}" 'third1 third2' )

# for e in "${arr2[@]}" ; do echo "'$e'" ; done
'first1 first2 second1 second2'
'third1 third2'

# arr2=( "${arr1[@]}" 'third1 third2' )

# for e in "${arr2[@]}" ; do echo "'$e'" ; done
'first1 first2'
'second1 second2'
'third1 third2'

jsmeix commented at 2016-11-15 11:12:

I found only 589 cases where "${array[@]}"
with double-quotes is used...

schlomo commented at 2016-11-16 19:46:

Please post here the 22 cases of "${array[*]}", I remember using that in
some very specific cases on purpose.

On 15 November 2016 at 12:12, Johannes Meixner notifications@github.com
wrote:

I found only 589 cases where "${array[@]}"
with double-quotes is used...


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/rear/rear/issues/1068#issuecomment-260614342, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAGMCNiVGl3vWKAkKHPzG7HDP1GgpiK2ks5q-ZOfgaJpZM4KyZsX
.

jsmeix commented at 2016-11-17 08:04:

In current ReaR GitHUb master code
in the usr/share/rear directory:

The array variables where "${array[*]}" is used:

$ find * | xargs grep -ho '[[:alnum:]_]*\[\*\]' | sort -u
[*]
ADDR[*]
array[*]
deps[*]
dev[*]
name[*]
NETWORKING_PREPARATION_COMMANDS[*]
NEW_INITRD_MODULES[*]
OLD_INITRD_MODULES[*]
RESULT_FILES[*]
TSM_FILESPACE_INCLUDED_NUMS[*]
TSM_FILESPACE_NUMS[*]

The files where "${array[*]}" is used:

$ find * | xargs grep -o '[[:alnum:]_]*\[\*\]' | sort -u
finalize/Fedora/i386/170_rebuild_initramfs.sh:NEW_INITRD_MODULES[*]
finalize/SUSE_LINUX/i386/170_rebuild_initramfs.sh:NEW_INITRD_MODULES[*]
finalize/SUSE_LINUX/i386/170_rebuild_initramfs.sh:OLD_INITRD_MODULES[*]
finalize/SUSE_LINUX/ppc64/500_rebuild_initramfs.sh:NEW_INITRD_MODULES[*]
finalize/SUSE_LINUX/ppc64/500_rebuild_initramfs.sh:OLD_INITRD_MODULES[*]
layout/prepare/default/540_generate_device_code.sh:deps[*]
layout/save/GNU/Linux/250_drbd_layout.sh:dev[*]
output/default/950_copy_result_files.sh:array[*]
output/default/950_copy_result_files.sh:RESULT_FILES[*]
rescue/GNU/Linux/310_network_devices.sh:name[*]
rescue/GNU/Linux/310_network_devices.sh:NETWORKING_PREPARATION_COMMANDS[*]
rescue/GNU/Linux/360_teaming.sh:ADDR[*]
verify/TSM/default/400_verify_tsm.sh:[*]
verify/TSM/default/400_verify_tsm.sh:TSM_FILESPACE_INCLUDED_NUMS[*]
verify/TSM/default/400_verify_tsm.sh:TSM_FILESPACE_NUMS[*]

The files with the actual code lines where "${array[*]}" is used:

$ find * | xargs grep '[[:alnum:]_]*\[\*\]'
finalize/Fedora/i386/170_rebuild_initramfs.sh:  NEW_INITRD_MODULES=( $(tr " " "\n" <<< "${NEW_INITRD_MODULES[*]}" | sort | uniq -u) )
finalize/SUSE_LINUX/i386/170_rebuild_initramfs.sh:      NEW_INITRD_MODULES=( $(tr " " "\n" <<< "${NEW_INITRD_MODULES[*]}" | sort | uniq -u) )
finalize/SUSE_LINUX/i386/170_rebuild_initramfs.sh:      sed -i -e '/^INITRD_MODULES/s/^.*$/#&\nINITRD_MODULES="'"${OLD_INITRD_MODULES[*]} ${NEW_INITRD_MODULES[*]}"'"/' $TARGET_FS_ROOT/etc/sysconfig/kernel
finalize/SUSE_LINUX/ppc64/500_rebuild_initramfs.sh:     NEW_INITRD_MODULES=( $(tr " " "\n" <<< "${NEW_INITRD_MODULES[*]}" | sort | uniq -u) )
finalize/SUSE_LINUX/ppc64/500_rebuild_initramfs.sh:     sed -i -e '/^INITRD_MODULES/s/^.*$/#&\nINITRD_MODULES="'"${OLD_INITRD_MODULES[*]} ${NEW_INITRD_MODULES[*]}"'"/' $TARGET_FS_ROOT/etc/sysconfig/kernel
layout/prepare/default/540_generate_device_code.sh:        Debug "deps (${#deps[@]}): ${deps[*]}"
layout/save/GNU/Linux/250_drbd_layout.sh:        for i in ${!dev[*]}; do
output/default/950_copy_result_files.sh:        # FIXME: Verify if usage of $array[*] instead of "${array[@]}" is actually intended here
output/default/950_copy_result_files.sh:        LogPrint "Copying result files '${RESULT_FILES[*]}' to $scheme location"
output/default/950_copy_result_files.sh:        Log "lftp -c open $OUTPUT_URL; mput ${RESULT_FILES[*]}"
output/default/950_copy_result_files.sh:        lftp -c "open $OUTPUT_URL; mput ${RESULT_FILES[*]}" || Error "Problem transferring result files to $OUTPUT_URL"
rescue/GNU/Linux/310_network_devices.sh:if test "${NETWORKING_PREPARATION_COMMANDS[*]}" ; then
rescue/GNU/Linux/310_network_devices.sh:    # For the above 'test' one must have all array members as a single word i.e. "${name[*]}"
rescue/GNU/Linux/360_teaming.sh:    if [[ ${ADDR[*]} ]]
verify/TSM/default/400_verify_tsm.sh:# NOTE: In this script we use '"${TSM_FILESPACE_NUMS[*]}"' (instead of [@]) and it seems to be intentionally
verify/TSM/default/400_verify_tsm.sh:# 2009-10-12 Schlomo as part of a code review to fix all occurences of [*]
verify/TSM/default/400_verify_tsm.sh:read -t $WAIT_SECS -p "(default: ${TSM_FILESPACE_INCLUDED_NUMS[*]}): [$WAIT_SECS secs] " -r TSM_RESTORE_FILESPACE_NUMS 2>&1
verify/TSM/default/400_verify_tsm.sh:   TSM_RESTORE_FILESPACE_NUMS="${TSM_FILESPACE_INCLUDED_NUMS[*]}" # set default on ENTER
verify/TSM/default/400_verify_tsm.sh:   Log "User pressed ENTER, setting default of ${TSM_FILESPACE_INCLUDED_NUMS[*]}"

gdha commented at 2016-11-18 09:11:

@jsmeix Please be extremely careful with changing the array expansions - you might create new problems. Do not change to just change things.

jsmeix commented at 2016-11-18 09:47:

@gdha
don't worry, as long as things are well documented
no ReaR contributor would inadvertently change it
to a wrong behaviour because he (falsely) assumes
he would improve things.

E.g. (shameless self-praise ;-) see
rescue/GNU/Linux/310_network_devices.sh

if test "${NETWORKING_PREPARATION_COMMANDS[*]}" ; then
    # For the above 'test' one must have all array members as a single word i.e. "${name[*]}"
    # (the test should succeed when there is any non-empty array member, not necessarily the first one)
    # while later one must have the array members as separated 'command' words i.e. "${name[@]}".
    echo "# Prepare network devices setup as specified in NETWORKING_PREPARATION_COMMANDS:" >>$network_devices_setup_script
    for command in "${NETWORKING_PREPARATION_COMMANDS[@]}" ; do

in contrast to e.g. in verify/TSM/default/400_verify_tsm.sh
where Schlomo Schapiro added an initial comment about
that issue but - as far as I understand his comment - he was
unable to find out whether or not the "[*]" usage is really
on purpose (and if yes what that purpose actually is).

jsmeix commented at 2016-11-18 09:56:

I change it from "bug" to "minor bug" because
currently things seem to work.

As long as the array members do not contain blanks
there is no difference beween "[@]" and "[*]" usage
but probably there are some cases where things break
if array members contain blanks.

I am waiting for the first user who has blanks
in system file names or system directory names
that are used by ReaR e.g. things like

BACKUP_PROG_INCLUDE=( '/my important stuff/*' )

I never tested if things like that would work.

jsmeix commented at 2017-01-03 08:04:

Nothing is urgent here.
In particular nothing needs to be done for ReaR 2.0.

jsmeix commented at 2017-02-27 14:52:

Using the '[*]' subscript is required when testing arrays:

# arr1=( 'first1 first2' 'second1 second2' )

# test "${arr1[@]:-}" && echo ok || echo empty
-bash: test: first1 first2: unary operator expected
empty

# test "${arr1[*]:-}" && echo ok || echo empty
ok

jsmeix commented at 2017-05-08 11:29:

An even more complicated usage of the '[*]' subscript
plus an interposed 'echo -n' is required when testing arrays
if there is any non-empty array member (not necessarily the first one)
but on the other hand the test should not succeed when there are
only empty or blank members:

# arr1=( '' ' ' )

# test "${arr1[*]}" && echo non-empty || echo empty
non-empty

# test "$( echo -n ${arr1[*]} )" && echo non-empty || echo empty
empty

# echo -n ${arr1[*]} | wc -c
0

# arr1=( '' ' ' 'some thing' ' ' 'something else' )

# test "$( echo -n ${arr1[*]} )" && echo non-empty || echo empty
non-empty

# echo -n ${arr1[*]} | wc -c
25

# echo -n ${arr1[*]} | od -a
# echo -n ${arr1[*]} | od -a
0000000   s   o   m   e  sp   t   h   i   n   g  sp   s   o   m   e   t
0000020   h   i   n   g  sp   e   l   s   e

cf. https://github.com/rear/rear/pull/1334
and the thereby added explanatory comment
in prep/default/950_check_missing_programs.sh
that reads (excerpts):

For the 'test' one must have all array members
as a single word like "${arr[*]}" with double-quotes
because it should detect when there is any non-empty
array member (not necessarily the first one)
But on the other hand the test should not succeed
when there are only empty or blank members
which would falsely succeed when the
array is e.g. something like arr=( '' ' ' )
because then "${arr[*]}" evaluates to "  "
(the empty and blank members separated
by the first character of IFS) and test "${arr[*]}"
results true for any non-empty argument
(e.g. test " " results true).
Therefore 'echo -n' is interposed because
the output of arr=( '' ' ' ) ; echo -n ${arr[*]}
is empty when the array has only empty
or blank array members

Note that there must be no double-quotes in the

echo -n ${arr[*]}

command because otherwise

# arr=( '' '' )

# test "$( echo -n "${arr[*]}" )" && echo non-empty || echo empty
non-empty

# echo -n "${arr[*]}" | od -a
0000000  sp

jsmeix commented at 2019-03-25 09:46:

FYI:

https://github.com/rear/rear/pull/2098
is a good example how much care is needed to use
the specific right syntax for array expansion:

Within a longer string with double quotes ${arr[@]} may expand
to something unexpeced.

E.g. see the following artificial example what happens:

# arr=( 'first element' 'second element' )

# ls "this ${arr[@]} that"
ls: cannot access 'this first element': No such file or directory
ls: cannot access 'second element that': No such file or directory

The "embedded" ${arr[@]} expands to separated words/strings
(which result two separated ls error messages)
plus (see "man bash"):

  Arrays
    ...
    If the double-quoted expansion occurs within a word,
    the expansion of the first parameter is joined with the
    beginning part of the original word, and the expansion
    of the last parameter is joined with the last part of the
    original word.

that results ls is called as

ls 'this first element' 'second element that'

Therefore within a longer string with double quotes
usually ${arr[*]} expands as normally intended.

E.g. in the above artificial example

# ls "this ${arr[*]} that"
ls: cannot access 'this first element second element that': No such file or directory

the ls command is called with one single parameter.

pcahyna commented at 2019-03-25 09:52:

It seems that https://github.com/koalaman/shellcheck/wiki/SC2145 uses a rule of thumb that ${arr[*]} is more likely to be correct than ${arr[@]} if it occurs embedded in a longer string, because ${arr[@]} will split the string into multiple parts and this is unlikely to be desired. This heuristics IMO makes sense.

jsmeix commented at 2019-03-25 13:08:

Yes, I came to the same heuristic conclusion in my
https://github.com/rear/rear/issues/1068#issuecomment-476122565

Therefore within a longer string with double quotes
usually ${arr[*]} expands as normally intended.

FYI:
The printf example in
https://github.com/koalaman/shellcheck/wiki/SC2145
works for me (output looks different but things do not fail)
with bash 3.x and 4.x

# arr=( 'first element' 'second element' )

# set -x

# printf "Arr: %s\n" "this ${arr[*]} that"
+ printf 'Arr: %s\n' 'this first element second element that'
Arr: this first element second element that

# printf "Arr: %s\n" "this ${arr[@]} that"
+ printf 'Arr: %s\n' 'this first element' 'second element that'
Arr: this first element
Arr: second element that

I am looking for a simple example where ${arr[@]} in a longer string
would make things badly fail...


[Export of Github issue for rear/rear.]