#2220 Issue closed
: bash internal overflow because element zero of CLONE_USERS array is "", resulting in rear doing 'getent passwd' following by 'getent group' for all GIDs¶
Labels: bug
, fixed / solved / done
pcahyna opened issue at 2019-08-29 18:07:¶
Relax-and-Recover (ReaR) Issue Template¶
Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):
-
ReaR version ("/usr/sbin/rear -V"):
Relax-and-Recover 2.4 / Git -
OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
RedHatEnterpriseServer 7 -
ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
not available -
Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
PC -
System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
x86_64 -
Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
Any/GRUB -
Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
any -
Description of the issue (ideally so that others can reproduce it):
With a huge number of users, groups and users in each group (tens of thousands), ReaR may abort with this in the log:Including rescue/default/900_clone_users_and_groups.sh Cloning users: daemon rpc usbmuxd usbmux vcsa nobody dbus /usr/share/rear/rescue/default/900_clone_users_and_groups.sh: xrealloc: cannot allocate 18446744071562067968 bytes (6635520 bytes allocated)
The underlying problem is:
-
/usr/share/rear/conf/GNU/Linux.conf ends up setting
${CLONE_USERS[0]}
to""
: CLONE_USERS=( "${CLONE_USERS[@]:-}" daemon rpc usbmuxd usbmux vcsa nobody dbus ) -
This then results in the for loop in https://github.com/rear/rear/blob/b7a75c3a39ee8cf0402f83da7fca8df231c7b979/usr/share/rear/rescue/default/900_clone_users_and_groups.sh#L42 doing 'getent passwd' without additional parameters in the first iteration, thus dumping ALL users (
$user
is set to""
) for user in "${CLONE_USERS[@]}" ; do # Skip if the user exists already in the ReaR recovery system: grep -q "^$user:" $ROOTFS_DIR/etc/passwd && continue # Skip if the user does not exist in the current system: if ! passwd_entry="$( getent passwd $user )" ; then Debug "Cannot clone user $user because it does not exist" continue fi -
The script then ends up doing a 'getent group $groupID' where groupID which is every users GID groupID="$( cut -d ':' -f '4' <<<"$passwd_entry" )" if ! group_entry="$( getent group $groupID )" ; then
-
With such a large number of users, groups and users in each groups, the output of the last command is so huge that the assignment to
group_entry
overflows some internal limit of bash.
The problematic case is hard to reproduce as it requires a really large amount of users, groups and users per group. However, the underlying problem (which seems mostly harmless in more limited settings) manifests itself easily:
Cloning users: daemon rpc usbmuxd usbmux vcsa nobody dbus
Note the additional space after :. This space is printed in the first iteration of the loop with
$user
empty.The root cause is the introduction of empty array elements in PR #699 to pacify
set -ue
in bash 3, which can not cope with empty arrays - so those were replaced by arrays with one empty member. (Since there are more arrays which were changed this way in this commit, it is possible that similar errors are more widespread.) See our discussion in https://github.com/rear/rear/commit/c19647fcd95f0c3f4020f39f5fad428e8359ec92#r32912187, https://github.com/rear/rear/commit/c19647fcd95f0c3f4020f39f5fad428e8359ec92#r32911995, https://github.com/rear/rear/commit/c19647fcd95f0c3f4020f39f5fad428e8359ec92#r32912160 -
-
Workaround, if any:
Not sure about workaround, but there are several possible fixes:-
The cleanest would be to revert the parts of PR #699 that introduce those empty array elements. The resulting problem in bash 3 and
set -ue
could be dealt with in the following ways:- by ignoring it, i.e desupport bash 3 together with
set -ue
.
This does not seem to be a big problem, asset -ue
in general (even bash 4) is apparently broken without anybody noticing: https://github.com/rear/rear/pull/1720/files#r317556097 and there are further places that probably broke it more for bash 3 (non-exhaustive list, and some of them may be false positives, i.e. the arrays become non-empty at some point):BACKUP_PROG_INCLUDE
:
https://github.com/rear/rear/commit/eb89244bad4a73eaed3bfcc4ddb9e85658819230#r34830311 https://github.com/rear/rear/blob/b7a75c3a39ee8cf0402f83da7fca8df231c7b979/usr/share/rear/backup/NETFS/default/400_create_include_exclude_files.sh#L3 https://github.com/rear/rear/blob/d8f1571a213a9df272327bb070e8a87f78fc14c3/usr/share/rear/layout/save/default/310_include_exclude.sh#L27BACKUP_PROG_EXCLUDE
: https://github.com/rear/rear/blob/950d14af59b9262525adf2f0f08bd990cb931e8c/usr/share/rear/backup/NETFS/default/400_create_include_exclude_files.sh#L19EXCLUDE_MOUNTPOINTS
: https://github.com/rear/rear/commit/7c3b34d467b9e03ca6c36b7478f6e6ae6f609ba7#r34830344 https://github.com/rear/rear/blob/b7a75c3a39ee8cf0402f83da7fca8df231c7b979/usr/share/rear/backup/NETFS/default/400_create_include_exclude_files.sh#L29EXCLUDE_MD
: https://github.com/rear/rear/blob/d8f1571a213a9df272327bb070e8a87f78fc14c3/usr/share/rear/layout/save/default/310_include_exclude.sh#L35EXCLUDE_VG
: https://github.com/rear/rear/blob/d8f1571a213a9df272327bb070e8a87f78fc14c3/usr/share/rear/layout/save/default/310_include_exclude.sh#L41EXCLUDE_COMPONENTS
: https://github.com/rear/rear/blob/d8f1571a213a9df272327bb070e8a87f78fc14c3/usr/share/rear/layout/save/default/310_include_exclude.sh#L57EXCLUDE_RECREATE
: https://github.com/rear/rear/blob/d8f1571a213a9df272327bb070e8a87f78fc14c3/usr/share/rear/layout/save/default/310_include_exclude.sh#L63RmInArray
: https://github.com/rear/rear/commit/e51bbef1034bc8c18112299ffe81da1572bbbf50#r34792188
- by protecting the array expansion in a bash 3 and
set -ue
compatible way: i.e. instead offor i in "${A[@]}"
writefor i in ${A+"${A[@]:-}"}
. To test for an empty array, write(( ${#A[@]} ))
or[ ${#A[@]} -gt 0 ]
. This is possible, but would require lots of changes which would make the code a bit less readable.
- by ignoring it, i.e desupport bash 3 together with
-
In the present case, a less invasive change would be to replace the appending to the array
CLONE_USERS=( "${CLONE_USERS[@]:-}" daemon rpc usbmuxd usbmux vcsa nobody dbus )
by
+=
, i.e.CLONE_USERS+=( daemon rpc usbmuxd usbmux vcsa nobody dbus )
(I've verified that
+=
works properly even in bash 3 andset -ue
.) This would fix the present case and other cases where the problem lies in appending to arrays, but not those where the array is already initialized to("")
. -
The most basic fix is to compensate for the fake empty element:
for user in "${CLONE_USERS[@]}" ; do [[ -z ${user:-} ]] && continue # Skip if the user exists already in the ReaR recovery system:
and if similar problems are found or newly introduced elsewhere, fix them in a similar way.
It is also the least desirable fix IMHO, because it pollutes the code to work around an artificial element that should not be there in the first place, and one will have to remember to work around it in every new loop that is introduced.
I would to like to discuss which path to take and I will then submit a PR.
-
-
Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
See above for relevant log snippets.
jsmeix commented at 2019-08-30 13:59:¶
@pcahyna
thank you so much for your careful analysis of the underlying root
cause
and for your explanatory and comprehensive description.
Offhandedly I think this is just one more example that shows
why 'set -eu' looks good in theory but does not work well in practice.
What does in particular not work in practice (with reasonable effort)
is 'set -u'.
I experienced that yesterday with a privately made and used bash
script.
While 'set -e' can be quite useful in practice at places where one must
not
blindly proceeed with the next command unless the current command
succeeded
(the best example for that in ReaR is the diskrestore.sh script),
I think 'set -u' is only useful while one is making and testing a
script
because it helps to detect misspelled variable names (for example
as in counter=conuter+1
) but from my experience 'set -u' as a general
setting
while scripts are run does more harm than good.
So my current conclusion is to revert those things (in particular the
dummy
array elements) that I had added to make the code work with 'set -u'
because those added stuff breakes other code at other places.
Furthermore the ReaR code is meant to be adapted and enhanced as
needed
by the user which means our scripts should be as much as possible in
compliance
with "the usual way" how a normal bash script is made nowadays in
practice
to avoid unexpected behaviour for the user when he changed a script.
By the way a related side note:
Our global shopt -s nullglob
setting in usr/sbin/rear
https://github.com/rear/rear/blob/master/usr/sbin/rear#L292
is another example where ReaR scripts do not behave as usually
expected:
Have "fun" with code like grep something my*files
when nothing matches my*files
;-)
pcahyna commented at 2019-08-30 14:47:¶
@jsmeix the analysis of the particular problem was actually done by @hartsjc - thanks! I did only the analysis of the root cause, related issues and possible fixes.
jsmeix commented at 2019-08-30 15:50:¶
@pcahyna
regarding your
... set -ue in general (even bash 4) is apparently broken
without anybody noticing:
https://github.com/rear/rear/pull/1720/files#r317556097
Currently only the 'help' workflow should work with 'set -ue'
all other workflows do not yet work with 'set -ue', cf.
https://github.com/rear/rear/issues/700#issuecomment-158332829
and in
https://github.com/rear/rear/pull/699
2.)
I fixed all what fails because of 'set -ue -o pipefail'
for the "help" workflow as a very first step into that direction.
Since that time nothing more was done, because it is questionable
if it is worth the effort to make the code ready for 'set -eu',
see also the later comments in
https://github.com/rear/rear/issues/700
jsmeix commented at 2019-09-03 14:18:¶
What needs to be fixed in any case in
rescue/default/900_clone_users_and_groups.sh
is the above mentioned
The most basic fix is to compensate for the fake empty element:
for user in "${CLONE_USERS[@]}" ; do
[[ -z ${user:-} ]] && continue
because CLONE_USERS and CLONE_GROUPS are user config variables
and things need to still work if the user had set (by accident) an array
element
that is empty or only blanks
cf.
https://github.com/rear/rear/commit/c19647fcd95f0c3f4020f39f5fad428e8359ec92#r32912521
jsmeix commented at 2019-09-03 14:27:¶
With
https://github.com/rear/rear/commit/36cf20e2d8bc136b334c2a70460b645afb770a24
I skip empty user and group values (in a simple but not save for
set -u
way).
jsmeix commented at 2019-09-03 14:41:¶
@hartsjc
many thanks for your laborious analysis of the issue!
I think I will add a new section
Beware of the emptiness!
to
https://github.com/rear/rear/wiki/Coding-Style
because so many tools behave very different if run with empty argument
for example
grep something $some_file
when $some_file is empty or only blanks
STDIN is used which lets rear hang up in practice because
the user won't provide input
or
ls $some_dir
when $some_dir is empty or only blanks which
lists the current working directory (cf. above shopt -s nullglob
)
which is in practice always wrong because if one wants to list
the current working directory one would not use $some_dir
and so on...
pcahyna commented at 2019-09-03 19:11:¶
What needs to be fixed in any case in
rescue/default/900_clone_users_and_groups.sh
is the above mentionedThe most basic fix is to compensate for the fake empty element: for user in "${CLONE_USERS[@]}" ; do [[ -z ${user:-} ]] && continue
because CLONE_USERS and CLONE_GROUPS are user config variables
and things need to still work if the user had set (by accident) an array element
that is empty or only blanks
cf. c19647f#r32912521
I would not say it is a good idea to clutter the code with such tests
for things that user might do wrong with variables. What if users
accidentally sets it to "-h" or "-V" for example, I doubt that "getent"
will work as expected in this case. If this is really a concern, I
rather propose a shared function to validate arrays.
I will send a PR for appending using +=, which is IMO the preferable
way, as it is simple yet clean (and does not interfere with your fix).
PR #2223
hartsjc commented at 2019-09-03 20:16:¶
@hartsjc
many thanks for your laborious analysis of the issue!
Glad to be of help.
jsmeix commented at 2019-09-03 20:38:¶
@pcahyna
I think there is a practical difference between when the user sets
a non-empty wrong value where it is visible that this cannot work
e.g. the value $( rm -rf * )
;-)
compared to an empty or empty looking value where one may
assume that what one cannot see cannot have a (bad) effect,
so I think "beware of the emptiness" is reasonable and helpful
in practice while it would be over the top if we tried to make ReaR
safe against any possible nonsense values.
jsmeix commented at 2019-09-05 12:36:¶
Via
https://github.com/rear/rear/commit/b8b7e466ed85891eff67e6653fb2239f41655f82
I replaced in the comments in default.conf
ARRAY=( "${ARRAY[@]}" additional elements
)
with the simpler and more fail safe
ARRAY+=( additional elements )
according to
https://github.com/rear/rear/issues/2220#issue-487102052
I've verified that += works properly even in bash 3 and set -ue
jsmeix commented at 2019-09-05 12:49:¶
@pcahyna
do you think it would be a good idea to replace
ARRAY=( "${ARRAY[@]}" additional elements )
with
ARRAY+=( additional elements )
everywhere in the ReaR code?
I did a quick find . | xargs grep '=(.*[@]'
in ReaR master code
which indicates there are more than 400 places where we use
ARRAY=( "${ARRAY[@]}" additional elements )
pcahyna commented at 2019-09-05 14:56:¶
@jsmeix yes, I don't see a reason why not to append using +=, do you know why it has not been used in ReaR?
pcahyna commented at 2019-09-05 15:05:¶
I think I will add a new section
Beware of the emptiness!
to https://github.com/rear/rear/wiki/Coding-Style
because so many tools behave very different if run with empty argument
for example
grep something $some_file
when $some_file is empty or only blanks
STDIN is used which lets rear hang up in practice because
the user won't provide input
or
ls $some_dir
when $some_dir is empty or only blanks which
lists the current working directory (cf. aboveshopt -s nullglob
)
which is in practice always wrong because if one wants to list
the current working directory one would not use $some_dir
and so on...
grep something "$some_file"
and ls "$some_dir"
behave properly, i.e.
fail. Empty parameter is not the same as a missing parameter, which
indeed often has a special meaning. The problem with bash that it makes
way too easy to treat undefined variables as missing values, instead of
as empty values. One more reason to quote variable expansion properly
(one should do it anyway because of blanks in values etc.). I think it
is therefore more productive to focus on quoting parameter expansion
rather than on inserting more and more code to guard against empty or
blank values.
jsmeix commented at 2019-09-10 13:25:¶
@pcahyna
I do not know why +=
was not used in ReaR.
I guess perhaps because at the time when ReaR started
+=
might have been not yet supported in bash?
I would very much appreciate it if you could find an automated command
(e.g. via sed
or whatever tool you like) that replaces
ARRAY=( "${ARRAY[@]}" additional elements )
with
ARRAY+=( additional elements )
everywhere in the ReaR code without changing "false positives"
i.e. without creating regressions.
You are right that proper quoting is mandatory for fail safe code
but in practice ... :-(
FYI
an example where I intentionally do not quote:
# Ensure $foo is a single non empty and non blank word
# (no quoting because test " " returns zero exit code):
test $foo ...
Perhaps there is a better way to do that?
pcahyna commented at 2020-05-25 12:58:¶
Our global
shopt -s nullglob
setting in usr/sbin/rear
https://github.com/rear/rear/blob/master/usr/sbin/rear#L292
is another example where ReaR scripts do not behave as usually expected:
Have "fun" with code likegrep something my*files
when nothing matchesmy*files
;-)
@jsmeix Would it work to use shopt -s failglob
instead of
shopt -s nullglob
?
jsmeix commented at 2020-05-26 06:14:¶
I think in practice (i.e. with reasonable effort)
we cannot change our global bash globbing settings
because we cannot know at which places in all of the scripts
the nullglob
behaviour is required to make things work
because usually there are no comments or other signs
how we could detect places that need nullglob
.
An example how nullglob
behaviour lets things work as intened
while failglob
would let things completely fail in a bad way:
# ( shopt -s nullglob ; set -x ; tar -cvf /tmp/testy.tar /etc/issue /optional/stuff* /etc/fstab && echo OK || echo Error $? )
+ tar -cvf /tmp/testy.tar /etc/issue /etc/fstab
tar: Removing leading `/' from member names
/etc/issue
tar: Removing leading `/' from hard link targets
/etc/fstab
+ echo OK
OK
where nullglob
gracefully skips non-existent optional files
versus
# ( shopt -s failglob ; set -x ; tar -cvf /tmp/testy.tar /etc/issue /optional/stuff* /etc/fstab && echo OK || echo Error $? )
-bash: no match: /optional/stuff*
where the whole command is not at all executed.
But how could one autodetect pieces of code like
tar -cvf /tmp/testy.tar /etc/issue /optional/stuff* /etc/fstab && echo OK || echo Error $?
that require nullglob
behaviour ?
Searching for any bash globbing patterns in the code would be possible
but I fear there are way too many places where bash globbing patterns
are used in the code so I think in practice (i.e. with reasonable
effort)
we cannot change our global bash globbing settings.
[Export of Github issue for rear/rear.]