#904 Issue closed: Speedup REAR by changing compression level?

Labels: enhancement, fixed / solved / done

stoxxys opened issue at 2016-07-06 14:28:

I am looking for a possibility to speedup the backup process by changing the compression level of gzip. It works fine but it is really slow caused by the usage of only one CPU.

I tried to use only tar. It works fine with an acceptable speed but I want to use a low compression to save diskspace.The default compressionlevel is 6. I think compressionlevel 4 would be a good deal.

Does anyone know how I can edit these compression settings in REAR?

OS= openSUSE 42.1
REAR-Version= 1.18

jsmeix commented at 2016-07-06 15:14:

SUSE issues belog to me.
I will have a look...

gdha commented at 2016-07-07 16:04:

@jsmeix Have a look at http://www.it3.be/2013/09/16/NETFS-compression-tests/

stoxxys commented at 2016-07-08 06:17:

Thanks for the information gdha. I already know that information. Other formats like bzip and xz are much slower than gzip.
I am looking for a way to change the gzip compression level. I want to use gzip.

jsmeix commented at 2016-07-12 14:23:

According to
https://www.gnu.org/software/tar/manual/html_node/gzip.html#SEC134
the tar -I option can override the defaults like

tar -c -I 'gzip -9 -n -c' -f archive.tar.gz directory

but in /etc/rear/local.conf

BACKUP_PROG_COMPRESS_OPTIONS="-I 'gzip -9 -n -c'"

does not work because in
usr/share/rear/backup/NETFS/default/50_make_backup.sh
it is used as

case "$(basename ${BACKUP_PROG})" in
    # tar compatible programs here
    (tar)
    ...
        $BACKUP_PROG $TAR_OPTIONS ...  \
        ...
        ... $BACKUP_PROG_COMPRESS_OPTIONS \

which leads to a quoting hell where I failed to escape
from it with reasonable effort.

According to
http://superuser.com/questions/360966/how-do-i-use-a-bash-variable-string-containing-quotes-in-a-command
and other such postings the best way out of quoting hell is

Putting commands (or parts of commands) into variables
and then getting them back out intact is complicated.
...
Usually, the best way to do this sort of thing is using
an array instead of a simple text variable:
FLAGS=(--archive --exclude="foo bar.txt")
rsync "${FLAGS[@]}" dir1 dir2

Example:

# BACKUP_PROG_COMPRESS_OPTIONS=(-I 'gzip -9 -n -c')
# set -x
# tar -c -v "${BACKUP_PROG_COMPRESS_OPTIONS[@]}" -f archive.tar.gz directory
+ tar -c -v -I 'gzip -9 -n -c' -f archive.tar.gz directory

As time permits I will play around with it...

jsmeix commented at 2016-07-12 14:31:

Wow!

It seems (at least on SLE11 and SLE12) that bash is
tolerant regarding array versus simple text variable.

In particular one can generally use ${VAR[@]}
also to get the value of a simple text variable:

# BACKUP_PROG_COMPRESS_OPTIONS=foo
# echo "'$BACKUP_PROG_COMPRESS_OPTIONS'"
'foo'
# echo "'${BACKUP_PROG_COMPRESS_OPTIONS[@]}'"
'foo'

This way it seems I can change
usr/share/rear/backup/NETFS/default/50_make_backup.sh
in a backward compatible way so that it still works when
BACKUP_PROG_COMPRESS_OPTIONS is defined
as a simple text variable but also when it is defined as
an array.

jsmeix commented at 2016-07-12 15:08:

It seems basically with simple replacing of all

$BACKUP_PROG_COMPRESS_OPTIONS

with

"${BACKUP_PROG_COMPRESS_OPTIONS[@]}"

in usr/share/rear/backup/NETFS/default/50_make_backup.sh
and usr/share/rear/restore/NETFS/default/40_restore_backup.sh
makes it work for me with things like

BACKUP_PROG_COMPRESS_OPTIONS=(-I 'gzip -9 -n -c')

in /etc/rear/local.conf

@stoxxys
please test if it also works for you this way
and provide feedback.

schlomo commented at 2016-07-12 17:46:

@jsmeix this is actually standard behavior for Bash and the reason why I recommend to implement all variables like this as an array. It doesn't cost us anything and it enables the users to provide more complex values.

jsmeix commented at 2016-07-13 08:58:

@schlomo
many thanks for your information!

It means that I can implement it in this way
without fear of backward incompatible regressions.

jsmeix commented at 2016-07-13 09:54:

I verified that https://github.com/rear/rear/issues/904#issuecomment-232065858
also works on SLE10 which has bash version 3.1.17
(SLE11 has bash version 3.2.51 and SLE12 has 4.2.47).

jsmeix commented at 2016-07-13 13:51:

With https://github.com/rear/rear/pull/912
this enhancement should be implemented.

Note that tar version >= 1.27 is required for
using it with command options (like 'gzip -9') in

BACKUP_PROG_COMPRESS_OPTIONS=( -I 'gzip -9 -n -c' )

because only
since tar version 1.27 tar supports passing command
line arguments to external commands, see
/usr/share/rear/conf/default.conf

jsmeix commented at 2016-07-14 14:42:

FYI
some statistics how the gzip compression level from 1 to 9
makes a difference on my particular SLES12-SP1 test system:

I specified in etc/rear/local.conf from

BACKUP_PROG_COMPRESS_OPTIONS=( -I 'gzip -1 -n -c' )

to

BACKUP_PROG_COMPRESS_OPTIONS=( -I 'gzip -9 -n -c' )

and got for the gzip compression level from 1 to 9
those backup.tar.gz sizes and the time and speed
to make each (according to the last line in backup.log):

-1 : 1286M backup.tar.gz in 129.191 s at 10.4 MB/s
-2 : 1270M backup.tar.gz in 131.495 s at 10.1 MB/s
-3 : 1256M backup.tar.gz in 139.401 s at 9.4 MB/s
-4 : 1232M backup.tar.gz in 140.011 s at 9.2 MB/s
-5 : 1217M backup.tar.gz in 165.636 s at 7.7 MB/s
-6 : 1210M backup.tar.gz in 174.397 s at 7.3 MB/s
-7 : 1208M backup.tar.gz in 194.586 s at 6.5 MB/s
-8 : 1206M backup.tar.gz in 271.525 s at 4.7 MB/s
-9 : 1205M backup.tar.gz in 383.796 s at 3.3 MB/s

Interestingly for me gzip compression level 9
needs almost three times more time but results
only about 7% less backup.tar.gz size
compared to gzip compression level 1.

@stoxxys
what are your results with that new feature?

stoxxys commented at 2016-07-26 11:57:

I have tested the new feature an it works great!

Thanks a lot

jsmeix commented at 2023-01-19 12:58:

Changing a string variable into an array
works usually backward compatible as in
https://github.com/rear/rear/issues/904#issuecomment-232065858
but in special cases it is not backward compatible:

In particular one cannot simply use

BACKUP_PROG_COMPRESS_OPTIONS=""

instead of the right

BACKUP_PROG_COMPRESS_OPTIONS=()

because during "rear recover" (with BACKUP=NETFS with usual 'tar')
the former (i.e. BACKUP_PROG_COMPRESS_OPTIONS="")
makes tar ... -C /mnt/local/ untar the whole backup.tar
into the '/root' directory inside the ReaR recovery system, see
https://github.com/rear/rear/issues/2911#issuecomment-1387056556


[Export of Github issue for rear/rear.]