#1269 Issue closed
: 400_autoresize_disks.sh incorrectly calculates $new_size with mawk (Debian)¶
Labels: enhancement
, cleanup
, fixed / solved / done
,
minor bug
gozora opened issue at 2017-03-28 18:10:¶
Relax-and-Recover (ReaR) Issue Template¶
- rear version (/usr/sbin/rear -V): Relax-and-Recover 2.00 / Git
- OS version (cat /etc/rear/os.conf or lsb_release -a): Debian GNU/Linux 8.7 (jessie)
- rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
BACKUP=NETFS
OUTPUT=ISO
OUTPUT_URL=nfs://beta/mnt/rear/iso
BACKUP_URL=nfs://beta/mnt/rear
EXCLUDE_BACKUP=( ${EXCLUDE_BACKUP[@]} fs:/crash fs:/usr/sap fs:/oracle )
BACKUP_PROG_EXCLUDE=( ${BACKUP_PROG_EXCLUDE[@]} '/mnt/*' '/media/*' )
ISO_MKISOFS_BIN=/usr/bin/ebiso
GRUB_RESCUE=n
USE_STATIC_NETWORKING=y
NETWORKING_PREPARATION_COMMANDS=(ifconfig eth1 inet 192.168.0.23)
- Are you using legacy BIOS or UEFI boot? UEFI
- Brief description of the issue:
Duringrear recover
partition size is incorrectly calculated with mawk.
Following code is executed by 400_autoresize_disks.sh
new_size=$(echo "$partition_size $resizeable_space $available_space" | awk '{ printf "%d", ($1/$2)*$3; }')
With gawk values are correct:
echo '8333033472 8333033472 39473643520' | awk '{ printf "%d", ($1/$2)*$3; }'
39473643520
With mawk however, values are maxed to 0xFFFFFFFF (2147483647)
echo '8333033472 8333033472 39473643520' | awk '{ printf "%d", ($1/$2)*$3; }'
2147483647
- Work-around, if any:
Change of awk conversion specification format:
- new_size=$(echo "$partition_size $resizeable_space $available_space" | awk '{ printf "%d", ($1/$2)*$3; }')
+ new_size=$(echo "$partition_size $resizeable_space $available_space" | awk '{ printf "%.0f", ($1/$2)*$3; }')
Or maybe use bash arithmetic expansion ?
- new_size=$(echo "$partition_size $resizeable_space $available_space" | awk '{ printf "%d", ($1/$2)*$3; }')
+ new_size=$(( ($partition_size/$resizeable_space)*$available_space ))
jsmeix commented at 2017-03-29 07:27:¶
I do very very much appreciate it whenever
needless external tool calls are replaced
by native bash 3.0 functionality to
Keep the Implementation Simple and Straightforward.
Furthermore I personally prefer when spaces
are used when possible to aid readability like
new_size=$(( ( $partition_size / $resizeable_space ) * $available_space ))
which at least helps my elderly eyes ;-)
cf.
https://github.com/rear/rear/wiki/Coding-Style
For me with GNU bash version 3.2.51 on SLES11 32-bit
bash arithmetic evaluation and expansion works with your values
# partition_size=8333033472 ; resizeable_space=8333033472 ; available_space=39473643520 ; new_size=$(( ( $partition_size / $resizeable_space ) * $available_space )) ; echo $new_size 39473643520
According to
http://unix.stackexchange.com/questions/117280/what-is-the-rationale-for-the-bash-shell-not-warning-you-of-arithmetic-overflow
I tried what the maximum value is for my GNU bash version 3.2.51
on SLES11 32-bit and it is 2^63 - 1
# max=$(( 2**63 - 1 )) ; echo $max 9223372036854775807 # max=$(( 2**63 )) ; echo $max -9223372036854775808 # echo '2^63 - 1' | bc -l 9223372036854775807 # echo '2^63' | bc -l 9223372036854775808
and I get same results with GNU bash version 4.2.47
on openSUSE Laep 42.1 64-bit:
# max=$(( 2**63 - 1 )) ; echo $max 9223372036854775807 # max=$(( 2**63 )) ; echo $max -9223372036854775808
so that I conclude that in practice the maximum value for
bash arithmetic evaluation is 2^63 - 1 = 9223372036854775807
which is 8388608 GiB - 1 so that we can use bash arithmetic
for disks up to 8388607 GiB.
gdha commented at 2017-03-29 08:08:¶
@gozora choose the bash way 👍
gozora commented at 2017-03-29 08:52:¶
@gdha, @jsmeix
Thanks a lot for your inputs!
I'll prepare PR later today.
V.
jsmeix commented at 2017-03-29 09:00:¶
@gozora
I think in layout/prepare/GNU/Linux/100_include_partition_code.sh
start=$( echo "$start" | awk '{printf "%u", $1+4096-($1%4096);}')
is also a place where mawk could fail because I think
the partition start values in ReaR use bytes as unit
so that on bigger disks it could overflow with mawk.
jsmeix commented at 2017-03-29 09:01:¶
@gozora
thanks a lot for your careful testing
that reveals those generic bugs in ReaR!
gozora commented at 2017-03-29 09:05:¶
I think in layout/prepare/GNU/Linux/100_include_partition_code.sh
start=$( echo "$start" | awk '{printf "%u", $1+4096-($1%4096);}')
is also a place where mawk could fail because I think
the partition start values in ReaR use bytes as unit
so that on bigger disks it could overflow with mawk.
Good catch @jsmeix, I'll rewrite this one as well.
thanks a lot for your careful testing
that reveals those generic bugs in ReaR!
No problem, it just somehow happens that I keep finding those bugs :-)
jsmeix commented at 2017-03-29 09:19:¶
I fear the whole
start=$( echo "$start" | awk '{printf "%u", $1+4096-($1%4096);}')
is plain wrong - or I do something wrong here:
# start=$(( 4096 * 1234 )) # echo $start 5054464 # rounded_start=$( echo "$start" | awk '{printf "%u", $1+4096- $1%4096);}') # echo $rounded_start 5058560 # echo $(( rounded_start / 4096 )) 1235
I.e. even for a exact matching start value, the current code
recalculates it to a new start + 4096 value.
I guess nobody notices a change of 4096 bytes in practice
but it would be a major bug in ReaR when it does not recreate
the partitioning exactly as it was before when possible.
If I am right the above is perhaps the cause of the mysterious
changes of partitioning after "rear recover" that you detected
during your tests with your BACKUP=BLOCKCLONE
in particular for Windows NTFS partitions, cf.
https://github.com/rear/rear/issues/1078#issuecomment-266099227
jsmeix commented at 2017-03-29 09:37:¶
The more I try to understand it the less I understand it.
I cannot reproduce it in practice on my original system:
# parted /dev/sda unit B print Model: ATA QEMU HARDDISK (scsi) Disk /dev/sda: 21474836480B Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1048576B 1562378239B 1561329664B primary linux-swap(v1) type=82 2 1562378240B 21474836479B 19912458240B primary ext4 boot, type=83
and after "rear recover"
# parted /dev/sda unit B print Model: ATA QEMU HARDDISK (scsi) Disk /dev/sda: 21474836480B Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1048576B 1562378239B 1561329664B primary linux-swap(v1) type=83 2 1562378240B 21474836479B 19912458240B primary ext4 boot, type=83
but when I try the current code with my /dev/sda2 partition
start value it calculates a wrong new start value:
# start=$(( 1048576 + 1561329664 )) ; echo $start ; echo $(( start / 4096 )) 1562378240 381440 # rounded_start=$( echo "$start" | awk '{printf "%u", $1+4096-($1%4096);}') # echo $rounded_start ; echo $(( rounded_start / 4096 )) 1562382336 381441
Am I mad or what goes on here?
gozora commented at 2017-03-29 10:22:¶
Hehe, looks like a skeleton in the closet.
I'll try to check it as well ...
jsmeix commented at 2017-03-29 10:43:¶
@gozora
only a side note FYI if you work on 100_include_partition_code.sh
and do not like the hardcoded '4096' guess of
"most device's block size":
You may have a look at the current
USB_PARTITION_ALIGN_BLOCK_SIZE
implementation in 300_format_usb_disk.sh
cf.
https://github.com/rear/rear/issues/1201
and
https://github.com/rear/rear/pull/1217
jsmeix commented at 2017-03-29 13:48:¶
@gozora
cf.
https://github.com/rear/rear/issues/1270
i.e. be careful with parted units when you like
to stay backward compatible with older parted.
jsmeix commented at 2017-03-30 08:50:¶
Hooray!
Even with GNU bash, version 3.1.17 on SLES10
bash arithmetic evaluation works up to 2^63 - 1
at least on x86_64:
# cat /etc/issue Welcome to SUSE Linux Enterprise Server 10 SP4 (x86_64) # bash --version GNU bash, version 3.1.17(1)-release (x86_64-suse-linux) # max=$(( 2**63 - 1 )) ; echo $max 9223372036854775807 # max=$(( 2**63 )) ; echo $max -9223372036854775808
cf. https://github.com/rear/rear/issues/1269#issuecomment-290006467
Therefore we can use bytes 'B' as unit for parted
(cf.
https://github.com/rear/rear/issues/1270)
up to disk sizes of 9223372036854775807 bytes
i.e. up to 8589934592 GiB minus one byte
or 838861 TiB minus one byte
or 8192 PiB minus one byte
or 8 EiB minus one byte
which looks sufficiently future-proof
(at least from my current point of view).
jsmeix commented at 2017-03-31 07:23:¶
With
https://github.com/rear/rear/pull/1272
merged,
this particular issue is fixed.
Any "grand cleanup" of the code in 100_include_partition_code.sh
can be done later as time permits.
jsmeix commented at 2017-04-03 11:27:¶
An addendum FYI:
Also on a 32-bit SLES10 installation on a 32-bit virtual machine
(we did not find some real 32-bit hardware for that test)
with bash version 3.1.17 arithmetic evaluation works
up to 2^63 - 1
ProBackup-nl commented at 2017-04-16 20:33:¶
@jsmeix For GNU Bash 4.4.12 x86_64 the bash arithmetic resize code fails at least for each situation where $resizeable_space is larger than $partition_size.
( $partition_size / $resizeable_space )
$ echo $(( (2/3)*9999 ))
0
Failing workaround: Mimicking the example by first multiplying fails because bash is not able to correctly multiply 20G with 30G.
$ bash --version
GNU bash, version 3.2.53(1)-release (x86_64-apple-darwin13)
$ echo $(( 20971520000*30971520000 ))
3883808530565693440
$ bash --version
GNU bash, version 4.4.12(1)-release (x86_64-unknown-linux-gnu)
$ echo $(( 20971520000*30971520000 ))
3883808530565693440
jsmeix commented at 2017-04-26 09:40:¶
Summary from what we learned above
and in
https://github.com/rear/rear/issues/1307
When a tool calculates correctly for numbers up to 2^N
one can in practice only use it for input numbers up to 2^(N/2)
when a single multiplication of such numbers should work
because 2^(N/2) * 2^(N/2) = 2^N
This means when bash can calculate up to 2^63 - 1
we can use in practice bash only for input numbers
up to something about 2^31 - 1.
Inputs of 2^32 - 1 do not work in general:
# echo '2^32 - 1' | bc 4294967295 # echo '4294967295 * 4294967295' | bc 18446744065119617025 # echo $(( 4294967295 * 4294967295 )) -8589934591
Multiplication works in general up to 2^31 - 1
# echo '2^31 - 1' | bc 2147483647 # echo '2147483647 * 2147483647' | bc 4611686014132420609 # echo $(( 2147483647 * 2147483647 )) 4611686014132420609
Some bigger special cases may still work
like (2^31 - 1) * (2^32)
# echo '2^32' | bc 4294967296 # echo '2147483647 * 4294967296' | bc 9223372032559808512 # echo $(( 2147483647 * 4294967296 )) 9223372032559808512
but nothing more
# echo '2^31' | bc 2147483648 # echo '2147483648 * 4294967296' | bc 9223372036854775808 # echo $(( 2147483648 * 4294967296 )) -9223372036854775808
jsmeix commented at 2017-04-26 11:56:¶
With
https://github.com/rear/rear/pull/1332
merged
this issue should (hopefully) be finally solved.
ProBackup-nl commented at 2017-05-10 21:29:¶
@gozora My source OS (Arch Linux) has no bc
tool installed (by
default).
Despite all your effort for fixing the mawk
(Mike Brennan's AWK
speedup by using a bytecode compiler) issue, the result is that a broken
awk
now introduces an additional dependency for ReaR to work properly:
bc
.
I would love a solution that leaves the old code in place, and extend
that for the cases where (m)awk
will fail. For instance change 'awk
'
to explicitly use 'gawk
'. And for the users that haven't got gawk
installed, fall back to bc
for math calculations.
gozora commented at 2017-05-11 06:33:¶
@ProBackup-nl how can we be sure that awk
will be installed (by
default) on your Arch next time?
V.
ProBackup-nl commented at 2017-05-11 07:41:¶
@gozora We can never be sure. However at the moment the gawk
package
is required by, for example:
jsmeix commented at 2017-05-11 08:28:¶
Simply put:
One cannot use any kind of 'awk' or any kind of
traditional tool for calculations nowadays.
All those tools have usually certain limitations
that hit us in this or that way when calculating big numbers.
For example using usual floating point arithmetik also leads
to wrong results when calculating big integer numbers.
We need a tool that is known to work for big numbers.
E.g. 2^100 + 1 - 2^100 must result 1 and never ever 0:
# echo '2^100 + 1 - 2^100' | bc -l 1 # awk 'BEGIN{print 2^52 + 1 - 2^52 }' 1 # awk 'BEGIN{print 2^53 + 1 - 2^53 }' || echo fail 0
I do not even want to know why my 'awk' fails at 2^53
and not at things like 2^31 2^32 2^63 2^64 and why
my 'awk' even does not show its calculation failure
with a non-zero exit code or an error message.
All what matters is that tools like 'awk' cannot be used
because calculations just have to be correct - always.
Accordingly ReaR cannot be used on systems without
a tool that is known to work for big numbers.
FWIW:
I was 'awk' (more precisely 'gawk') package maintainer
some time ago and - guess what - every now and then
on the bug-gawk@gnu.org mailing list issues were reported
by this or that gawk user that "awk calculates wrong"
in this or that particular unexpected way.
ProBackup-nl commented at 2017-05-11 09:53:¶
A link to a 2013 bug submission by gratien....@gmail.com for the lack of large integer support in mawk / incorrect output of %d format and the quoted response:
This is a known limitation: mawk's format for %d is limited by the format.
The limitation is done to improve performance.You can get more precision using one of the floating formats (and can construct
one which prints like a %d, e.g., by putting a ".0" on the end of the format).
Does ReaR need this kind of large integer precision here?
Or can the math in ReaR work with a floating point calculation that is
converted to an integer?
gozora commented at 2017-05-11 10:22:¶
@ProBackup-nl using different conversion type was discussed already (see
https://github.com/rear/rear/issues/1269#issue-217646048)
and we decided to use bc
instead ...
V.
ProBackup-nl commented at 2017-05-11 10:36:¶
In case the choice for bc
decision will be reconsidered, floating
point gawk returns the expected output for gratien's example:
# echo '26341277696 26341278720 47819259904' | gawk '{ printf "%.0f", ($1/$2)*$3; }'
47819258045
# gawk -V
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5-p2, GNU MP 6.1.2)
schlomo commented at 2017-05-11 12:51:¶
@ProBackup-nl ReaR will always require some software that might not be
installed by default on some Linux distro. Therefore we check for it
with REQUIRED_PROGS
and also include those packages in the distro
packages.
I therefore see absolutely no value in writing code that supplements
bc
instead of requiring bc
to be present.
If you are an Archlinux user then I would kindly ask you to help ReaR by making sure that the Archlinuc package also includes such required tools.
Specifically to this point about gawk
or bc
I see that the current
PKGBUILD
file
already includes a hard dependency on gawk
which we could easily
extend to also include bc
:
depends=(lsb-release iproute2 parted cpio openssl gawk)
ProBackup-nl commented at 2017-05-11 18:17:¶
@schlomo As long as bc
isn't an optional dependency, I don't feel
any need.
[Export of Github issue for rear/rear.]