#1859 Issue closed: Verifying md5sums of the files in the recovery system

Labels: enhancement, fixed / solved / done

jsmeix opened issue at 2018-07-12 14:21:

I think it would be good to verify that the files in the recovery system are intact.

The reason behind is that there are obscure rumours
( actually it was a colleague who told me about it ;-)
that errors during loading/unpacking the initrd/initramfs
are somehow reported (if one knows how to look for them)
but at least some errors do not abort the boot process.

Accordingly it could happen that files in the recovery system are corrupt
but the user may not notice the actual error and get any kind
of inexplicable errors when using the recovery system.

I have the dim feeling that this is the root cause of issues like
cf. https://github.com/rear/rear/issues/1724

In particular my colleague told me that in some cases a too big
initrd/initramfs might get only partially loaded/unpacked
(depending on reasons that I do not understand)
that could lead to corrupted and/or missing files
in the recovery system.

To detect such kind of possible file corruptions in the recovery system
I added on my SLES12 test system (a virtual QEMU/KVM machine)
a new build/default/995_md5sums_rootfs.sh script that does basically

local md5sums_file="md5sums.txt"
pushd $ROOTFS_DIR 1>&2
    cat /dev/null >$md5sums_file
    # Do not provide a md5sums.txt in the recovery system if it was not successfully created here.
    # Exclude the md5sums.txt file itself and all .gitignore files:
    find . -xdev -type f | egrep -v '/md5sums\.txt|/\.gitignore' | xargs md5sum >>$md5sums_file || cat /dev/null >$md5sums_file
popd 1>&2

plus I added a section to skel/default/etc/scripts/system-setup
that does first and foremost during recovery system startup
a verification of the md5sums (before some files get changed
by recovery system startup scripts) basically via

if test -s "/md5sums.txt" ; then
    echo -e "\nVerifying md5sums of the files in the Relax-and-Recover rescue system\n"
    # In case of a FAILED md5sum wait some seconds so that the user can read the md5sum output.
    # /etc/issue is excluded because verifying md5sum for it fails - something changes it:
    grep -v 'etc/issue' md5sums.txt | md5sum --quiet --check && echo -e "md5sums are OK\n" || sleep 5
fi

and this way things seem to work already well for me on SLES12.

jsmeix commented at 2018-07-13 10:15:

How long it takes to create and verify md5sums of the files in the recovery system:

With a big recovery system via

MODULES=( 'all_modules' )
FIRMWARE_FILES=( 'yes' )

my (unpacked) recovery system size is 476M (du -hcs /tmp/rear.XXX/rootfs/ output) and
it contains 8256 regular files (find /tmp/rear.XXX/rootfs/ -xdev -type f | wc -l output)
where for 8159 files the md5sum is stored (wc -l /tmp/rear.XXX/rootfs/md5sums.txt output)
and during recovery system startup verification of the md5sums of that files
needs 1.021 seconds (time output for the above md5sum ... --check
line in skel/default/etc/scripts/system-setup) and
it took 3.623 seconds to create the md5sums (difference of the timestamps in
the rear log file from begin to end of the 995_md5sums_rootfs.sh script).

With a small recovery system via

MODULES=( 'loaded_modules' )
FIRMWARE_FILES=( 'no' )

my (unpacked) recovery system size is 164M (du -hcs /tmp/rear.XXX/rootfs/ output) and
it contains 4943 regular files (find /tmp/rear.XXX/rootfs/ -xdev -type f | wc -l output)
where for 4846 files the md5sum is stored (wc -l /tmp/rear.XXX/rootfs/md5sums.txt output)
and during recovery system startup verification of the md5sums of that files
needs 0.327 seconds (time output for the above md5sum ... --check
line in skel/default/etc/scripts/system-setup) and
it took 0.333 seconds to create the md5sums (difference of the timestamps in
the rear log file from begin to end of the 995_md5sums_rootfs.sh script).

I think this is fast enough so that ReaR can in any case
create and verify md5sums of the files in the recovery system.

@rear/contributors
what do you think?

jsmeix commented at 2018-07-16 09:25:

As reference see
https://github.com/rear/rear/issues/1724#issuecomment-405166295
for a reason why @schabrolles thinks it could really help.

jsmeix commented at 2018-07-17 11:55:

With https://github.com/rear/rear/pull/1864 merged
this issue should be done.


[Export of Github issue for rear/rear.]