#2407 Issue closed: build/default/995_md5sums_rootfs.sh fails on filenames with spaces

Labels: enhancement, fixed / solved / done, minor bug

z1atk0 opened issue at 2020-05-25 13:15:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.5 / Git (Included from SEP SESAM)

  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
    OS_VENDOR=RedHatEnterpriseServer
    OS_VERSION=7

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
    [root@rz-backup:~]# cat /var/opt/sesam/var/lib/rear/etc/rear/local.conf
    BACKUP=SESAM
    OUTPUT=ISO
    SSH_ROOT_PASSWORD=xxx

  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    PC

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86 compatible

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    BIOS/GRUB

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local disk

  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT" or "lsblk" as makeshift):

NAME                          KNAME     PKNAME    TRAN TYPE  FSTYPE              SIZE MOUNTPOINT
/dev/sda                      /dev/sda            sas  disk                     68.4G 
|-/dev/sda1                   /dev/sda1 /dev/sda       part  linux_raid_member   512M 
| `-/dev/md0                  /dev/md0  /dev/sda1      raid1 ext3              511.4M /boot
`-/dev/sda2                   /dev/sda2 /dev/sda       part  linux_raid_member  67.9G 
  `-/dev/md1                  /dev/md1  /dev/sda2      raid1 LVM2_member        67.8G 
    |-/dev/mapper/system-root /dev/dm-0 /dev/md1       lvm   ext4                 20G /
    |-/dev/mapper/system-swap /dev/dm-1 /dev/md1       lvm   swap                  4G [SWAP]
    `-/dev/mapper/system-var  /dev/dm-2 /dev/md1       lvm   ext4                 20G /var
/dev/sdb                      /dev/sdb            sas  disk                     68.4G 
|-/dev/sdb1                   /dev/sdb1 /dev/sdb       part  linux_raid_member   512M 
| `-/dev/md0                  /dev/md0  /dev/sdb1      raid1 ext3              511.4M /boot
`-/dev/sdb2                   /dev/sdb2 /dev/sdb       part  linux_raid_member  67.9G 
  `-/dev/md1                  /dev/md1  /dev/sdb2      raid1 LVM2_member        67.8G 
    |-/dev/mapper/system-root /dev/dm-0 /dev/md1       lvm   ext4                 20G /
    |-/dev/mapper/system-swap /dev/dm-1 /dev/md1       lvm   swap                  4G [SWAP]
    `-/dev/mapper/system-var  /dev/dm-2 /dev/md1       lvm   ext4                 20G /var
/dev/sr0                      /dev/sr0            ata  rom                      1024M
  • Description of the issue (ideally so that others can reproduce it):
    Generating the MD5 sums fails on files which have spaces in their names:
[...]
2020-05-25 14:26:34.554068350 Including build/default/995_md5sums_rootfs.sh
2020-05-25 14:26:34.576479009 Creating md5sums for regular files in /tmp/rear.VYppB6yfu20BDTk/rootfs
/tmp/rear.VYppB6yfu20BDTk/rootfs /var/opt/sesam/var/work
md5sum: ./usr/lib/firmware/brcm/brcmfmac43455-sdio.MINIX-NEO: No such file or directory
md5sum: Z83-4.txt: No such file or directory
md5sum: ./usr/lib/firmware/brcm/brcmfmac43430a0-sdio.ONDA-V80: No such file or directory
md5sum: PLUS.txt: No such file or directory
/var/opt/sesam/var/work
[...]
[root@rz-backup:~]# ls -la /usr/lib/firmware/brcm/*\ *
-rw-r--r--. 1 root root  989 Apr  1 05:25 /usr/lib/firmware/brcm/brcmfmac43430a0-sdio.ONDA-V80 PLUS.txt
-rw-r--r--. 1 root root 2510 Apr  1 05:25 /usr/lib/firmware/brcm/brcmfmac43455-sdio.MINIX-NEO Z83-4.txt

Granted, in this specific case it's only unneeded text files ... but still. ;-)

  • Workaround, if any:
    The fix is trivial: add -print0 to the find options, and add -0 to the xargs options.

jsmeix commented at 2020-05-26 06:55:

Ugh!
I was waiting for the first example where $IFS characters appear
in basic system file names - so here we are :-(

firmware/brcm/brcmfmac43430a0-sdio.ONDA-V80 PLUS.txt and
firmware/brcm/brcmfmac43455-sdio.MINIX-NEO Z83-4.txt
belong to the Linux kernel firmware files so recent kernel files
have $IFS characters in their names.

@z1atk0
thank you for your issue report!

FYI:
The root cause behind such issues is described in
https://github.com/rear/rear/issues/1372

jsmeix commented at 2020-05-26 08:29:

The actual fix is not so trivial

-    find . -xdev -type f | egrep -v '/md5sums\.txt|/\.gitignore|~$|/dev/' | xargs md5sum >>$md5sums_file || cat /dev/null >$md5sums_file
+    find . -xdev -type f -print0 | egrep -z -v '/md5sums\.txt|/\.gitignore|~$|/dev/' | xargs -0 md5sum -b >>$md5sums_file || cat /dev/null >$md5sums_file

because each command in the pipe must be explicitly changed
to also work with files that have $IFS characters in their names.

This shows that the default behaviour of usual commands
is to not expect files with $IFS characters in their names
which proves that nowadays $IFS characters in file names
are still not a generally expected or a "just supported" case.

This fix makes things work at least for me on SLES12-SP5.

Currently I don't know if on SLES11 and older RHEL versions
grep supports -z and md5sum supports -b
so the new code might fail there but intentionally that code
behaves fail-safe via || cat /dev/null >$md5sums_file
so if the new code fails on older systems it means the whole
md5sum verification is skipped on systems where it cannot work
just as it happens now (i.e. without the above fix) when there are
files in the recovery system that have $IFS characters in their names.

jsmeix commented at 2020-05-26 08:35:

@z1atk0
could you please verify that with my fix in the above
https://github.com/rear/rear/issues/2407#issuecomment-633885689
the whole md5sum verification also works for you on RHEL 7
in particular that the actual md5sum verification during startup
of your particular ReaR recovery system really works for you,
cf. the code about "Verifying md5sums" in
usr/share/rear/skel/default/etc/scripts/system-setup
https://github.com/rear/rear/blob/master/usr/share/rear/skel/default/etc/scripts/system-setup#L79

z1atk0 commented at 2020-05-26 10:16:

The actual fix is not so trivial
[...]
because each command in the pipe must be explicitly changed
to also work with files that have $IFS characters in their names.
[...]

My bad, sorry, I totally missed the egrep in the middle. :man_shrugging: 😇

If you're worried about -z support on certain systems, then what about eliminating the egrep altogether and let find do the work instead, like:

find . -xdev -type f -regextype posix-egrep ! -regex '/md5sums\.txt|/\.gitignore|~$|/dev/' -print0 | xargs -0 [...]

Or, if you're worried that find might not be new enough to support -regextype, then either modify the regex to be of type "emacs" (whatever that may look like, I have no idea TBH), or use a slightly different shell construct like:

find . -xdev -type f -print0 | xargs -0 -n1 | egrep -v '/md5sums\.txt|/\.gitignore|~$|/dev/' | while read single_file_name; do md5sum "$single_file_name" >>$md5sums_file; done || cat /dev/null >$md5sums_file

Here, the xargs invocation emits one filename per line, so for both egrep and md5sum it's just business as usual, as you can conveniently read and quote the whole filename that way. The downside is that you now have one md5sum invocation per file in a bigger loop.

In any case, I don't think it's necessary to use md5sum -b just because of the filename, as this option only seems to concern the handling of a file's contents, but not its name. This only seems to be relevant on systems that actually distinguish between text & binary files on disk. I found a few interesting links on that specific matter:

As for actually testing your fix, that might be a bit of a problem, especially the actual recovery part of it. I mentioned in my initial report that we're using REAR as bundled by SEP SESAM, and not as a standalone solution. So I can add your fix to the script in question, do an extra BSR backup run, and then check the log file, but ATM I'd be a bit hard-pressed (both time and (virtual) hardware wise) to find resources to do a real from-scratch recovery.

jsmeix commented at 2020-05-26 12:32:

@z1atk0
thank you for your suggestions - I will have a look...

I thought I need md5sum -b because without the -b
I got an error message from md5sum that went away with -b
but right now I can no longer reproduce how it had failed
(I don't remember exactly what I did that caused this failure).

In general regarding testing things in the ReaR recovery system:

Set in your etc/rear/local.conf KEEP_BUILD_DIR="yes"
and see the description about KEEP_BUILD_DIR
in your particular usr/share/rear/conf/default.conf

In our current GitHub master code that is
https://github.com/rear/rear/blob/master/usr/share/rear/conf/default.conf#L142

For example on my openSUSE Leap 15.1 system:

# chroot /tmp/rear.3QcTcmdxtyN1vjV/rootfs/

bash-4.4# md5sum -c --quiet md5sums.txt

bash-4.4# echo $?
0

bash-4.4# exit
exit

z1atk0 commented at 2020-05-26 13:01:

I thought I need md5sum -b because without the -b
I got an error message from md5sum that went away with -b
but right now I can no longer reproduce how it had failed
(I don't remember exactly what I did that caused this failure).

Well, so if you have other reasons than the stream of NUL-terminated filenames, then that's a different story of course. 😉 Coming to think of it, I guess you can probably just leave it in without any ill effects, because on systems without text/binary discrimination it won't make any difference at all (except for having "<SPACE>*" as a separator between MD5 sum and filename instead of "<SPACE><SPACE>"), and on systems where it does matter it will automagically do the right thing. 👍

In general regarding testing things in the ReaR recovery system:
[...]

Okay, so this sounds easy & quick enough - I thought I'd have to go through a full-blown restore to test it, which would have been quite cumbersome. I'll give it a shot and let you know if/how it worked!

z1atk0 commented at 2020-05-26 13:52:

Okay, building looks good (note the relaxing & reassuring absence of error messages 😁) ...

[...]
2020-05-26 15:24:38.958967413 Including build/default/995_md5sums_rootfs.sh
2020-05-26 15:24:38.962614206 Creating md5sums for regular files in /tmp/rear.059UeBAVgkdNabg/rootfs
/tmp/rear.059UeBAVgkdNabg/rootfs /var/opt/sesam/var/work
/var/opt/sesam/var/work
2020-05-26 15:25:05.030945133 Finished running 'build' stage in 147 seconds
[...]

... as does verification (I guess the mismatch in ./root/.bash_history is to be expected if I continue roaming & hacking away on the system while SESAM & REAR are doing their things in the background 😉):

[root@rz-backup:~]# chroot /tmp/rear.059UeBAVgkdNabg/rootfs/
[root@rz-backup:/]# md5sum -c --quiet md5sums.txt; echo $?
./root/.bash_history: FAILED
md5sum: WARNING: 1 computed checksum did NOT match
1
[root@rz-backup:/]# grep "[[:alnum:]] [[:alnum:]]" md5sums.txt 
33ad683f22dc4bd4b398629821d7a3de *./usr/lib/firmware/brcm/brcmfmac43455-sdio.MINIX-NEO Z83-4.txt
74e00528d84534ba68fe92c512d11de5 *./usr/lib/firmware/brcm/brcmfmac43430a0-sdio.ONDA-V80 PLUS.txt

So, from my side this bug appears to be fixed - thanks a lot for the quick response! 😎 👍

jsmeix commented at 2020-05-28 08:26:

I tested the exact fix from
https://github.com/rear/rear/issues/2407#issuecomment-633885689
in SLES11-SP4 where it "just works" so I committed that fix "as is" via
https://github.com/rear/rear/commit/d4001883ac6b4f5e08a4801f3d4ede0e3e86747b

@z1atk0
thank you for your check that it also works on RHEL 7.


[Export of Github issue for rear/rear.]