#2177 Issue closed: rear runs ldd on vdso.so kernel module which segfaults and generates a memory error

Labels: enhancement, cleanup, support / question, fixed / solved / done, external tool

RolfWeilen opened issue at 2019-07-08 10:00:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.5-git.0.a60d619.unknown.changed / 2019-06-07

  • OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
    Red Hat Enterprise Linux Server 7.6 (Maipo)

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):

AUTOEXCLUDE_MULTIPATH=N
OUTPUT=ISO
OUTPUT_URL=null
ISO_DEFAULT=manuel
TIMESYNC=NTP
BACKUP=TSM
TSM_RESULT_SAVE=n
TSM_RESULT_FILE_PATH=
USE_DHCLIENT=y
USE_STATIC_NETWORKING=
ONLY_INCLUDE_VG=(h50ll60vg00)
GRUB_RESCUE=n
WAIT_SECS=31104000
SSH_ROOT_PASSWORD=rear
  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    KVM / RHV Guest

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86 compatible

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local disk

  • Description of the issue (ideally so that others can reproduce it):
    This question was already posted: ld-2.17.so killed by SIGSEGV #2056
    I know its not a rear problem. The error in the message.debug appears when rear check the vdso.so with the ldd command. Our SAP System check this messages and ask us for help. The rear itself and restore works without any problem.

rear log:

++ chroot /tmp/rear.LPs6ioTnMixguCJ/rootfs /bin/bash --login -c 'cd /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso && ldd /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso/vdso.so'
ldd: exited with unknown exit code (139)

Error in message.debug

Jul  8 11:47:48 h50ll60 auth.info audispd:node=h50ll60 type=ANOM_ABEND msg=audit(1562579268.632:385404): auid=0 uid=0 gid=0 ses=36704 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=25090 comm="ld-linux-x86-64" reason="memory violation" sig=11

Question: To prevent checking rear this library, can we define a exclude for some libraries?

  • Workaround, if any:

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
    rear-h50ll60.log

jsmeix commented at 2019-07-08 13:40:

@RolfWeilen
normally NON_FATAL_BINARIES_WITH_MISSING_LIBRARY
is meant to exclude things from the ldd test, cf.
https://github.com/rear/rear/blob/master/usr/share/rear/conf/default.conf#L1399

I would think

NON_FATAL_BINARIES_WITH_MISSING_LIBRARY='vdso'

should work to exclude any vdso stuff from being fatal for the ldd test.

Doesn't it?

jsmeix commented at 2019-07-08 14:04:

@RolfWeilen
I do not understand your issue.
From what I see in your rear-h50ll60.log
the ldd segfault was (by accident) ignored
and in the end all went fine as far as I see what you got
as terminal output when you run rear -D mkrescue:

# egrep '+ Print |+ PrintError ' rear-h50ll60.log | cut -d ' ' -f3-
'Using autodetected kernel '\''/boot/vmlinuz-3.10.0-957.12.1.el7.x86_64'\'' as kernel in the recovery system'
'Creating disk layout'
'Excluding Volume Group h50ll60vg01'
'Using guessed bootloader '\''GRUB'\'' (found in first bytes on /dev/vda)'
'Verifying that the entries in /var/lib/rear/layout/disklayout.conf are correct ...'
'Creating root filesystem layout'
'Handling network interface '\''eth0'\'''
'eth0 is a physical device'
'Handled network interface '\''eth0'\'''
'Handling network interface '\''eth1'\'''
'eth1 is a physical device'
'Handled network interface '\''eth1'\'''
'Copying logfile /var/log/rear/rear-h50ll60.log into initramfs as '\''/tmp/rear-h50ll60-partial-2019-07-08T11:47:34+0200.log'\'''
'Copying files and directories'
'Copying binaries and libraries'
'Copying all kernel modules in /lib/modules/3.10.0-957.12.1.el7.x86_64 (MODULES contains '\''all_modules'\'')'
'Copying all files in /lib*/firmware/'
'Skip copying broken symlink '\''/etc/mtab'\'' target '\''/proc/18260/mounts'\'' on /proc/ /sys/ /dev/ or /run/'
'Broken symlink '\''/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/build'\'' in recovery system because '\''readlink'\'' cannot determine its link target'
'Broken symlink '\''/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/source'\'' in recovery system because '\''readlink'\'' cannot determine its link target'
'Testing that the recovery system in /tmp/rear.LPs6ioTnMixguCJ/rootfs contains a usable system'
'Creating recovery/rescue system initramfs/initrd initrd.cgz with gzip default compression'
'Created initrd.cgz with gzip default compression (448972701 bytes) in 43 seconds'
'Making ISO image'
'Wrote ISO image: /var/lib/rear/output/rear-h50ll60.iso (512M)'
'Exiting rear mkrescue (PID 8961) and its descendant processes ...'
'Running exit tasks'
'You should also rm -Rf /tmp/rear.LPs6ioTnMixguCJ'

And in particular what happened while
usr/share/rear/build/default/990_verify_rootfs.sh
was running (excerpt)

+ source /usr/share/rear/build/default/990_verify_rootfs.sh
...
++ for binary in '$( find $ROOTFS_DIR -type f -executable -printf '\''/%P\n'\'' )'
++ grep -q 'not found'
+++ dirname /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso/vdso.so
++ chroot /tmp/rear.LPs6ioTnMixguCJ/rootfs /bin/bash --login -c 'cd /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso && ldd /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso/vdso.so'
ldd: exited with unknown exit code (139)
++ for binary in '$( find $ROOTFS_DIR -type f -executable -printf '\''/%P\n'\'' )'
+++ dirname /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso/vdso32-int80.so
++ chroot /tmp/rear.LPs6ioTnMixguCJ/rootfs /bin/bash --login -c 'cd /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso && ldd /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso/vdso32-int80.so'
++ grep -q 'not found'
++ for binary in '$( find $ROOTFS_DIR -type f -executable -printf '\''/%P\n'\'' )'
...
++ local fatal_missing_library=
++ contains_visible_char ''
...
++ true
+ source_return_code=0
...
+ source /usr/share/rear/build/default/995_md5sums_rootfs.sh

so ldd /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/vdso/vdso.so
segfaults but that does not let ReaR error out but it just proceeds
in this case (by accident because of how that pipe status
is checked by that code).

So where is the actual problem?

jsmeix commented at 2019-07-08 14:09:

@RolfWeilen
I do not know what that message.debug file is.

jsmeix commented at 2019-07-08 14:11:

@RolfWeilen
forget my https://github.com/rear/rear/issues/2177#issuecomment-509238760
now I even see that you wrote

The rear itself and restore works without any problem.

RolfWeilen commented at 2019-07-08 14:12:

Hi
Thanks for your answer. Yes everything is working, as i wrote in the case.
Just the ldd command generates an error to /var/log/messages witch is tracked and controlled by guard system.

Jul  8 15:50:39 h50ll60 auth.info audispd:node=h50ll60 type=ANOM_ABEND msg=audit(1562593839.903:389401): auid=0 uid=0 gid=0 ses=37105 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=8105 comm="ld-linux-x86-64" reason="memory violation" sig=11

jsmeix commented at 2019-07-08 14:17:

Now I think I understand what the actual problem is:

All you like is to not get that error message in that message.debug file
which leads to some false alarm by an automated tool.

For this NON_FATAL_BINARIES_WITH_MISSING_LIBRARY
won't work because it does not skip the segfaulting ldd test
but it only specifies what ldd tests that result a 'not found' thingy
are considered to be not fatal.

So to work around that ldd segfault you need a special
workaround in usr/share/rear/build/default/990_verify_rootfs.sh
for your particular case - i.e. you need to modify that script
as you need it in your particular case, cf.
"Disaster recovery with Relax-and-Recover (ReaR)" in
https://en.opensuse.org/SDB:Disaster_Recovery
that reads (excerpt)

Experienced users and system admins can adapt or extend
the ReaR scripts to make it work for their particular cases
and - if the worst comes to the worst - even temporary
quick and dirty workarounds are relatively easily possible

jsmeix commented at 2019-07-08 14:24:

Perhaps running ldd on kernel modules is in general nonsense
so we may skip that ldd test on kernel modules in general?

jsmeix commented at 2019-07-08 14:29:

@rear/contributors
what do you think about
https://github.com/rear/rear/issues/2177#issuecomment-509247114

RolfWeilen commented at 2019-07-09 05:59:

Hi I have changed the script 990_verify_rootfs.sh as you recomended for our case.
Thanks a lot for this hint.

#for binary in $( find $ROOTFS_DIR -type f -executable -printf '/%P\n' ); do
#SUVA Change
for binary in $( find $ROOTFS_DIR ! -name '*vdso*.so' -type f -executable -printf '/%P\n' ); do

gdha commented at 2019-07-09 06:29:

@rear/contributors
what do you think about
#2177 (comment)
@jsmeix Indeed not a bad idea to exclude kernel modules from the ldd output.

jsmeix commented at 2019-07-09 08:02:

I will exclude kernel modules from the ldd test in
usr/share/rear/build/default/990_verify_rootfs.sh

jsmeix commented at 2019-07-09 08:05:

@RolfWeilen
thank you for your prompt feedback what "quick and dirty workaround"
made things work in your particular case.

jsmeix commented at 2019-07-09 11:50:

@RolfWeilen

FYI:

I was wondering why in your particular case
you have kernel modules in /usr/lib/modules/
instead of the normal directory /lib/modules/.

You may have a look at the output of e.g.

# egrep ' source |vdso|usr/lib/modules' rear-h50ll60.log

of your rear-h50ll60.log file that results in particular (excerpts)

+ source /usr/share/rear/build/default/985_fix_broken_links.sh
/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/weak-updates/kmod-kvdo/uds/uds.ko
/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/weak-updates/kmod-kvdo/vdo/kvdo.ko
/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/build
/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/source
...
mkdir: created directory './usr/lib/modules/3.10.0-948.el7.x86_64'
mkdir: created directory './usr/lib/modules/3.10.0-948.el7.x86_64/extra'
mkdir: created directory './usr/lib/modules/3.10.0-948.el7.x86_64/extra/kmod-kvdo'
mkdir: created directory './usr/lib/modules/3.10.0-948.el7.x86_64/extra/kmod-kvdo/uds'
...
++ PrintError 'Broken symlink '\''/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/build'\'' in recovery system because '\''readlink'\'' cannot determine its link target'
...
++ PrintError 'Broken symlink '\''/usr/lib/modules/3.10.0-957.12.1.el7.x86_64/source'\'' in recovery system because '\''readlink'\'' cannot determine its link target'

I think there could be something wrong on your original system
when you have /usr/lib/modules/3.10.0-957.12.1.el7.x86_64/ there.

RolfWeilen commented at 2019-07-10 08:23:

Hi
On our redhat server's the /lib and /usr/lib seems to be the same. Just a link.
I try to install a redhat from iso image and not from the satelite install server.
Redhat 7.6 and Redhat 8:

# ls -ld /lib
lrwxrwxrwx. 1 root root 7 Aug 30  2018 /lib -> usr/lib

# ls -ld  /lib64
lrwxrwxrwx. 1 root root 9 Aug 30  2018 /lib64 -> usr/lib64

jsmeix commented at 2019-07-10 09:19:

@RolfWeilen
thank you for the info how things are on Red Hat.

A side note:

Let's hope that no Linux distribution uses something new
like lib64/modules or whatever else for kernel modules
(the usual suspect for unexpected fancy deviations is Arch Linux ;-)
because ReaR only looks at /lib/modules/$KERNEL_VERSION
(cf. usr/share/rear/build/GNU/Linux/400_copy_modules.sh and other scripts)
where KERNEL_VERSION="$( uname -r )" by default
(cf. usr/share/rear/conf/default.conf but see also -r in "man rear")
which matches what "man modprobe" reads

modprobe looks in the module directory
/lib/modules/`uname -r` for all the modules

at least on my old SLES11 and also on my
newer SLES15-like openSUSE Leap 15.0.

RolfWeilen commented at 2019-07-10 09:28:

Hi
I installed a redhat server 7.6 from dvd with same result. /lib is a link to /usr/lib.
regards
rolf

jsmeix commented at 2019-07-10 12:41:

With https://github.com/rear/rear/pull/2179 merged
this issue should be avoided.


[Export of Github issue for rear/rear.]