#2279 Issue closed: Do more when COPY_AS_IS misses required shared libraries?

Labels: enhancement, documentation, cleanup, fixed / solved / done

OliverO2 opened issue at 2019-11-19 14:32:

This is to continue a discussion about whether to extend the functionality of COPY_AS_IS with respect to detecting and/or automatically adding missing shared libraries.

This is a starting point of former postings: https://github.com/rear/rear/pull/2278#discussion_r347488398

Answering @jsmeix: https://github.com/rear/rear/pull/2278#discussion_r347813267

To verify my assumption I ask you to do the following test:

I did some tests on a production development system (equipped with SSDs in a RAID 1 configuration). I've used rear mkopalpba as this was where the libraries were missing. I'm reluctant to post complete logs publicly from such a system but I nevertheless tried to capture all the relevant information.

Console output

These common lines appeared at the top of each run and were stripped from output below for better readability:

Cannot include default keyboard mapping (no KEYMAPS_DEFAULT_DIRECTORY specified)
Cannot include keyboard mappings (neither KEYMAPS_DEFAULT_DIRECTORY nor KEYMAPS_DIRECTORIES specified)
Broken symlink '/bin/vim' in recovery system because 'readlink' cannot determine its link target
Symlink '/usr/share/misc/magic' -> '/usr/share/file/magic' refers to a non-existing directory on the recovery system.
It will not be copied by default. You can include '/usr/share/file/magic' via the 'COPY_AS_IS' configuration variable.
Symlink '/var/lib/rear' -> '/var/lib/rear' refers to a non-existing directory on the recovery system.
It will not be copied by default. You can include '/var/lib/rear' via the 'COPY_AS_IS' configuration variable.

Now my experiments with different find configurations in build/default/990_verify_rootfs.sh:

  1. Using find $ROOTFS_DIR -type f -executable -printf '/%P\n':

    rear# time usr/sbin/rear mkopalpba
    
    real    0m19,947s
    user    0m10,081s
    sys     0m3,106s
    
  2. Using find $ROOTFS_DIR -type f -printf '/%P\n':

    rear# time usr/sbin/rear mkopalpba
    There are binaries or libraries in the ReaR recovery system that need additional libraries
    /usr/lib/x86_64-linux-gnu/plymouth/renderers/drm.so requires additional libraries
            libdrm.so.2 => not found
    /usr/lib/x86_64-linux-gnu/plymouth/label.so requires additional libraries
            libpangocairo-1.0.so.0 => not found
            libpango-1.0.so.0 => not found
            libcairo.so.2 => not found
    /usr/lib/x86_64-linux-gnu/plymouth/script.so requires additional libraries
            libply-splash-graphics.so.4 => not found
    ReaR recovery system in '/tmp/rear.Xj27qSSQ3vBsLKx/rootfs' needs additional libraries, check /home/oliver/Repositories/open-source/rear/var/log/rear/rear-foxtrot.log for details
    
    real    0m38,900s
    user    0m25,695s
    sys     0m8,661s
    
  3. Using find $ROOTFS_DIR -type f \( -executable -o -name '*.so' -o -name '*.so.[0-9]*' \) -printf '/%P\n':

    rear# time usr/sbin/rear mkopalpba
    There are binaries or libraries in the ReaR recovery system that need additional libraries
    /usr/lib/x86_64-linux-gnu/plymouth/renderers/drm.so requires additional libraries
            libdrm.so.2 => not found
    /usr/lib/x86_64-linux-gnu/plymouth/label.so requires additional libraries
            libpangocairo-1.0.so.0 => not found
            libpango-1.0.so.0 => not found
            libcairo.so.2 => not found
    /usr/lib/x86_64-linux-gnu/plymouth/script.so requires additional libraries
            libply-splash-graphics.so.4 => not found
    ReaR recovery system in '/tmp/rear.V9PCzDCTxHcYtvk/rootfs' needs additional libraries, check /home/oliver/Repositories/open-source/rear/var/log/rear/rear-foxtrot.log for details
    
    real    0m21,177s
    user    0m10,828s
    sys     0m3,409s
    
  4. Using find $ROOTFS_DIR/bin $ROOTFS_DIR/lib* $ROOTFS_DIR/opt $ROOTFS_DIR/sbin $ROOTFS_DIR/usr/bin $ROOTFS_DIR/usr/lib* $ROOTFS_DIR/usr/local $ROOTFS_DIR/usr/sbin -type f \( -executable -o -name '*.so' -o -name '*.so.[0-9]*' \) -print | sed "s;^$ROOTFS_DIR;;":

    rear# time usr/sbin/rear mkopalpba
    There are binaries or libraries in the ReaR recovery system that need additional libraries
    /usr/lib/x86_64-linux-gnu/plymouth/renderers/drm.so requires additional libraries
            libdrm.so.2 => not found
    /usr/lib/x86_64-linux-gnu/plymouth/label.so requires additional libraries
            libpangocairo-1.0.so.0 => not found
            libpango-1.0.so.0 => not found
            libcairo.so.2 => not found
    /usr/lib/x86_64-linux-gnu/plymouth/script.so requires additional libraries
            libply-splash-graphics.so.4 => not found
    ReaR recovery system in '/tmp/rear.GGJ1oyz8XIuv7U9/rootfs' needs additional libraries, check /home/oliver/Repositories/open-source/rear/var/log/rear/rear-foxtrot.log for details
    
    real    0m20,628s
    user    0m10,856s
    sys     0m3,362s
    

Log excerpts

  • Log excerpt from configuration no. 2: 2.log.txt
  • Log excerpt from configuration no. 4: 4.log.txt

Remarks

  • Checking each and every file (no. 2) takes about 20 additional seconds (+50%) on a seemingly rather fast system.
  • Including just shared libraries into the search required almost no additional time (below measurement precision).
  • Configurations no. 3 and no. 4 limit the check to executables and shared libraries conforming to Linux conventions (cf. Shared Libraries).
  • Configuration no. 4 limits the check to all reasonable directories containing executables according to the Filesystem Hierarchy Standard. (As this uses find with
    multiple starting points, stripping $ROOTFS_DIR from paths could not be done via -printf, so sed was used.)
  • To choose an option, there are some conflicting goals to consider:
    • Fast runtime -> shorter testing cycles for ReaR developers and users -> more contributions, better product quality
    • More complete checking -> catching remaining errors early -> better operational quality
    • Safer checking -> reducing the attack surface -> protecting privacy, avoiding financial losses
  • Keeping in mind that all of the above alternatives would improve the current situation: If we cannot have everything, which goal do we value more?
    We'd have to judge the likelihood and risks involved. Would we go for finding problems in obscure places or would we prefer higher security?
  • In my view, as ReaR is a program that runs regularly with root privileges, I'd favor security over catching errors in obscure situations. Note that chroot doesn't help here as root can easily break out of it (cf. Is chroot a security feature? - Red Hat Customer Portal).

jsmeix commented at 2019-11-19 15:19:

For completeness and easier readability
a copy of the relevant posts about this topic here
in https://github.com/rear/rear/pull/2278

A copy of
https://github.com/rear/rear/pull/2278#discussion_r347445136

Now I am wondering if there could be ever a use case
to have a shared object (library) in the ReaR recovery system
but not also all of its dependant required other shared objects?

I think a shared object (library) can not at all work
without its dependant required other shared objects
so I think

  • either have a shared object with all its required other shared objects
  • or do not have the shared object at all because it is useless

Or could a shared object be useful in any way
without all its required other shared objects?

A copy of
https://github.com/rear/rear/pull/2278#discussion_r347458656

No. It just won't load. So it is a non-functional trap waiting for someone to step into. ;-)

And I'd like to object, as least a bit: While ReaR is not that high on (even developer-oriented) usability, it is not that bad: ReaR has so many little functions included that try to auto-detect and/or auto-correct stuff which would be hard to get right the first time. So it just works in many cases, which makes it get usability points. ReaR also includes considerable effort to spot problems and provide useful information in logs, which makes it get additional points.

So maybe improving COPY_AS_IS to inspect shared libraries would be a good idea. Currently, it lets ldd look at every shell/python/whatever executable script, so why not make it inspect shared libraries? The only problem would be the naming. It's not just .so or .so (as this would include syslog.socket) or lib.so*. You could have:

  • libSegFault.so
  • libacl.so.1.1.0
  • pam_gdm.so

Maybe something like *.so plus .so.[0-9] would do...

A copy of
https://github.com/rear/rear/pull/2278#discussion_r347488398

My reason behinf my question in
https://github.com/rear/rear/pull/2278#discussion_r347445136
is exactly what you propose in your reply
https://github.com/rear/rear/pull/2278#discussion_r347458656

improving COPY_AS_IS to inspect shared libraries would be a good idea

but I would try to go even further and try to
improve COPY_AS_IS to inspect all regular files
because as far as I know ldd won't run amok
for files that it cannot work with
e.g. on my openSUSE Leap 15.0 system

# ldd /etc/fstab 
ldd: warning: you do not have execution permission for `/etc/fstab'
        not a dynamic executable

# ldd /boot/vmlinuz-4.12.14-lp150.12.82-default
ldd: warning: you do not have execution permission for `/boot/vmlinuz-4.12.14-lp150.12.82-default'
        not a dynamic executable

# ldd /dev/sda
ldd: /dev/sda: not regular file

which may result in the end to inspect all regular files
in the recovery system, i.e. to enhance first and foremost
usr/share/rear/build/default/990_verify_rootfs.sh
to check all files if they have all required libraries
actually accessible inside the recovery system.

A copy of
https://github.com/rear/rear/pull/2278#discussion_r347495018

All files would probaby be a bit too much. I've tested a bit and found that this one would get me all shared libraries:

find / -mount \( -name \*.so -o -name \*.so.[0-9]\* \) -print

But some caution from http://man7.org/linux/man-pages/man1/ldd.1.html seems advisable:

    Security

    Be aware that in some circumstances (e.g., where the program speci‐
    fies an ELF interpreter other than ld-linux.so), some versions of ldd
    may attempt to obtain the dependency information by attempting to
    directly execute the program, which may lead to the execution of
    whatever code is defined in the program's ELF interpreter, and per‐
    haps to execution of the program itself. (In glibc versions before
    2.27, the upstream ldd implementation did this for example, although
    most distributions provided a modified version that did not.)

    Thus, you should never employ ldd on an untrusted executable, since
    this may result in the execution of arbitrary code. A safer alterna‐
    tive when dealing with untrusted executables is:

    $ objdump -p /path/to/program | grep NEEDED

    Note, however, that this alternative shows only the direct dependen‐
    cies of the executable, while ldd shows the entire dependency tree of
    the executable.

So if someone configured ReaR to include user-writable directories in COPY_AS_IS, executables lying around there might be executed with root privileges. This could already happen currently, so it would probably be a good idea to limit executable examination to safe paths only.

It is not clear from the above statement whether a shared library would be considered an executable, too (in most respects, it is). So if COPY_AS_IS were to be extended, it would probably be safer to limit automatic examination of shared libraries to safe paths only (such as /lib and /usr/lib) as well.

I still think that extending COPY_AS_IS would be a good idea and might even save you some support headaches later on.

A copy of
https://github.com/rear/rear/pull/2278#discussion_r347813267

I know about the security issues with ldd
which is another reason behind why I think
I should first and foremost enhance 990_verify_rootfs.sh
to inspect all regular files of the recovery system with ldd
because 990_verify_rootfs.sh is unning ldd within the
recovery system build directory via chroot $ROOTFS_DIR ... ldd
so that things happen within a separated chroot environment
and not directly on the original running system.

I think if 990_verify_rootfs.sh would have checked all regular files
with ldd you would have got a notification about the missing
libraries in your particular recovery system.

To verify my assumption I ask you to do the following test:
Change in your build/default/990_verify_rootfs.sh the line

for binary in $( find $ROOTFS_DIR -type f -executable -printf '/%P\n' ) ; do

to

for binary in $( find $ROOTFS_DIR -type f -printf '/%P\n' ) ; do

(i.e. remove the -executable to find all regular files)
and run "rear -D mkrescue" but without your

OPAL_PBA_LIBS+=( /usr/lib/x86_64-linux-gnu/plymouth/*.so )

and attach your rear debug log so that I can have a look
how things behave then in your case.

I tested it on my rather fast computer on plain command line
where testing all regular files in a /tmp/rear.XXXXXX/rootfs
means testing about 16 times more files
(11769 regular files instead of 732 executables)
which needs about 14 times more time
(about one and a half minute instead of 6 seconds):

# ROOTFS_DIR=/tmp/rear.lASNytXC8KICp3f/rootfs

# find $ROOTFS_DIR -type f -executable -printf '/%P\n' | wc -l
732

# find $ROOTFS_DIR -type f -printf '/%P\n' | wc -l
11769

# time for binary in $( find $ROOTFS_DIR -type f -executable -printf '/%P\n' ) ; \
 do chroot $ROOTFS_DIR /bin/bash --login -c "cd $( dirname $binary ) \
 && ldd $binary" </dev/null 2>/dev/null | grep -q 'not found' \
 && broken_binaries="$broken_binaries $binary" ; done

real    0m5.534s
user    0m3.964s
sys     0m3.096s

# time for binary in $( find $ROOTFS_DIR -type f -printf '/%P\n' ) ; \
 do chroot $ROOTFS_DIR /bin/bash --login -c "cd $( dirname $binary ) \
 && ldd $binary" </dev/null 2>/dev/null | grep -q 'not found' \
 && broken_binaries="$broken_binaries $binary" ; done

real    1m17.684s
user    0m55.042s
sys     0m41.441s

I think one or two minutes more time during "rear mkrescue"
do not matter in practice because I assume there is plenty of time
when the admin runs "rear mkrescue" (or even "rear mkbackup")
and it results a more reliably working recovery system
which could save the admin hours of later work
when the recovery system does not work.

The only case I know where time matters for "rear mkrescue"
is the broken by design 'udev' workflow, cf.
https://github.com/rear/rear/issues/840

I also tested "rear mkresue" with a modified 990_verify_rootfs.sh

-for binary in $( find $ROOTFS_DIR -type f -executable -printf '/%P\n' ) ; do
+for binary in $( find $ROOTFS_DIR -type f -printf '/%P\n' ) ; do
...
-    # Redirected stdin for login shell avoids motd welcome message, cf. https://github.com/rear/rear/issues/2120.
-    chroot $ROOTFS_DIR /bin/bash --login -c "cd $( dirname $binary ) && ldd $binary" < /dev/null | grep -q 'not found' && broken_binaries="$broken_binaries $binary"
+    # Redirected stdin for login shell avoids motd welcome message, cf. https://github.com/rear/rear/issues/2120
+    # and redirected stderr avoids tons of ldd warning messages in the log like 'ldd: warning: you do not have execution permission for ...'
+    # cf. https://blog.schlomo.schapiro.org/2015/04/warning-is-waste-of-my-time.html
+    chroot $ROOTFS_DIR /bin/bash --login -c "cd $( dirname $binary ) && ldd $binary" </dev/null 2>/dev/null | grep -q 'not found' && broken_binaries="$broken_binaries $binary"

That modified 990_verify_rootfs.sh needed only about 40 seconds
during "rear mkrescue" in my case.

The reason is that kernel modules are not tested, cf.
https://github.com/rear/rear/blob/master/usr/share/rear/build/default/990_verify_rootfs.sh#L99
and from the 11769 regular files in my recovery system
there are 3935 kernel modules

# find $ROOTFS_DIR -type f | wc -l
11769

# find $ROOTFS_DIR -type f | grep '/lib/modules/' | wc -l
3935

so only about two-thirds of the regular files
are actually tested with ldd.

The whole "rear mkrescue" took one minute and 15 seconds
with testing all regular files that are no kernel modules.

I noticed that I use FIRMWARE_FILES=( 'no' ) (as I always do)
which is not the default.
Without it (i.e. by default) I get about 2000 firmware files
in the recovery system where running ldd also does
not make sense so I also skip ldd for firmware files

-    # so we 'grep' for '/lib/modules/' anywhere in the full path of the binary:
-    grep -q "/lib/modules/" <<<"$binary" && continue
+    # so we 'grep' for '/lib/modules/' anywhere in the full path of the binary.
+    # Also skip the ldd test for firmware files where it also does not make sense:
+    egrep -q '/lib/modules/|/lib.*/firmware/' <<<"$binary" && continue

which saves me about 6 seconds for useless ldd for firmware files.

A copy of
https://github.com/rear/rear/pull/2278#discussion_r347854647

I was thinking about if it could be sufficient
to run ldd only for the regular files in some
"well known" directories in the recovery system.

I did a quick test to see in which directories in the recovery system
ldd found regular files that are dynamic executables
i.e. in which directories in the recovery system there are
files where running ldd makes sense:

# ROOTFS_DIR=/tmp/rear.c3mdOvGKSRIkH8L/rootfs/

# for d in $( for f in $( find $ROOTFS_DIR -type f -printf '/%P\n' ) ; \
 do ldd $f 2>/dev/null | grep -v 'not a dynamic executable' \
 && echo $f is a dynamic executable ; done \
 | grep 'is a dynamic executable' | cut -d ' ' -f1 ) ; \
 do dirname $d ; done | sort -u

/bin
/lib64
/usr/lib/dracut
/usr/lib/dracut/modules.d/99kdump
/usr/lib/perl5/5.26.1/x86_64-linux-thread-multi/CORE
/usr/lib/systemd
/usr/lib/systemd/system-generators
/usr/lib/udev
/usr/lib64
/usr/lib64/rsyslog

Accordingly files where running ldd makes sense
can appear in arbitrary directories in the recovery system
so it is not sufficient to run ldd only for the regular files
in some "well known" directories in the recovery system.

jsmeix commented at 2019-11-19 15:35:

@OliverO2
thank you so much for your valuable input.
It helps a lot and it is much appreciated!

On first glance I like your item 3. most of all

find $ROOTFS_DIR -type f \( -executable -o -name '*.so' -o -name '*.so.[0-9]*' \) -printf '/%P\n':

because it is simple and straightforward
and it is reasonably complete
(libraries without a '.so...' suffix are considered "unreasonable")
and it still runs fast.

By the way:
I don't know if the much simplified directory structure
in the recovery system is still in compliance with the FHS
so by gut feeling I prefer a generic method (as in your item 3.)
over a method that depends on compliance with a standard
(perhaps currently it is in compliance with FHS but
should we depend and rely on that for the future?), cf.
"Better very simple code than oversophisticated (possibly fragile) constructs"
in https://github.com/rear/rear/wiki/Coding-Style

jsmeix commented at 2019-11-19 16:06:

@OliverO2
if you like you could test https://github.com/rear/rear/pull/2280

OliverO2 commented at 2019-11-19 16:10:

@jsmeix
The FHS is pretty broad, so with respect to this issue ReaR is sufficiently compliant. I'd say that find configuration no. 4 would almost never miss a single executable or library. I'd be surprised if any of the supported operating systems or ReaR itself were deviating from the directories covered by no 4.

I would not expect problems own my own if you'd not limit the directory list. But I'd be unhappy if some admin by accident configured something like /home/hugo/testing into COPY_AS_IS, forgets about it later, and then some scary executables make it into that user directory, having code execute with root privileges on the next (automatic) run of rear mkrescue. The common policy for privileged executables is to do only what's absolutely necessary with root privileges and drop privileges as soon as possible.

OliverO2 commented at 2019-11-19 16:11:

@OliverO2
if you like you could test #2280

I'll do.

OliverO2 commented at 2019-11-19 18:53:

Have tested it with mkopalpba, successfully. Made me even find another set of libraries with one additional missing dependency. Pushed another commit to correct this.

jsmeix commented at 2019-11-22 12:39:

With https://github.com/rear/rear/pull/2280 merged
this issue should be fixed for now.

I would like to do a mitigation of the ldd security issue
with things like COPY_AS_IS+=( /home/JohnDoe )
in a separated pull request, cf.
https://github.com/rear/rear/pull/2280#issuecomment-557501226

jsmeix commented at 2019-11-25 16:28:

@OliverO2
I would appreciate it if you could have a look at my subsequent pull request
https://github.com/rear/rear/pull/2286

jsmeix commented at 2019-11-29 13:04:

With https://github.com/rear/rear/pull/2286 merged
the mitigation of the ldd security issue
with things like COPY_AS_IS+=( /home/JohnDoe )
is implemented.

jsmeix commented at 2019-11-29 13:09:

@OliverO2
because since https://github.com/rear/rear/pull/2286
the ldd test is skipped for files with untrused owner
it can now happen that executables with untrused owner
do not work in the recovery system when there are
missing libraries for executables with untrused owner
and "rear mkrescue" would not error out in this case
but the user is at least informed, cf.
https://github.com/rear/rear/pull/2286#issuecomment-559757716

jsmeix commented at 2019-11-29 13:11:

My next step is to use TRUSTED_FILE_OWNERS to check
all files before they are copied into the recovery system
or directly after they were copied into the recovery system
(depending on what works better or is simpler to implement)
which will be done via a separated pull request (as time permits).


[Export of Github issue for rear/rear.]