#2623 Issue closed: Suppressing stderr by default decreases usability

Labels: discuss / RFC, no-issue-activity

OliverO2 opened issue at 2021-06-02 21:55:

I am aware of #2416. It just does not work for me.

Use Case

I invoke rear mkrescue without options. An unexpected problem occurs. I cannot see what went wrong precisely, because relevant error messages are missing in the log file.

Edit: As @pcahyna has correctly pointed out, this does not apply to rear recover which runs in verbose mode by default.

Suggestion

Please do not suppress stderr globally. The extra output in the log file seems way more acceptable than seeing a clean log file when actually something bad had happened.

The only workaround right now would be to always invoke rear with verbose or debug options. This seems like overkill as in most case just the normal stderr would give a reasonable indication what the problem might have been. Also, increased verbosity makes ReaR harder to use as more lines appear on the terminal, burying possible important messages.

When there are spots which routinely pollute the log file with expected output looking like errors, such spots could be cleaned up individually.

pcahyna commented at 2021-06-03 08:33:

I invoke rear recover without options. (...) The only workaround right now would be to always invoke rear with verbose or debug options.

Note that rear recover is verbose by default.

OliverO2 commented at 2021-06-03 09:58:

@pcahyna You are right. This does not apply to rear recover. I have corrected the issue description accordingly.

Seems like I was trained by repeated invocations of rear mkrescue with missing log information that I did not even look into the log file of rear recover as I was not expecting it to contain anything useful.

OliverO2 commented at 2021-06-03 20:56:

A practical example. This example is a bit primitive, because in this case ReaR directly uses an Error statement. When we are not that lucky, the error case would not be checked by ReaR at all, but would surface sometime later when a follow-up error occurs.

  1. With stderr completely suppressed, as is now the case, it looks like this:

    $ sudo usr/sbin/rear mkrescue
    ERROR: Mount command 'sshfs maintenance@backup-hub.example.com:/nonexistent /tmp/rear.jIL8Nfe59j0uBjo/outputfs -o rw,noatime' failed.
    Some latest log messages since the last called script 100_mount_output_path.sh:
    2021-06-03 22:28:12.395891117 Including output/default/100_mount_output_path.sh
    2021-06-03 22:28:12.406362038 Mounting with 'sshfs maintenance@backup-hub.example.com:/nonexistent /tmp/rear.jIL8Nfe59j0uBjo/outputfs -o rw,noatime'
    Aborting due to an error, check /home/oliver/Repositories/open-source/rear/var/log/rear/rear-juliet.log for details
    Terminated
    
    • So a Mount command ... failed. Why that? User not existing? Host name misspelled? Permission denied? Something else?

    • The last action logged was Mounting with..., then afterwards (we do not know whether immediately or several statements later) "an error" occurred.

    • ReaR tells me to check the log file for details, but there is no further information than what is shown above, followed by Exiting rear mkrescue....

  2. With stderr going to the log file, it looks like this:

    $ sudo usr/sbin/rear mkrescue
    ERROR: Mount command 'sshfs maintenance@backup-hub.example.com:/nonexistent /tmp/rear.IeoteOoHosexvDi/outputfs -o rw,noatime' failed.
    Some latest log messages since the last called script 100_mount_output_path.sh:
    2021-06-03 22:26:07.247867501 Including output/default/100_mount_output_path.sh
    2021-06-03 22:26:07.258859399 Mounting with 'sshfs maintenance@backup-hub.example.com:/nonexistent /tmp/rear.IeoteOoHosexvDi/outputfs -o rw,noatime'
    maintenance@backup-hub.example.com:/nonexistent: No such file or directory
    Aborting due to an error, check /home/oliver/Repositories/open-source/rear/var/log/rear/rear-juliet.log for details
    Terminated
    
    • This immediately shows the root cause of the problem:

      maintenance@backup-hub.example.com:/nonexistent: No such file or directory
      
    • The log file contains the above information (including the root cause), plus a stack trace, showing the line numbers of all statements which were executed at the time, followed by Exiting rear mkrescue....

jsmeix commented at 2021-06-07 13:45:

Only a quick initial note:

I implemented
https://github.com/rear/rear/pull/2498
intentionally as I did it because of
https://github.com/rear/rear/issues/2473
therein in particular because of how users experienced it as in
https://github.com/rear/rear/issues/2473#issuecomment-670483657

I do not want to spend any more minute of my precious time with
possible further issue reports where users who call rear or rear -v
misunderstand some message from some program as an error of ReaR.

Since https://github.com/rear/rear/pull/2498
we (i.e. the ReaR developers) control what users see up to rear -v.

Since https://github.com/rear/rear/pull/2498
the user must explicitly request to also get "low level 'debug' messages"
and then he is hopefully experienced enough to not misunderstand
arbitrary messages from programs (where we have no control what
a particular program may show under what specific curcumstances)
as an error of ReaR.

Personally I always use rear -D because I need that during development.

I think users who are interested what goes on "behind ReaR's surface"
must use at least rear -d.

@OliverO2
is there a reason why you can't use rear -d?

Should it perhaps be better documented that
since https://github.com/rear/rear/pull/2498
what goes on "behind ReaR's surface" is intentionally hidden
to make the content in the log less confusing for users
unless 'debug' messaging was explicitly requested?

OliverO2 commented at 2021-06-07 15:14:

  1. I understand that you don't want to answer repeated enquiries about stderr messages which are just part of ReaR's normal operation. The underlying problem should definitely be fixed.

  2. Globally suppressing errors is considered an unsafe development practice. This is based on a long history of failures. If we'd go down that route, we will experience missing substantial error messages over and over again, on the console and in log files. It is impossible to foresee all conditions which might result in an important error. I/O errors and out-of-memory errors, for example, can occur at any time.

  3. I generally invoke rear mkrescue without any options as I need to focus on the task. What's the task?

    • The task is getting a reliable rescue medium.
    • The task is not to learn about ReaR's internal operations.
    • The task is not to debug ReaR.

    So on the console and in the log file, I want substantial information only, not more, not less. Each additional (unimportant) line of output distracts me, wastes my time and risks that I overlook something important.

    Now look what happens:

    • rear mkrescue already outputs three lines, none of which is important for me, but I can live with that: Ignoring non-existing btrfs subvolume listed as mounted: /var/lib/docker/btrfs Symlink '/lib/modules/5.8.0-55-generic/build' -> '/usr/src/linux-headers-5.8.0-55-generic' refers to a non-existing directory on the recovery system. It will not be copied by default. You can include '/usr/src/linux-headers-5.8.0-55-generic' via the 'COPY_AS_IS' configuration variable.

    • rear -d mkrescue outputs 112 lines, all about ReaR's internal considerations, none of which is important for me. This puts an unnecessary cognitive load on me ("What if I miss some important information in between?") and distracts me from my task. As a user, I feel disrespected when I have to bear this: Relax-and-Recover 2.6 / Git Running rear mkrescue (PID 36709 date 2021-06-07 16:49:44) Using log file: /home/oliver/Repositories/open-source/rear/var/log/rear/rear-juliet.log Running workflow mkrescue on the normal/original system Found EFI system partition /dev/sda1 on /boot/efi type vfat Using UEFI Boot Loader for Linux (USING_UEFI_BOOTLOADER=1) Using local PBA image file '/home/oliver/Repositories/open-source/rear/var/lib/rear/TCG-Opal-PBA/juliet/TCG-Opal-PBA-juliet.raw' Using autodetected kernel '/boot/vmlinuz-5.8.0-55-generic' as kernel in the recovery system Creating disk layout Overwriting existing disk layout file /home/oliver/Repositories/open-source/rear/var/lib/rear/layout/disklayout.conf Docker is running, skipping filesystems mounted below Docker Root Dir /var/lib/docker Ignoring non-existing btrfs subvolume listed as mounted: /var/lib/docker/btrfs Disabling excluded components in /home/oliver/Repositories/open-source/rear/var/lib/rear/layout/disklayout.conf Using guessed bootloader 'EFI' (found in first bytes on /dev/sda) Verifying that the entries in /home/oliver/Repositories/open-source/rear/var/lib/rear/layout/disklayout.conf are correct Created disk layout (check the results in /home/oliver/Repositories/open-source/rear/var/lib/rear/layout/disklayout.conf) Creating recovery system root filesystem skeleton layout Skipping 'br-14542ceec047': not bound to any physical interface. Skipping 'br-fb9c38f9ba2f': not bound to any physical interface. Handling network interface 'bridge0' bridge0 is a bridge bridge0 has lower interface eno1 eno1 is a physical device Handled network interface 'bridge0' Skipping 'docker0': not bound to any physical interface. Skipping 'virbr0': not bound to any physical interface. Included current keyboard mapping (via 'dumpkeys -f') No default US keyboard mapping included (no KEYMAPS_DEFAULT_DIRECTORY specified) No support for different keyboard layouts (neither KEYMAPS_DEFAULT_DIRECTORY nor KEYMAPS_DIRECTORIES specified) Trying to find what to use as UEFI bootloader... Trying to find a 'well known file' to be used as UEFI bootloader... Using '/boot/efi/EFI/ubuntu/grubx64.efi' as UEFI bootloader file Copying logfile /home/oliver/Repositories/open-source/rear/var/log/rear/rear-juliet.log into initramfs as '/tmp/rear-juliet-partial-2021-06-07T16:49:57+02:00.log' Copying files and directories Copying binaries and libraries Copying all kernel modules in /lib/modules/5.8.0-55-generic (MODULES contains 'all_modules') Copying all files in /lib*/firmware/ Skip copying broken symlink '/etc/mtab' target '/proc/45618/mounts' on /proc/ /sys/ /dev/ or /run/ Skip copying broken symlink '/etc/resolv.conf' target '/run/systemd/resolve/stub-resolv.conf' on /proc/ /sys/ /dev/ or /run/ Symlink '/lib/modules/5.8.0-55-generic/build' -> '/usr/src/linux-headers-5.8.0-55-generic' refers to a non-existing directory on the recovery system. It will not be copied by default. You can include '/usr/src/linux-headers-5.8.0-55-generic' via the 'COPY_AS_IS' configuration variable. Testing that the recovery system in /tmp/rear.jTWZ3BAx5GJA0kO/rootfs contains a usable system Testing each binary with 'ldd' and look for 'not found' libraries within the recovery system Testing that the existing programs in the PROGS array can be found as executable command within the recovery system Testing that each program in the REQUIRED_PROGS array can be found as executable command within the recovery system Creating recovery/rescue system initramfs/initrd initrd.cgz with gzip default compression Created initrd.cgz with gzip default compression (433981547 bytes) in 41 seconds Using syslinux to create an EFI bootloader Creating 455 MiB raw disk image "rear-juliet.raw" Using syslinux to install a Legacy BIOS bootloader Installing the EFI rescue system to partitions labeled 'Rescue System' Could not detect suitable partitions labeled 'Rescue System' Copying resulting files to sshfs location Saving /home/oliver/Repositories/open-source/rear/var/log/rear/rear-juliet.log as rear-juliet.log to sshfs location Copying result files '/tmp/rear.jTWZ3BAx5GJA0kO/tmp/rear-juliet.raw.gz /tmp/rear.jTWZ3BAx5GJA0kO/tmp/VERSION /tmp/rear.jTWZ3BAx5GJA0kO/tmp/README /tmp/rear.jTWZ3BAx5GJA0kO/tmp/rear-juliet.log' to /tmp/rear.jTWZ3BAx5GJA0kO/outputfs/juliet at sshfs location Exiting rear mkrescue (PID 36709) and its descendant processes ... Running exit tasks

  4. In this case, the correct fix to eliminate misleading messages from the log file is to treat them specifically. It could be as easy as prefixing affected script lines with expect_stderr:

    #!/bin/bash
    
    function Log() {
        echo "LOG: $*" > /dev/tty
    }
    
    function expect_stderr() {
        local stderr_output exit_status
        if [[ "$VERBOSE" ]]; then
            { stderr_output=$("$@" 2>&1 1>&9-) ;} 9>&1
            exit_status=$?
            [[ -n "$stderr_output" ]] && Log "[THIS MESSAGE IS EXPECTED] $stderr_output"
        else
            "$@" 2>/dev/null
            exit_status=$?
        fi
        return $exit_status
    }
    
    function has_binary() {
        for bin in "$@"; do
            expect_stderr type $bin 1>/dev/null && return 0
        done
        return 1
    }
    
    echo "***VERBOSE unset"
    for x in /bin/bash /bin/basxh; do
        has_binary $x && echo has $x || echo no $x
    done
    
    echo ""
    echo "*** VERBOSE=1"
    VERBOSE=1
    for x in /bin/bash /bin/basxh; do
        has_binary $x && echo has $x || echo no $x
    done
    

EDIT: Script (4.) corrected.

github-actions commented at 2021-08-08 02:07:

Stale issue message


[Export of Github issue for rear/rear.]