#838 Issue closed: USB gets suspended

gozora opened issue at 2016-05-15 11:55:

Relax-and-Recover (rear) Issue Template

Please fill in the following items before submitting a new issue:

  • rear version (/usr/sbin/rear -V): Relax-and-Recover 1.18 / Git
  • OS version (cat /etc/rear/os.conf or lsb_release -a): CentOS release 6.7 (Final)
  • rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
    ONLY_INCLUDE_VG=( "vg00" )
    EXCLUDE_BACKUP=( ${EXCLUDE_BACKUP[@]} fs:/crash fs:/usr/sap fs:/oracle )
    BACKUP_PROG_EXCLUDE=( ${BACKUP_PROG_EXCLUDE[@]} '/mnt/' )
    ISO_MKISOFS_BIN=/usr/bin/ebiso
    BACKUP=NETFS
    OUTPUT=USB
    BACKUP_URL=usb:///dev/disk/by-label/REAR-000
    USB_DEVICE=/dev/disk/by-label/REAR-000
    EXCLUDE_MD=( $(grep -o -E '^md[0-9]+' /proc/mdstat) ) # exclude all md devices
    GRUB_RESCUE=n
    COPY_AS_IS=( ${COPY_AS_IS[@]} /sbin/sysctl /etc/sysctl.conf /sbin/vconfig /sbin/if
    /etc/sysconfig/network /sbin/shutdown.wrap )
  • Brief description of the issue
    USB device gets suspended after plugging in and triggering rear udev
  • Work-around, if any
    set:
    UDEV_SUSPEND=n
    or
    echo on > /sys/devices/<path_to_USB_device>/power/level

Hi guys, during my experimenting with USB I've came across a piece of code, which purpose I don't really understand.
In file /usr/share/rear/lib/udev-workflow.sh we have code which suspends USB device for some reason.
First I thought that if I run rear recover/mkrescue it will resume somehow, but no :-(.
Here is mentioned code from commit 2af02d9

 +    # Suspend USB port
 +    if [[ "$DEVPATH" && "$UDEV_SUSPEND" =~ ^[yY1] ]]; then
 +        path="/sys$DEVPATH"
 +        Log "Trying to syspend USB device at '$path'"
 +        while [[ "$path" != "/sys/devices" && ! -w "$path/power/level" ]]; do
 +            path=$(dirname $path)
 +        done
 +        if [[ -w "$path/power/level" ]]; then
 +            Log "Suspending USB device at '$path'"
 +            echo -n suspend >$path/power/level
 +        fi
 +    fi

The outcome (at lease on my Centos) is that whenever I plug in the USB device and udev triggers rear udev device gets suspended and consecutive backup will fail with obvious message:
Message: Mount command 'mount -o rw,noatime /dev/disk/by-label/REAR-000 /tmp/rear.fntaUA9e1X5YWFc/outputfs' failed. As a workaround I can wake up the device or set variable UDEV_SUSPEND, however I don't think this is how defaults should be set.

Do you have any clue why is suspending implemented like this?
Did I something wrong or is this some kind of regression?

jsmeix commented at 2016-05-17 12:43:

@dagwieers
can you provide background information about
the reasoning behind your "USB suspend support"
in particular the reasoning why you have in default.conf

# Suspend the (USB) device when udev handler has finished ?
UDEV_SUSPEND=y

I am not at all a udev expert but as far as I understand
in etc/udev/rules.d/62-rear-usb.rules

ACTION=="add", SUBSYSTEM=="block", ENV{ID_FS_LABEL}=="REAR-000", RUN+="/usr/sbin/rear udev"

the RUN+="/usr/sbin/rear udev" means
according to the following "man udev" excerpt

  RUN{type}
    Add a program to the list of programs to be executed
    after processing all the rules for a specific event
    ...
    This can only be used for very short-running foreground
    tasks. Running an event process for a long period of time
    may block all further events for this or a dependent device.
    Starting daemons or other long-running processes is not
    appropriate for udev; the forked processes, detached or not,
    will be unconditionally killed after the event handling has
    finished.

that the udev workflow might be unconditionally killed
after the event handling has finished.

As far as I understand in usr/share/rear/lib/udev-workflow.sh

    # Run udev workflow
    WORKFLOW_$UDEV_WORKFLOW "${ARGS[@]}"

this runs the WORKFLOW_$UDEV_WORKFLOW function
(by default.conf it is the WORKFLOW_mkrescue function)
as the same process as the udev workflow itself
but that process will be killed by udev after a
short time (as far as I understand "man udev").

From my current non-udev-expert point of view
the whole udev workflow cannot work this way.

gozora commented at 2016-05-17 12:49:

@jsmeix you might not be an udev expert but you are definitely expert in finding things! 👍
That udev kill thing explains a lot, because recently I'v tested EFI USB boot on Debian and rear udev workflow was terminated (after a while) without any further notification (hence it was not suspended at the end) ...

jsmeix commented at 2016-05-17 12:56:

@gozora
to avoid that the rear udev workflow is run whenever
you plug in your "REAR-000" labeled USB stick,
remove the etc/udev/rules.d/62-rear-usb.rules file
(or whatever the name of your rear usb rules file is).

jsmeix commented at 2016-05-17 12:59:

<shameless self-praise mode>
after working 17 years at SUSE where I maily have to deal
with all kind of "strange issue reports" from users and customers
I think I got some experience in finding "smelling" things ;-)
</shameless self-praise mode>

jsmeix commented at 2016-05-17 13:11:

I asked a colleague who is a udev expert
and he explained it to me:

There is a hard 120 seconds timeout in udev
after which any by udev started process is killed.

The reason is that udev proceeds event by event
and as long as there is a process running that was
started by udev for an event, udev cannot proceed
with any subsequent event for that device.

Therefore udev kills any process running that was
started by udev for an event after 120 seconds.

Bottom line is:
udev cannot be used to launch longer running tasks
when a udev event happens.

Accordingly the whole udev workflow in rear
cannot work as it is currently.

jsmeix commented at 2016-05-17 13:21:

@gozora
does it work to avoid that the udev workflow interferes
when you remove the udev rule file that triggers
the rear udev workflow?

I.e. without the automated udev workflow
does it work when you manually do what
you like to do?

If yes, I would like to close this particular issue here
because I submitted the general issue https://github.com/rear/rear/issues/840
that the current udev workflow cannot work at all.

gozora commented at 2016-05-17 13:22:

@gozora
to avoid that the rear udev workflow is run whenever
you plug in your "REAR-000" labeled USB stick,
remove the etc/udev/rules.d/62-rear-usb.rules file
(or whatever the name of your rear usb rules file is).

Yep, that is also an workaround ...

strange that centos runs whole rear udev without interrupting, I'm not so sure but I'd say that it was running longer that 2 minutes ...

gozora commented at 2016-05-17 13:26:

@jsmeix I can try it later today.
I have no problem closing this issue, it was open just to discuss code parts that was not clear to me.

So feel free to close it...

gozora commented at 2016-05-17 13:27:

Oh I can close it by my self! Sweet :-)

jsmeix commented at 2016-05-17 13:53:

Regarding https://github.com/rear/rear/issues/838#issuecomment-219715419

Perhaps the udev kill timeout value is different depending
on the udev version / Linux distribution / whatever ...

But that does not change the generic fault here
that running a rear workflow with a time bomb
is not "the right thing" from my point of view.

By the way:
As far as I see the rear udev rules file is not included
in rear RPM packages (because as far as I see
it does not get installed by rear's "make install")
so that this problematic udev workflow should not
affect users who install rear by a RPM package.

@gozora
I guess you installed rear by copying the GitHub
sources directly so that you got the rear udev rules file.

gozora commented at 2016-05-17 13:56:

@gozora
I guess you installed rear by copying the GitHub
sources directly so that you got the rear udev rules file.

Yes exactly, I'm just doing copy from rear directory. I thought that udev rule is part of installation. If not than it should not affect any user ....


[Export of Github issue for rear/rear.]