#251 Issue closed: ERROR: FindStorageDrivers called but STORAGE_DRIVERS is empty

Labels: enhancement

gdha opened issue at 2013-06-18 13:59:

#-> rear -v mkrescue
Relax-and-Recover 1.14-git201306171045 / 2013-06-17
Using log file: /var/log/rear/rear-witsbebelnx02.log
Creating disk layout
Creating root filesystem layout
ERROR: FindStorageDrivers called but STORAGE_DRIVERS is empty
Aborting due to an error, check /var/log/rear/rear-witsbebelnx02.log for details
Terminated

Going deeper via the logs:

2013-06-18 15:10:17 Including rescue/GNU/Linux/23_storage_and_network_modules.sh
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers/usb': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/kernel/drivers': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/extra': No such file or directory
find: `/lib/modules/3.0.13-0.27-default/weak-updates': No such file or directory
...
2013-06-18 15:10:17 ERROR: FindStorageDrivers called but STORAGE_DRIVERS is empty
=== Stack trace ===
Trace 0: /usr/sbin/rear:249 main
Trace 1: /usr/share/rear/lib/mkrescue-workflow.sh:35 WORKFLOW_mkrescue
Trace 2: /usr/share/rear/lib/framework-functions.sh:79 SourceStage
Trace 3: /usr/share/rear/lib/framework-functions.sh:40 Source
Trace 4: /usr/share/rear/rescue/GNU/Linux/26_storage_drivers.sh:4 source
Trace 5: /usr/share/rear/lib/linux-functions.sh:106 FindStorageDrivers
Trace 6: /usr/share/rear/lib/_input-output-functions.sh:131 StopIfError
Message: FindStorageDrivers called but STORAGE_DRIVERS is empty
===================

Ok, why nothing was found?

#-> uname -r
3.0.13-0.27-default
#-> more /boot/grub/menu.lst
# Modified by YaST2. Last modification on Mon Jun 17 13:57:38 CEST 2013
default 0
timeout 8
##YaST - generic_mbr
gfxmenu (hd0,0)/message
##YaST - activate

###Don't change this comment - YaST2 identifier: Original name: linux###
title SUSE Linux Enterprise Desktop 11 SP2 - 3.0.74-0.6.10
    root (hd0,0)
    kernel /vmlinuz-3.0.74-0.6.10-default root=/dev/vg00/lv00 splash=silent  showopts multipath=off crashkernel=256M
    initrd /initrd-3.0.74-0.6.10-default

The system was upgraded, but not yet rebooted - evidence run zypper ps on SLES.

dagwieers commented at 2013-06-18 17:17:

The reason for this is that Relax-and-Recover is using the currently running kernel as the basis for its rescue image. As a result if the running kernel is no longer installed, there are no drivers found. So we basically assume that the running kernel is still installed.

We could improve this, however if the logic depends on the bootloader (to find the 'upcoming' kernel to use) it might get very complex very quickly :-/ So the logic should only chime in when it is determined that the current kernel is no longer installed.

A simply workaround in your case would be to execute rear -v -r 3.0.74-0.6.10-default mkrescue.

PS Maybe I should look at how kdump is handling this (might be possible it doesn't handle the case either though).

gdha commented at 2013-06-19 06:50:

I do prefer that users reboot after installing a new kernel - good practice anyway - no need we improve rear for this. The possible error is documented now in GitHub :-)

dagwieers commented at 2013-06-19 08:07:

I agree it is better that the kernel used for Relax-and-Recover is the one that is running (or at least one that booted once before). However if it is possible that the current kernel is no longer installed, and Relax-and-Recover bails out because of this, I would prefer the second best option: search for another kernel.

If we object to using a kernel that may never have been booted once, we should at least detect this case and provide with a more specific error IMO. Especially if people use "rear mkrescue" from cron, it would be more convenient to indicate that "the currently running kernel is no longer installed on this system".

I would prefer a specific error in prep over adapting the current error-message to hint, because:

  1. an empty STORAGE_DRIVERS could be related to something else (a bug ?)
  2. even if the kernel is not installed, STORAGE_DRIVERS could be non-empty (e.g. out-of-tree driver)

The main question is, what's the best way to determine that the running kernel is not installed ? Should we test for a set of directories that should always be there (all of them):

  • /lib/modules/$(uname -r)/kernel/crypto/
  • /lib/modules/$(uname -r)/kernel/drivers/usb/
  • /lib/modules/$(uname -r)/kernel/lib/
  • /lib/modules/$(uname -r)/kernel/modules.dep

Or is there something more appropriate/safe ?

BTW I like to involve @jhoekx, he's got a good (cross-distro) taste for things like this as well ;-)

jhoekx commented at 2013-06-19 12:00:

:-)

Do we need to check subdirectories? Is /lib/modules/$(uname -r)/kernel not
sufficient?

On 19 June 2013 10:07, Dag Wieers notifications@github.com wrote:

I agree it is better that the kernel used for Relax-and-Recover is the one
that is running (or at least one that booted once before). However if it is
possible that the current kernel is no longer installed, and
Relax-and-Recover bails out because of this, I would prefer the second best
option: search for another kernel.

If we object to using a kernel that may never have been booted once, we
should at least detect this case and provide with a more specific error
IMO. Especially if people use "rear mkrescue" from cron, it would be more
convenient to indicate that "the currently running kernel is no longer
installed on this system"
.

I would prefer a specific error in prep over adapting the current
error-message to hint, because:

  1. an empty STORAGE_DRIVERS could be related to something else (a bug
    ?)
  2. even if the kernel is not installed, STORAGE_DRIVERS could be
    non-empty (e.g. out-of-tree driver)

The main question is, what's the best way to determine that the running
kernel is not installed ? Should we test for a set of directories that
should always be there (all of them):

  • /lib/modules/$(uname -r)/kernel/crypto/
  • /lib/modules/$(uname -r)/kernel/drivers/usb/
  • /lib/modules/$(uname -r)/kernel/lib/
  • /lib/modules/$(uname -r)/kernel/modules.dep

Or is there something more appropriate/safe ?

BTW I like to involve @jhoekx https://github.com/jhoekx, he's got a
good (cross-distro) taste for things like this as well ;-)


Reply to this email directly or view it on GitHubhttps://github.com/rear/rear/issues/251#issuecomment-19668637
.

schlomo commented at 2013-06-19 12:04:

  1. an empty STORAGE_DRIVERS could be related to something else (a bug
    ?)
  2. even if the kernel is not installed, STORAGE_DRIVERS could be
    non-empty (e.g. out-of-tree driver)

​A static kernel will also yield an empty STORAGE_DRIVERS.

From a package deployment perspective (everything installed is there
because we want to use it) I could imagine that we should automatically set
-r not to the current kernel but to the greatest *installed

  • kernel. If that happens to be the running kernel then nothing changed, if
    there is a difference then we use a kernel that at least has all files
    available and that would have been used after the next reboot anyway.​

dagwieers commented at 2013-06-19 18:17:

@jhoekx I deliberately do not check for /lib/modules/$(uname -r)/kernel, because if you have any file added (e.g. mvfs.ko or anything out-of-tree, maybe vmware-tools or an hp-driver) the parent directories will still be around, STORAGE_DRIVERS will not be empty, most likely it will bail out the moment it doesn't find the kernel. But since the name of the kernel is also not standardized...

dagwieers commented at 2013-06-19 18:21:

@schlomo We don't wan't to automatically pick up the latest installed kernel since it may not properly boot. Sure, the intention is to use it, but we don't want to be confronted with an unbootable DR image if the new kernel trashed the filesystem when it gets booted into the first time. It is safer to use the current kernel, if it exists. As soon as the server is rebooted (successfully) checklayout may notice it and recreate the image.

schlomo commented at 2013-06-19 18:38:

@dagwieers so why not take the latest_installed-1 kernel if the current
kernel is not available? Wouldn't that be better than failing the rear run?

On 19 June 2013 20:21, Dag Wieers notifications@github.com wrote:

@schlomo https://github.com/schlomo We don't wan't to automatically
pick up the latest installed kernel since it may not properly boot. Sure,
the intention is to use it, but we don't want to be confronted with an
unbootable DR image if the new kernel trashed the filesystem when it gets
booted into the first time. It is safer to use the current kernel, if it
exists. As soon as the server is rebooted (successfully) checklayout may
notice it and recreate the image.


Reply to this email directly or view it on GitHubhttps://github.com/rear/rear/issues/251#issuecomment-19703229
.

dagwieers commented at 2013-06-19 18:52:

@schlomo Apart from the difficulty to determine what the latest_installed-1 kernel is, if you have had 2 new kernels installed, but did not reboot neither yet, you are still booting a kernel that may not boot as part of the DR. Maybe you got 2 kernels because the previous update had a known defect.

So if the current kernel is not available, my preference would be to stick with whatever kernel/image we had before by failing (likely the last succesful run was the current kernel/ramdisk), rather than to deliver something suboptimal. So I would simply add a check and escalate this to the user so cron mails it, or it's obvious from the output on screen.

BTW The reporter's example shows a single new kernel on the system, so there wouldn't even be a latest_installed-1 kernel ;-)

dagwieers commented at 2013-06-19 18:56:

PS Yes, I am flip-flopping ;-) So I agree with Gratien now ! (Let this be proof that even I can change my mind...)

schlomo commented at 2013-06-19 19:30:

Fine with me. Better error messages are always good.

On 19 June 2013 20:56, Dag Wieers notifications@github.com wrote:

PS Yes, I am flip-flopping ;-) So I agree with Gratien now ! (Let this be
proof that even I can change my mind...)


Reply to this email directly or view it on GitHubhttps://github.com/rear/rear/issues/251#issuecomment-19705872
.

gdha commented at 2013-08-02 08:04:

@rear/owners add a sentence in the release-notes about this (reminder to myself)

gdha commented at 2013-10-02 14:43:

made a note in documentation/release-notes-1-16.md

teissler commented at 2013-11-16 22:58:

I got the error message because i have a kernel compiled without block modules.
I think it should be detected if the kernel is build with storage modules or fixed builtin.

schlomo commented at 2013-11-18 09:53:

Can you suggest some modinfo calls to find out if that is the case? Are
there some modules which are representative of that case?

Also, how could we handle the case where only some drivers are built into
the kernel and others not?

On 16 November 2013 23:58, morlix notifications@github.com wrote:

I got the error message because i have a kernel compiled without block
modules.
I think it should be detected if the kernel is build with storage modules
or fixed builtin.


Reply to this email directly or view it on GitHubhttps://github.com/rear/rear/issues/251#issuecomment-28637931
.

teissler commented at 2013-11-18 19:28:

As this issue is closed i opened a new one: https://github.com/rear/rear/issues/331


[Export of Github issue for rear/rear.]