#766 Issue closed: Rear fails to run on modern Fedora kernels

Labels: bug, fixed / solved / done

astafyev opened issue at 2016-02-11 13:08:

40-start-udev-or-load-modules.sh script assumes that kernel modules have .ko extension. Recent Fedora kernels have .ko.xz extensions so no storage modules are not being loaded at all.
Moreover this script loads usb_storage modules before others and usually there is a slight delay between this module load and attempt to mount /dev/disk/by-label/REAR-000 where backup is located. The more storage modules are loaded the more delay is. But in case no or few modules are loaded the delay is small and udev doesn't have enough time to make corresponding symlink and rear fails with error while trying to mount backup USB media.

gdha commented at 2016-02-11 16:09:

@astafyev thank you for the report - we will check it

gdha commented at 2016-02-12 13:28:

The configuration of my test:

# cat /etc/rear/site.conf 
BACKUP=NETFS
OUTPUT=USB
BACKUP_URL="usb:///dev/disk/by-label/REAR-000"
BACKUP_PROG_COMPRESS_OPTIONS="--bzip2"
BACKUP_PROG_COMPRESS_SUFFIX=".bz2"

I did a complete /backup/recover/ test with fedora23 and USB and it worked out of the box for me.
I did found 2 issues, but these are not related to this.
OTOH, you have a point too. Will spend some time to investigate a bit deeper.

jsmeix commented at 2016-02-12 14:39:

Since udev generates device nodes it is a generic issue
that basically before any initial access of a device node
one would have to do something like

udevadm settle --timeout=20

or use the rear function "my_udevsettle".

@astafyev
can you get a rear log file (preferably where rear was called with '-d' and '-D' debugging options) where it fails for you so that it is more clear in what exact line in which rear script the initial access of the
device node happens because there a "udevadm settle" is probably missing and needs to be inserted.

@gdha
perhaps adding a general "udevadm settle" step in all rear
workflows after all kernel modules have been loaded
could be a single and simple generic solution
(instead of adding "udevadm settle" at various places
before each initial access of a device node).

FYI:
The current scripts where "modprobe " is called:

usr/share/rear/layout/prepare/GNU/Linux/15_include_drbd_code.sh
  modprobe drbd
usr/share/rear/layout/prepare/GNU/Linux/17_include_hpraid_code.sh
  modprobe cciss
usr/share/rear/layout/prepare/GNU/Linux/21_load_multipath.sh
  modprobe dm-multipath
usr/share/rear/layout/prepare/default/20_recreate_hpraid.sh
  modprobe cciss
usr/share/rear/lib/network-functions.sh
  modprobe bonding
usr/share/rear/lib/udev-workflow.sh
  modprobe pcspkr
usr/share/rear/prep/default/32_include_uefi_env.sh
  modprobe efivars
usr/share/rear/rescue/GNU/Linux/31_network_devices.sh
  modprobe bonding
  modprobe 8021q
usr/share/rear/skel/default/etc/scripts/system-setup.d/41-load-special-modules.sh
  modprobe xenblk
usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh
  many modprobe calls for many modules

If modprobe calls result that udev creates new device nodes,
a "udevadm settle" should happen either after the modprobe call or at least before the new device node is accessed for the first time.

@astafyev
perhaps it might help for your particular "rear fails with error while trying to mount backup USB media" issue when you add a line

udevadm settle --timeout=20

as very last line in
usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh but I don't think that helps because "on modern Fedora" systemd/udev should do it and then there are already some "my_udevsettle" calls in usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh

astafyev commented at 2016-02-12 18:51:

@gdha
/usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh script has condition with two branches to use udev or manually load modules.
I suppose that in your case you are routed to first branch that's why you do not see any issues.
I see "Loading storage modules..." during restore so in my case the process goes definitely via second branch i.e. load modules manually:
`

if test -s /etc/modules ; then
    while read module options ; do
        case "$module" in
            (\#*|"") ;;
            (*) modprobe -v $module $options;;
        esac
    done </etc/modules
fi

echo "Loading storage modules..."
for module in $(find /lib/modules/$(uname -r)/kernel/drivers/{scsi,block,ide,message,ata} -type f 2>/dev/null) ; do
    case "$(basename $module .ko)" in
        (nbd) echo "Module nbd excluded from being autoloaded.";;
        (*) modprobe -q $(basename $module .ko);;
    esac
done

`

First part loads modules from /etc/modules and in my case it is usb_storage module and process of USB drive initialization starts here. Second part loads storage modules with

modprobe -q $(basename $module .ko)
command. Since all modules have .ko.xz extension nothing is being loaded.

@jsmeix
You see that in "manual modules loading" branch there is no "my_udevsettle" or anything else that would wait for usb_storage initialization.

gdha commented at 2016-02-13 12:52:

@astafyev I've fixed script /usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh and the following was the effect:

Module                  Size  Used by
sata_qstor             16384  0
sata_promise           20480  0
sata_nv                28672  0
sata_mv                32768  0
sata_inic162x          20480  0
pdc_adma               16384  0
pata_via               16384  0
pata_triflex           16384  0
pata_sl82c105          16384  0
pata_sil680            16384  0
pata_serverworks       16384  0
sata_sis               16384  0
pata_sis               20480  1 sata_sis
pata_sch               16384  0
pata_rdc               16384  0
pata_piccolo           16384  0
pata_pdc202xx_old      16384  0
sata_sil24             20480  0
pata_pdc2027x          16384  0
pata_pcmcia            20480  0
pata_optidma           16384  0
pata_opti              16384  0
pata_oldpiix           16384  0
pata_ns87415           16384  0
pata_ns87410           16384  0
pata_ninja32           16384  0
pata_netcell           16384  0
pata_mpiix             16384  0
pata_marvell           16384  0
pata_jmicron           16384  0
pata_it821x            20480  0
pata_it8213            16384  0
pata_hpt3x3            16384  0
pata_hpt3x2n           16384  0
pata_hpt37x            20480  0
pata_hpt366            16384  0
pata_efar              16384  0
pata_cypress           16384  0
pata_cmd64x            16384  0
pata_cmd640            16384  0
pata_atp867x           16384  0
pata_atiixp            16384  0
pata_artop             16384  0
pata_amd               20480  0
pata_ali               16384  0
sata_vsc               16384  0
sata_uli               16384  0
sata_sil               16384  0
sata_via               20480  0
ahci_platform          16384  0
libahci_platform       16384  1 ahci_platform
sata_sx4               20480  0
acard_ahci             16384  0
sata_svw               16384  0
mptspi                 24576  0
mptsas                 61440  0
mptscsih               40960  2 mptsas,mptspi
mptlan                 20480  0
mptbase                98304  4 mptlan,mptsas,mptspi,mptscsih
zram                   28672  0
virtio_blk             20480  0
umem                   20480  0
sx8                    24576  0
skd                    77824  0
rbd                    69632  0
libceph               237568  1 rbd
pktcdvd                36864  0
osdblk                 16384  0
nvme                   57344  0
null_blk               16384  0
mtip32xx               49152  0
loop                   28672  0
floppy                 69632  0
drbd                  339968  0
lru_cache              16384  1 drbd
libcrc32c              16384  2 drbd,libceph
cciss                 114688  0
brd                    16384  0
aoe                    57344  0
DAC960                 81920  0
pmcraid                53248  0
megaraid               40960  0
iscsi_tcp              20480  0
wd719x                 20480  0
eeprom_93cx6           16384  1 wd719x
vmw_pvscsi             24576  0
ufshcd_pci             16384  0
ufshcd                 49152  1 ufshcd_pci
sym53c8xx              81920  0
stex                   24576  0
st                     57344  0
snic                   90112  0
ses                    20480  0
enclosure              16384  1 ses
scsi_transport_srp     20480  0
qla1280                28672  0
qla4xxx               266240  0
tcm_qla2xxx            32768  0
target_core_mod       331776  1 tcm_qla2xxx
qla2xxx               606208  1 tcm_qla2xxx
pm80xx                163840  0
osst                   57344  0
mvumi                  32768  0
mvsas                  53248  0
mpt3sas               192512  0
mpt2sas               188416  0
raid_class             16384  2 mpt2sas,mpt3sas
megaraid_sas          106496  0
megaraid_mbox          36864  0
megaraid_mm            20480  1 megaraid_mbox
lpfc                  638976  0
isci                  135168  0
libsas                 69632  3 isci,mvsas,pm80xx
ipr                   122880  0
initio                 28672  0
hptiop                 24576  0
hpsa                   86016  0
gdth                   90112  0
fnic                  106496  0
dmx3191d               20480  0
scsi_dh_rdac           16384  0
scsi_dh_hp_sw          16384  0
scsi_dh_emc            16384  0
scsi_dh_alua           20480  0
dc395x                 40960  0
cxgb4i                 40960  0
cxgb4                 217088  1 cxgb4i
cxgb3i                 36864  0
cxgb3                 151552  1 cxgb3i
mdio                   16384  1 cxgb3
libcxgbi               61440  2 cxgb3i,cxgb4i
libiscsi_tcp           20480  4 cxgb3i,cxgb4i,iscsi_tcp,libcxgbi
csiostor              131072  0
ch                     20480  0
bnx2i                  53248  0
fcoe                   28672  0
bnx2fc                 86016  0
cnic                   61440  2 bnx2i,bnx2fc
uio                    20480  1 cnic
libfcoe                61440  3 fcoe,fnic,bnx2fc
libfc                 110592  4 fcoe,fnic,libfcoe,bnx2fc
bfa                   552960  0
scsi_transport_fc      57344  9 bfa,csiostor,fcoe,fnic,lpfc,qla2xxx,libfc,tcm_qla2xxx,bnx2fc
be2iscsi              106496  0
iscsi_boot_sysfs       16384  2 qla4xxx,be2iscsi
libiscsi               53248  8 qla4xxx,libiscsi_tcp,bnx2i,cxgb3i,cxgb4i,be2iscsi,iscsi_tcp,libcxgbi
scsi_transport_iscsi    94208  6 qla4xxx,bnx2i,be2iscsi,iscsi_tcp,libcxgbi,libiscsi
atp870u                36864  0
arcmsr                 45056  0
am53c974               16384  0
esp_scsi               32768  1 am53c974
aic7xxx               122880  0
aic79xx               135168  0
advansys               61440  0
aacraid                94208  0
a100u2w                20480  0
BusLogic               90112  0
3w_xxxx                36864  0
3w_sas                 28672  0
3w_9xxx                45056  0
scsi_transport_spi     32768  6 esp_scsi,mptspi,sym53c8xx,aic79xx,aic7xxx,dmx3191d
scsi_transport_sas     40960  7 isci,mvsas,mpt2sas,mpt3sas,libsas,mptsas,pm80xx
osd                    16384  1 osdblk
libosd                 28672  2 osd,osdblk
8139cp                 28672  0
mii                    16384  1 8139cp
uas                    24576  0
snd_hda_codec_generic    73728  1
usb_storage            65536  1 uas
qxl                    69632  1
drm_kms_helper        118784  1 qxl
snd_hda_intel          36864  0
snd_hda_codec         126976  2 snd_hda_codec_generic,snd_hda_intel
ttm                    90112  1 qxl
virtio_balloon         16384  0
virtio_net             28672  0
snd_hwdep              16384  1 snd_hda_codec
virtio_console         28672  0
virtio_scsi            20480  0
snd_hda_core           61440  3 snd_hda_codec_generic,snd_hda_codec,snd_hda_intel
drm                   335872  4 qxl,ttm,drm_kms_helper
ppdev                  20480  0
snd_pcm               114688  3 snd_hda_codec,snd_hda_intel,snd_hda_core
joydev                 20480  0
snd_timer              32768  1 snd_pcm
snd                    77824  6 snd_hwdep,snd_timer,snd_pcm,snd_hda_codec_generic,snd_hda_codec,snd_hda_intel
ata_generic            16384  0
serio_raw              16384  0
virtio_pci             24576  0
pata_acpi              16384  0
soundcore              16384  1 snd
virtio_ring            16384  6 virtio_blk,virtio_net,virtio_pci,virtio_balloon,virtio_console,virtio_scsi
virtio                 16384  6 virtio_blk,virtio_net,virtio_pci,virtio_balloon,virtio_console,virtio_scsi
i2c_piix4              24576  0
parport_pc             28672  0
pvpanic                16384  0
parport                49152  2 ppdev,parport_pc
acpi_cpufreq           20480  0

It loaded all kernel modules. That is not desired. Fedora23 loaded all modules it needed when I used the old script (with .ko).
The problem is most likely a timing issue you encountered. Perhaps a call to my_udevsettle is enough?

astafyev commented at 2016-02-13 18:41:

@gdha
If loading of all modules are not desired we should remove second branch with corresponding code. Otherwise if we want to keep case when udev is not available we should also support ko.xz modules as well.
I found that /sys/kernel/uevent_helper does not exist and that is why if [ $systemd_version -gt 190 ] || [ -s /etc/udev/rules.d/00-rear.rules ] && [ -w /sys/kernel/uevent_helper ]; then condition fails and goes with second route. Is this file existence mandatory?

gdha commented at 2016-02-15 11:03:

udev seems to have:

  • Tue Jul 20 2010 Harald Hoyer harald@redhat.com 160-7 - turn off hotplug in /sys/kernel/uevent_helper or /proc/sys/kernel/hotplug

That explains probably why /sys/kernel/uevent_helper does not exist anymore on Fedora 23. Systemd is taking over all these tasks. And systemd-udev is starting via the systemd services availbale within rear (see /usr/share/rear/skel/default/usr/lib/systemd/system/systemd-udevd.service).

We should improve 40-start-udev-or-load-modules.sh to skip if systemd-udevd.service is present

gdha commented at 2016-02-15 15:50:

@astafyev tomorrow you would be able to test the updated script. Feedback is welcome

gdha commented at 2016-02-22 15:47:

@astafyev any feedback? Would like to complete 1.18

astafyev commented at 2016-02-23 13:10:

Sorry for delay, was busy last days. With fixed script flow goes with udev branch as expected. However I still get error message when mounting mount /dev/disk/by-label/REAR-000. It seems that udev settle doesn't work. man says that settle waits until queue is empty but queue can be empty (as USB drive is found) while helpers that create device links may still be working. I increased sleep delay from 1 to 5 and it started working. It may be interim solution but there may be configurations when 5 seconds is not enough so we need more elegant solution.

jsmeix commented at 2016-03-07 10:08:

@gdha
in general regarding "sleep" in https://github.com/rear/rear/commit/c04427c62e68d427d8e31e1f88abc68c4ff5e9f5 and in general regarding how to wait for things that are created by udev, I submitted right now https://github.com/rear/rear/issues/791 so that (hopefully) I can implement it properly for a future rear version. For now the current workaround via "sleep some more" should be o.k.


[Export of Github issue for rear/rear.]