#766 Issue closed
: Rear fails to run on modern Fedora kernels¶
Labels: bug
, fixed / solved / done
astafyev opened issue at 2016-02-11 13:08:¶
40-start-udev-or-load-modules.sh script assumes that kernel modules have
.ko extension. Recent Fedora kernels have .ko.xz extensions so no
storage modules are not being loaded at all.
Moreover this script loads usb_storage modules before others and
usually there is a slight delay between this module load and attempt to
mount /dev/disk/by-label/REAR-000 where backup is located. The more
storage modules are loaded the more delay is. But in case no or few
modules are loaded the delay is small and udev doesn't have enough time
to make corresponding symlink and rear fails with error while trying to
mount backup USB media.
gdha commented at 2016-02-11 16:09:¶
@astafyev thank you for the report - we will check it
gdha commented at 2016-02-12 13:28:¶
The configuration of my test:
# cat /etc/rear/site.conf
BACKUP=NETFS
OUTPUT=USB
BACKUP_URL="usb:///dev/disk/by-label/REAR-000"
BACKUP_PROG_COMPRESS_OPTIONS="--bzip2"
BACKUP_PROG_COMPRESS_SUFFIX=".bz2"
I did a complete /backup/recover/ test with fedora23 and USB and it
worked out of the box for me.
I did found 2 issues, but these are not related to this.
OTOH, you have a point too. Will spend some time to investigate a bit
deeper.
jsmeix commented at 2016-02-12 14:39:¶
Since udev generates device nodes it is a generic issue
that basically before any initial access of a device node
one would have to do something like
udevadm settle --timeout=20
or use the rear function "my_udevsettle".
@astafyev
can you get a rear log file (preferably where rear was called with '-d'
and '-D' debugging options) where it fails for you so that it is more
clear in what exact line in which rear script the initial access of
the
device node happens because there a "udevadm settle" is probably missing
and needs to be inserted.
@gdha
perhaps adding a general "udevadm settle" step in all rear
workflows after all kernel modules have been loaded
could be a single and simple generic solution
(instead of adding "udevadm settle" at various places
before each initial access of a device node).
FYI:
The current scripts where "modprobe
usr/share/rear/layout/prepare/GNU/Linux/15_include_drbd_code.sh modprobe drbd usr/share/rear/layout/prepare/GNU/Linux/17_include_hpraid_code.sh modprobe cciss usr/share/rear/layout/prepare/GNU/Linux/21_load_multipath.sh modprobe dm-multipath usr/share/rear/layout/prepare/default/20_recreate_hpraid.sh modprobe cciss usr/share/rear/lib/network-functions.sh modprobe bonding usr/share/rear/lib/udev-workflow.sh modprobe pcspkr usr/share/rear/prep/default/32_include_uefi_env.sh modprobe efivars usr/share/rear/rescue/GNU/Linux/31_network_devices.sh modprobe bonding modprobe 8021q usr/share/rear/skel/default/etc/scripts/system-setup.d/41-load-special-modules.sh modprobe xenblk usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh many modprobe calls for many modules
If modprobe calls result that udev creates new device nodes,
a "udevadm settle" should happen either after the modprobe call or at
least before the new device node is accessed for the first time.
@astafyev
perhaps it might help for your particular "rear fails with error while
trying to mount backup USB media" issue when you add a line
udevadm settle --timeout=20
as very last line in
usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh
but I don't think that helps because "on modern Fedora" systemd/udev
should do it and then there are already some "my_udevsettle" calls in
usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh
astafyev commented at 2016-02-12 18:51:¶
@gdha
/usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh
script has condition with two branches to use udev or manually load
modules.
I suppose that in your case you are routed to first branch that's why
you do not see any issues.
I see "Loading storage modules..." during restore so in my case the
process goes definitely via second branch i.e. load modules manually:
`
if test -s /etc/modules ; then
while read module options ; do
case "$module" in
(\#*|"") ;;
(*) modprobe -v $module $options;;
esac
done </etc/modules
fi
echo "Loading storage modules..."
for module in $(find /lib/modules/$(uname -r)/kernel/drivers/{scsi,block,ide,message,ata} -type f 2>/dev/null) ; do
case "$(basename $module .ko)" in
(nbd) echo "Module nbd excluded from being autoloaded.";;
(*) modprobe -q $(basename $module .ko);;
esac
done
`
First part loads modules from /etc/modules and in my case it is usb_storage module and process of USB drive initialization starts here. Second part loads storage modules with
modprobe -q $(basename $module .ko)
command. Since all modules have .ko.xz extension nothing is being
loaded.
@jsmeix
You see that in "manual modules loading" branch there is no
"my_udevsettle" or anything else that would wait for usb_storage
initialization.
gdha commented at 2016-02-13 12:52:¶
@astafyev I've fixed script
/usr/share/rear/skel/default/etc/scripts/system-setup.d/40-start-udev-or-load-modules.sh
and the following was the effect:
Module Size Used by
sata_qstor 16384 0
sata_promise 20480 0
sata_nv 28672 0
sata_mv 32768 0
sata_inic162x 20480 0
pdc_adma 16384 0
pata_via 16384 0
pata_triflex 16384 0
pata_sl82c105 16384 0
pata_sil680 16384 0
pata_serverworks 16384 0
sata_sis 16384 0
pata_sis 20480 1 sata_sis
pata_sch 16384 0
pata_rdc 16384 0
pata_piccolo 16384 0
pata_pdc202xx_old 16384 0
sata_sil24 20480 0
pata_pdc2027x 16384 0
pata_pcmcia 20480 0
pata_optidma 16384 0
pata_opti 16384 0
pata_oldpiix 16384 0
pata_ns87415 16384 0
pata_ns87410 16384 0
pata_ninja32 16384 0
pata_netcell 16384 0
pata_mpiix 16384 0
pata_marvell 16384 0
pata_jmicron 16384 0
pata_it821x 20480 0
pata_it8213 16384 0
pata_hpt3x3 16384 0
pata_hpt3x2n 16384 0
pata_hpt37x 20480 0
pata_hpt366 16384 0
pata_efar 16384 0
pata_cypress 16384 0
pata_cmd64x 16384 0
pata_cmd640 16384 0
pata_atp867x 16384 0
pata_atiixp 16384 0
pata_artop 16384 0
pata_amd 20480 0
pata_ali 16384 0
sata_vsc 16384 0
sata_uli 16384 0
sata_sil 16384 0
sata_via 20480 0
ahci_platform 16384 0
libahci_platform 16384 1 ahci_platform
sata_sx4 20480 0
acard_ahci 16384 0
sata_svw 16384 0
mptspi 24576 0
mptsas 61440 0
mptscsih 40960 2 mptsas,mptspi
mptlan 20480 0
mptbase 98304 4 mptlan,mptsas,mptspi,mptscsih
zram 28672 0
virtio_blk 20480 0
umem 20480 0
sx8 24576 0
skd 77824 0
rbd 69632 0
libceph 237568 1 rbd
pktcdvd 36864 0
osdblk 16384 0
nvme 57344 0
null_blk 16384 0
mtip32xx 49152 0
loop 28672 0
floppy 69632 0
drbd 339968 0
lru_cache 16384 1 drbd
libcrc32c 16384 2 drbd,libceph
cciss 114688 0
brd 16384 0
aoe 57344 0
DAC960 81920 0
pmcraid 53248 0
megaraid 40960 0
iscsi_tcp 20480 0
wd719x 20480 0
eeprom_93cx6 16384 1 wd719x
vmw_pvscsi 24576 0
ufshcd_pci 16384 0
ufshcd 49152 1 ufshcd_pci
sym53c8xx 81920 0
stex 24576 0
st 57344 0
snic 90112 0
ses 20480 0
enclosure 16384 1 ses
scsi_transport_srp 20480 0
qla1280 28672 0
qla4xxx 266240 0
tcm_qla2xxx 32768 0
target_core_mod 331776 1 tcm_qla2xxx
qla2xxx 606208 1 tcm_qla2xxx
pm80xx 163840 0
osst 57344 0
mvumi 32768 0
mvsas 53248 0
mpt3sas 192512 0
mpt2sas 188416 0
raid_class 16384 2 mpt2sas,mpt3sas
megaraid_sas 106496 0
megaraid_mbox 36864 0
megaraid_mm 20480 1 megaraid_mbox
lpfc 638976 0
isci 135168 0
libsas 69632 3 isci,mvsas,pm80xx
ipr 122880 0
initio 28672 0
hptiop 24576 0
hpsa 86016 0
gdth 90112 0
fnic 106496 0
dmx3191d 20480 0
scsi_dh_rdac 16384 0
scsi_dh_hp_sw 16384 0
scsi_dh_emc 16384 0
scsi_dh_alua 20480 0
dc395x 40960 0
cxgb4i 40960 0
cxgb4 217088 1 cxgb4i
cxgb3i 36864 0
cxgb3 151552 1 cxgb3i
mdio 16384 1 cxgb3
libcxgbi 61440 2 cxgb3i,cxgb4i
libiscsi_tcp 20480 4 cxgb3i,cxgb4i,iscsi_tcp,libcxgbi
csiostor 131072 0
ch 20480 0
bnx2i 53248 0
fcoe 28672 0
bnx2fc 86016 0
cnic 61440 2 bnx2i,bnx2fc
uio 20480 1 cnic
libfcoe 61440 3 fcoe,fnic,bnx2fc
libfc 110592 4 fcoe,fnic,libfcoe,bnx2fc
bfa 552960 0
scsi_transport_fc 57344 9 bfa,csiostor,fcoe,fnic,lpfc,qla2xxx,libfc,tcm_qla2xxx,bnx2fc
be2iscsi 106496 0
iscsi_boot_sysfs 16384 2 qla4xxx,be2iscsi
libiscsi 53248 8 qla4xxx,libiscsi_tcp,bnx2i,cxgb3i,cxgb4i,be2iscsi,iscsi_tcp,libcxgbi
scsi_transport_iscsi 94208 6 qla4xxx,bnx2i,be2iscsi,iscsi_tcp,libcxgbi,libiscsi
atp870u 36864 0
arcmsr 45056 0
am53c974 16384 0
esp_scsi 32768 1 am53c974
aic7xxx 122880 0
aic79xx 135168 0
advansys 61440 0
aacraid 94208 0
a100u2w 20480 0
BusLogic 90112 0
3w_xxxx 36864 0
3w_sas 28672 0
3w_9xxx 45056 0
scsi_transport_spi 32768 6 esp_scsi,mptspi,sym53c8xx,aic79xx,aic7xxx,dmx3191d
scsi_transport_sas 40960 7 isci,mvsas,mpt2sas,mpt3sas,libsas,mptsas,pm80xx
osd 16384 1 osdblk
libosd 28672 2 osd,osdblk
8139cp 28672 0
mii 16384 1 8139cp
uas 24576 0
snd_hda_codec_generic 73728 1
usb_storage 65536 1 uas
qxl 69632 1
drm_kms_helper 118784 1 qxl
snd_hda_intel 36864 0
snd_hda_codec 126976 2 snd_hda_codec_generic,snd_hda_intel
ttm 90112 1 qxl
virtio_balloon 16384 0
virtio_net 28672 0
snd_hwdep 16384 1 snd_hda_codec
virtio_console 28672 0
virtio_scsi 20480 0
snd_hda_core 61440 3 snd_hda_codec_generic,snd_hda_codec,snd_hda_intel
drm 335872 4 qxl,ttm,drm_kms_helper
ppdev 20480 0
snd_pcm 114688 3 snd_hda_codec,snd_hda_intel,snd_hda_core
joydev 20480 0
snd_timer 32768 1 snd_pcm
snd 77824 6 snd_hwdep,snd_timer,snd_pcm,snd_hda_codec_generic,snd_hda_codec,snd_hda_intel
ata_generic 16384 0
serio_raw 16384 0
virtio_pci 24576 0
pata_acpi 16384 0
soundcore 16384 1 snd
virtio_ring 16384 6 virtio_blk,virtio_net,virtio_pci,virtio_balloon,virtio_console,virtio_scsi
virtio 16384 6 virtio_blk,virtio_net,virtio_pci,virtio_balloon,virtio_console,virtio_scsi
i2c_piix4 24576 0
parport_pc 28672 0
pvpanic 16384 0
parport 49152 2 ppdev,parport_pc
acpi_cpufreq 20480 0
It loaded all kernel modules. That is not desired. Fedora23 loaded all
modules it needed when I used the old script (with .ko).
The problem is most likely a timing issue you encountered. Perhaps a
call to my_udevsettle
is enough?
astafyev commented at 2016-02-13 18:41:¶
@gdha
If loading of all modules are not desired we should remove second branch
with corresponding code. Otherwise if we want to keep case when udev is
not available we should also support ko.xz modules as well.
I found that /sys/kernel/uevent_helper
does not exist and that is why
if [ $systemd_version -gt 190 ] || [ -s /etc/udev/rules.d/00-rear.rules ] && [ -w /sys/kernel/uevent_helper ]; then
condition fails and goes with second route. Is this file existence
mandatory?
gdha commented at 2016-02-15 11:03:¶
udev seems to have:
- Tue Jul 20 2010 Harald Hoyer harald@redhat.com 160-7 - turn off hotplug in /sys/kernel/uevent_helper or /proc/sys/kernel/hotplug
That explains probably why /sys/kernel/uevent_helper
does not exist
anymore on Fedora 23. Systemd is taking over all these tasks. And
systemd-udev is starting via the systemd services availbale within rear
(see
/usr/share/rear/skel/default/usr/lib/systemd/system/systemd-udevd.service
).
We should improve 40-start-udev-or-load-modules.sh to skip if systemd-udevd.service is present
gdha commented at 2016-02-15 15:50:¶
@astafyev tomorrow you would be able to test the updated script. Feedback is welcome
gdha commented at 2016-02-22 15:47:¶
@astafyev any feedback? Would like to complete 1.18
astafyev commented at 2016-02-23 13:10:¶
Sorry for delay, was busy last days. With fixed script flow goes with udev branch as expected. However I still get error message when mounting mount /dev/disk/by-label/REAR-000. It seems that udev settle doesn't work. man says that settle waits until queue is empty but queue can be empty (as USB drive is found) while helpers that create device links may still be working. I increased sleep delay from 1 to 5 and it started working. It may be interim solution but there may be configurations when 5 seconds is not enough so we need more elegant solution.
jsmeix commented at 2016-03-07 10:08:¶
@gdha
in general regarding "sleep" in
https://github.com/rear/rear/commit/c04427c62e68d427d8e31e1f88abc68c4ff5e9f5
and in general regarding how to wait for things that are created by
udev, I submitted right now
https://github.com/rear/rear/issues/791
so that (hopefully) I can implement it properly for a future rear
version. For now the current workaround via "sleep some more" should be
o.k.
[Export of Github issue for rear/rear.]