#531 Issue closed
: rear git201501071534 recovery system on Fedora 21: systemd at 100% CPU (/sbin/udevd missing)¶
Labels: enhancement
, support / question
, fixed / solved / done
jsmeix opened issue at 2015-01-20 11:18:¶
I am testing rear git201501071534 version on Fedora 21:
When I boot the rear recovery system it works but
there are two processes "systemd" and "systemd-journal"
where each one continuously runs basically
at 50% CPU usage.
The following dirty hack seems to "fix" it for me:
ln -s /bin/true /sbin/udevd
Details:
RESCUE f42:~ # cat /etc/os-release NAME=Fedora VERSION="21 (Twenty One)" ID=fedora VERSION_ID=21 PRETTY_NAME="Fedora 21 (Twenty One)" ANSI_COLOR="0;34" CPE_NAME="cpe:/o:fedoraproject:fedora:21" HOME_URL="https://fedoraproject.org/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Fedora" REDHAT_BUGZILLA_PRODUCT_VERSION=21 REDHAT_SUPPORT_PRODUCT="Fedora" REDHAT_SUPPORT_PRODUCT_VERSION=21 RESCUE f42:~ # top ... PID USER ... %CPU ... COMMAND 54 root 50.0 systemd-journal 1 root 43.8 systemd RESCUE f42:~ # journalctl ... systemd[1]: Enqueued job udev.service/start as 14149229 systemd[1]: Installed new job udev.service/start as 14149229 systemd[1]: Trying to enqueue job udev.service/start/replace systemd[1]: Incoming traffic on udev-kernel.socket systemd[1]: udev-kernel.socket changed running -> listening systemd[1]: udev-kernel.socket got notified about service death (failed permanently: no) systemd[1]: Started udev Kernel Device Manager. systemd[1]: Job udev.service/start finished, result=done systemd[1]: Starting of udev.service requested but condition failed. Ignoring. systemd[1]: ConditionPathExists=/sbin/udevd failed for udev.service. systemd[1]: udev-kernel.socket changed listening -> running systemd[1]: Enqueued job udev.service/start as 14149223 systemd[1]: Installed new job udev.service/start as 14149223 systemd[1]: Trying to enqueue job udev.service/start/replace systemd[1]: Incoming traffic on udev-kernel.socket systemd[1]: udev-kernel.socket changed running -> listening systemd[1]: udev-kernel.socket got notified about service death (failed permanently: no) systemd[1]: Started udev Kernel Device Manager. systemd[1]: Job udev.service/start finished, result=done systemd[1]: Starting of udev.service requested but condition failed. Ignoring. systemd[1]: ConditionPathExists=/sbin/udevd failed for udev.service. ... RESCUE f42:~ # ls /sbin/udevd ls: cannot access /sbin/udevd: No such file or directory RESCUE f42:~ # ps auxw | grep udev root ... /usr/lib/systemd/systemd-udevd RESCUE f42:~ # ln -s /bin/true /sbin/udevd RESCUE f42:~ # top ... PID USER ... %CPU ... COMMAND 1 root 0.0 systemd RESCUE f42:~ # journalctl ... systemd[1]: udev.service failed. systemd[1]: Unit udev.service entered failed state. systemd[1]: Unit udev-kernel.socket entered failed state. systemd[1]: udev-kernel.socket changed running -> failed systemd[1]: udev-kernel.socket got notified about service death (failed permanently: yes) systemd[1]: Failed to start udev Kernel Device Manager. systemd[1]: Job udev.service/start finished, result=failed systemd[1]: udev.service changed dead -> failed systemd[1]: start request repeated too quickly for udev.service systemd[1]: Starting udev Kernel Device Manager... systemd[1]: ConditionPathExists=/sbin/udevd succeeded for udev.service.
jsmeix commented at 2015-01-20 11:47:¶
See also https://bugzilla.opensuse.org/show_bug.cgi?id=908854#c9
gdha commented at 2015-01-20 12:49:¶
That must be introduced by
./skel/default/usr/lib/systemd/system/udev.service
script and I think
the guilty entry might be Restart=on-failure
in the service
description.
[Unit]
Description=udev Kernel Device Manager
Wants=udev-control.socket udev-kernel.socket
After=udev-control.socket udev-kernel.socket
Before=basic.target
DefaultDependencies=no
ConditionPathExists=/sbin/udevd
[Service]
Type=notify
OOMScoreAdjust=-1000
Sockets=udev-control.socket udev-kernel.socket
Restart=on-failure
ExecStart=/sbin/udevd
Perhaps adding RestartSec=10s
would be enough to reduce the CPU load?
jsmeix commented at 2015-01-20 13:51:¶
Still same CPU load in the recovery system now with:
RESCUE f42:~ # cat /usr/lib/systemd/system/udev.service [Unit] Description=udev Kernel Device Manager Wants=udev-control.socket udev-kernel.socket After=udev-control.socket udev-kernel.socket Before=basic.target DefaultDependencies=no ConditionPathExists=/sbin/udevd [Service] Type=notify OOMScoreAdjust=-1000 Sockets=udev-control.socket udev-kernel.socket Restart=on-failure RestartSec=10s ExecStart=/sbin/udevd RESCUE f42:~ #
But CPU load is not the actual problem - it is only a symptom.
Regarding the actual problem:
It seems in my original Fedora 21 system there is no udevd executable:
[root@f42 ~]# type -a udevd -bash: type: udevd: not found [root@f42 ~]# ls -l /sbin/udevd /usr/sbin/udevd /bin/udevd /usr/bin/udevd ls: cannot access /sbin/udevd: No such file or directory ls: cannot access /usr/sbin/udevd: No such file or directory ls: cannot access /bin/udevd: No such file or directory ls: cannot access /usr/bin/udevd: No such file or directory
Accordingly it seems to not make sense to have a systemd unit file in the rear recovery system with "ExecStart=/sbin/udevd" because this can never work.
gdha commented at 2015-01-20 14:45:¶
@jsmeix That is service for older Fedora releases (and/or other
distro's) still referring to this service of udevd. If you put the
Restart=on-failure
in comment then the problem is gone.
jsmeix commented at 2015-01-20 14:53:¶
FYI:
What "ExecStart=..." exist in my recovery system and what executables are missing there:
RESCUE f42:~ # find /usr/lib/systemd/ | xargs grep -h 'ExecStart=' 2>/dev/null | grep -o '/[^ ]*' | sort -u /bin/agetty /bin/dbus-daemon /bin/systemctl /etc/scripts/boot /etc/scripts/run-sshd /etc/scripts/run-syslog /etc/scripts/system-setup /lib/systemd/systemd-logger /lib/systemd/systemd-shutdownd /sbin/agetty /sbin/udevd /usr/bin/udevadm /usr/lib/systemd/systemd-journald /usr/lib/systemd/systemd-udevd RESCUE f42:~ # for e in $( find /usr/lib/systemd/ | xargs grep -h 'ExecStart=' 2>/dev/null | grep -o '/[^ ]*' | sort -u ) ; do ls -l $e | grep 'No such' ; done ls: cannot access /lib/systemd/systemd-logger: No such file or directory ls: cannot access /sbin/udevd: No such file or directory RESCUE f42:~ #
/lib/systemd/systemd-logger is used only in /usr/lib/systemd/system/systemd-logger.service as ExecStart=/lib/systemd/systemd-logger
/sbin/udevd is used only in /usr/lib/systemd/system/udev.service as
ConditionPathExists=/sbin/udevd and ExecStart=/sbin/udevd
I think when the rear recovery system is generated, there should be a test to ensure executables that are needed by systemd unit files do actually exist.
gdha commented at 2015-01-20 15:17:¶
@jsmeix Basically I agree with you. However, systemd development is
going so fast and is different on several distro's that it is very hard
for us to keep track where what is used and what became obsolete in the
meantime. Remember, this is part of the skel
tree.
If we want to change this then we should create additional scripts which
figures this out during the recovery image build. Not impossible, but
time consuming and error prone because it is impossible to test all
possible combinations in a limited time period (which means other stuff
may be broken).
To be honest, I do not care much about some missing executables as once the system is recovered it is finished with rear. I care about rear's functionality - doing a proper recovery.
jsmeix commented at 2015-01-20 15:33:¶
I fully agree that systemd stuff are at least currently various too fast moving targets so that others simply cannot keep up with that madness.
Feel free to close this issue.
Nevertheless I am willing to have a look if it is possible to add some check scripts that implement some very basic tests whether or not the rear recovery system seems to be self-consistent.
In this case I would like to get some initial basic hint how I should implement it in particular the location of such scripts in rear and how to make them automatically executed during "rear mkbackup/mkrescue".
jsmeix commented at 2015-01-20 15:48:¶
FYI:
What "ExecStart=..." exist in my recovery system for openSUSE 13.2
and what executables are missing there:
RESCUE f74:~ # find /usr/lib/systemd/ | xargs grep -h 'ExecStart=' 2>/dev/null | grep -o '/[^ ]*' | sort -u /bin/agetty /bin/dbus-daemon /bin/systemctl /etc/scripts/boot /etc/scripts/run-sshd /etc/scripts/run-syslog /etc/scripts/system-setup /lib/systemd/systemd-logger /lib/systemd/systemd-shutdownd /sbin/agetty /sbin/udevd /usr/bin/udevadm /usr/lib/systemd/systemd-journald /usr/lib/systemd/systemd-udevd RESCUE f74:~ # for e in $( find /usr/lib/systemd/ | xargs grep -h 'ExecStart=' 2>/dev/null | grep -o '/[^ ]*' | sort -u ) ; do ls -l $e | grep 'No such' ; done ls: cannot access /lib/systemd/systemd-logger: No such file or directory ls: cannot access /lib/systemd/systemd-shutdownd: No such file or directory RESCUE f74:~ # find /usr/lib/systemd/ | xargs grep '/lib/systemd/systemd-logger' 2>/dev/null /usr/lib/systemd/system/systemd-logger.service:ExecStart=/lib/systemd/systemd-logger RESCUE f74:~ # find /usr/lib/systemd/ | xargs grep '/lib/systemd/systemd-shutdownd' 2>/dev/null /usr/lib/systemd/system/systemd-shutdownd.service:ExecStart=/lib/systemd/systemd-shutdownd
gdha commented at 2015-01-23 15:24:¶
@jsmeix about:
Nevertheless I am willing to have a look if it is possible to add some check scripts that implement some very basic tests whether or not the rear recovery system seems to be self-consistent.
In this case I would like to get some initial basic hint how I should implement it in particular the location of such scripts in rear and how to make them automatically executed during "rear mkbackup/mkrescue".
I would add these scripts somewhere in the build stage (use -s
option):
Source rescue/GNU/Linux/95_cfg2html.sh
Source rescue/GNU/Linux/96_collect_MC_serviceguard_infos.sh
Source build/GNU/Linux/00_create_symlinks.sh
Source build/GNU/Linux/09_create_lib_directories_and_symlinks.sh
Source build/GNU/Linux/10_copy_as_is.sh
Source build/GNU/Linux/11_touch_empty_files.sh
Source build/GNU/Linux/13_create_dotfiles.sh
Source build/GNU/Linux/15_adjust_permissions.sh
Source build/GNU/Linux/16_adjust_sshd_config.sh
Source build/GNU/Linux/39_copy_binaries_libraries.sh
Source build/GNU/Linux/40_copy_modules.sh
Source build/default/50_patch_sshd_config.sh
Source build/GNU/Linux/60_verify_and_adjust_udev.sh
Source build/GNU/Linux/61_verify_and_adjust_udev_systemd.sh
Source build/default/96_remove_encryption_keys.sh
Source build/default/97_add_rear_release.sh
Source build/default/98_verify_rootfs.sh
Source build/default/99_update_os_conf.sh
Source pack/Linux-i386/30_copy_kernel.sh
jsmeix commented at 2015-01-26 13:12:¶
Right now I found another issue where something is missing
in the recovery system:
When a vfat filesystem should be recreated "rear recover" fails with
/var/lib/rear/layout/diskrestore.sh: line 156: dosfslabel: command not found
One can manually make it work by adding
PROGS=( ${PROGS[@]} dosfslabel )
in /etc/rear/local.conf but it would be nicer if "rear mkbackup"
itself
detects what is needed in the recovery system and reports an error
if what is needed for the recovery system cannot be found
in the original system where "rear mkbackup" runs.
gdha commented at 2015-01-26 17:46:¶
try a 'grep -r dosfslabel .' and you will see it is detected for EFI
systems where a vfat partition is required. Do you need it for
something
else?
On Mon, Jan 26, 2015 at 2:12 PM, Johannes Meixner
notifications@github.com
wrote:
Right now I found another issue where something is missing
in the recovery system:When a vfat filesystem should be recreated "rear recover" fails with
/var/lib/rear/layout/diskrestore.sh: line 156: dosfslabel: command not found
One can manually make it work by adding
PROGS=( ${PROGS[@]} dosfslabel )
in /etc/rear/local.conf but it would be nicer if "rear mkbackup" itself
detects what is needed in the recovery system and reports an error
if what is needed for the recovery system cannot be found
in the original system where "rear mkbackup" runs.—
Reply to this email directly or view it on GitHub
https://github.com/rear/rear/issues/531#issuecomment-71458330.
jsmeix commented at 2015-01-27 09:03:¶
Out of curiosity I made on my test system an additional normal data partition (/dev/sda7) with a FAT filesystem on it.
In 23_filesystem_layout.sh there is
(vfat) ... label=$(dosfslabel $device | tail -1)
and in 13_include_filesystem_code.sh there is
(vfat) ... echo "dosfslabel $device $label >&2" >> "$LAYOUT_CODE"
gdha commented at 2015-01-27 11:00:¶
was the FAT partition mounted? If yes, then you found a defect.
On Tue, Jan 27, 2015 at 10:03 AM, Johannes Meixner <notifications@github.com
wrote:
Out of curiosity I made on my test system a normal partition (/dev/sdXn)
with a FAT filesystem on it.—
Reply to this email directly or view it on GitHub
https://github.com/rear/rear/issues/531#issuecomment-71612242.
jsmeix commented at 2015-01-27 11:21:¶
It is mounted.
On the original system (it is the SLES12 system that I used in https://github.com/rear/rear/issues/533 for testing MD stuff where I just out of play instinct used the last remaining cylinder of the disk as separate partition and because it is so small I used FAT on it):
f143:~ # mount -t ext2,ext4,vfat /dev/md0 on / type ext4 (rw,relatime,stripe=16,data=ordered) /dev/sda7 on /remainder type vfat (rw,nosuid,nodev,noexec,relatime,gid=100,fmask=0002,dmask=0002,allow_utime=0020,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro) /dev/sda2 on /boot type ext2 (rw,relatime) f143:~ # mdadm --detail /dev/md0 /dev/md0: Version : 1.0 Creation Time : Mon Jan 26 13:21:59 2015 Raid Level : raid0 Array Size : 20945856 (19.98 GiB 21.45 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Mon Jan 26 13:21:59 2015 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Chunk Size : 32K Name : any:0 UUID : d83d58d6:10642523:9803931f:4969ea21 Events : 0 Number Major Minor RaidDevice State 0 8 5 0 active sync /dev/sda5 1 8 6 1 active sync /dev/sda6 f143:~ # fdisk -l -u=cylinders Disk /dev/sda: 24 GiB, 25769803776 bytes, 50331648 sectors Geometry: 255 heads, 63 sectors/track, 3133 cylinders Units: cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x000edd6b Device Boot Start End Cylinders Size Id Type /dev/sda1 1 393 392 3G 82 Linux swap / Solaris /dev/sda2 * 393 524 132 1G 83 Linux /dev/sda3 524 3134 2610 20G f W95 Ext'd (LBA) /dev/sda5 524 1827 1304 10G fd Linux raid autodetect /dev/sda6 1828 3131 1304 10G fd Linux raid autodetect /dev/sda7 3132 3132 1 7M c W95 FAT32 (LBA) Disk /dev/md0: 20 GiB, 21448556544 bytes, 41891712 sectors Geometry: 2 heads, 4 sectors/track, 5236464 cylinders Units: cylinders of 8 * 512 = 4096 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 32768 bytes / 65536 bytes
gdha commented at 2015-01-27 11:30:¶
Just added dosfslabel
to the conf/GNU/Linux.conf file which should fix
your vfat dosfslabel issue
gdha commented at 2015-01-28 16:15:¶
@jsmeix can this issue be closed?
jsmeix commented at 2015-01-29 10:00:¶
@gdha
I am sorry. I misused this issue when I found "dosfslabel" is missing in
the recovery system. I should have submitted a new separated issue for
"dosfslabel". The "dosfslabel" part of this isuse is fixed now.
This issue is initially about "on Fedora 21: systemd at 100% CPU (/sbin/udevd missing)". In my last test with Fedora 21 it still happened so that I think the initial issue is not yet fixed.
gdha commented at 2015-01-30 15:56:¶
Guess we can add some more lines in here to remove the obsolete
udev.service
2015-01-30 14:55:25 Including build/GNU/Linux/61_verify_and_adjust_udev_systemd.sh
+ . /usr/share/rear/build/GNU/Linux/61_verify_and_adjust_udev_systemd.sh
++ test -d /tmp/rear.Vd25lI6NQG7yU5K/rootfs/usr/lib/systemd/system
++ Log 'Cleaning up systemd udev socket files'
++ test 1 -gt 0
+++ Stamp
+++ date '+%Y-%m-%d %H:%M:%S '
++ echo '2015-01-30 14:55:25 Cleaning up systemd udev socket files'
2015-01-30 14:55:25 Cleaning up systemd udev socket files
++ my_udev_files=($(find $ROOTFS_DIR/usr/lib/systemd/system/sockets.target.wants -type l -name "*udev*" -printf "%P\n"))
+++ find /tmp/rear.Vd25lI6NQG7yU5K/rootfs/usr/lib/systemd/system/sockets.target.wants -type l -name '*udev*' -printf '%P\n'
++ for m in '"${my_udev_files[@]}"'
++ [[ ! -h /lib/systemd/system/sockets.target.wants/systemd-udevd-control.socket ]]
++ for m in '"${my_udev_files[@]}"'
++ [[ ! -h /lib/systemd/system/sockets.target.wants/systemd-udevd-kernel.socket ]]
++ for m in '"${my_udev_files[@]}"'
++ [[ ! -h /lib/systemd/system/sockets.target.wants/udev-control.socket ]]
++ [[ ! -h /usr/lib/systemd/system/sockets.target.wants/udev-control.socket ]]
++ rm -f /tmp/rear.Vd25lI6NQG7yU5K/rootfs/usr/lib/systemd/system/sockets.target.wants/udev-control.socket
++ for m in '"${my_udev_files[@]}"'
++ [[ ! -h /lib/systemd/system/sockets.target.wants/udev-kernel.socket ]]
++ [[ ! -h /usr/lib/systemd/system/sockets.target.wants/udev-kernel.socket ]]
++ rm -f /tmp/rear.Vd25lI6NQG7yU5K/rootfs/usr/lib/systemd/system/sockets.target.wants/udev-kernel.socket
gdha commented at 2015-02-06 13:22:¶
Did a test and now we have a sane CPU usage again! Fixed.
gdha commented at 2015-02-23 13:49:¶
added to the release notes so we can close this issue
[Export of Github issue for rear/rear.]