#2044 Issue closed
: WARNING: Failed to connect to lvmetad. Falling back to device scanning¶
Labels: enhancement
, fixed / solved / done
gdha opened issue at 2019-02-16 12:43:¶
-
ReaR version ("/usr/sbin/rear -V"): any
-
OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"): centos/rhel
-
System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device): x86_64
-
Description of the issue (ideally so that others can reproduce it): when using lvm we see lots of these warnings "WARNING: Failed to connect to lvmetad. Falling back to device scanning" in the log file
-
Workaround, if any: see https://unix.stackexchange.com/questions/332556/arch-linux-installation-grub-problem - Edit your /etc/lvm/lvm.conf and set use_lvmetad = 0 - still need to verify this
gozora commented at 2019-02-16 15:11:¶
MIght this be related to https://github.com/rear/rear/issues/2035 ?
V.
jsmeix commented at 2019-02-18 08:52:¶
@gdha
I also noticed those warnings and got worried about what might be
wrong
but those warnings are nothing but a perfect example of a useless
"WARNING is a waste of my time" cf.
https://blog.schlomo.schapiro.org/2015/04/warning-is-waste-of-my-time.html
At least for me (using SLES) LVM "just worked" without lvmetad.
I would appreciate it if we could silence this useless warnings.
@gozora
I think @gdha means here etc/lvm/lvm.conf in the recovery system.
In contrast - as far as I understand it -
https://github.com/rear/rear/issues/2035
is about LVM in the recreated system because there things fail (hang
up)
within a chroot $TARGET_FS_ROOT
so that one would have to edit
$TARGET_FS_ROOT/etc/lvm/lvm.conf to make a difference there.
But we cannot "just change" files in the recreated system because
in general the restored files from the user's backup are sacrosanct.
gozora commented at 2019-02-18 09:02:¶
@jsmeix
My
https://github.com/rear/rear/issues/2044#issuecomment-464354380
was a reaction to
https://unix.stackexchange.com/questions/332556/arch-linux-installation-grub-problem
(provided by @gdha) which says excerpt:
...
/run/lvm/lvmetad.socket: connect failed: No such file or directory
or
WARNING: failed to connect to lvmetad: No such file or directory. Falling back to internal scanning.This is because /run is not available inside the chroot. These warnings will not prevent the system from booting, provided that everything has been done correctly, so you may continue with the installation.
So at the end of the day, we might not need to modify anything in recovered system ($TARGET_FS_ROOT/etc/lvm/lvm.conf), but just plain mounting /run might help. I guess it is at least wroth trying ...
V.
jsmeix commented at 2019-02-18 09:30:¶
Yes,
if LVM can no longer work without lvmetad or certain things in /run
we should bind-mount /run at $TARGET_FS_ROOT/run.
My point was that for me (on SLES) LVM "just worked"
without lvmetad or certain things in /run - same as in
https://unix.stackexchange.com/questions/332556/arch-linux-installation-grub-problem
These warnings will not prevent the system from booting
so that these warnings are useless and can be silenced.
But the behaviour in issue
https://github.com/rear/rear/issues/2035
is different because there things do not work and then
the LVM programs should error out (instead of endless waiting)
when they can no longer work without lvmetad or certain things in /run.
gozora commented at 2019-02-18 09:41:¶
@jsmeix
Things might "just work" for you because of older/different version of
LVM ;-).
But the behaviour in issue #2035
is different because there things do not work and then
the LVM programs should error out (instead of endless waiting)
when they can no longer work without lvmetad or certain things in /run.
I don't think there is some kind of endless waiting. There is just too many block devices LVM would like to scan and timing out after 10 seconds on each of them, might just appear to be endless ...
@gdha, @jsmeix just for comparing, can you post here what versions of LVM are you using you your particular systems ?
V.
jsmeix commented at 2019-02-18 11:27:¶
SLES15 and openSUSE Leap 15.0:
# lvm version
LVM version: 2.02.177(2) (2017-12-18)
Library version: 1.03.01 (2017-12-18)
Driver version: 4.37.0
SLES12-SP4
# lvm version
LVM version: 2.02.180(2) (2018-07-19)
Library version: 1.03.01 (2018-07-19)
Driver version: 4.37.0
SLES11-SP4
# lvm version
LVM version: 2.02.98(2) (2012-10-15)
Library version: 1.03.01 (2011-10-15)
Driver version: 4.25.0
gozora commented at 2019-02-18 11:29:¶
Wow SLES12-SP4 runs newer version of LVM than SLES15 ...
V.
jsmeix commented at 2019-02-18 11:32:¶
Seems so - I saw it and double checked it - that's what lvm version
outputs
(which does not take possible SUSE specific patches into account)...
jsmeix commented at 2019-02-22 11:33:¶
With
https://github.com/rear/rear/pull/2047
merged
/proc/ /sys/ /dev/ and /run/ are only bind-mounted into
TARGET_FS_ROOT
at the beginning of the finalize stage during rear recover
so that this won't help for all what happens during rear recover
before its finalize stage - in particular
https://github.com/rear/rear/pull/2047
cannot help when the disk layout gets recreated during rear recover
.
gdha commented at 2019-03-27 13:11:¶
on Ubuntu 18.04:
ESCUE client:~ # lvm version
LVM version: 2.02.176(2) (2017-11-03)
Library version: 1.02.145 (2017-11-03)
Driver version: 4.37.0
Configuration: ./configure --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --exec-prefix= --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2 --with-cache=internal --with-clvmd=corosync --with-cluster=internal --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --with-default-pid-dir=/run --with-default-run-dir=/run/lvm --with-default-locking-dir=/run/lock/lvm --with-thin=internal --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump --with-thin-repair=/usr/sbin/thin_repair --enable-applib --enable-blkid_wiping --enable-cmdlib --enable-cmirrord --enable-dmeventd --enable-dbus-service --enable-lvmetad --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld --enable-notify-dbus --enable-pkgconfig --enable-readline --enable-udev_rules --enable-udev_sync
And, I can confirm that by defining in /etc/lvm/lvm.conf
the following
: use_lvmetad = 0 the problem disappears.
We can create a script
/usr/share/rear/build/GNU/Linux/640_verify_lvm_conf.sh
to modify this
setting
jsmeix commented at 2019-03-28 10:14:¶
@gdha
I would appreciate such a
usr/share/rear/build/GNU/Linux/640_verify_lvm_conf.sh
script.
On SLES10 and SLES11 lvm version
does not show its Configuration:
in contrast to SLES12:
# lvm version
LVM version: 2.02.180(2) (2018-07-19)
Library version: 1.03.01 (2018-07-19)
Driver version: 4.37.0
Configuration: ./configure --host=x86_64-suse-linux-gnu --build=x86_64-suse-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/lib --localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --disable-dependency-tracking --prefix=/ --libdir=/lib64 --with-usrlibdir=/usr/lib64 --with-usrsbindir=/usr/sbin --sbindir=/sbin --enable-dmeventd --enable-udev_sync --enable-udev_rules --enable-cmdlib --enable-applib --enable-dmeventd --enable-realtime --enable-pkgconfig --enable-selinux --with-clvmd=corosync --with-cluster=internal --datarootdir=/usr/share --with-default-locking-dir=/run/lock/lvm --enable-cmirrord --enable-lvmetad --with-default-pid-dir=/run --with-default-dm-run-dir=/run --with-default-run-dir=/run/lvm --with-tmpfilesdir=/usr/lib/tmpfiles.d --with-thin=internal --with-device-gid=6 --with-device-mode=0640 --with-device-uid=0 --with-dmeventd-path=/sbin/dmeventd --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump --with-thin-repair=/usr/sbin/thin_repair --with-udev-prefix=/usr/ --enable-blkid-wiping --enable-lvmpolld
and openSUSE Leap 15.0:
# lvm version
LVM version: 2.02.177(2) (2017-12-18)
Library version: 1.03.01 (2017-12-18)
Driver version: 4.37.0
Configuration: ./configure --host=x86_64-suse-linux-gnu --build=x86_64-suse-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/lib --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --disable-dependency-tracking --enable-dmeventd --enable-cmdlib --enable-udev_rules --enable-udev_sync --with-udev-prefix=/usr/ --enable-selinux --enable-pkgconfig --with-usrlibdir=/usr/lib64 --with-usrsbindir=/usr/sbin --with-default-dm-run-dir=/run --with-tmpfilesdir=/usr/lib/tmpfiles.d --with-thin=internal --with-device-gid=6 --with-device-mode=0640 --with-device-uid=0 --with-dmeventd-path=/usr/sbin/dmeventd --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump --with-thin-repair=/usr/sbin/thin_repair --enable-applib --enable-blkid_wiping --enable-cmdlib --enable-lvmetad --enable-lvmpolld --enable-realtime --with-cache=internal --with-default-locking-dir=/run/lock/lvm --with-default-pid-dir=/run --with-default-run-dir=/run/lvm
that contain Configuration: ... --enable-lvmetad ...
.
On SLES10 and SLES11 /etc/lvm/lvm.conf does not contain lvmetad
in contrast to SLES12 and openSUSE Leap 15.0 that contain
# find /etc/lvm/ | xargs grep -i lvmetad
/etc/lvm/lvm.conf: debug_classes = [ "memory", "devices", "activation", "allocation", "lvmetad", "metadata", "cache", "locking", "lvmpolld", "dbus" ]
...
/etc/lvm/lvm.conf: use_lvmetad = 1
On SLES10 and SLES11 there is no lvmetad
process running
in contrast to SLES12 and openSUSE Leap 15.0
# ps auxw | grep lvmetad
root ... /usr/sbin/lvmetad -f
viper1986 commented at 2019-04-03 10:26:¶
Hello,
I have similar error in my rear recover.
I perform restore to another LPAR on PowerVM. OS SLES 12 PPC64LE.
I try to modify /etc/lvm/lvm.conf and set use_lvmetad = 0 but it not
working.
lvmetad.socket.txt
use_lvmetad.txt
gdha commented at 2019-04-03 10:34:¶
@viper1986 the issue has nothing to do with the lvmetab warning - duplicate UUID for a PV detected
+++ lvm pvcreate -ff --yes -v --uuid 2qVfcO-mTQp-NzPW-Xphj-ezrY-HMNy-OJN7Fi --norestorefile /dev/mapper/mpathp
Found duplicate PV 7DGexv5mGb83xwSTdIaKDHzahPeHZ87u: using /dev/disk/by-id/dm-name-mpathp-part2 not /dev/disk/by-id/dm-name-mpathi-part2
Using duplicate PV /dev/disk/by-id/dm-name-mpathp-part2 which is last seen, replacing /dev/disk/by-id/dm-name-mpathi-part2
Device /dev/mapper/mpathp not found (or ignored by filtering). Please run with -vvv option for more details
viper1986 commented at 2019-04-03 11:29:¶
I deleted all VG, LV and PV from rescue system.
But I got:
Device /dev/mapper/mpathp not found (or ignored by filtering). Please run with -vvv option for more details
Run fdisk and see that there are some partition exist:
RESCUE dwrdev01:~ # fdisk /dev/mapper/mpathp
Welcome to fdisk (util-linux 2.28).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): p
Disk /dev/mapper/mpathp: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 32768 bytes / 32768 bytes
Disklabel type: dos
Disk identifier: 0x5bc10d00
Device Boot Start End Sectors Size Id Type
/dev/mapper/mpathp-part1 * 4096 419839 415744 203M 6 FAT16
/dev/mapper/mpathp-part2 419848 41943039 41523192 19.8G 8e Linux LVM
Partition 2 does not start on physical sector boundary.
Command (m for help): d
Partition number (1,2, default 2): 1
Partition 1 has been deleted.
Command (m for help): d
Selected partition 2
Partition 2 has been deleted.
Command (m for help): p
Disk /dev/mapper/mpathp: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 32768 bytes / 32768 bytes
Disklabel type: dos
Disk identifier: 0x5bc10d00
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Re-reading the partition table failed.: Invalid argument
The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8).
RESCUE dwrdev01:~ # fdisk -l /dev/mapper/mpathp
Disk /dev/mapper/mpathp: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 32768 bytes / 32768 bytes
Disklabel type: dos
Disk identifier: 0x5bc10d00
Now recover works.
But my question is why REAR does not delete existing partitions on disks?
jsmeix commented at 2019-04-03 14:34:¶
@viper1986
according to
https://github.com/rear/rear/issues/2044#issuecomment-479434057
your issue has nothing to do with this issue here
so that you would need to report your issue as a new separtated issue
via the [New issue] button at
https://github.com/rear/rear/issues
When you restore to another LPAR on PowerVM
that other LPAR must have fully clean storage and be fully
compatible to the LPAR of your original system, cf. the sections
"Fully compatible replacement hardware is needed" and
"Prepare replacement hardware for disaster recovery" in
https://en.opensuse.org/SDB:Disaster_Recovery
Because of
https://github.com/rear/rear/issues/2044#issuecomment-479449884
it seems what was missing in this case was to
"Prepare replacement hardware for disaster recovery".
ReaR does delete existing partitions on disks via parted mklabel
but that does only delete the plain partitioning data but not any
other
kind of remainder data on an already used disk which wipefs
is
supposed to do when wipefs
is run in reverse ordering on each
storage object (listed by lsblk
as KNAME) starting from higher-level
storage objects (like partitions) down to lower-level storage objects
(like whole disks) - except exceptions where only dd
helps (see
below).
As long as we do not have a "cleanupdisk" script in ReaR,
cf.
https://github.com/rear/rear/issues/799
you should clear out your target disk if it had been ever used before
to be in general on the safe side against unexpected weird issues
because of whatever kind of remainder data on an already used disk.
Cf.
https://github.com/rear/rear/issues/2019#issuecomment-476598723
and the subsequent comments therein.
The interesting result in that case was that the only really reliable
working way was to completely zero out the replacement storage
via a "dumb brute force" command like
dd if=/dev/zero of=/dev/whole_disk
i.e. wipefs
alone was not sufficient - and only deleting plain
partitions
is in general not at all sufficient to remove any kind of remainder
data
on an already used disk (for example remainders of RAID
or partition-table signatures and other kind of "magic strings"
like LVM metadata and whatever else)...
jsmeix commented at 2019-04-05 12:46:¶
With
https://github.com/rear/rear/pull/2107
merged
this issue here as described in
https://github.com/rear/rear/issues/2044#issue-411069484
should be fixed.
[Export of Github issue for rear/rear.]