#3085 Issue closed
: write_protected_candidate_device called for '/sys/block/nvme0c0n1' but '/dev/nvme0c0n1' is no block device¶
Labels: enhancement
, bug
hitrikrtek opened issue at 2023-11-17 08:12:¶
-
ReaR version ("/usr/sbin/rear -V"):
Relax-and-Recover 2.7 / 2022-07-13 -
If your ReaR version is not the current version, explain why you can't upgrade:
Latest version with distribution -
OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
NAME="SLES"
VERSION="15-SP5"
VERSION_ID="15.5"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP5"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp5"
- ReaR configuration files ("cat /etc/rear/local.conf"):
BACKUP=NETFS
OUTPUT=ISO
BACKUP_URL=nfs://172.17.3.1/rear
BACKUP_OPTIONS=
NETFS_KEEP_OLD_BACKUP_COPY=yes
USE_DHCLIENT=
MODULES_LOAD=( )
#BACKUP_PROG_INCLUDE=("{BACKUP_PROG_INCLUDE[@]}" '/boot/grub2/i386-pc' '/boot/grub2/x86_64-efi' '/home' '/opt' '/root' '/srv' '/hana/shared' '/usr' '/var')
BACKUP_PROG_EXCLUDE=("{BACKUP_PROG_EXCLUDE[@]}" '/hana/backup' '/hana/data' '/hana/log' '/hana/shared/HSP/HDB05/backup' '/hana/shared/NSP/HDB50/backup' '/opt/dpsapps/agentsvc/dbs' '/opt/dpsapps/agentsvc/logs' '/tmp')
POST_RECOVERY_SCRIPT=('if snapper --no-dbus -r $TARGET_FS_ROOT get-config | grep -q "^QGROUP.*[0-9]/[0-9]" ; then snapper --no-dbus -r $TARGET_FS_ROOT set-config QGROUP= ; snapper --no-dbus -r $TARGET_FS_ROOT setup-quota && echo snapper setup-quota done || echo snapper setup-quota failed ; else echo snapper setup-quota not used ; fi')
REQUIRED_PROGS=(snapper chattr lsattr ${REQUIRED_PROGS[@]})
COPY_AS_IS=(/usr/lib/snapper/installation-helper /etc/snapper/config-templates/default ${COPY_AS_IS[@]})
EXCLUDE_MOUNTPOINTS=("/mnt")
ISO_MKISOFS_BIN=/usr/bin/ebiso
-
Hardware vendor/product (PC or PowerNV BareMetal or ARM) or VM (KVM guest or PowerVM LPAR):
DELL PowerEdge R860 -
System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
x86_64 -
Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
BIOS Version 1.5.6
grub2-x86_64-efi-2.06-150500.29.8.1 -
Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
NVMe in RAID configuration
Backplane 1 on Connector 0 of RAID Controller in SL 1
DeviceDescription Backplane 1 on Connector 0 of RAID Controller in SL 1
DeviceType PCIeSSDBackPlane
FirmwareVersion 7.10
FQDD Enclosure.Internal.0-1:RAID.SL.1-1
InstanceID Enclosure.Internal.0-1:RAID.SL.1-1
MediaType Solid State Drive
PCIExpressGeneration Gen 4
ProductName BP_PSV 0:1
RollupStatus OK
SlotCount 8
WiredOrder 1
Backplane 2 on Connector 0 of RAID Controller in SL 4
DeviceDescription Backplane 2 on Connector 0 of RAID Controller in SL 4
DeviceType PCIeSSDBackPlane
FirmwareVersion 7.10
FQDD Enclosure.Internal.0-2:RAID.SL.4-1
InstanceID Enclosure.Internal.0-2:RAID.SL.4-1
MediaType Solid State Drive
PCIExpressGeneration Gen 4
ProductName BP_PSV 0:2
RollupStatus OK
SlotCount 8
WiredOrder 2
- Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT"):
NAME KNAME PKNAME TRAN TYPE FSTYPE LABEL SIZE MOUNTPOINT
/dev/sda /dev/sda disk LVM2_member 23.3T
|-/dev/mapper/vghana-lvusrsap /dev/dm-2 /dev/sda lvm xfs 150G /usr/sap
|-/dev/mapper/vghana-lvdata /dev/dm-3 /dev/sda lvm xfs 6T /hana/data
|-/dev/mapper/vghana-lvshared /dev/dm-4 /dev/sda lvm xfs 2T /hana/shared
|-/dev/mapper/vghana-lvlog /dev/dm-5 /dev/sda lvm xfs 4T /hana/log
'-/dev/mapper/vghana-lvbackup /dev/dm-6 /dev/sda lvm xfs 2T /hana/backup
/dev/sdb /dev/sdb iscsi disk 100M
/dev/sdc /dev/sdc iscsi disk 100M
/dev/sdd /dev/sdd iscsi disk 1G
/dev/nvme0n1 /dev/nvme0n1 nvme disk 447.1G
|-/dev/nvme0n1p1 /dev/nvme0n1p1 /dev/nvme0n1 nvme part vfat ESP 250M /boot/efi
|-/dev/nvme0n1p2 /dev/nvme0n1p2 /dev/nvme0n1 nvme part vfat OS 2G /boot
'-/dev/nvme0n1p3 /dev/nvme0n1p3 /dev/nvme0n1 nvme part LVM2_member 444.8G
|-/dev/mapper/vgroot-root /dev/dm-0 /dev/nvme0n1p3 lvm btrfs 442.8G /var
'-/dev/mapper/vgroot-swap /dev/dm-1 /dev/nvme0n1p3 lvm swap 2G [SWAP]
- Description of the issue (ideally so that others can reproduce it):
When running rear recover we get this error:
-
Workaround, if any:
None -
Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
rear-srv-lx2013.log
See attached debug log: rear-srv-lx2013.log
jsmeix commented at 2023-11-23 08:22:¶
@hitrikrtek
what is the output of each of the commands
# echo /sys/block/*
# ls -l /dev/nvme0c0n1
# ls -l /sys/block/nvme0c0n1
# ls -l $( readlink -e /sys/block/nvme0c0n1 )
?
Details:
The following excerpt from your
https://github.com/rear/rear/files/13389016/rear-srv-lx2013.log
shows how it fails:
++ for current_device_path in /sys/block/*
++ current_disk_name=nvme0c0n1
++ is_multipath_path nvme0c0n1
++ test nvme0c0n1
++ is_multipath_used
++ type multipath
++ return 1
++ return 1
++ test -d /sys/block/nvme0c0n1/queue
++ test 0 = 1
++ is_write_protected /sys/block/nvme0c0n1
+++ write_protected_candidate_device /sys/block/nvme0c0n1
+++ local device=/sys/block/nvme0c0n1
+++ [[ /sys/block/nvme0c0n1 == /sys/block/* ]]
++++ get_device_name /sys/block/nvme0c0n1
++++ local name=/sys/block/nvme0c0n1
++++ name=nvme0c0n1
++++ contains_visible_char nvme0c0n1
+++++ tr -d -c '[:graph:]'
++++ test nvme0c0n1
++++ [[ nvme0c0n1 =~ ^mapper/ ]]
++++ [[ -L /dev/nvme0c0n1 ]]
++++ [[ nvme0c0n1 =~ ^dm- ]]
++++ name=nvme0c0n1
++++ echo /dev/nvme0c0n1
++++ [[ -r /dev/nvme0c0n1 ]]
++++ return 1
+++ device=/dev/nvme0c0n1
+++ test -b /dev/nvme0c0n1
+++ BugError 'write_protected_candidate_device called for '\''/sys/block/nvme0c0n1'\'' but '\''/dev/nvme0c0n1'\'' is no block device'
This looks strange because it seems in your case
there is "nvme0c0n1" in /sys/block/ where
test -b /dev/nvme0c0n1
fails so it seems "nvme0c0n1" is a block device
(because "nvme0c0n1" is in /sys/block/) that
either has no matching /dev/nvme0c0n1 device node
or it has a matching /dev/nvme0c0n1 device node
but that /dev/nvme0c0n1 device node is no block device.
By "googling" for 'nvme0c0n1' I found
https://github.com/google/cadvisor/issues/3340
Not all /sys/block devices will have a "dev" file
which seems to explain the root cause.
For comparison:
On my laptop with a NVMe disk I have /sys/block/nvme0n1
and I have a matching device node /dev/nvme0n1
which is a block device.
The matching code is in
usr/share/rear/layout/prepare/default/250_compare_disks.sh
which is online for ReaR 2.7 starting at
https://github.com/rear/rear/blob/rear-2.7/usr/share/rear/layout/prepare/default/250_compare_disks.sh#L94
The is_write_protected and write_protected_candidate_device
functions are in
usr/share/rear/lib/write-protect-functions.sh
which is online for ReaR 2.7
https://github.com/rear/rear/blob/rear-2.7/usr/share/rear/lib/write-protect-functions.sh
So it seems we have to enhance the
write_protected_candidate_device function
to ignore it when it is called with a device
where no matching /dev/... device node exists.
hitrikrtek commented at 2023-11-23 08:28:¶
Hi @jsmeix
First of all thank you for your time looking into this.
I've ran the commands you requested and here's the output:
# echo /sys/block/*
/sys/block/dm-0 /sys/block/dm-1 /sys/block/dm-2 /sys/block/dm-3 /sys/block/dm-4 /sys/block/dm-5 /sys/block/dm-6 /sys/block/nvme0c0n1 /sys/block/nvme0n1 /sys/block/sda /sys/block/sdb /sys/block/sdc /sys/block/sdd
# ls -l /dev/nvme0c0n1
ls: cannot access '/dev/nvme0c0n1': No such file or directory
# ls -l /sys/block/nvme0c0n1
lrwxrwxrwx 1 root root 0 Nov 21 13:38 /sys/block/nvme0c0n1 -> ../devices/pci0000:00/0000:00:0a.0/0000:01:00.0/nvme/nvme0/nvme0c0n1
# ls -l $( readlink -e /sys/block/nvme0c0n1 )
total 0
-r--r--r-- 1 root root 4096 Nov 23 09:26 alignment_offset
-r--r--r-- 1 root root 4096 Nov 23 09:26 capability
lrwxrwxrwx 1 root root 0 Nov 21 13:38 device -> ../../nvme0
-r--r--r-- 1 root root 4096 Nov 23 09:26 discard_alignment
-r--r--r-- 1 root root 4096 Nov 23 09:26 diskseq
-r--r--r-- 1 root root 4096 Nov 23 09:26 eui
-r--r--r-- 1 root root 4096 Nov 23 09:26 events
-r--r--r-- 1 root root 4096 Nov 23 09:26 events_async
-rw-r--r-- 1 root root 4096 Nov 23 09:26 events_poll_msecs
-r--r--r-- 1 root root 4096 Nov 23 09:26 ext_range
-r--r--r-- 1 root root 4096 Nov 23 09:26 hidden
drwxr-xr-x 2 root root 0 Nov 23 09:26 holders
-r--r--r-- 1 root root 4096 Nov 23 09:26 inflight
drwxr-xr-x 2 root root 0 Nov 23 09:26 integrity
-rw-r--r-- 1 root root 4096 Nov 23 09:26 make-it-fail
drwxr-xr-x 5 root root 0 Nov 23 09:26 mq
-r--r--r-- 1 root root 4096 Nov 23 09:26 nsid
drwxr-xr-x 2 root root 0 Nov 23 09:26 power
drwxr-xr-x 2 root root 0 Nov 23 09:26 queue
-r--r--r-- 1 root root 4096 Nov 23 09:26 range
-r--r--r-- 1 root root 4096 Nov 23 09:26 removable
-r--r--r-- 1 root root 4096 Nov 23 09:26 ro
-r--r--r-- 1 root root 4096 Nov 23 09:26 size
drwxr-xr-x 2 root root 0 Nov 23 09:26 slaves
-r--r--r-- 1 root root 4096 Nov 23 09:26 stat
lrwxrwxrwx 1 root root 0 Nov 21 13:38 subsystem -> ../../../../../../../class/block
drwxr-xr-x 2 root root 0 Nov 23 09:26 trace
-rw-r--r-- 1 root root 4096 Nov 21 13:38 uevent
-r--r--r-- 1 root root 4096 Nov 23 09:26 wwid
jsmeix commented at 2023-11-23 08:31:¶
# echo /sys/block/*
... /sys/block/nvme0c0n1 /sys/block/nvme0n1 ...
# ls -l /dev/nvme0c0n1
ls: cannot access '/dev/nvme0c0n1': No such file or directory
proves that the root cause is that
Not all /sys/block devices will have a "dev" file
so we have to enhance the
write_protected_candidate_device function
to ignore it when it is called with a device
where no matching /dev/... device node exists.
jsmeix commented at 2023-11-23 10:08:¶
@hitrikrtek
I cannot reproduce your issue
because I don't have a system with /sys/block/nvme0c0n1
or something similar - i.e. where a /sys/block/device
does not have a matching /dev/device.
So as an offhanded untested workaround for now
you may try out how things behave when you replace in
usr/share/rear/lib/write-protect-functions.sh
test -b "$device" || BugError "write_protected_candidate_device called for '$1' but '$device' is no block device"
with
test -b "$device" || DebugPrint "write_protected_candidate_device called for '$1' but '$device' is no block device"
Perhaps with that you get other errors in other functions
in usr/share/rear/lib/write-protect-functions.sh
Alternatively (and preferred by me if you can) you could test
if my overhauled usr/share/rear/lib/write-protect-functions.sh
https://raw.githubusercontent.com/rear/rear/a403b5fe1c2c58420ba1b77db52283c041e4f7d4/usr/share/rear/lib/write-protect-functions.sh
in
https://github.com/rear/rear/pull/3091
makes things work well for your case.
hitrikrtek commented at 2023-11-23 10:16:¶
Alright! I've grabbed your updated function and am creating a backup
now. Will report back the result.
Thank you
jsmeix commented at 2023-11-23 12:16:¶
@hitrikrtek
with
OUTPUT=ISO
you should get nothing automatically set for
WRITE_PROTECTED_IDS and/or WRITE_PROTECTED_FS_LABEL_PATTERNS
after "rear mkrescue/mkbackup" in your
/var/tmp/rear.XXX/rootfs/etc/rear/rescue.conf
When you also have
neither WRITE_PROTECTED_IDS
nor WRITE_PROTECTED_FS_LABEL_PATTERNS
specified in your etc/rear/local.conf file
the all is_write_protected function calls
would always result that nothing is write-protected
so the is_write_protected function call could be
simplified and made behave more robust against errors
when it checks if WRITE_PROTECTED_IDS
and WRITE_PROTECTED_FS_LABEL_PATTERNS are empty
and in this case directly return 1 (i.e. "not write-protected").
hitrikrtek commented at 2023-11-23 12:49:¶
Regarding the updated function - this seems to work now, we were able to do restore - but now we're experiencing some other issues after restore our network broke somehow... all disks and configs seem just fine, we're investigating but I don't think it's related to this device. I'll post my finding here, if we figure it out.
jsmeix commented at 2023-11-23 12:52:¶
@hitrikrtek
please do not post other things that do not belong to this issue
in this issue here but instead create new separated issues
for each separated issue that you have.
Otherwise all would get messed up.
[Export of Github issue for rear/rear.]