#3178 Issue open: rear does not recognize nvme when trying to format it with "rear format"

Labels: enhancement

xwhitebeltx opened issue at 2024-03-12 15:22:

  • ReaR version ("/usr/sbin/rear -V"):
Relax-and-Recover 2.7-git.5396.573f7f01.master / 2024-03-08
  • If your ReaR version is not the current version, explain why you can't upgrade:

  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):

PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
OUTPUT=USB
BACKUP=NETFS
BACKUP_URL=usb:///dev/disk/by-label/REAR-000
ONLY_INCLUDE_VG=( "system" )
USB_UEFI_PART_SIZE="1000"
KERNEL_CMDLINE="noip unattended"
USER_INPUT_TIMEOUT=15
USB_RETAIN_BACKUP_NR=1
COPY_AS_IS+=( /usr/local/bin )
  • Hardware vendor/product (PC or PowerNV BareMetal or ARM) or VM (KVM guest or PowerVM LPAR):
HP ProLiant DL385 Gen11
  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
x86 compatible
  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
UEFI
GRUB
  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
2xlocal Micron 960GB NVMe SSD
  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT"):
NAME                        KNAME          PKNAME         TRAN   TYPE FSTYPE      LABEL   SIZE MOUNTPOINT
/dev/loop0                  /dev/loop0                           loop squashfs           63.3M /snap/core20/1879
/dev/loop1                  /dev/loop1                           loop squashfs           63.9M /snap/core20/2182
/dev/loop2                  /dev/loop2                           loop squashfs          111.9M /snap/lxd/24322
/dev/loop3                  /dev/loop3                           loop squashfs             87M /snap/lxd/27428
/dev/loop4                  /dev/loop4                           loop squashfs           70.2M /snap/powershell/264
/dev/loop5                  /dev/loop5                           loop squashfs           53.2M /snap/snapd/19122
/dev/loop6                  /dev/loop6                           loop squashfs           39.1M /snap/snapd/21184
/dev/nvme1n1                /dev/nvme1n1                  nvme   disk                   894.3G
|-/dev/nvme1n1p1            /dev/nvme1n1p1 /dev/nvme1n1   nvme   part vfat                  1G /boot/efi
|-/dev/nvme1n1p2            /dev/nvme1n1p2 /dev/nvme1n1   nvme   part ext4                  2G /boot
`-/dev/nvme1n1p3            /dev/nvme1n1p3 /dev/nvme1n1   nvme   part LVM2_member       891.2G
  `-/dev/mapper/system-root /dev/dm-0      /dev/nvme1n1p3        lvm  ext4              891.2G /
/dev/nvme0n1                /dev/nvme0n1                  nvme   disk ext4              894.3G
  • Description of the issue (ideally so that others can reproduce it):
executing "rear -v format -- --efi /dev/nvme0n1" returns this error:

Relax-and-Recover 2.7-git.5396.573f7f01.master / 2024-03-08
Running rear format (PID 24112 date 2024-03-12 15:15:12)
Using log file: /var/log/rear/rear-avpukr.log
Running workflow format on the normal/original system
ERROR:
====================
BUG in /usr/share/rear/format/USB/default/200_check_usb_layout.sh line 28:
Unable to determine raw device for /dev/nvme0n1
--------------------
Please report it at https://github.com/rear/rear/issues
and include all related parts from /var/log/rear/rear-avpukr.log
preferably the whole debug information via 'rear -D format'
====================
Use debug mode '-d' for some debug messages or debugscript mode '-D' for full debug messages with 'set -x' output
Aborting due to an error, check /var/log/rear/rear-avpukr.log for details
Exiting rear format (PID 24112) and its descendant processes ...
Running exit tasks
Terminated
  • Workaround, if any:
nothing that i could find
  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
[rear-avpukr.log](https://github.com/rear/rear/files/14574841/rear-avpukr.log)

You can drag-drop log files into this editor to create an attachment
or paste verbatim text like command output or file content
by including it between a leading and a closing line of
three backticks like this:

verbatim content

gdha commented at 2024-03-18 16:46:

@xwhitebeltx Why do you want to treat a Micron 960GB NVMe SSD as an USB device? I guess you want to make a bootable UEFI device? Remember, a local SSD device is not safe to keep ReaR archives on.
@rear/contributors This is something we haven't foreseen so far (I think, but could be wrong [again])?

pcahyna commented at 2024-03-18 16:53:

@gdha in real use this is indeed unusual. For testing it is useful though to treat NVMe as any other disk device - you can take a machine with a secondary NVMe device, make a backup to it as if it were USB, and boot from it for recovery.
(yes, it is more realistic to test on a machine with a secondary USB disk, or at least a SCSI or SATA disk since it also presents itself as a sd* block device, but a machine with NVMe may be the only thing you have)

schlomo commented at 2024-03-18 17:07:

Any thoughts why we can't use find_device here?
https://github.com/rear/rear/blob/2073e77f1ed213653bbf846d67bef346a9f65da9/usr/share/rear/format/USB/default/200_check_usb_layout.sh#L12-L13

It seems to me like our code simply doesn't handle the nvme use case yet, it assumes a sysfs device path like /devices/pci0000:00/0000:00:10.0/host2/target2:0:1/2:0:1:0/block/sdb/sdb1 whereas in your example we have devices/pci0000:c0/0000:c0:03.3/0000:c5:00.0/nvme/nvme0/nvme0n1

@xwhitebeltx have you tried creating a partition and specifying that instead of the whole disk?

xwhitebeltx commented at 2024-03-18 17:52:

thank you all for your responses!

@gdha - we do as @pcahyna suggested, i have 2 local NVMes:

  1. OS
  2. backup drive i can boot from and restore the OS back to its original state
    in this case the main purpose is not backup of critical data, it is to be able to "reset" the server anytime we want with an automated boot + restore process, in places where PXE is banned.

@schlomo - it works fine when the SSD is SATA, first time i'm trying that on an NVMe,
i have not tried creating a partition, i'll try tomorrow morning and update =)

xwhitebeltx commented at 2024-03-19 07:33:

well, it worked but it ignored the partition(which is totally good for me), the initial state was this:

NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0             7:0    0  63.3M  1 loop /snap/core20/1879
loop1             7:1    0  63.9M  1 loop /snap/core20/2182
loop2             7:2    0 111.9M  1 loop /snap/lxd/24322
loop3             7:3    0    87M  1 loop /snap/lxd/27428
loop4             7:4    0  70.2M  1 loop /snap/powershell/264
loop5             7:5    0  53.2M  1 loop /snap/snapd/19122
loop6             7:6    0  39.1M  1 loop /snap/snapd/21184
nvme1n1         259:0    0 894.3G  0 disk
├─nvme1n1p1     259:2    0     1G  0 part /boot/efi
├─nvme1n1p2     259:3    0     2G  0 part /boot
└─nvme1n1p3     259:4    0 891.2G  0 part
  └─system-root 253:0    0 891.2G  0 lvm  /
nvme0n1         259:1    0 894.3G  0 disk
└─nvme0n1p1     259:6    0 894.3G  0 part

after i executed "rear -v format -- --efi /dev/nvme0n1p1" i got this output:

root@avpukr:~# rear -v format -- --efi /dev/nvme0n1p1
Relax-and-Recover 2.7-git.5396.573f7f01.master / 2024-03-08
Running rear format (PID 222220 date 2024-03-19 06:56:31)
Using log file: /var/log/rear/rear-avpukr.log
Running workflow format on the normal/original system
USB or disk device /dev/nvme0n1p1 is not formatted with ext2/3/4 or btrfs filesystem
Formatting /dev/nvme0n1p1 will remove all currently existing data on that whole device
Type exactly 'Yes' to format /dev/nvme0n1p1 with ext3 filesystem
(default 'No' timeout 15 seconds)
Yes
Repartitioning /dev/nvme0n1
Creating partition table of type gpt on /dev/nvme0n1
Making an EFI bootable device /dev/nvme0n1
Creating EFI system partition /dev/nvme0n11 with size 1000 MiB aligned at 8 MiB
Setting 'esp' flag on EFI partition /dev/nvme0n11
Creating ReaR data partition /dev/nvme0n12 up to 100% of /dev/nvme0n1
Setting 'legacy_boot' flag on ReaR data partition /dev/nvme0n12
Creating vfat filesystem on EFI system partition on /dev/nvme0n1p1
Creating ext3 filesystem with label 'REAR-000' on ReaR data partition /dev/nvme0n1p2
Adjusting filesystem parameters on ReaR data partition /dev/nvme0n1p2
Exiting rear format (PID 222220) and its descendant processes ...
Running exit tasks

and then the disk structure is as following:

NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0             7:0    0  63.3M  1 loop /snap/core20/1879
loop1             7:1    0  63.9M  1 loop /snap/core20/2182
loop2             7:2    0 111.9M  1 loop /snap/lxd/24322
loop3             7:3    0    87M  1 loop /snap/lxd/27428
loop4             7:4    0  70.2M  1 loop /snap/powershell/264
loop5             7:5    0  53.2M  1 loop /snap/snapd/19122
loop6             7:6    0  39.1M  1 loop /snap/snapd/21184
nvme1n1         259:0    0 894.3G  0 disk
├─nvme1n1p1     259:2    0     1G  0 part /boot/efi
├─nvme1n1p2     259:3    0     2G  0 part /boot
└─nvme1n1p3     259:4    0 891.2G  0 part
  └─system-root 253:0    0 891.2G  0 lvm  /
nvme0n1         259:1    0 894.3G  0 disk
├─nvme0n1p1     259:5    0  1000M  0 part
└─nvme0n1p2     259:6    0 893.3G  0 part

i've performed a backup with "rear -v mkbackup"
and i've restored it successfully!

there's one unexpected screen which i'm not sure yet what's happening there and if it's related to rear or not,
actions i took:

  1. reboot
  2. F11 for boot menu
  3. choose the REAR-000 NVMe Drive
  4. this screen pops up:
    image
    and then the rear grub menu, from there it all seems to be working fine

thanks for the workaround! is this something you would consider implementing in future releases? treating NVMe devices as USB?

jsmeix commented at 2024-03-19 08:22:

@xwhitebeltx
that "unexpected screen" was implemeted by me via
https://github.com/rear/rear/commit/a0a6429119bc284fcf93775bd579bcd74d1c3b40
see also
https://github.com/rear/rear/pull/3025#issuecomment-1634068353

Actually there is no new screen.
It is a timeout delay on the normal initial GRUB screen
(without timeout it shows up only for a fraction of a second)
before GRUB replaces it with its boot menu screen.
With the timeout delay you can see and even read
what GRUB shows (in particular possible error messages)
which helps to understand why GRUB fails when it fails
(e.g. it fails when a wrong 'root' device is used).

xwhitebeltx commented at 2024-03-19 09:32:

@jsmeix oh, nice =)
thank you very much,
i'm sorry for the silly question but this is my first time reporting a bug in github,
please tell me should i close it as resolved (workaround provided) or do you want to leave it open until it is fixed?
what is the proper conduct?

jsmeix commented at 2024-03-19 09:38:

@xwhitebeltx
don't worry how you may close it "correctly".
Leave that "problem" to us.

Personally I would like to fix it but unfortunately
too often I don't find the needed time for it,
too much other stuff elsewhere :-(
I would like to keep it open for some time,
hope dies last :-)


[Export of Github issue for rear/rear.]