#2908 Issue closed: ReaR 2.7 on SLES15SP2: inexplicable regular umount failure

Labels: enhancement, support / question, fixed / solved / done, special hardware or VM

thomas-merz opened issue at 2023-01-09 18:02:

Relax-and-Recover (ReaR) Issue Template

  • ReaR version ("/usr/sbin/rear -V"):
    Relax-and-Recover 2.7 / 2022-07-13

  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
    NAME="SLES"
    VERSION="15-SP2"
    VERSION_ID="15.2"
    PRETTY_NAME="SUSE Linux Enterprise Server 15 SP2"
    ID="sles"
    ID_LIKE="suse"
    ANSI_COLOR="0;32"
    CPE_NAME="cpe:/o:suse:sles:15:sp2"

  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):

OUTPUT=ISO
BACKUP=CDM
OUTPUT_URL=file:///rear/iso
OUTPUT_PREFIX="hostname-f"
export TMPDIR="/tmp"
TIMESYNC=NTP
NETFS_KEEP_OLD_BACKUP_COPY=N
EXCLUDE_VG=()
ONLY_INCLUDE_VG=(systemvg binvg hanasharedvg)
WAIT_SECS=120
SKIP_CFG2HTML=Y
USE_CFG2HTML=N
  • Hardware vendor/product (PC or PowerNV BareMetal or ARM) or VM (KVM guest or PowerVM LPAR):
    VMware

  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86

  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    EFI v2.40 by VMware, Inc.

  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    more than one "VMware PVSCSI storage adapter rev 2" with a total of 5 disks

  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT"):

NAME                                    KNAME      PKNAME    TRAN TYPE FSTYPE      LABEL   SIZE MOUNTPOINT
/dev/loop0                              /dev/loop0                loop vfat                 96M /var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/dev/loop1                              /dev/loop1                loop vfat                 96M /var/tmp/rear.OVhhey7uK3taoxV/tmp/efi_virt
/dev/loop2                              /dev/loop2                loop vfat                 96M /var/tmp/rear.CJOWoaQvuMUvyDv/tmp/efi_virt
/dev/loop3                              /dev/loop3                loop vfat                 96M /var/tmp/rear.IIDh3emTJrYkDY4/tmp/efi_virt
/dev/loop4                              /dev/loop4                loop vfat                 96M /var/tmp/rear.PxS05QU0gUOharc/tmp/efi_virt
/dev/loop5                              /dev/loop5                loop vfat                 96M /var/tmp/rear.XnSh08Zi7LWWAAm/tmp/efi_virt
/dev/loop6                              /dev/loop6                loop vfat                 96M /var/tmp/rear.k4nw099Q7s9UiH6/tmp/efi_virt
/dev/loop7                              /dev/loop7                loop vfat                 96M /var/tmp/rear.l53DsAliWYCFV8c/tmp/efi_virt
/dev/sda                                /dev/sda                  disk                     170G
`-/dev/sda1                             /dev/sda1  /dev/sda       part LVM2_member         170G
  |-/dev/mapper/binvg-usrsap_lv         /dev/dm-6  /dev/sda1      lvm  xfs                  50G /usr/sap
  |-/dev/mapper/binvg-saphome_lv        /dev/dm-7  /dev/sda1      lvm  xfs                   2G /home/sap
  |-/dev/mapper/binvg-uc4home_lv        /dev/dm-8  /dev/sda1      lvm  xfs                   2G /home/uc4
  `-/dev/mapper/binvg-sapinst_lv        /dev/dm-10 /dev/sda1      lvm  xfs                 115G /sapinst
/dev/sdb                                /dev/sdb                  disk                     122G
|-/dev/sdb1                             /dev/sdb1  /dev/sdb       part                       8M
|-/dev/sdb2                             /dev/sdb2  /dev/sdb       part ext3              372.5M /boot
|-/dev/sdb3                             /dev/sdb3  /dev/sdb       part LVM2_member       119.5G
| |-/dev/mapper/systemvg-usr_lv         /dev/dm-0  /dev/sdb3      lvm  xfs                  15G /usr
| |-/dev/mapper/systemvg-swap_lv        /dev/dm-1  /dev/sdb3      lvm  swap                 16G [SWAP]
| |-/dev/mapper/systemvg-root_lv        /dev/dm-2  /dev/sdb3      lvm  xfs                  15G /
| |-/dev/mapper/systemvg-opt_lv         /dev/dm-9  /dev/sdb3      lvm  xfs                   5G /opt
| |-/dev/mapper/systemvg-var_lv         /dev/dm-11 /dev/sdb3      lvm  xfs                  10G /var
| |-/dev/mapper/systemvg-varlogaudit_lv /dev/dm-12 /dev/sdb3      lvm  xfs                  10G /var/log/audit
| |-/dev/mapper/systemvg-varlog_lv      /dev/dm-13 /dev/sdb3      lvm  xfs                  10G /var/log
| |-/dev/mapper/systemvg-vartmp_lv      /dev/dm-14 /dev/sdb3      lvm  xfs                  10G /var/tmp
| |-/dev/mapper/systemvg-tmp_lv         /dev/dm-15 /dev/sdb3      lvm  xfs                  10G /tmp
| `-/dev/mapper/systemvg-home_lv        /dev/dm-16 /dev/sdb3      lvm  xfs                  15G /home
`-/dev/sdb4                             /dev/sdb4  /dev/sdb       part vfat                139M /boot/efi
/dev/sdc                                /dev/sdc                  disk                     1.4T
`-/dev/sdc1                             /dev/sdc1  /dev/sdc       part LVM2_member         1.4T
  `-/dev/mapper/hanadatavg-hanadata_lv  /dev/dm-4  /dev/sdc1      lvm  xfs                 1.4T /hana/data
/dev/sdd                                /dev/sdd                  disk                      80G
`-/dev/sdd1                             /dev/sdd1  /dev/sdd       part LVM2_member          80G
  `-/dev/mapper/hanasharedvg-hanashared_lv
                                        /dev/dm-3  /dev/sdd1      lvm  xfs                  80G /hana/shared
/dev/sde                                /dev/sde                  disk                     700G
`-/dev/sde1                             /dev/sde1  /dev/sde       part LVM2_member         700G
  `-/dev/mapper/hanalogvg-hanalog_lv    /dev/dm-5  /dev/sde1      lvm  xfs                 700G /hana/log
/dev/sr0                                /dev/sr0             ata  rom                     1024M
  • Description of the issue (ideally so that others can reproduce it):

When running Rear 2.7 on our EFI-boot-enabled VMs it says

Could not remove build area /var/tmp/rear.k4nw099Q7s9UiH6 (something still exists therein)

and many "dead" mounts are there (one for each single run):

/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/isofs/boot/efiboot.img (deleted) on /var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt type vfat

...(and so on)...

  • Workaround, if any:
    None; or manual unmount 😞

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):

2023-01-09 17:51:36.478977061 Exiting rear mkrescue (PID 126650) and its descendant processes ...
2023-01-09 17:51:40.004244933 rear,126650 /usr/sbin/rear mkrescue -vvv
                                `-rear,154186 /usr/sbin/rear mkrescue -vvv
                                    `-pstree,154187 -Aplau 126650
2023-01-09 17:51:40.211717375 Running exit tasks
2023-01-09 17:51:40.236870986 Finished rear mkrescue in 165 seconds
2023-01-09 17:51:40.250497716 Removing build area /var/tmp/rear.k4nw099Q7s9UiH6
2023-01-09 17:51:40.346937030 Failed to 'rm -Rf --one-file-system /var/tmp/rear.k4nw099Q7s9UiH6/tmp'
2023-01-09 17:51:40.831854759 Could not remove build area /var/tmp/rear.k4nw099Q7s9UiH6 (something still exists therein)
2023-01-09 17:51:40.862120053 Something is still mounted within the build area
2023-01-09 17:51:40.878799867   /var/tmp/rear.k4nw099Q7s9UiH6/tmp/isofs/boot/efiboot.img (deleted) on /var/tmp/rear.k4nw099Q7s9UiH6/tmp/efi_virt type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
2023-01-09 17:51:40.895534725 You must manually umount it, then you could manually remove the build area
2023-01-09 17:51:40.912931056 To manually remove the build area use (with caution): rm -Rf --one-file-system /var/tmp/rear.k4nw099Q7s9UiH6
2023-01-09 17:51:40.929027396 End of program 'rear' reached

❓ Or do you need the full detailled logfile with 166441 lines?

We found this by accident because our monitoring
got so many new filesystems every day for these
/var/tmp/rear.XXXXXX filesystems
after each scheduled "rear mkrescue" run… 😲

jsmeix commented at 2023-01-10 07:43:

@thomas-merz
yes, I need a full debug "rear -D mkrescue" logfile
to be able to find out why something is still mounted
within the build area in your particular case.
If that kind of issue would happen in general
we would have noticed it and fixed its root cause.

Additionally I need the information what files there are
below the mountpoint directory that is still mounted
within the build area in your particular case,
i.e. what files and directories there are in
/var/tmp/rear.*/tmp/efi_virt
in your particular case.
I assume something related to UEFI code does not
sufficiently clean up its own temporary files
or it fails to do so but "blindly proceeds" (see below).
Because 'efi_virt' appears only in

output/ISO/Linux-i386/700_create_efibootimg.sh

this script is the "prime suspect".

FYI:
Be grateful that over the time we at ReaR upstream added
many basic sanity checks and safeguards at many places
to let ReaR less often blindly proceed regardless of issues
according to "Try hard to care about possible errors" in
https://github.com/rear/rear/wiki/Coding-Style

By default bash proceeds with the next command when something failed.
Do not let your code blindly proceed in case of errors...

There are still many places in ReaR where it blindly proceeds
and we fix them one by one and step by step as we notice them.

Regarding this particular case see the critical issue
https://github.com/rear/rear/issues/2611
in particular the initial description how the old blind
"rm -Rf /.../outputfs" could cause a disaster by ReaR

... removes ... backup directory ...
... destroy the backup directories for other machines ...

instead of let ReaR help the user to protect against disaster
and see the rather tricky implementation of an appropriate fix
https://github.com/rear/rear/pull/2625

In this particular case here it is not BACKUP=NETFS
so no backup should be mounted below /var/tmp/rear...
but at the point when /var/tmp/rear... is cleaned up
it is unknown what there might be still left mounted below
(normally all gets umounted and nothing is left mounted)
so in general the new careful "rm -Rf --one-file-system"
instead of the old just blindly "rm -Rf" helps to avoid
possible disastrous outcomes - i.e. better safe than sorry.

thomas-merz commented at 2023-01-11 11:28:

Can I send you "full debug "rear -D mkrescue" logfile" not via public GitHub due to possible "sensitive" data that my company don't want to be leaked public?

thomas-merz commented at 2023-01-11 11:28:

what files there are
below the mountpoint directory that is still mounted

For example for one single mounted filesystem out of many:

find /var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/BOOTX64.efi
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/grub.cfg
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/fonts
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/fonts/unicode.pf2
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ast.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ca.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/da.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/de.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/de_CH.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/en@quot.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/eo.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/es.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/fi.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/fr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/gl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/hr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/hu.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/id.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/it.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ja.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ko.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/lt.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/nb.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/nl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pa.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pt.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pt_BR.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ro.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ru.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/sl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/sr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/sv.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/tr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/uk.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/vi.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/zh_CN.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/zh_TW.mo

jsmeix commented at 2023-01-11 12:02:

@thomas-merz
I think only the part when
output/ISO/Linux-i386/700_create_efibootimg.sh
runs is perhaps already sufficient to show why in
output/ISO/Linux-i386/700_create_efibootimg.sh
it fails to umount $v $TMP_DIR/efiboot.img cf.
https://github.com/rear/rear/blob/rear-2.7/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh#L47

So in the "rear -D mkrescue" log file from a line like

+ source /usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh

all until the next script is sourced
i.e. all until the next line that starts with

+ source /usr/share/rear/...

which is likely a line like

+ source /usr/share/rear/output/ISO/Linux-i386/800_create_isofs.sh

thomas-merz commented at 2023-01-11 13:49:

… I think only the part …

Here we go:

2023-01-09 18:54:05.744086285 Including output/ISO/Linux-i386/700_create_efibootimg.sh
2023-01-09 18:54:05.758244199 Entering debugscript mode via 'set -x'.
+ source /usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh
++ is_true 1
++ case "$1" in
++ return 0
++ efi_img_sz=($( du --block-size=32M --summarize $TMP_DIR/mnt ))
+++ du --block-size=32M --summarize /var/tmp/rear.106OcLQpa6JeG10/tmp/mnt
++ ((  efi_img_sz += 2  ))
++ dd if=/dev/zero of=/var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img count=3 bs=32M
3+0 records in
3+0 records out
100663296 bytes (101 MB, 96 MiB) copied, 0.0787105 s, 1.3 GB/s
++ mkfs.vfat -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img
mkfs.fat 4.1 (2017-01-24)
/var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img has 64 heads and 32 sectors per track,
hidden sectors 0x0000;
logical sector size is 512,
using 0xf8 media descriptor, with 196608 sectors;
drive number 0x80;
filesystem has 2 16-bit FATs and 4 sectors per cluster.
FAT size is 192 sectors, and provides 49047 clusters.
There are 4 reserved sectors.
Root directory contains 512 slots and uses 32 sectors.
Volume ID is 4bdd6f20, no volume label.
++ mkdir -p -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt
mkdir: created directory '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt'
++ mount -v -o loop -t vfat /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt
mount: /dev/loop8 mounted on /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt.
++ cp -v -r /var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/. /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/BOOTX64.efi' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/BOOTX64.efi'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/grub.cfg' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/grub.cfg'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/fonts' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/fonts'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/fonts/unicode.pf2' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/fonts/unicode.pf2'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ast.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ast.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ca.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ca.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/da.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/da.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/de.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/de.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/de_CH.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/de_CH.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/en@quot.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/en@quot.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/eo.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/eo.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/es.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/es.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/fi.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/fi.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/fr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/fr.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/gl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/gl.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/hr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/hr.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/hu.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/hu.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/id.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/id.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/it.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/it.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ja.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ja.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ko.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ko.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/lt.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/lt.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/nb.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/nb.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/nl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/nl.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pa.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pa.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pl.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pt.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pt.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pt_BR.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pt_BR.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ro.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ro.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ru.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ru.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/sl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/sl.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/sr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/sr.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/sv.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/sv.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/tr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/tr.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/uk.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/uk.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/vi.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/vi.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/zh_CN.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/zh_CN.mo'
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/zh_TW.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/zh_TW.mo'
++ umount -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img
umount: /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt: target is busy.
++ mv -v -f /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img /var/tmp/rear.106OcLQpa6JeG10/tmp/isofs/boot/efiboot.img
renamed '/var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/isofs/boot/efiboot.img'
+ source_return_code=0
+ test 0 -eq 0
+ cd /var/log/rear
+ test 1
+ Debug 'Leaving debugscript mode (back to previous bash flags and options settings).'
+ test 1
+ Log 'Leaving debugscript mode (back to previous bash flags and options settings).'
+ test -w /var/log/rear/rear-hostname.log
+ echo '2023-01-09 18:54:06.042704848 Leaving debugscript mode (back to previous bash flags and options settings).'
2023-01-09 18:54:06.042704848 Leaving debugscript mode (back to previous bash flags and options settings).
2023-01-09 18:54:06.070822282 Including output/ISO/Linux-i386/800_create_isofs.sh

jsmeix commented at 2023-01-11 14:11:

So this is the crucial excerpt:

mount: /dev/loop8 mounted on /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt.
++ cp -v -r /var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/. /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI'
...
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/zh_TW.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/zh_TW.mo'
++ umount -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img
umount: /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt: target is busy.

I wonder why it is busy?
I am not a sufficient expert to understand
what goes on here behind the scenes.
Does perhaps the 'cp' command return "too early"?
I.e. 'cp' returns but the kernel is still busy with device IO?
Does perhaps the 'umount' command require '--lazy' in practice
to avoid that some IO command like 'cp' with subsequent 'umount'
could let 'umount' (sometimes) fail with "target is busy"?

What "man 8 umount" tells about 'busy' and 'lazy'
doesn't make it clear what one should normally do:

Note that a filesystem cannot be unmounted
when it is 'busy' - for example,
when there are open files on it,
or when some process has its working directory there,
or when a swap file on it is in use.
The offending process could even be umount
itself - it opens libc, and libc in its turn
may open for example locale files.
A lazy unmount avoids this problem,
but it may introduce other issues.
See --lazy description
...
-l, --lazy
Lazy unmount.
Detach the filesystem from the file hierarchy now,
and clean up all references to this filesystem
as soon as it is not busy anymore.
A system reboot would be expected in near future
if you’re going to use this option for network filesystem
or local filesystem with submounts.
The recommended use-case for umount -l is to prevent hangs
on shutdown due to an unreachable network share where
a normal umount will hang due to a downed server
or a network partition.
Remounts of the share will not be possible.

@rear/contributors in particular @pcahyna
does one of you perhaps know more details about how
some IO command like 'cp' with subsequent 'umount'
works behind the scenes?

jsmeix commented at 2023-01-11 14:18:

@thomas-merz
could you try out if things behave better in your case
when you change in
output/ISO/Linux-i386/700_create_efibootimg.sh
the line

umount $v $TMP_DIR/efiboot.img

to

umount $v --lazy $TMP_DIR/efiboot.img

Alternatively or additionally when 'umount --lazy'
does not help or does not work sufficiently well
could you try out if things behave OK in your case
when you insert a line with something like sleep 3
before the 'umount' line?

thomas-merz commented at 2023-01-11 14:49:

Adding --lazy solved my problem πŸ‘

Before I had some mounts left:

df|grep var.*rear| awk '{print $6}'
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/var/tmp/rear.OVhhey7uK3taoxV/tmp/efi_virt
/var/tmp/rear.CJOWoaQvuMUvyDv/tmp/efi_virt
/var/tmp/rear.IIDh3emTJrYkDY4/tmp/efi_virt
/var/tmp/rear.PxS05QU0gUOharc/tmp/efi_virt
/var/tmp/rear.XnSh08Zi7LWWAAm/tmp/efi_virt
/var/tmp/rear.l53DsAliWYCFV8c/tmp/efi_virt
/var/tmp/rear.k4nw099Q7s9UiH6/tmp/efi_virt
/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt
/var/tmp/rear.uU7oIrbAheXLldP/tmp/efi_virt
/var/tmp/rear.NqP94taUaVv35hW/tmp/efi_virt
/var/tmp/rear.a0aCW56IS2BeEZP/tmp/efi_virt
/var/tmp/rear.zfQsgOB2se46rFQ/tmp/efi_virt

After manual umount (umount $(df|grep var.*rear| awk '{print $6}')) and running rear mkrescue no such mount was left πŸ˜„

jsmeix commented at 2023-01-12 12:35:

Ok.
I will improve
https://github.com/rear/rear/pull/2909
to use 'umount --lazy'.

pcahyna commented at 2023-01-12 14:14:

@rear/contributors in particular @pcahyna does one of you perhaps know more details about how some IO command like 'cp' with subsequent 'umount' works behind the scenes?

I believe that when cp completes, it should not cause umount failures. If there are data to be flushed to the disk, umount should wait for the write to complete even without --lazy.
IOW, I don't understand what's going on. I suspect some process might be using the mount (could be just by having its current working directory in the mounted filesystem).
Ideally, it would be good to debug and understand this (for example by using fuser -m), but I understand that it could be difficult and that using --lazy is an easier solution.

pcahyna commented at 2023-01-12 14:21:

One could change the umount command to something like

umount $v $TMP_DIR/efiboot.img || fuser -m -v  $TMP_DIR/efi_virt

to try to see the reason for umount problems.

thomas-merz commented at 2023-01-12 14:37:

@pcahyna , "one"? You will do? Or do you want me testing this? πŸ˜‰

jsmeix commented at 2023-01-12 14:41:

Regarding 'fuser' I saw
"Why fuser is inferior to lsof" in
https://stackoverflow.com/questions/7878707/how-to-unmount-a-busy-device

I am not at all an expert in this area.
From what I read all that looks more like pile of long grown mess
than something where things are working cleanly and consistently.

pcahyna commented at 2023-01-12 14:43:

@thomas-merz I meant that we could change the released ReaR code this way, so that any user facing the issue would get a more meaningful error message.

But if you are able to reproduce the issue reliably, please try this and report the results! I am curious.

jsmeix commented at 2023-01-12 14:43:

@thomas-merz
I would much appreciate it if you could test things
because it fails in your case so your system behaves
as we need to test real failures.

The "one" who then implements it in ReaR is me via
https://github.com/rear/rear/pull/2909

pcahyna commented at 2023-01-12 14:49:

I am not at all an expert in this area. From what I read all that looks more like pile of long grown mess than something where things are working cleanly and consistently.

I agree. The umount command uses the umount(2) or umount2(2) system call behind the scenes. From the manual page you can see that the amount of information the system call can accept (four boolean flags) and return back (one of the few integer errno values without any additional parameters that would allow to debug a problem) is really limited.

pcahyna commented at 2023-01-12 14:50:

the latter point also applies to the mount syscall where you basically only know that "something went wrong".

jsmeix commented at 2023-01-12 14:59:

In https://github.com/rear/rear/pull/2909 via
https://github.com/rear/rear/commit/25b532824b57d1e064bee179591c01f907478039
I added 'fuser -v -m $TMP_DIR/efi_virt' output in the log file

jsmeix commented at 2023-01-12 15:07:

Yes,
those funny 'mount' error message like:

mount: /var/tmp/rear.../outputfs: wrong fs type, bad option,
 bad superblock on /dev/sd..., missing codepage or helper program,
 or other error.

that tells all and nothing.
So nowadays a more modern state of the art

mount: Oops - something went wrong - we apologize

that truly tells nothing would be "more correct" ;-)

jsmeix commented at 2023-01-12 15:34:

Regarding
https://github.com/rear/rear/issues/2908#issuecomment-1380417619

If there are data to be flushed to the disk,
umount should wait for the write to complete

This is exactly what I had experienced some longer time ago.
I did a 'dd' of a huge file into a mounted filesystem on a USB disk
and 'dd' completed rather soon but my subsequent plain normal 'umount'
took a rather long time while the USB disk (a real rotating disk)
made its typical ongoing access noise until all had completed
and finally 'umount' returned.
I had even asked a colleague at that time and he had told me
that 'umount' is the best way in practice to be sure that
all what needs to be written on a filesystem will have been
written by the kernel when 'umount' returns.
The only thing what is left then is cache in the disk hardware.
So after 'umount' returned one should still wait a bit before
unplugging a USB disk (which does a hard power-off for the disk)
so that possible dirty caches on the disk itself could clean up.

pcahyna commented at 2023-01-12 15:41:

"Why fuser is inferior to lsof" in
https://stackoverflow.com/questions/7878707/how-to-unmount-a-busy-device

The reasoning there seems to be that you can't use fuser on a lazy-unmounted filesystem (because the filesystem is detached, so no path can point to any file on it), while lsof would still display the path. As you are using fuser before trying umount --lazy, I think this reasoning does not apply and fuser will do the right thing.

pcahyna commented at 2023-01-12 15:44:

Yes, those funny 'mount' error message like:

mount: /var/tmp/rear.../outputfs: wrong fs type, bad option,
 bad superblock on /dev/sd..., missing codepage or helper program,
 or other error.

that tells all and nothing. So nowadays a more modern state of the art

mount: Oops - something went wrong - we apologize

that truly tells nothing would be "more correct" ;-)

Indeed - have fun with user reports that took the error message at a face value and complain they have a bad superblock.

pcahyna commented at 2023-01-12 15:47:

did a 'dd' of a huge file into a mounted filesystem on a USB disk
and 'dd' completed rather soon but my subsequent plain normal 'umount'
took a rather long time while the USB disk (a real rotating disk)
made its typical ongoing access noise until all had completed
and finally 'umount' returned.

Indeed, so cp should not be causing this (after all, cp should not be doing anything special compared to dd to a file).

thomas-merz commented at 2023-01-12 16:15:

@jsmeix

I would much appreciate it if you could test things because it fails in your case so your system behaves as we need to test real failures.

The "one" who then implements it in ReaR is me via #2909

++ umount -v /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efiboot.img
umount: /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt: target is busy.
++ fuser -m -v /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt
                     USER        PID ACCESS COMMAND
/var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt:
                     root     kernel mount /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt

Which is still mounted:

Filesystem     1K-blocks  Used Available Use% Mounted on
/dev/loop3         98094 12618     85476  13% /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt

pcahyna commented at 2023-01-12 16:40:

@thomas-merz thanks. I experimented a bit and it seems that the "kernel mount" entry is shown by fuser for any mounted filesystem. This indicates that the filesystem is unused, if there are no other entries in the output. Why the unmounting fails is then even more mysterious. Could you please add one more "umount $TMP_DIR/efiboot.img" after the fuser ? Maybe it is just some temporary glitch and a second umount will succeed even without --lazy.

thomas-merz commented at 2023-01-13 08:37:

@pcahyna
Adding a second umount (without --lazy) will also unmount $TMP_DIR/efiboot.img πŸ‘

Update @pcahyna - sorry, I did make a mistake:
Adding a second umount (without --lazy) will
NOT unmount $TMP_DIR/efiboot.img πŸ‘Ž

jsmeix commented at 2023-01-13 08:44:

I can confirm on my openSUSE Leap 15.4 system
that the "kernel mount" is always there:

# mount -v /dev/sda6 /other ; fuser -v -m /other ; umount -v /other

mount: /dev/sda6 mounted on /other.

                     USER        PID ACCESS COMMAND
/other:              root     kernel mount /other

umount: /other unmounted

I also think it is a timing issue
which is my reasoning behind why in
https://github.com/rear/rear/pull/2909
I do first of all 'sleep 1' before normal 'umount'
https://github.com/rear/rear/blob/25b532824b57d1e064bee179591c01f907478039/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh#L53

jsmeix commented at 2023-01-13 09:20:

I tried to reproduce such a regular umount failure
of a loop mounted file on my openSUSE Leap 15.4 system
several times in an automated row (i.e. via a script)
and for me it always works.

I have /var/tmp in the root filesystem which is

# lsblk -ipo NAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT
NAME                                                      TRAN   TYPE  FSTYPE        SIZE MOUNTPOINT
/dev/sda                                                  sata   disk              465.8G 
|-/dev/sda1                                                      part                  8M 
|-/dev/sda2                                                      part  crypto_LUKS     4G 
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2        crypt swap            4G [SWAP]
|-/dev/sda3                                                      part  crypto_LUKS   200G 
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3        crypt ext4          200G /

In contrast for @thomas-merz /var/tmp is on XFS on LVM
(excerpt from his initial description here)

NAME                                    KNAME      PKNAME    TRAN TYPE FSTYPE      LABEL   SIZE MOUNTPOINT
...
/dev/sdb                                /dev/sdb                  disk                     122G
|-/dev/sdb1                             /dev/sdb1  /dev/sdb       part                       8M
|-/dev/sdb2                             /dev/sdb2  /dev/sdb       part ext3              372.5M /boot
|-/dev/sdb3                             /dev/sdb3  /dev/sdb       part LVM2_member       119.5G
| |-/dev/mapper/systemvg-usr_lv         /dev/dm-0  /dev/sdb3      lvm  xfs                  15G /usr
| |-/dev/mapper/systemvg-swap_lv        /dev/dm-1  /dev/sdb3      lvm  swap                 16G [SWAP]
| |-/dev/mapper/systemvg-root_lv        /dev/dm-2  /dev/sdb3      lvm  xfs                  15G /
| |-/dev/mapper/systemvg-opt_lv         /dev/dm-9  /dev/sdb3      lvm  xfs                   5G /opt
| |-/dev/mapper/systemvg-var_lv         /dev/dm-11 /dev/sdb3      lvm  xfs                  10G /var
| |-/dev/mapper/systemvg-varlogaudit_lv /dev/dm-12 /dev/sdb3      lvm  xfs                  10G /var/log/audit
| |-/dev/mapper/systemvg-varlog_lv      /dev/dm-13 /dev/sdb3      lvm  xfs                  10G /var/log
| |-/dev/mapper/systemvg-vartmp_lv      /dev/dm-14 /dev/sdb3      lvm  xfs                  10G /var/tmp
| |-/dev/mapper/systemvg-tmp_lv         /dev/dm-15 /dev/sdb3      lvm  xfs                  10G /tmp
| `-/dev/mapper/systemvg-home_lv        /dev/dm-16 /dev/sdb3      lvm  xfs                  15G /home

So perhaps (only a blind guess) something in XFS or LVM
causes some tiny delay somewhere so that a loop mounted file
that actually is on XFS or LVM may sometimes show such an
inexplicable regular umount failure?

My disk 'sda' is of lsblk TRAN 'sata'.
In contrast for @thomas-merz 'sdb' there is no lsblk TRAN.
So perhaps his sdb is somewhat "unusually" connected
(his "hardware" is 'VMware')
and that causes some tiny delay somewhere that may
sometimes cause such an inexplicable regular umount failure?

thomas-merz commented at 2023-01-13 10:15:

@jsmeix , our disks are VMware-VMDKs on an IBM-SAN-Storagebox in virtual machines:

# dmesg -T|grep sdb
[Thu Jan  5 02:42:15 2023] sd 0:0:0:0: [sdb] 255852544 512-byte logical blocks: (131 GB/122 GiB)
[Thu Jan  5 02:42:15 2023] sd 0:0:0:0: [sdb] Write Protect is off
[Thu Jan  5 02:42:15 2023] sd 0:0:0:0: [sdb] Mode Sense: 3b 00 00 00
[Thu Jan  5 02:42:15 2023] sd 0:0:0:0: [sdb] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
[Thu Jan  5 02:42:15 2023]  sdb: sdb1 sdb2 sdb3 sdb4
[Thu Jan  5 02:42:15 2023] sd 0:0:0:0: [sdb] Attached SCSI disk
[Thu Jan  5 02:42:19 2023] EXT4-fs (sdb2): mounting ext3 file system using the ext4 subsystem
[Thu Jan  5 02:42:20 2023] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: acl,user_xattr

Full lsblk:

# lsblk -ipo NAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT /dev/sdb
NAME                                    TRAN TYPE FSTYPE        SIZE MOUNTPOINT
/dev/sdb                                     disk               122G
|-/dev/sdb1                                  part                 8M
|-/dev/sdb2                                  part ext3        372.5M /boot
|-/dev/sdb3                                  part LVM2_member 119.5G
| |-/dev/mapper/systemvg-usr_lv              lvm  xfs            15G /usr
| |-/dev/mapper/systemvg-swap_lv             lvm  swap           16G [SWAP]
| |-/dev/mapper/systemvg-root_lv             lvm  xfs            15G /
| |-/dev/mapper/systemvg-opt_lv              lvm  xfs             5G /opt
| |-/dev/mapper/systemvg-var_lv              lvm  xfs            10G /var
| |-/dev/mapper/systemvg-varlogaudit_lv      lvm  xfs            10G /var/log/audit
| |-/dev/mapper/systemvg-varlog_lv           lvm  xfs            10G /var/log
| |-/dev/mapper/systemvg-vartmp_lv           lvm  xfs            10G /var/tmp
| |-/dev/mapper/systemvg-tmp_lv              lvm  xfs            10G /tmp
| `-/dev/mapper/systemvg-home_lv             lvm  xfs            15G /home
`-/dev/sdb4                                  part vfat          139M /boot/efi

Is there anything "unusual" for you? πŸ€”

pcahyna commented at 2023-01-13 10:38:

@jsmeix I don't think the file system type of /var/tmp matters (although I am not 100% sure), I believe the mounted file system type is more relevant. And that is vfat.

jsmeix commented at 2023-01-13 10:42:

All what is not usual end-user hardware is "unusual" for me
because I only have usual end-user hardware in my homeoffice.
And I have only very limited "usual end-user hardware"
(i.e. only what I actually have in my homeoffice).

"Unusual" does not mean something is wrong or broken.
But it means it is something where others cannot reproduce
how unusual (or special) hardware (or software) behaves.

Same with virtualization software:
I only use KVM/QEMU.
In particular I don't use any proprietary software.

So "VMware-VMDKs on an IBM-SAN-Storagebox"
is "very unusual" for me personally.

Things may change when you have a vaild support contract
with SUSE for ReaR in a SUSE Linux Enterprise product
because for SUSE special enterprise hardware and software
is less "unusual" (but of course SUSE cannot have
each and every enterprise hardware and software).

For more information see the section
"SUSE support for Relax-and-Recover" in
https://en.opensuse.org/SDB:Disaster_Recovery

jsmeix commented at 2023-01-13 10:53:

I had tested with VFAT and
it all works for me (100 times in a row):

# dd if=/dev/zero of=/var/tmp/test.img count=64 bs=1M
64+0 records in
64+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.0453405 s, 1.5 GB/s

# mkfs.vfat /var/tmp/test.img
mkfs.fat 4.1 (2017-01-24)

# mkdir /var/tmp/test.mnt

# for i in $( seq 100 ) ; \
  do echo test $i ; \
     mount -v -o loop -t vfat /var/tmp/test.img /var/tmp/test.mnt ; \
     rm -f /var/tmp/test.mnt/* ; \
     dd if=/dev/urandom of=/var/tmp/test.mnt/urandom.data count=60 bs=1M ; \
     umount -v /var/tmp/test.mnt || break ; \
  done

test 1
mount: /dev/loop0 mounted on /var/tmp/test.mnt.
60+0 records in
60+0 records out
62914560 bytes (63 MB, 60 MiB) copied, 0.9974 s, 63.1 MB/s
umount: /var/tmp/test.mnt unmounted
test 2
mount: /dev/loop0 mounted on /var/tmp/test.mnt.
60+0 records in
60+0 records out
62914560 bytes (63 MB, 60 MiB) copied, 1.04459 s, 60.2 MB/s
umount: /var/tmp/test.mnt unmounted
.
.
.
test 100
mount: /dev/loop0 mounted on /var/tmp/test.mnt.
60+0 records in
60+0 records out
62914560 bytes (63 MB, 60 MiB) copied, 1.02499 s, 61.4 MB/s
umount: /var/tmp/test.mnt unmounted

thomas-merz commented at 2023-01-13 11:10:

Things may change when you have a vaild support contract
with SUSE for ReaR in a SUSE Linux Enterprise product
because for SUSE special enterprise hardware and software
is less "unusual" (but of course SUSE cannot have
each and every enterprise hardware and software).

We "have a vaild support contract with SUSE for ReaR in a SUSE Linux Enterprise product", but we are using Rear 2.7 from Upstream because the one provided by SUSE (2.3) is far toooooo old because we need a more current version with Rubrik-Integration.

@roseswe already asked a SUSE representive and you @jsmeix on Nov. 29th to adapt Rear 2.7 into official SUSE repos.

So are we stuck now or can we proceed? πŸ€·β€β™‚οΈ

thomas-merz commented at 2023-01-13 11:12:

With "unusual to you" I ment "disabled" or "unsupport" features or Opts.

pcahyna commented at 2023-01-13 12:13:

So, if it fails only at the first attempt, I would advise to add fuser -m -v $TMP_DIR/efi_virt before the first (failing) umount. I agree it is a timing issue and it would be good to know if it is purely in the kernel or if there is some userspace process involved (in the latter case, fuser before failing umount should reveal it). Could you please do this test, @thomas-merz ?

pcahyna commented at 2023-01-13 12:16:

sorry, now I saw the update:

I did make a mistake:
Adding a second umount (without --lazy) will NOT unmount $TMP_DIR/efiboot.img

so, maybe it is not a timing issue after all.

jsmeix commented at 2023-01-13 12:26:

@thomas-merz

providing ReaR 2.7 as new RPM package 'rear27a' for SLE-HA-15
("SUSE Linux Enterprise High Availability Extension version 15")
is already done by me (as the RPM package maintainer at SUSE)
which means currently it is work in progress by others at SUSE
to get that new RPM package rear27a into the SLE-HA-15 product
so that finally it becomes officially available for SUSE customers.

That you use ReaR 2.7 from ReaR upstream is much appreciated and
recommended by me and I support you with that here at ReaR upstream
because this way we can right now proceed to get it working for you
as you need it in your specific case (e.g. fixing issues like this one)
and we at ReaR upstream can learn new (special) things how we could
further improve ReaR so that it works in even more special cases
(there are already lots of things in ReaR to deal with special cases).

Because you reported your issue publicly at ReaR upstream
all ReaR upstream developers can work together on a proper fix.
In particular @pcahyna helps me so much with ReaR issues.

To be able to improve ReaR (in particular for special cases)
we at ReaR upstream depend very much on contributions by users.

What I appreciate most of all is that you @thomas-merz
as an actual user of ReaR (i.e. someone who sits in front
of a system where ReaR is used and who is 'root' there)
directly works together with us at ReaR upstream.
This helps you to get your specific issue fixed
as good as possible and as fast as possible
"nothing is better and faster than ReaR upstream" ;-)
and it helps us at ReaR upstream to understand
a special case to be able to deal with it properly
"nothing is better than direct collaboration with users" :-)

So nothing is stuck.
Let us just proceed here as we currently do.
From my point of view all works perfectly well here.

In the end all will benefit from how we do things here:
First of all you because your specific issue gets fixed right now.
Later when it is fixed generically in ReaR also other users who use
ReaR upstream benefit because they will not be hit by this issue.
Finally Linux distributions benefit when they provide a newer ReaR
version to their users who will benefit last.

jsmeix commented at 2023-01-13 12:31:

Oops!
I also did not notice the update in
https://github.com/rear/rear/issues/2908#issuecomment-1381481497

I assume this happened because GitHub does not send
its usual notification e-mail when a comment was updated
and at least I depend on GitHub's notification e-mails.

thomas-merz commented at 2023-01-13 13:23:

@pcahyna

add fuser -m -v $TMP_DIR/efi_virt before the first (failing) umount.

This little "slow down" by fuser seems to be sufficient to make umount successfully unmount the filesystem:

++ fuser -m -v /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt
                     USER        PID ACCESS COMMAND
/var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt:
                     root     kernel mount /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt
++ umount -v /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efiboot.img
umount: /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt (/dev/loop0) unmounted

jsmeix commented at 2023-01-13 13:58:

So it looks again as if some simple
artificial 'sleep 1' delay before 'umount'
should sufficiently avoid such issues
which I do currently in
https://github.com/rear/rear/pull/2909
in the

# Umounting the EFI virtual image:
...

code section in
https://github.com/rear/rear/blob/25b532824b57d1e064bee179591c01f907478039/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh

@thomas-merz
could you try out if using
/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh
as in
https://github.com/rear/rear/blob/25b532824b57d1e064bee179591c01f907478039/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh
makes that umount working sufficienly reliable for you?

You can see the actual code change in
https://github.com/rear/rear/pull/2909/files
(long lines may be shown wrapped in the browser).

You only need the change for
output/ISO/Linux-i386/700_create_efibootimg.sh

thomas-merz commented at 2023-01-13 15:13:

@jsmeix , sleep 1 also works fine:

++ sleep 1
++ umount -v /var/tmp/rear.6CEkscr2qDP8VI5/tmp/efiboot.img
umount: /var/tmp/rear.6CEkscr2qDP8VI5/tmp/efi_virt (/dev/loop0) unmounted

pcahyna commented at 2023-01-13 19:26:

since there does not seem to be a process blocking it (although we are not 100% sure), maybe the filesystem just needs syncing?
Could you please try adding sync --file-system $TMP_DIR/efi_virt before the first umount attempt (without the sleep and the fuser)?

thomas-merz commented at 2023-01-13 20:32:

I did:

++ sync --file-system /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efi_virt
++ umount -v /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efiboot.img
umount: /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efi_virt: target is busy.

And voila - it's still mounted:

Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0       96M   13M   84M  13% /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efi_virt

πŸ‘Ž

pcahyna commented at 2023-01-13 20:48:

@thomas-merz thanks for trying...

jsmeix commented at 2023-01-26 14:18:

An addedum regarding fuser -v -m /path/to/something
see in https://github.com/rear/rear/pull/2909
my recent commit
https://github.com/rear/rear/pull/2909/commits/f770c69ee7e2fadc7137a37b3c5596da0b62aedb
why one should use fuser with its '-M' option

fuser -v -M -m /path/to/something

instead of plain fuser -v -m /path/to/something

For example (on my openSUSE Leap 15.4 system):

# mount -v /dev/sda6 /other
mount: /dev/sda6 mounted on /other.

# fuser -v -m /other
                     USER        PID ACCESS COMMAND
/other:              root     kernel mount /other

# fuser -v -M -m /other
                     USER        PID ACCESS COMMAND
/other:              root     kernel mount /other

# umount -v /other
umount: /other unmounted

# fuser -v -m /other
                     USER        PID ACCESS COMMAND
/other:              root     kernel mount /
                     root          1 .rce. systemd
                     root          2 .rc.. kthreadd
...
[about 300 more lines]
...
                     root      31953 .rc.. kvm-pit/31921
                     johannes  32016 .rce. ssh
                     johannes  32018 .rce. ssh-agent

# fuser -v -M -m /other
Specified filename /other is not a mountpoint.

pcahyna commented at 2023-02-14 11:45:

Hi @jsmeix , good catch with -M, indeed, if the filesystem is not mounted at this point, the output would be very misleading.

Unfortunately, RHEL 6 does not yet have the fuser -M flag, so I suspect some of the SLES or openSUSE variants that you care about will have the same problem.

jsmeix commented at 2023-02-14 12:01:

Hello @pcahyna

if you have a system where 'fuser' does not support '-M'
could you show me what 'fuser -v -M -m /dir' results?
Does it perhaps just ignore '-M' or does it error out?

pcahyna commented at 2023-02-14 12:27:

Hello @pcahyna

if you have a system where 'fuser' does not support '-M' could you show me what 'fuser -v -M -m /dir' results? Does it perhaps just ignore '-M' or does it error out?

# mount
/dev/mapper/vg_ciscoc240m301-lv_root on / type ext4 (rw)
(...)
/dev/sda1 on /boot type ext4 (rw)
# fuser -v -M -m /boot
M: unknown signal; fuser -l lists signals.
# echo $?
1

Weird, huh? That's on RHEL 6.
The manual page says:

fuser [-a|-s|-c] [-4|-6] [-n  space ] [-k [-i] [-signal ] ] [-muvf] name ...

and we get bitten by the -signal part.

jsmeix commented at 2023-03-20 12:18:

Regarding in
https://github.com/rear/rear/issues/2908#issuecomment-1429605894
what SLES versions support 'fuser ... -M ...':

SLES11 does not support 'fuser ... -M ...'

SLES12 and later support 'fuser ... -M ...'

Since ReaR 2.7 is released I do no longer care about SLES11, cf.
https://github.com/rear/rear/pull/2949#discussion_r1124225289

jsmeix commented at 2023-03-22 12:25:

With https://github.com/rear/rear/pull/2909 merged
this issue should be sufficiently avoided.


[Export of Github issue for rear/rear.]