#2908 Issue closed
: ReaR 2.7 on SLES15SP2: inexplicable regular umount failure¶
Labels: enhancement
, support / question
,
fixed / solved / done
, special hardware or VM
thomas-merz opened issue at 2023-01-09 18:02:¶
Relax-and-Recover (ReaR) Issue Template¶
-
ReaR version ("/usr/sbin/rear -V"):
Relax-and-Recover 2.7 / 2022-07-13 -
OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
NAME="SLES"
VERSION="15-SP2"
VERSION_ID="15.2"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP2"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp2" -
ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
OUTPUT=ISO
BACKUP=CDM
OUTPUT_URL=file:///rear/iso
OUTPUT_PREFIX="hostname-f"
export TMPDIR="/tmp"
TIMESYNC=NTP
NETFS_KEEP_OLD_BACKUP_COPY=N
EXCLUDE_VG=()
ONLY_INCLUDE_VG=(systemvg binvg hanasharedvg)
WAIT_SECS=120
SKIP_CFG2HTML=Y
USE_CFG2HTML=N
-
Hardware vendor/product (PC or PowerNV BareMetal or ARM) or VM (KVM guest or PowerVM LPAR):
VMware -
System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
x86 -
Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
EFI v2.40 by VMware, Inc. -
Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
more than one "VMware PVSCSI storage adapter rev 2" with a total of 5 disks -
Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT"):
NAME KNAME PKNAME TRAN TYPE FSTYPE LABEL SIZE MOUNTPOINT
/dev/loop0 /dev/loop0 loop vfat 96M /var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/dev/loop1 /dev/loop1 loop vfat 96M /var/tmp/rear.OVhhey7uK3taoxV/tmp/efi_virt
/dev/loop2 /dev/loop2 loop vfat 96M /var/tmp/rear.CJOWoaQvuMUvyDv/tmp/efi_virt
/dev/loop3 /dev/loop3 loop vfat 96M /var/tmp/rear.IIDh3emTJrYkDY4/tmp/efi_virt
/dev/loop4 /dev/loop4 loop vfat 96M /var/tmp/rear.PxS05QU0gUOharc/tmp/efi_virt
/dev/loop5 /dev/loop5 loop vfat 96M /var/tmp/rear.XnSh08Zi7LWWAAm/tmp/efi_virt
/dev/loop6 /dev/loop6 loop vfat 96M /var/tmp/rear.k4nw099Q7s9UiH6/tmp/efi_virt
/dev/loop7 /dev/loop7 loop vfat 96M /var/tmp/rear.l53DsAliWYCFV8c/tmp/efi_virt
/dev/sda /dev/sda disk 170G
`-/dev/sda1 /dev/sda1 /dev/sda part LVM2_member 170G
|-/dev/mapper/binvg-usrsap_lv /dev/dm-6 /dev/sda1 lvm xfs 50G /usr/sap
|-/dev/mapper/binvg-saphome_lv /dev/dm-7 /dev/sda1 lvm xfs 2G /home/sap
|-/dev/mapper/binvg-uc4home_lv /dev/dm-8 /dev/sda1 lvm xfs 2G /home/uc4
`-/dev/mapper/binvg-sapinst_lv /dev/dm-10 /dev/sda1 lvm xfs 115G /sapinst
/dev/sdb /dev/sdb disk 122G
|-/dev/sdb1 /dev/sdb1 /dev/sdb part 8M
|-/dev/sdb2 /dev/sdb2 /dev/sdb part ext3 372.5M /boot
|-/dev/sdb3 /dev/sdb3 /dev/sdb part LVM2_member 119.5G
| |-/dev/mapper/systemvg-usr_lv /dev/dm-0 /dev/sdb3 lvm xfs 15G /usr
| |-/dev/mapper/systemvg-swap_lv /dev/dm-1 /dev/sdb3 lvm swap 16G [SWAP]
| |-/dev/mapper/systemvg-root_lv /dev/dm-2 /dev/sdb3 lvm xfs 15G /
| |-/dev/mapper/systemvg-opt_lv /dev/dm-9 /dev/sdb3 lvm xfs 5G /opt
| |-/dev/mapper/systemvg-var_lv /dev/dm-11 /dev/sdb3 lvm xfs 10G /var
| |-/dev/mapper/systemvg-varlogaudit_lv /dev/dm-12 /dev/sdb3 lvm xfs 10G /var/log/audit
| |-/dev/mapper/systemvg-varlog_lv /dev/dm-13 /dev/sdb3 lvm xfs 10G /var/log
| |-/dev/mapper/systemvg-vartmp_lv /dev/dm-14 /dev/sdb3 lvm xfs 10G /var/tmp
| |-/dev/mapper/systemvg-tmp_lv /dev/dm-15 /dev/sdb3 lvm xfs 10G /tmp
| `-/dev/mapper/systemvg-home_lv /dev/dm-16 /dev/sdb3 lvm xfs 15G /home
`-/dev/sdb4 /dev/sdb4 /dev/sdb part vfat 139M /boot/efi
/dev/sdc /dev/sdc disk 1.4T
`-/dev/sdc1 /dev/sdc1 /dev/sdc part LVM2_member 1.4T
`-/dev/mapper/hanadatavg-hanadata_lv /dev/dm-4 /dev/sdc1 lvm xfs 1.4T /hana/data
/dev/sdd /dev/sdd disk 80G
`-/dev/sdd1 /dev/sdd1 /dev/sdd part LVM2_member 80G
`-/dev/mapper/hanasharedvg-hanashared_lv
/dev/dm-3 /dev/sdd1 lvm xfs 80G /hana/shared
/dev/sde /dev/sde disk 700G
`-/dev/sde1 /dev/sde1 /dev/sde part LVM2_member 700G
`-/dev/mapper/hanalogvg-hanalog_lv /dev/dm-5 /dev/sde1 lvm xfs 700G /hana/log
/dev/sr0 /dev/sr0 ata rom 1024M
- Description of the issue (ideally so that others can reproduce it):
When running Rear 2.7 on our EFI-boot-enabled VMs it says
Could not remove build area /var/tmp/rear.k4nw099Q7s9UiH6 (something still exists therein)
and many "dead" mounts are there (one for each single run):
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/isofs/boot/efiboot.img (deleted) on /var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt type vfat
...(and so on)...
-
Workaround, if any:
None; or manual unmount π -
Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
2023-01-09 17:51:36.478977061 Exiting rear mkrescue (PID 126650) and its descendant processes ...
2023-01-09 17:51:40.004244933 rear,126650 /usr/sbin/rear mkrescue -vvv
`-rear,154186 /usr/sbin/rear mkrescue -vvv
`-pstree,154187 -Aplau 126650
2023-01-09 17:51:40.211717375 Running exit tasks
2023-01-09 17:51:40.236870986 Finished rear mkrescue in 165 seconds
2023-01-09 17:51:40.250497716 Removing build area /var/tmp/rear.k4nw099Q7s9UiH6
2023-01-09 17:51:40.346937030 Failed to 'rm -Rf --one-file-system /var/tmp/rear.k4nw099Q7s9UiH6/tmp'
2023-01-09 17:51:40.831854759 Could not remove build area /var/tmp/rear.k4nw099Q7s9UiH6 (something still exists therein)
2023-01-09 17:51:40.862120053 Something is still mounted within the build area
2023-01-09 17:51:40.878799867 /var/tmp/rear.k4nw099Q7s9UiH6/tmp/isofs/boot/efiboot.img (deleted) on /var/tmp/rear.k4nw099Q7s9UiH6/tmp/efi_virt type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
2023-01-09 17:51:40.895534725 You must manually umount it, then you could manually remove the build area
2023-01-09 17:51:40.912931056 To manually remove the build area use (with caution): rm -Rf --one-file-system /var/tmp/rear.k4nw099Q7s9UiH6
2023-01-09 17:51:40.929027396 End of program 'rear' reached
β Or do you need the full detailled logfile with 166441 lines?
We found this by accident because our monitoring
got so many new filesystems every day for these
/var/tmp/rear.XXXXXX filesystems
after each scheduled "rear mkrescue" runβ¦ π²
jsmeix commented at 2023-01-10 07:43:¶
@thomas-merz
yes, I need a full debug "rear -D mkrescue" logfile
to be able to find out why something is still mounted
within the build area in your particular case.
If that kind of issue would happen in general
we would have noticed it and fixed its root cause.
Additionally I need the information what files there are
below the mountpoint directory that is still mounted
within the build area in your particular case,
i.e. what files and directories there are in
/var/tmp/rear.*/tmp/efi_virt
in your particular case.
I assume something related to UEFI code does not
sufficiently clean up its own temporary files
or it fails to do so but "blindly proceeds" (see below).
Because 'efi_virt' appears only in
output/ISO/Linux-i386/700_create_efibootimg.sh
this script is the "prime suspect".
FYI:
Be grateful that over the time we at ReaR upstream added
many basic sanity checks and safeguards at many places
to let ReaR less often blindly proceed regardless of issues
according to "Try hard to care about possible errors" in
https://github.com/rear/rear/wiki/Coding-Style
By default bash proceeds with the next command when something failed.
Do not let your code blindly proceed in case of errors...
There are still many places in ReaR where it blindly proceeds
and we fix them one by one and step by step as we notice them.
Regarding this particular case see the critical issue
https://github.com/rear/rear/issues/2611
in particular the initial description how the old blind
"rm -Rf /.../outputfs" could cause a disaster by ReaR
... removes ... backup directory ...
... destroy the backup directories for other machines ...
instead of let ReaR help the user to protect against disaster
and see the rather tricky implementation of an appropriate fix
https://github.com/rear/rear/pull/2625
In this particular case here it is not BACKUP=NETFS
so no backup should be mounted below /var/tmp/rear...
but at the point when /var/tmp/rear... is cleaned up
it is unknown what there might be still left mounted below
(normally all gets umounted and nothing is left mounted)
so in general the new careful "rm -Rf --one-file-system"
instead of the old just blindly "rm -Rf" helps to avoid
possible disastrous outcomes - i.e. better safe than sorry.
thomas-merz commented at 2023-01-11 11:28:¶
Can I send you "full debug "rear -D mkrescue" logfile" not via public GitHub due to possible "sensitive" data that my company don't want to be leaked public?
thomas-merz commented at 2023-01-11 11:28:¶
what files there are
below the mountpoint directory that is still mounted
For example for one single mounted filesystem out of many:
find /var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/BOOTX64.efi
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/grub.cfg
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/fonts
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/fonts/unicode.pf2
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ast.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ca.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/da.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/de.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/de_CH.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/en@quot.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/eo.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/es.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/fi.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/fr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/gl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/hr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/hu.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/id.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/it.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ja.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ko.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/lt.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/nb.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/nl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pa.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pt.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/pt_BR.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ro.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/ru.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/sl.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/sr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/sv.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/tr.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/uk.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/vi.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/zh_CN.mo
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt/EFI/BOOT/locale/zh_TW.mo
jsmeix commented at 2023-01-11 12:02:¶
@thomas-merz
I think only the part when
output/ISO/Linux-i386/700_create_efibootimg.sh
runs is perhaps already sufficient to show why in
output/ISO/Linux-i386/700_create_efibootimg.sh
it fails to umount $v $TMP_DIR/efiboot.img
cf.
https://github.com/rear/rear/blob/rear-2.7/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh#L47
So in the "rear -D mkrescue" log file from a line like
+ source /usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh
all until the next script is sourced
i.e. all until the next line that starts with
+ source /usr/share/rear/...
which is likely a line like
+ source /usr/share/rear/output/ISO/Linux-i386/800_create_isofs.sh
thomas-merz commented at 2023-01-11 13:49:¶
β¦ I think only the part β¦
Here we go:
2023-01-09 18:54:05.744086285 Including output/ISO/Linux-i386/700_create_efibootimg.sh 2023-01-09 18:54:05.758244199 Entering debugscript mode via 'set -x'. + source /usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh ++ is_true 1 ++ case "$1" in ++ return 0 ++ efi_img_sz=($( du --block-size=32M --summarize $TMP_DIR/mnt )) +++ du --block-size=32M --summarize /var/tmp/rear.106OcLQpa6JeG10/tmp/mnt ++ (( efi_img_sz += 2 )) ++ dd if=/dev/zero of=/var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img count=3 bs=32M 3+0 records in 3+0 records out 100663296 bytes (101 MB, 96 MiB) copied, 0.0787105 s, 1.3 GB/s ++ mkfs.vfat -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img mkfs.fat 4.1 (2017-01-24) /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img has 64 heads and 32 sectors per track, hidden sectors 0x0000; logical sector size is 512, using 0xf8 media descriptor, with 196608 sectors; drive number 0x80; filesystem has 2 16-bit FATs and 4 sectors per cluster. FAT size is 192 sectors, and provides 49047 clusters. There are 4 reserved sectors. Root directory contains 512 slots and uses 32 sectors. Volume ID is 4bdd6f20, no volume label. ++ mkdir -p -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt mkdir: created directory '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt' ++ mount -v -o loop -t vfat /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt mount: /dev/loop8 mounted on /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt. ++ cp -v -r /var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/. /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/BOOTX64.efi' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/BOOTX64.efi' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/grub.cfg' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/grub.cfg' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/fonts' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/fonts' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/fonts/unicode.pf2' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/fonts/unicode.pf2' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ast.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ast.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ca.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ca.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/da.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/da.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/de.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/de.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/de_CH.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/de_CH.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/en@quot.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/en@quot.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/eo.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/eo.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/es.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/es.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/fi.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/fi.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/fr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/fr.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/gl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/gl.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/hr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/hr.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/hu.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/hu.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/id.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/id.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/it.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/it.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ja.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ja.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ko.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ko.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/lt.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/lt.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/nb.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/nb.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/nl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/nl.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pa.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pa.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pl.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pt.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pt.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/pt_BR.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/pt_BR.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ro.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ro.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/ru.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/ru.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/sl.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/sl.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/sr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/sr.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/sv.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/sv.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/tr.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/tr.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/uk.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/uk.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/vi.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/vi.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/zh_CN.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/zh_CN.mo' '/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/zh_TW.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/zh_TW.mo' ++ umount -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img umount: /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt: target is busy. ++ mv -v -f /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img /var/tmp/rear.106OcLQpa6JeG10/tmp/isofs/boot/efiboot.img renamed '/var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/isofs/boot/efiboot.img' + source_return_code=0 + test 0 -eq 0 + cd /var/log/rear + test 1 + Debug 'Leaving debugscript mode (back to previous bash flags and options settings).' + test 1 + Log 'Leaving debugscript mode (back to previous bash flags and options settings).' + test -w /var/log/rear/rear-hostname.log + echo '2023-01-09 18:54:06.042704848 Leaving debugscript mode (back to previous bash flags and options settings).' 2023-01-09 18:54:06.042704848 Leaving debugscript mode (back to previous bash flags and options settings). 2023-01-09 18:54:06.070822282 Including output/ISO/Linux-i386/800_create_isofs.sh
jsmeix commented at 2023-01-11 14:11:¶
So this is the crucial excerpt:
mount: /dev/loop8 mounted on /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt.
++ cp -v -r /var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/. /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI'
...
'/var/tmp/rear.106OcLQpa6JeG10/tmp/mnt/./EFI/BOOT/locale/zh_TW.mo' -> '/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt/./EFI/BOOT/locale/zh_TW.mo'
++ umount -v /var/tmp/rear.106OcLQpa6JeG10/tmp/efiboot.img
umount: /var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt: target is busy.
I wonder why it is busy?
I am not a sufficient expert to understand
what goes on here behind the scenes.
Does perhaps the 'cp' command return "too early"?
I.e. 'cp' returns but the kernel is still busy with device IO?
Does perhaps the 'umount' command require '--lazy' in practice
to avoid that some IO command like 'cp' with subsequent 'umount'
could let 'umount' (sometimes) fail with "target is busy"?
What "man 8 umount" tells about 'busy' and 'lazy'
doesn't make it clear what one should normally do:
Note that a filesystem cannot be unmounted
when it is 'busy' - for example,
when there are open files on it,
or when some process has its working directory there,
or when a swap file on it is in use.
The offending process could even be umount
itself - it opens libc, and libc in its turn
may open for example locale files.
A lazy unmount avoids this problem,
but it may introduce other issues.
See --lazy description
...
-l, --lazy
Lazy unmount.
Detach the filesystem from the file hierarchy now,
and clean up all references to this filesystem
as soon as it is not busy anymore.
A system reboot would be expected in near future
if youβre going to use this option for network filesystem
or local filesystem with submounts.
The recommended use-case for umount -l is to prevent hangs
on shutdown due to an unreachable network share where
a normal umount will hang due to a downed server
or a network partition.
Remounts of the share will not be possible.
@rear/contributors in particular @pcahyna
does one of you perhaps know more details about how
some IO command like 'cp' with subsequent 'umount'
works behind the scenes?
jsmeix commented at 2023-01-11 14:18:¶
@thomas-merz
could you try out if things behave better in your case
when you change in
output/ISO/Linux-i386/700_create_efibootimg.sh
the line
umount $v $TMP_DIR/efiboot.img
to
umount $v --lazy $TMP_DIR/efiboot.img
Alternatively or additionally when 'umount --lazy'
does not help or does not work sufficiently well
could you try out if things behave OK in your case
when you insert a line with something like sleep 3
before the 'umount' line?
thomas-merz commented at 2023-01-11 14:49:¶
Adding --lazy
solved my problem π
Before I had some mounts left:
df|grep var.*rear| awk '{print $6}'
/var/tmp/rear.Dn01b2YPeaEY3jK/tmp/efi_virt
/var/tmp/rear.OVhhey7uK3taoxV/tmp/efi_virt
/var/tmp/rear.CJOWoaQvuMUvyDv/tmp/efi_virt
/var/tmp/rear.IIDh3emTJrYkDY4/tmp/efi_virt
/var/tmp/rear.PxS05QU0gUOharc/tmp/efi_virt
/var/tmp/rear.XnSh08Zi7LWWAAm/tmp/efi_virt
/var/tmp/rear.l53DsAliWYCFV8c/tmp/efi_virt
/var/tmp/rear.k4nw099Q7s9UiH6/tmp/efi_virt
/var/tmp/rear.106OcLQpa6JeG10/tmp/efi_virt
/var/tmp/rear.uU7oIrbAheXLldP/tmp/efi_virt
/var/tmp/rear.NqP94taUaVv35hW/tmp/efi_virt
/var/tmp/rear.a0aCW56IS2BeEZP/tmp/efi_virt
/var/tmp/rear.zfQsgOB2se46rFQ/tmp/efi_virt
After manual umount (umount $(df|grep var.*rear| awk '{print $6}')
)
and running rear mkrescue
no such mount was left π
jsmeix commented at 2023-01-12 12:35:¶
Ok.
I will improve
https://github.com/rear/rear/pull/2909
to use 'umount --lazy'.
pcahyna commented at 2023-01-12 14:14:¶
@rear/contributors in particular @pcahyna does one of you perhaps know more details about how some IO command like 'cp' with subsequent 'umount' works behind the scenes?
I believe that when cp
completes, it should not cause umount failures.
If there are data to be flushed to the disk, umount should wait for the
write to complete even without --lazy
.
IOW, I don't understand what's going on. I suspect some process might be
using the mount (could be just by having its current working directory
in the mounted filesystem).
Ideally, it would be good to debug and understand this (for example by
using fuser -m
), but I understand that it could be difficult and that
using --lazy
is an easier solution.
pcahyna commented at 2023-01-12 14:21:¶
One could change the umount command to something like
umount $v $TMP_DIR/efiboot.img || fuser -m -v $TMP_DIR/efi_virt
to try to see the reason for umount problems.
thomas-merz commented at 2023-01-12 14:37:¶
@pcahyna , "one"? You will do? Or do you want me testing this? π
jsmeix commented at 2023-01-12 14:41:¶
Regarding 'fuser' I saw
"Why fuser is inferior to lsof" in
https://stackoverflow.com/questions/7878707/how-to-unmount-a-busy-device
I am not at all an expert in this area.
From what I read all that looks more like pile of long grown mess
than something where things are working cleanly and consistently.
pcahyna commented at 2023-01-12 14:43:¶
@thomas-merz I meant that we could change the released ReaR code this way, so that any user facing the issue would get a more meaningful error message.
But if you are able to reproduce the issue reliably, please try this and report the results! I am curious.
jsmeix commented at 2023-01-12 14:43:¶
@thomas-merz
I would much appreciate it if you could test things
because it fails in your case so your system behaves
as we need to test real failures.
The "one" who then implements it in ReaR is me via
https://github.com/rear/rear/pull/2909
pcahyna commented at 2023-01-12 14:49:¶
I am not at all an expert in this area. From what I read all that looks more like pile of long grown mess than something where things are working cleanly and consistently.
I agree. The umount command uses the umount(2)
or umount2(2)
system
call behind the scenes. From the manual page you can see that the amount
of information the system call can accept (four boolean flags) and
return back (one of the few integer errno values without any additional
parameters that would allow to debug a problem) is really limited.
pcahyna commented at 2023-01-12 14:50:¶
the latter point also applies to the mount
syscall where you basically
only know that "something went wrong".
jsmeix commented at 2023-01-12 14:59:¶
In
https://github.com/rear/rear/pull/2909
via
https://github.com/rear/rear/commit/25b532824b57d1e064bee179591c01f907478039
I added 'fuser -v -m $TMP_DIR/efi_virt' output in the log file
jsmeix commented at 2023-01-12 15:07:¶
Yes,
those funny 'mount' error message like:
mount: /var/tmp/rear.../outputfs: wrong fs type, bad option,
bad superblock on /dev/sd..., missing codepage or helper program,
or other error.
that tells all and nothing.
So nowadays a more modern state of the art
mount: Oops - something went wrong - we apologize
that truly tells nothing would be "more correct" ;-)
jsmeix commented at 2023-01-12 15:34:¶
Regarding
https://github.com/rear/rear/issues/2908#issuecomment-1380417619
If there are data to be flushed to the disk,
umount should wait for the write to complete
This is exactly what I had experienced some longer time ago.
I did a 'dd' of a huge file into a mounted filesystem on a USB disk
and 'dd' completed rather soon but my subsequent plain normal 'umount'
took a rather long time while the USB disk (a real rotating disk)
made its typical ongoing access noise until all had completed
and finally 'umount' returned.
I had even asked a colleague at that time and he had told me
that 'umount' is the best way in practice to be sure that
all what needs to be written on a filesystem will have been
written by the kernel when 'umount' returns.
The only thing what is left then is cache in the disk hardware.
So after 'umount' returned one should still wait a bit before
unplugging a USB disk (which does a hard power-off for the disk)
so that possible dirty caches on the disk itself could clean up.
pcahyna commented at 2023-01-12 15:41:¶
"Why fuser is inferior to lsof" in
https://stackoverflow.com/questions/7878707/how-to-unmount-a-busy-device
The reasoning there seems to be that you can't use fuser
on a
lazy-unmounted filesystem (because the filesystem is detached, so no
path can point to any file on it), while lsof
would still display the
path. As you are using fuser
before trying umount --lazy
, I think
this reasoning does not apply and fuser
will do the right thing.
pcahyna commented at 2023-01-12 15:44:¶
Yes, those funny 'mount' error message like:
mount: /var/tmp/rear.../outputfs: wrong fs type, bad option, bad superblock on /dev/sd..., missing codepage or helper program, or other error.
that tells all and nothing. So nowadays a more modern state of the art
mount: Oops - something went wrong - we apologize
that truly tells nothing would be "more correct" ;-)
Indeed - have fun with user reports that took the error message at a face value and complain they have a bad superblock.
pcahyna commented at 2023-01-12 15:47:¶
did a 'dd' of a huge file into a mounted filesystem on a USB disk
and 'dd' completed rather soon but my subsequent plain normal 'umount'
took a rather long time while the USB disk (a real rotating disk)
made its typical ongoing access noise until all had completed
and finally 'umount' returned.
Indeed, so cp
should not be causing this (after all, cp
should not
be doing anything special compared to dd
to a file).
thomas-merz commented at 2023-01-12 16:15:¶
@jsmeix
I would much appreciate it if you could test things because it fails in your case so your system behaves as we need to test real failures.
The "one" who then implements it in ReaR is me via #2909
++ umount -v /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efiboot.img
umount: /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt: target is busy.
++ fuser -m -v /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt
USER PID ACCESS COMMAND
/var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt:
root kernel mount /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt
Which is still mounted:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/loop3 98094 12618 85476 13% /var/tmp/rear.qhWQgnwk6DLWmgU/tmp/efi_virt
pcahyna commented at 2023-01-12 16:40:¶
@thomas-merz thanks. I experimented a bit and it seems that the "kernel
mount" entry is shown by fuser for any mounted filesystem. This
indicates that the filesystem is unused, if there are no other entries
in the output. Why the unmounting fails is then even more mysterious.
Could you please add one more "umount $TMP_DIR/efiboot.img" after the
fuser
? Maybe it is just some temporary glitch and a second umount
will succeed even without --lazy
.
thomas-merz commented at 2023-01-13 08:37:¶
@pcahyna
Adding a second umount
(without --lazy
) will also unmount
$TMP_DIR/efiboot.img π
Update @pcahyna - sorry, I did make a mistake:
Adding a second umount
(without --lazy
) will NOT unmount
$TMP_DIR/efiboot.img π
jsmeix commented at 2023-01-13 08:44:¶
I can confirm on my openSUSE Leap 15.4 system
that the "kernel mount" is always there:
# mount -v /dev/sda6 /other ; fuser -v -m /other ; umount -v /other
mount: /dev/sda6 mounted on /other.
USER PID ACCESS COMMAND
/other: root kernel mount /other
umount: /other unmounted
I also think it is a timing issue
which is my reasoning behind why in
https://github.com/rear/rear/pull/2909
I do first of all 'sleep 1' before normal 'umount'
https://github.com/rear/rear/blob/25b532824b57d1e064bee179591c01f907478039/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh#L53
jsmeix commented at 2023-01-13 09:20:¶
I tried to reproduce such a regular umount failure
of a loop mounted file on my openSUSE Leap 15.4 system
several times in an automated row (i.e. via a script)
and for me it always works.
I have /var/tmp in the root filesystem which is
# lsblk -ipo NAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT
NAME TRAN TYPE FSTYPE SIZE MOUNTPOINT
/dev/sda sata disk 465.8G
|-/dev/sda1 part 8M
|-/dev/sda2 part crypto_LUKS 4G
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 crypt swap 4G [SWAP]
|-/dev/sda3 part crypto_LUKS 200G
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 crypt ext4 200G /
In contrast for @thomas-merz /var/tmp is on XFS on LVM
(excerpt from his initial description here)
NAME KNAME PKNAME TRAN TYPE FSTYPE LABEL SIZE MOUNTPOINT
...
/dev/sdb /dev/sdb disk 122G
|-/dev/sdb1 /dev/sdb1 /dev/sdb part 8M
|-/dev/sdb2 /dev/sdb2 /dev/sdb part ext3 372.5M /boot
|-/dev/sdb3 /dev/sdb3 /dev/sdb part LVM2_member 119.5G
| |-/dev/mapper/systemvg-usr_lv /dev/dm-0 /dev/sdb3 lvm xfs 15G /usr
| |-/dev/mapper/systemvg-swap_lv /dev/dm-1 /dev/sdb3 lvm swap 16G [SWAP]
| |-/dev/mapper/systemvg-root_lv /dev/dm-2 /dev/sdb3 lvm xfs 15G /
| |-/dev/mapper/systemvg-opt_lv /dev/dm-9 /dev/sdb3 lvm xfs 5G /opt
| |-/dev/mapper/systemvg-var_lv /dev/dm-11 /dev/sdb3 lvm xfs 10G /var
| |-/dev/mapper/systemvg-varlogaudit_lv /dev/dm-12 /dev/sdb3 lvm xfs 10G /var/log/audit
| |-/dev/mapper/systemvg-varlog_lv /dev/dm-13 /dev/sdb3 lvm xfs 10G /var/log
| |-/dev/mapper/systemvg-vartmp_lv /dev/dm-14 /dev/sdb3 lvm xfs 10G /var/tmp
| |-/dev/mapper/systemvg-tmp_lv /dev/dm-15 /dev/sdb3 lvm xfs 10G /tmp
| `-/dev/mapper/systemvg-home_lv /dev/dm-16 /dev/sdb3 lvm xfs 15G /home
So perhaps (only a blind guess) something in XFS or LVM
causes some tiny delay somewhere so that a loop mounted file
that actually is on XFS or LVM may sometimes show such an
inexplicable regular umount failure?
My disk 'sda' is of lsblk TRAN 'sata'.
In contrast for @thomas-merz 'sdb' there is no lsblk TRAN.
So perhaps his sdb is somewhat "unusually" connected
(his "hardware" is 'VMware')
and that causes some tiny delay somewhere that may
sometimes cause such an inexplicable regular umount failure?
thomas-merz commented at 2023-01-13 10:15:¶
@jsmeix , our disks are VMware-VMDKs on an IBM-SAN-Storagebox in virtual machines:
# dmesg -T|grep sdb
[Thu Jan 5 02:42:15 2023] sd 0:0:0:0: [sdb] 255852544 512-byte logical blocks: (131 GB/122 GiB)
[Thu Jan 5 02:42:15 2023] sd 0:0:0:0: [sdb] Write Protect is off
[Thu Jan 5 02:42:15 2023] sd 0:0:0:0: [sdb] Mode Sense: 3b 00 00 00
[Thu Jan 5 02:42:15 2023] sd 0:0:0:0: [sdb] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
[Thu Jan 5 02:42:15 2023] sdb: sdb1 sdb2 sdb3 sdb4
[Thu Jan 5 02:42:15 2023] sd 0:0:0:0: [sdb] Attached SCSI disk
[Thu Jan 5 02:42:19 2023] EXT4-fs (sdb2): mounting ext3 file system using the ext4 subsystem
[Thu Jan 5 02:42:20 2023] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: acl,user_xattr
Full lsblk
:
# lsblk -ipo NAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT /dev/sdb
NAME TRAN TYPE FSTYPE SIZE MOUNTPOINT
/dev/sdb disk 122G
|-/dev/sdb1 part 8M
|-/dev/sdb2 part ext3 372.5M /boot
|-/dev/sdb3 part LVM2_member 119.5G
| |-/dev/mapper/systemvg-usr_lv lvm xfs 15G /usr
| |-/dev/mapper/systemvg-swap_lv lvm swap 16G [SWAP]
| |-/dev/mapper/systemvg-root_lv lvm xfs 15G /
| |-/dev/mapper/systemvg-opt_lv lvm xfs 5G /opt
| |-/dev/mapper/systemvg-var_lv lvm xfs 10G /var
| |-/dev/mapper/systemvg-varlogaudit_lv lvm xfs 10G /var/log/audit
| |-/dev/mapper/systemvg-varlog_lv lvm xfs 10G /var/log
| |-/dev/mapper/systemvg-vartmp_lv lvm xfs 10G /var/tmp
| |-/dev/mapper/systemvg-tmp_lv lvm xfs 10G /tmp
| `-/dev/mapper/systemvg-home_lv lvm xfs 15G /home
`-/dev/sdb4 part vfat 139M /boot/efi
Is there anything "unusual" for you? π€
pcahyna commented at 2023-01-13 10:38:¶
@jsmeix I don't think the file system type of /var/tmp
matters
(although I am not 100% sure), I believe the mounted file system type is
more relevant. And that is vfat
.
jsmeix commented at 2023-01-13 10:42:¶
All what is not usual end-user hardware is "unusual" for me
because I only have usual end-user hardware in my homeoffice.
And I have only very limited "usual end-user hardware"
(i.e. only what I actually have in my homeoffice).
"Unusual" does not mean something is wrong or broken.
But it means it is something where others cannot reproduce
how unusual (or special) hardware (or software) behaves.
Same with virtualization software:
I only use KVM/QEMU.
In particular I don't use any proprietary software.
So "VMware-VMDKs on an IBM-SAN-Storagebox"
is "very unusual" for me personally.
Things may change when you have a vaild support contract
with SUSE for ReaR in a SUSE Linux Enterprise product
because for SUSE special enterprise hardware and software
is less "unusual" (but of course SUSE cannot have
each and every enterprise hardware and software).
For more information see the section
"SUSE support for Relax-and-Recover" in
https://en.opensuse.org/SDB:Disaster_Recovery
jsmeix commented at 2023-01-13 10:53:¶
I had tested with VFAT and
it all works for me (100 times in a row):
# dd if=/dev/zero of=/var/tmp/test.img count=64 bs=1M
64+0 records in
64+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.0453405 s, 1.5 GB/s
# mkfs.vfat /var/tmp/test.img
mkfs.fat 4.1 (2017-01-24)
# mkdir /var/tmp/test.mnt
# for i in $( seq 100 ) ; \
do echo test $i ; \
mount -v -o loop -t vfat /var/tmp/test.img /var/tmp/test.mnt ; \
rm -f /var/tmp/test.mnt/* ; \
dd if=/dev/urandom of=/var/tmp/test.mnt/urandom.data count=60 bs=1M ; \
umount -v /var/tmp/test.mnt || break ; \
done
test 1
mount: /dev/loop0 mounted on /var/tmp/test.mnt.
60+0 records in
60+0 records out
62914560 bytes (63 MB, 60 MiB) copied, 0.9974 s, 63.1 MB/s
umount: /var/tmp/test.mnt unmounted
test 2
mount: /dev/loop0 mounted on /var/tmp/test.mnt.
60+0 records in
60+0 records out
62914560 bytes (63 MB, 60 MiB) copied, 1.04459 s, 60.2 MB/s
umount: /var/tmp/test.mnt unmounted
.
.
.
test 100
mount: /dev/loop0 mounted on /var/tmp/test.mnt.
60+0 records in
60+0 records out
62914560 bytes (63 MB, 60 MiB) copied, 1.02499 s, 61.4 MB/s
umount: /var/tmp/test.mnt unmounted
thomas-merz commented at 2023-01-13 11:10:¶
Things may change when you have a vaild support contract
with SUSE for ReaR in a SUSE Linux Enterprise product
because for SUSE special enterprise hardware and software
is less "unusual" (but of course SUSE cannot have
each and every enterprise hardware and software).
We "have a vaild support contract with SUSE for ReaR in a SUSE Linux Enterprise product", but we are using Rear 2.7 from Upstream because the one provided by SUSE (2.3) is far toooooo old because we need a more current version with Rubrik-Integration.
@roseswe already asked a SUSE representive and you @jsmeix on Nov. 29th to adapt Rear 2.7 into official SUSE repos.
So are we stuck now or can we proceed? π€·ββοΈ
thomas-merz commented at 2023-01-13 11:12:¶
With "unusual to you" I ment "disabled" or "unsupport" features or Opts.
pcahyna commented at 2023-01-13 12:13:¶
So, if it fails only at the first attempt, I would advise to add
fuser -m -v $TMP_DIR/efi_virt
before the first (failing) umount.
I agree it is a timing issue and it would be good to know if it is
purely in the kernel or if there is some userspace process involved (in
the latter case, fuser
before failing umount should reveal it). Could
you please do this test, @thomas-merz ?
pcahyna commented at 2023-01-13 12:16:¶
sorry, now I saw the update:
I did make a mistake:
Adding a second umount (without --lazy) will NOT unmount $TMP_DIR/efiboot.img
so, maybe it is not a timing issue after all.
jsmeix commented at 2023-01-13 12:26:¶
@thomas-merz
providing ReaR 2.7 as new RPM package 'rear27a' for SLE-HA-15
("SUSE Linux Enterprise High Availability Extension version 15")
is already done by me (as the RPM package maintainer at SUSE)
which means currently it is work in progress by others at SUSE
to get that new RPM package rear27a into the SLE-HA-15 product
so that finally it becomes officially available for SUSE customers.
That you use ReaR 2.7 from ReaR upstream is much appreciated and
recommended by me and I support you with that here at ReaR upstream
because this way we can right now proceed to get it working for you
as you need it in your specific case (e.g. fixing issues like this
one)
and we at ReaR upstream can learn new (special) things how we could
further improve ReaR so that it works in even more special cases
(there are already lots of things in ReaR to deal with special cases).
Because you reported your issue publicly at ReaR upstream
all ReaR upstream developers can work together on a proper fix.
In particular @pcahyna helps me so much with ReaR issues.
To be able to improve ReaR (in particular for special cases)
we at ReaR upstream depend very much on contributions by users.
What I appreciate most of all is that you @thomas-merz
as an actual user of ReaR (i.e. someone who sits in front
of a system where ReaR is used and who is 'root' there)
directly works together with us at ReaR upstream.
This helps you to get your specific issue fixed
as good as possible and as fast as possible
"nothing is better and faster than ReaR upstream" ;-)
and it helps us at ReaR upstream to understand
a special case to be able to deal with it properly
"nothing is better than direct collaboration with users" :-)
So nothing is stuck.
Let us just proceed here as we currently do.
From my point of view all works perfectly well here.
In the end all will benefit from how we do things here:
First of all you because your specific issue gets fixed right now.
Later when it is fixed generically in ReaR also other users who use
ReaR upstream benefit because they will not be hit by this issue.
Finally Linux distributions benefit when they provide a newer ReaR
version to their users who will benefit last.
jsmeix commented at 2023-01-13 12:31:¶
Oops!
I also did not notice the update in
https://github.com/rear/rear/issues/2908#issuecomment-1381481497
I assume this happened because GitHub does not send
its usual notification e-mail when a comment was updated
and at least I depend on GitHub's notification e-mails.
thomas-merz commented at 2023-01-13 13:23:¶
@pcahyna
add
fuser -m -v $TMP_DIR/efi_virt
before the first (failing) umount.
This little "slow down" by fuser
seems to be sufficient to make
umount
successfully unmount the filesystem:
++ fuser -m -v /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt
USER PID ACCESS COMMAND
/var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt:
root kernel mount /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt
++ umount -v /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efiboot.img
umount: /var/tmp/rear.dm7v6Gulm4z7wnA/tmp/efi_virt (/dev/loop0) unmounted
jsmeix commented at 2023-01-13 13:58:¶
So it looks again as if some simple
artificial 'sleep 1' delay before 'umount'
should sufficiently avoid such issues
which I do currently in
https://github.com/rear/rear/pull/2909
in the
# Umounting the EFI virtual image:
...
code section in
https://github.com/rear/rear/blob/25b532824b57d1e064bee179591c01f907478039/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh
@thomas-merz
could you try out if using
/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh
as in
https://github.com/rear/rear/blob/25b532824b57d1e064bee179591c01f907478039/usr/share/rear/output/ISO/Linux-i386/700_create_efibootimg.sh
makes that umount working sufficienly reliable for you?
You can see the actual code change in
https://github.com/rear/rear/pull/2909/files
(long lines may be shown wrapped in the browser).
You only need the change for
output/ISO/Linux-i386/700_create_efibootimg.sh
thomas-merz commented at 2023-01-13 15:13:¶
@jsmeix , sleep 1
also works fine:
++ sleep 1
++ umount -v /var/tmp/rear.6CEkscr2qDP8VI5/tmp/efiboot.img
umount: /var/tmp/rear.6CEkscr2qDP8VI5/tmp/efi_virt (/dev/loop0) unmounted
pcahyna commented at 2023-01-13 19:26:¶
since there does not seem to be a process blocking it (although we are
not 100% sure), maybe the filesystem just needs syncing?
Could you please try adding sync --file-system $TMP_DIR/efi_virt
before the first umount
attempt (without the sleep
and the fuser
)?
thomas-merz commented at 2023-01-13 20:32:¶
I did:
++ sync --file-system /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efi_virt
++ umount -v /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efiboot.img
umount: /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efi_virt: target is busy.
And voila - it's still mounted:
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 96M 13M 84M 13% /var/tmp/rear.9nhtp3oeToiJMS5/tmp/efi_virt
π
pcahyna commented at 2023-01-13 20:48:¶
@thomas-merz thanks for trying...
jsmeix commented at 2023-01-26 14:18:¶
An addedum regarding fuser -v -m /path/to/something
see in
https://github.com/rear/rear/pull/2909
my recent commit
https://github.com/rear/rear/pull/2909/commits/f770c69ee7e2fadc7137a37b3c5596da0b62aedb
why one should use fuser with its '-M' option
fuser -v -M -m /path/to/something
instead of plain fuser -v -m /path/to/something
For example (on my openSUSE Leap 15.4 system):
# mount -v /dev/sda6 /other
mount: /dev/sda6 mounted on /other.
# fuser -v -m /other
USER PID ACCESS COMMAND
/other: root kernel mount /other
# fuser -v -M -m /other
USER PID ACCESS COMMAND
/other: root kernel mount /other
# umount -v /other
umount: /other unmounted
# fuser -v -m /other
USER PID ACCESS COMMAND
/other: root kernel mount /
root 1 .rce. systemd
root 2 .rc.. kthreadd
...
[about 300 more lines]
...
root 31953 .rc.. kvm-pit/31921
johannes 32016 .rce. ssh
johannes 32018 .rce. ssh-agent
# fuser -v -M -m /other
Specified filename /other is not a mountpoint.
pcahyna commented at 2023-02-14 11:45:¶
Hi @jsmeix , good catch with -M
, indeed, if the filesystem is not
mounted at this point, the output would be very misleading.
Unfortunately, RHEL 6 does not yet have the fuser -M
flag, so I
suspect some of the SLES or openSUSE variants that you care about will
have the same problem.
jsmeix commented at 2023-02-14 12:01:¶
Hello @pcahyna
if you have a system where 'fuser' does not support '-M'
could you show me what 'fuser -v -M -m /dir' results?
Does it perhaps just ignore '-M' or does it error out?
pcahyna commented at 2023-02-14 12:27:¶
Hello @pcahyna
if you have a system where 'fuser' does not support '-M' could you show me what 'fuser -v -M -m /dir' results? Does it perhaps just ignore '-M' or does it error out?
# mount
/dev/mapper/vg_ciscoc240m301-lv_root on / type ext4 (rw)
(...)
/dev/sda1 on /boot type ext4 (rw)
# fuser -v -M -m /boot
M: unknown signal; fuser -l lists signals.
# echo $?
1
Weird, huh? That's on RHEL 6.
The manual page says:
fuser [-a|-s|-c] [-4|-6] [-n space ] [-k [-i] [-signal ] ] [-muvf] name ...
and we get bitten by the -signal
part.
jsmeix commented at 2023-03-20 12:18:¶
Regarding in
https://github.com/rear/rear/issues/2908#issuecomment-1429605894
what SLES versions support 'fuser ... -M ...':
SLES11 does not support 'fuser ... -M ...'
SLES12 and later support 'fuser ... -M ...'
Since ReaR 2.7 is released I do no longer care about SLES11, cf.
https://github.com/rear/rear/pull/2949#discussion_r1124225289
jsmeix commented at 2023-03-22 12:25:¶
With
https://github.com/rear/rear/pull/2909
merged
this issue should be sufficiently avoided.
[Export of Github issue for rear/rear.]