#2909 PR merged: Inform the user when it could not umount something

Labels: enhancement, fixed / solved / done

jsmeix opened issue at 2023-01-10 11:48:

  • Type: Enhancement

  • Impact: Normal

  • Reference to related issue (URL):
    https://github.com/rear/rear/issues/2908

  • How was this pull request tested?
    Not tested by me.

  • Brief description of the changes in this pull request:

In output/ISO/Linux-i386/700_create_efibootimg.sh
make umounting the EFI virtual image more fail-safe:
In any case 'sleep 1' after 'cp' before normal 'umount'
and if this fails try 'umount --lazy' and if that also fails
show the user a LogPrintError (and proceed bona fide)
so the user can understand why later at the end
cleanup_build_area_and_end_program() may show

Could not remove build area

when lazy umount could not clean up things until then.

In restore/NETFS/default/400_restore_backup.sh and
same also in restore/YUM/default/410_restore_backup.sh
it calls umount "$BUILD_DIR/outputfs"
because BUILD_DIR/outputfs is needed as mountpoint
to mount something (a medium that contains the backup)
so inform the user about an umounting failure
(if there was something mounted at BUILD_DIR/outputfs)
so he can understand why the subsequent mounting fails
or why several things got mounted stacked one over the other.

jsmeix commented at 2023-01-12 13:56:

Primarily for my own information
here some links about 'umount --lazy' and related things:

https://superuser.com/questions/412109/lazy-umount-or-unmounting-a-busy-disk-in-linux

https://stackoverflow.com/questions/7878707/how-to-unmount-a-busy-device

https://serverfault.com/questions/1101277/how-does-a-lazy-unmount-actually-work

https://unix.stackexchange.com/questions/385645/how-do-i-know-when-lazy-umount-l-completes

https://www.unix.com/linux/119194-umount-l-dangerous.html

https://unix.stackexchange.com/questions/111779/how-to-find-out-easily-whether-a-block-device-or-a-part-of-it-is-mounted-someh

jsmeix commented at 2023-01-12 14:16:

@pcahyna
in my current change here for
output/ISO/Linux-i386/700_create_efibootimg.sh
I call only plain 'umount --lazy' without additional '--force'
so I cannot use the umount_mountpoint_lazy function here
because that calls umount $v -f -l $mountpoint.

I do not like '--lazy' with additional '--force'
because "man umount" tells:

-f, --force
Force an unmount (in case of an unreachable NFS system).
Note that this option does not guarantee that umount command does not hang.

As usual the description is rather terse and does not explain things
so one has to imagine what could be meant.

Currently I understand this as if '--force' may hang up
because in general to enforce something one must try hard
to get that something done which means "don't give up"
but "try hard endlessly" so it may never finish.

In contrast '--lazy' has opposite meaning
so 'umount --lazy' basically always returns and
lets things clean up on its own in the background,
e.g. see
https://serverfault.com/questions/1101277/how-does-a-lazy-unmount-actually-work
that reads (excerpt):

Lazy umount is a regular umount, which happens in background.
In addition is hides the mount entry, which makes it looks
like umount have worked (or Detach the filesystem from
the file hierarchy, how man page says). In really the
unmount will happen when filesystem is not in use anymore.

Thus, for a real umount one should use --force and wait ...

So I think '--lazy' and '--force' contradict each other.

Or what do I misunderstand here?

pcahyna commented at 2023-01-12 14:23:

@jsmeix I am not sure myself. The impression I get from the manual page is that the main motivation for umount options is the case of NFS mount where the server became unreachable. I am not even sure whether -f does anything useful in the case of local filesystems.

schlomo commented at 2023-02-13 20:06:

We still do umount ... || LogPrintError ... so that rear will happily continue even in case of an umount error.

I'm wondering if we maybe should be more strict and actually StopIfError to avoid follow-up problems like reported by @thomas-merz originally? That way he would have noticed the problem via a direct error message from ReaR instead of via some other monitoring noticing lots of new filesystems...

Also, maybe the discussed sleep 1 would also be a good solution instead of a cascae of umount attempts?

I find the fuser really useful in case of umount troubles, great idea!

And finally, maybe umount should be converted into a ReaR function to always have that error handling attached to it?

jsmeix commented at 2023-02-14 10:59:

It is my goal behind to get in the end a
sufficiently simple and reliably working
umounting function in ReaR.

But this will take some time in particular
to get sufficient experience how things behave
for real users out there on their systems.

What "just works" for us on our clean test systems
may terribly fail for users on problematic systems.

E.g. see https://github.com/rear/rear/issues/2908
As far as I can remember we never had an issue
like that with plain and simple 'umount' before.

E.g. see my comment in this pull request code
how 'fuser' must be called to avoid possibly
scaring and misleading output in certain cases.

Bottom line:
RFC 1925 item (8) It is more complicated than you think.

That one should even better read:
In practice it is more complicated than one imagines in theory.

jsmeix commented at 2023-03-20 12:32:

@rear/contributors
I would like to merge it tomorrow afternoon
unless there are objections.

This is only a first step towards a (hopefully) sufficiently
simple and reliably working umounting function in ReaR, cf.
https://github.com/rear/rear/pull/2909#issuecomment-1429530985

pcahyna commented at 2023-03-21 16:00:

jsmeix with the addition of sleep it helps with #2908, right?

jsmeix commented at 2023-03-22 12:12:

Yes, the "sleep 1 and retry" helps in particular with
issues like https://github.com/rear/rear/issues/2908
but in general I prefer to try simple things first,
see my new comments via my recent
https://github.com/rear/rear/pull/2909/commits/d6559072f4bc64c5ba18ff2fdd03b410ad340ace

When normal umount works after simple 'sleep 1'
it avoids in particular possible issues with 'fuser -M'
on older Linux distributions that do not support '-M'.


[Export of Github issue for rear/rear.]