#3109 Issue open: rear should exclude NFS mounts with exclude list instead of tar one-filesystem feature

Labels: enhancement, discuss / RFC, support / question, external tool

bwelterl opened issue at 2023-12-15 09:37:

  • ReaR version ("/usr/sbin/rear -V"):
    rear-2.6-19.el9.x86_64

  • If your ReaR version is not the current version, explain why you can't upgrade:

  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
    Red Hat Enterprise Linux release 9.3 (Plow)

  • Description of the issue (ideally so that others can reproduce it):
    If there is a NFS mount on the system and this mount hangs (because for example the VM sharing the NFS is suspended), the tar command will hang because even with the --one-file-system, it tries to access the mount point.

  • Workaround, if any:
    Add the mount path to the excluded lists EXCLUDE_MOUNTPOINTS or BACKUP_PROG_EXCLUDE

The idea is to detect the NFS mount point and add it directly to the excluded list to avoid any access to the directory. This should also apply on other filesystems.
I will also check with tar if any improvement can be done on the tool.

Thank you !

Benoit

jsmeix commented at 2023-12-15 11:28:

@bwelterl
only to be on the safe side:

When 'tar' is called with explicitly specified files or directories
it will archive all explicitly specified files or directories
regardless on which different filesystems they are.
I.e. '--one-file-system' only means that 'tar' will
not automatically descend into other filesystems.

So if something belonging to that hanging NFS mountpoint
is explicily specified then 'tar' will try to archive it.

Therefore to be on the safe side check the 'tar' command
that is shown in the ReaR log file what files and directories
are explicitly specified to be archived.

bwelterl commented at 2023-12-15 11:43:

Hi,

Even with nothing in the mount point, tar will send a request to the NFS server, even with '--one-file-system'.

There is no explicit inclusion of something on NFS mountpoint in tar command, for example:

tar --warning=no-xdev --sparse --block-number --totals
 --verbose --no-wildcards-match-slash --one-file-system
 --ignore-failed-read --anchored --xattrs
 --xattrs-include=security.capability
 --xattrs-include=security.selinux --acls --gzip
 -X /tmp/rear.Flepiaf/tmp/backup-exclude.txt
 -C / -c -f -
 /var/crash /home /local /opt / /tmp /var /boot/efi
 /boot /var/log/rear/rear-server.log

A simple tar of / with a mount point in /nfs_mount will show the issue.

The idea is to improve the robustness of the backup.

Just to be sure, do you confirm that mounts are excluded by the one-file-system parameter of tar and not something else ?

Thank you

Benoit

jsmeix commented at 2023-12-15 13:39:

I can reproduce it.

On a VM with IP 192.168.122.142 I export '/nfs/'
which contains only a single regular file '/nfs/nfshello'

On my laptop (which is the host of that VM) I do:

# mkdir /tmp/test-nfs-tar

# echo localhello >/tmp/test-nfs-tar/localhello

# mkdir /tmp/test-nfs-tar/nfs.mountpoint

# mount -v -t nfs -o nfsvers=3,nolock 192.168.122.142:/nfs /tmp/test-nfs-tar/nfs.mountpoint
mount.nfs: timeout set for Fri Dec 15 14:13:17 2023
mount.nfs: trying text-based options 'nfsvers=3,nolock,addr=192.168.122.142'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 192.168.122.142 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 192.168.122.142 prog 100005 vers 3 prot UDP port 20048

linux-h9wr:~ # mount | grep nfs
...
192.168.122.142:/nfs on /tmp/test-nfs-tar/nfs.mountpoint type nfs (rw,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.122.142,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=192.168.122.142)

# find /tmp/test-nfs-tar
/tmp/test-nfs-tar
/tmp/test-nfs-tar/localhello
/tmp/test-nfs-tar/nfs.mountpoint
/tmp/test-nfs-tar/nfs.mountpoint/nfshello

# tar -cvf test-nfs-tar.tar --one-file-system /tmp/test-nfs-tar
tar: Removing leading `/' from member names
/tmp/test-nfs-tar/
/tmp/test-nfs-tar/localhello
/tmp/test-nfs-tar/nfs.mountpoint/
tar: /tmp/test-nfs-tar/nfs.mountpoint/: file is on a different filesystem; not dumped

# tar -tvf test-nfs-tar.tar
drwxr-xr-x root/root         0 2023-12-15 14:11 tmp/test-nfs-tar/
-rw-r--r-- root/root        11 2023-12-15 14:10 tmp/test-nfs-tar/localhello
drwxr-xr-x root/root         0 2023-12-15 14:03 tmp/test-nfs-tar/nfs.mountpoint/

I paused the VM
and then I get on my laptop:

# tar -cvf test-nfs-tar.2.tar --one-file-system /tmp/test-nfs-tar
tar: Removing leading `/' from member names
/tmp/test-nfs-tar/
/tmp/test-nfs-tar/localhello

where it "just hangs" - meanwhile since several minutes
(I don't know if that will ever time out / error out).

I resumed the VM
and then I get on my laptop:

/tmp/test-nfs-tar/nfs.mountpoint/
tar: /tmp/test-nfs-tar/nfs.mountpoint/: file is on a different filesystem; not dumped

# echo $?
0

i.e. the hanging 'tar' finished successfully.

A second test with 'tar excludes' on my laptop:

# tar -cvf test-nfs-tar.2.tar --one-file-system --exclude='nfs.mountpoint' /tmp/test-nfs-tar
tar: Removing leading `/' from member names
/tmp/test-nfs-tar/
/tmp/test-nfs-tar/localhello

# tar -tvf test-nfs-tar.2.tar
drwxr-xr-x root/root         0 2023-12-15 14:11 tmp/test-nfs-tar/
-rw-r--r-- root/root        11 2023-12-15 14:10 tmp/test-nfs-tar/localhello

I pause the VM again
and with 'tar excludes' I get this on my laptop

# tar -cvf test-nfs-tar.2.tar --one-file-system --exclude='nfs.mountpoint' /tmp/test-nfs-tar
tar: Removing leading `/' from member names
/tmp/test-nfs-tar/
/tmp/test-nfs-tar/localhello

i.e. with 'tar excludes' nothing hangs.

The difference between 'tar --one-file-system' without '--exclude'
and 'tar --one-file-system' with '--exclude' is that
the mountpoint directory 'tmp/test-nfs-tar/nfs.mountpoint/'
is included with 'tar --one-file-system' without '--exclude'
while it is excluded (as specified with --exclude='nfs.mountpoint')
with 'tar --one-file-system' with '--exclude'.

So excluding NFS mounts with 'tar --exclude'
would also exclude the NFS mountpoint directories
so that after restore of such a backup
the NFS mountpoint directories would not be there.

I.e. excluding NFS mounts with 'tar --exclude'
would make the backup incomplete and cause subsequent issues
because mounting NFS shares would no longer work
when their NFS mountpoint directories cannot be restored
because the NFS mountpoint directories are not in the backup.

I think it could be rather annoying when after "rear recover"
all NFS mountpoint directories are missing
and all need to be manually recreated
possibly with some longish investigation to find out
what all the NFS mountpoint directory paths must be
e.g. by searching system log files for error messages like

mount.nfs: mount point /tmp/test-nfs-tar/nfs.mountpoint does not exist

bwelterl commented at 2023-12-15 13:45:

Hello Johannes,

Thanks for the test on your side.
Good point that the mount point will not be recreated.

If we add the detection of mount points in the logic, we can also recreate it during recover.

But I admin this adds complexity... It's easier to have a working filesystem or tar not sending request to NFS server.

Thank you

Benoit

jsmeix commented at 2023-12-15 13:50:

You are right,
what got excluded by ReaR could be recreated during recover.

jsmeix commented at 2023-12-15 13:58:

Perhaps ReaR should in general store and recreate
all mountpoint directories regardless of the backup?

I am thinking about something similar
(i.e. with the same basic idea behind)
as what we already do with FHS directories via
prep/default/400_save_directories.sh
and
restore/default/900_create_missing_directories.sh

Hmm ... in prep/default/400_save_directories.sh I see

# First save directories that are used as mountpoints:

so it seems we do that already so that
"just excluding" NFS mounts with 'tar --exclude'
may even "just work" by existing old magic in ReaR?

jsmeix commented at 2023-12-15 14:04:

Likely not - because - as ar as I see - the comments in
prep/default/400_save_directories.sh
explain why we cannot save NFS mountpoint directories because

# Furthermore automounted NFS filesystems can cause this script to hang up if NFS server fails
# because the below 'stat' command may then wait indefinitely for the NFS server to respond.

so - at least on quick first glance - it seems
it is impossible to save NFS mountpoint directories
in a fail-safe way (in particular without possibly hangup)
regardless if done by 'tar' or otherwise
because NFS hangups happen too deeply in the system (kernel)
so that userland tools cannot check if access would hang up
so userland tools can only blindly try and hope for the best.

jsmeix commented at 2023-12-15 14:12:

As usual - simple things get complex,
cf. RFC 1925 item (8):

It is more complicated than you think.

For mountpoint directories we may inspect
what 'mount' tells, cf. above

linux-h9wr:~ # mount | grep nfs
...
192.168.122.142:/nfs on /tmp/test-nfs-tar/nfs.mountpoint type nfs (rw,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.122.142,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=192.168.122.142)

so we could know at least the mountpoint directory names
so we may recreate them with default owner,permissions,...
which should be normally OK for mountpoint directories.

jsmeix commented at 2023-12-15 14:46:

Oh - surprisingly 'stat' on a NFS mountpoint directory
seems to work at least for me with SLES15-SP5
even when the NFS server VM is paused:

# stat -c '%n %a %U %G' /tmp/test-nfs-tar/nfs.mountpoint
/tmp/test-nfs-tar/nfs.mountpoint 755 root root

In contrast

# ls /tmp/test-nfs-tar/nfs.mountpoint

"just hangs" when the NFS server VM is paused (as expected).

pcahyna commented at 2023-12-19 17:05:

Interesting. What about findmnt /tmp/test-nfs-tar/nfs.mountpoint ? It has a better parseable output than mount, and it returns exit code 1 when the argument is not a mountpoint.

jsmeix commented at 2023-12-20 10:52:

Both

# findmnt /tmp/test-nfs-tar/nfs.mountpoint

and

# mountpoint /tmp/test-nfs-tar/nfs.mountpoint

work at least for me with SLES15-SP5
even when the NFS server VM is paused.

But

# ls -d /tmp/test-nfs-tar/nfs.mountpoint

"just hangs" when the NFS server VM is paused
which is not expected because 'ls -d' should only
"list directories themselves, not their contents"
according to what "man ls" tells.

jsmeix commented at 2023-12-20 10:54:

Oops!
Correction of my above
https://github.com/rear/rear/issues/3109#issuecomment-1857998156

Now also

# stat -c '%n %a %U %G' /tmp/test-nfs-tar/nfs.mountpoint

"just hangs" when the NFS server VM is paused
in contrast to what I had falsely written in
https://github.com/rear/rear/issues/3109#issuecomment-1857998156

pcahyna commented at 2023-12-20 11:03:

maybe some time-limited cache was in play?

jsmeix commented at 2023-12-20 11:03:

Plain

# stat -c '%n' /tmp/test-nfs-tar/nfs.mountpoint

works at least for me when the NFS server VM is paused.
But this command is rather meaningles as it shows only
the name '/tmp/test-nfs-tar/nfs.mountpoint' which is already known.

jsmeix commented at 2023-12-20 11:06:

Oops! - Oops!
Another correction of my recent
https://github.com/rear/rear/issues/3109#issuecomment-1864267013

I had some inexplicable behaviour with further tests
so I umounted the NFS share and mounted it again.

Now

# stat -c '%n %a %U %G' /tmp/test-nfs-tar/nfs.mountpoint

works again at least for me with SLES15-SP5
even when the NFS server VM is paused.

After some more pause and resume of the NFS server VM

# stat -c '%n %a %U %G' /tmp/test-nfs-tar/nfs.mountpoint

"just hangs" when the NFS server VM is paused.

Lesson learned:
NFS stuff only works predictably as long as all is well.
As soon as something does no longer work well with NFS
the results that one gets as user on a client are arbitrary.

jsmeix commented at 2023-12-20 11:13:

@pcahyna
perhaps some time-limited cache or something else.

What matters is that one cannot rely on NFS
to behave predictably from a user's point of view.
Of course from the kernel's point of view all
behaves as intended (i.e. as programmed).

gdha commented at 2023-12-22 10:14:

@jsmeix Perhaps we can check for stale NFS mount points and exclude these somehow? I have a script which I uses a lot for these kind of activities - see https://github.com/gdha/lnxutils/blob/master/opt/ncsbin/lnxutils/nfs_stale_handle_check.sh

schlomo commented at 2024-01-13 18:39:

Maybe ReaR NETFS backup should simply exclude by default all network filesystems?

With regard to recreating mountpoints I'd do this only for stuff mentioned in /etc/fstab and not for other, probably manually created, mounts. I think that trying to do that is way too much complexity and will have a high degree of "doing the wrong thing with the best intentions"

pcahyna commented at 2024-01-15 13:42:

Maybe ReaR NETFS backup should simply exclude by default all network filesystems?

I believe it already does, no? --one-file-system excludes everything that is not explicitly included. And network filesystems, I believe, are not included.

The problem here is that tar hangs while trying to determine whether a directory is a mountpoint.

With regard to recreating mountpoints I'd do this only for stuff mentioned in /etc/fstab and not for other, probably manually created, mounts. I think that trying to do that is way too much complexity and will have a high degree of "doing the wrong thing with the best intentions"

We need to recreate mountpoints for all filesystems, because the layout contains all the mounted local filesystems and without recreating the mountpoints there would not be anything to mount them to. Or did you mean "recreating filesystems" instead of "recreating mountpoints"?

schlomo commented at 2024-01-15 13:53:

@pcahyna it seems to me that we need to specifically exclude the mountpoints themselves of network filesystems, as that seems to be the root cause of this issue. As @jsmeix was able to show in https://github.com/rear/rear/issues/3109#issuecomment-1857897456 just "looking" at the mountpoint will make the system hang if the NFS server is down, so I'd hope that excluding the mountpoint itself would allow tar to work in this scenario.

I did mean recreating mountpoints and I think that for manual mounts of network filesystems we should just completely ignore them and that we don't need to recreate the mountpoints beyond the standard directories that we always create (e.g. /media or /mnt. My point was not about local filesystems but about network filesystems.

jsmeix commented at 2024-01-15 14:02:

Finally I found time to have a look at
https://github.com/gdha/lnxutils/blob/master/opt/ncsbin/lnxutils/nfs_stale_handle_check.sh

As far as I see its basic idea is
to run a command that may not return with

timeout ... COMMAND

I searched the ReaR scripts for usage of the 'timeout' command
but interestingly I found nothing.

I think there could be several cases where it is not certain
that a command will return (within reasonable time)
so it could be in general a good idea to run
commands that might not return with

timeout ... COMMAND

pcahyna commented at 2024-01-15 14:21:

@schlomo I suppose that you mean explicit exclusion via BACKUP_PROG_EXCLUDE in addition to the implicit exclusion provided by --one-file-system. How to reliably determine the list of mountpoints though? I am afraid that any utility we may try to use for that can access the filesystem and hang cf. the stat example above.

pcahyna commented at 2024-01-16 11:20:

I believe that not recreating mountpoints by default would not be a good idea. Consider a system where one mounts a network share at /mnt/foo. It is not permanently mounted, so it is not present in /etc/fstab, the mount and unmount is done manually or by a script. If a ReaR backup is taken at the moment the share is mounted, the mountpoint (the empty directory itself) would not be backed up and mounting it after the system is destroyed and recovered from backup would fail. It would work fine if the backup is taken at a moment when the share is not mounted, and to me it looks wrong to make the content of the backup dependent on what filesystems are mounted at the moment (if the filesystems are excluded anyway).
I checked and found that the above should actually not happen, as ReaR saves and restores all mountpoints independently of the backup stage and its exclusions. The code is here: https://github.com/rear/rear/blob/master/usr/share/rear/prep/default/400_save_directories.sh .
So, in fact, excluding mountpoints from the backup should be safe and the script above could even provide an inspiration for the implementation (as it also gathers the list of all mountpoints).

jsmeix commented at 2024-01-16 13:28:

The more I am thinking about it in general
the more I come to the conclusion
that in general (i.e. except exceptions)
ReaR should not deal with broken things
or somehow avoid or circumvent broken things.

My main reason is that I think it is in practice
unreasonable and endless efforts to implement code
that still works even when things are broken.

Exceptional cases are when there is a simple and
generic way that works reliable and stable over time
how to deal with or avoid or circumvent broken things.

Regarding unresponsive NFS mounts:

I am not at all a NFS expert but I think
a mounted NFS share behaves basically
same as a mounted filesystem on a local disk
(as far as I know this is the basic idea behind NFS).
When a local disk filesystem becomes unresponsive
all access hangs up (as far as I know without timeout
usually in read/write syscalls i.e. in the kernel)
until it responds or ad infinitum.
I think basically no software implements
to deal with unresponsive filesystem access
(which would basically mean to work around the kernel)
so why should especially ReaR implement that?

pcahyna commented at 2024-01-16 13:47:

@jsmeix I mostly agree, just a note that in the case of local disk being unresponsive is an exceptional situation, but in distributed systems (including the use of NFS) a loss of connectivity is a fairly normal state. Also, I believe that processes waiting for the disk are stuck in the "uninterruptible sleep" state, while processes stuck on NFS can be killed via SIGKILL (cf. the use of timeout above). Local disk access also has timeouts in the SCSI or block layer (maybe both), so it does not wait "ad infinitum".

bwelterl commented at 2024-01-16 13:48:

I must agree with that @jsmeix . But for this case, we are saying that rear is automatically excluding the remote filesystems, but this is implemented only through the tar --one-file-system, which leads to the issue, not with a specific exclusion.
rear should not try to access this FS at all.

But I undestand this is a corner case and we should not add complexity.

Benoit


[Export of Github issue for rear/rear.]