#3058 PR merged: Skip useless xfs mount options when mounting during recovery

Labels: enhancement, bug, fixed / solved / done

pcahyna opened issue at 2023-10-25 10:47:

Pull Request Details:
  • Type: Bug Fix

  • Impact: Normal

  • Reference to related issue (URL): closes #2777

  • How was this pull request tested?
    Full backup and recovery with `MKFS_XFS_OPTIONS="-d sunit=128 -d swidth=128" and root on XFS. Before the change, recovery failed to mount the root filesystem during layout restoration.

  • Description of the changes in this pull request:

The mount command displays all mount options for a filesystem, including those that are not explictitly set in fstab, and ReaR saves them to disk layout. In the case of XFS, some of them can be harmful for mounting the filesystem during layout restoration:

The logbsize=... mount option is a purely performance/memory usage optimization option, which can lead to mount failures, because it must be an integer multiple of the log stripe unit and the log stripe unit can be different in the recreated filesystem from the original filesystem. It was reported for some some exotic situation apparently involving an old filesystem created with a version of xfsprogs that accepted values that it does not accept anymore, see GitHub issue #2777 . More importantly, it can occur when using MKFS_XFS_OPTIONS, because this will lead to a change of filesystem parameters and the mount options will no longer match. If logbsize is not an integer multiple of the log stripe unit, mount fails with the warning XFS (...): logbuf size must be greater than or equal to log stripe size in the kernel log (and a confusing error message mount: ...: wrong fs type, bad option, bad superblock on ..., missing codepage or helper program, or other error. from the mount command), causing the layout restoration in the recovery process to fail.

Wrong sunit/swidth can cause mount to fail as well, with this in the kernel log: kernel: XFS (...): alignment check failed: sunit/swidth vs. agsize.

Therefore, remove the logbsize=... and sunit=.../swidth=... from XFS mount options before mounting the file system.

(Another option possibility be to remove them already when saving the layout, in layout/save/GNU/Linux/230_filesystem_layout.sh, but I decided to follow the example of btrfs here.)

jsmeix commented at 2023-10-26 08:23:

@pcahyna
I am wondering if the function name remove_mount_options_values
correctly indicates what this function actually does because
from only reading its description it seems this function
removes whole mount options like
nodev,uid=1,rw,gid=1-> nodev,rw
and not only the values from mount options like
nodev,uid=1,rw,gid=1-> nodev,uid,rw,gid
so the simpler function name remove_mount_options might be better?

jsmeix commented at 2023-10-26 09:07:

It is an enhancement when ReaR even works in an
"exotic situation apparently involving an old filesystem
created with a version of xfsprogs that accepted values
that it does not accept anymore".

I think in practice that situation is not as exotic as it seems
because (at least how SLES service pack upgrades happen at SUSE)
it could be rather common that upgrades of basic system software
happen via software package upgrades (e.g. RPM package updates)
and (of course) not by reinstalling the system anew.

By the way cf.
https://en.opensuse.org/SDB:Disaster_Recovery

You must continuously validate that the recovery
still works on the replacement hardware in particular
after each change of the basic system.
.
.
.
When you have a working disaster recovery procedure,
do not upgrade ReaR and do not change the basic software
that is used by ReaR (like partitioning tools, filesystem tools,
bootloader tools, ISO image creating tools, and so on).
...
When you have a working disaster recovery procedure running
and you upgrade software that is related to the basic system
or you do other changes in your basic system, you must also
carefully and completely re-validate that your particular
disaster recovery procedure still works for you.

pcahyna commented at 2023-10-26 10:28:

@pcahyna I am wondering if the function name remove_mount_options_values correctly indicates what this function actually does because from only reading its description it seems this function removes whole mount options like nodev,uid=1,rw,gid=1-> nodev,rw

more descriptive name would be remove_mount_options_with_values, but I felt it was a bit too long.

and not only the values from mount options like nodev,uid=1,rw,gid=1-> nodev,uid,rw,gid so the simpler function name remove_mount_options might be better?

remove_mount_options would suggest that it removes anything, even options without values (rw, nodev and so on), which it is not able to do (I preferred to keep the implementation simpler).

Maybe it would be better to let the function let look for both options witjout values, and options with values, and remove them both? (And rename it.) My only worry is that there might be a case where an option without a value means something different than an option with a value. The oncly case I found where an option can be specified both with and without value is utf8:

       utf8
           UTF8 is the filesystem safe 8-bit encoding of Unicode that is used
           by the console. It can be enabled for the filesystem with this
           option or disabled with utf8=0, utf8=no or utf8=false. If uni_xlate
           gets set, UTF8 gets disabled.

jsmeix commented at 2023-10-26 10:58:

@pcahyna
I didn't want to make a bigger issue with this function name.
Leave it as is - its current name is OK - and what is most important:
It is well described what it does so the name is not a real issue.

pcahyna commented at 2023-10-26 11:10:

It is an enhancement when ReaR even works in an "exotic situation apparently involving an old filesystem created with a version of xfsprogs that accepted values that it does not accept anymore".

I think in practice that situation is not as exotic as it seems because (at least how SLES service pack upgrades happen at SUSE) it could be rather common that upgrades of basic system software happen via software package upgrades (e.g. RPM package updates) and (of course) not by reinstalling the system anew.

Indeed.

I described it as "exotic", because even in older RHEL versions that had more relaxed checks on XFS parameters I was not able to create the problematic combination of parameters. Probably there is some trick that I am missing, given that the bug report is real.

pcahyna commented at 2023-10-30 17:30:

@jsmeix thank you for having a look!


[Export of Github issue for rear/rear.]