#2796 PR closed
: Overhauled layout/compare/default/510_compare_files.sh¶
Labels: enhancement
, cleanup
, won't fix / can't fix / obsolete
jsmeix opened issue at 2022-05-03 12:53:¶
Completely overhauled
layout/compare/default/510_compare_files.sh
according to
https://github.com/rear/rear/pull/2795#issuecomment-1116010676
and
https://github.com/rear/rear/pull/2796#issuecomment-1116126796
jsmeix commented at 2022-05-03 12:54:¶
@pcahyna
this pull request is only meant as a proposal
so you could have a look
how things might be done according to
https://github.com/rear/rear/pull/2795#issuecomment-1116010676
Currently it is totally untested code so there could be stupid bugs.
It is only meant to show some basic ideas.
jsmeix commented at 2022-05-03 13:48:¶
I think
https://github.com/rear/rear/pull/2796/commits/4d4230b435b6ad4693a77b15578910a14f4676e0
can be fixed by first regenerating what FILES_TO_PATCH_PATTERNS
evaluates to
and then appending what CHECK_CONFIG_FILES evaluates to
as it is done in layout/save/default/600_snapshot_files.sh
in
https://github.com/rear/rear/pull/2795
jsmeix commented at 2022-05-03 13:52:¶
I wonder why there is no symlink handling for what CHECK_CONFIG_FILES evaluates to.
I think that whole stuff needs to be overhauled and cleanly
(i.e. more simple and straightforward) implemented like:
-
Evaluate CHECK_CONFIG_FILES and FILES_TO_PATCH_PATTERNS
to a local array of plain files and symlinks. -
Do symlink handling for the plain files and symlinks in that aray.
-
Run "md5sum" for the results.
jsmeix commented at 2022-05-04 09:07:¶
My recent
https://github.com/rear/rear/pull/2796/commits/3d816d982e2133e8a84385178ecbf5c3d61583ff
has almost duplicated code (except log messages and absolute path
handling)
which could be further unified but for now I leave it as is
and test if that works...
pcahyna commented at 2022-05-04 09:22:¶
Do we need any symlink checks for CHECK_CONFIG_FILES
? Those files are
specified explicitly, so shouldn't we respect what was put there and
avoid any post-processing (except expanding directories)?
jsmeix commented at 2022-05-04 10:00:¶
With my latest commit (i.e. with DebugPrint) I get for
CHECK_CONFIG_FILES=( '/etc/fstab' '/etc/crypttab' '/etc/mtab' '/etc/issue' '/etc/QQQ' )
FILES_TO_PATCH_PATTERNS="[e]tc/fstab [e]tc/mtab [e]tc/issue /home/johannes/testdir_symlink QQQQ"
with
# find /home/johannes/testdir* -ls
... drwxr-xr-x ... /home/johannes/testdir
... lrwxrwxrwx ... /home/johannes/testdir/etc_mtab_symlink -> /etc/mtab
... lrwxrwxrwx ... /home/johannes/testdir/broken_symlink -> broken
... -rw-r--r-- ... /home/johannes/testdir/file
... lrwxrwxrwx ... /home/johannes/testdir/file_symlink -> file
... lrwxrwxrwx ... /home/johannes/testdir/etc_issue_symlink -> /etc/issue
... lrwxrwxrwx ... /home/johannes/testdir_symlink -> testdir
after a "rear savelayout"
# usr/sbin/rear -D checklayout
...
Running 'layout/compare' stage ======================
Disk layout has changed
Skip /etc/mtab from CHECK_CONFIG_FILES (symlink with target /proc/28235/mounts in /proc/ /sys/ /dev/ /run/)
Skip /etc/issue from CHECK_CONFIG_FILES (symlink with target /run/issue in /proc/ /sys/ /dev/ /run/)
Skip /etc/QQQ in CHECK_CONFIG_FILES (no regular file matches)
Skip /etc/grub.cfg in CHECK_CONFIG_FILES (no regular file matches)
Skip /etc/grub2.cfg in CHECK_CONFIG_FILES (no regular file matches)
Skip /boot/grub2/grub2.cfg in CHECK_CONFIG_FILES (no regular file matches)
Skip /boot/grub/grub.cfg in CHECK_CONFIG_FILES (no regular file matches)
Skip /etc/mtab from FILES_TO_PATCH_PATTERNS (symlink with target /proc/28289/mounts in /proc/ /sys/ /dev/ /run/)
Skip /etc/issue from FILES_TO_PATCH_PATTERNS (symlink with target /run/issue in /proc/ /sys/ /dev/ /run/)
Skip //home/johannes/testdir_symlink/etc_mtab_symlink from FILES_TO_PATCH_PATTERNS (symlink with target /proc/28307/mounts in /proc/ /sys/ /dev/ /run/)
Skip //home/johannes/testdir_symlink/etc_issue_symlink from FILES_TO_PATCH_PATTERNS (symlink with target /run/issue in /proc/ /sys/ /dev/ /run/)
Skip QQQQ in FILES_TO_PATCH_PATTERNS (no regular file matches)
There are changes related to configuration files and files to be patched
The current md5sum file is /root/rear.github.master/var/lib/rear/layout/config/files.md5sum.checklayout
The old md5sum file is kept as /root/rear.github.master/var/lib/rear/layout/config/files.md5sum.outdated
Saving /root/rear.github.master/var/log/rear/rear-linux-h9wr.log.lockless as /root/rear.github.master/var/log/rear/rear-linux-h9wr.log
Exiting rear checklayout (PID 26051) and its descendant processes ...
jsmeix commented at 2022-05-04 10:02:¶
I think we need symlink checks also for CHECK_CONFIG_FILES
because the user may have specified directories which
contain symlinks with targets in /proc/ /sys/ /dev/ /run/
If a user really wants to check things in /proc/ /sys/ /dev/ /run/
I think he could specify the targets in /proc/ /sys/ /dev/ /run/
jsmeix commented at 2022-05-04 10:07:¶
For the fun of it:
# ls -l /dev/lp0
crw-rw---- 1 root lp 6, 0 May 4 09:11 /dev/lp0
# md5sum /dev/lp0
md5sum: /dev/lp0: No such device or address
# time md5sum /dev/sda8
ac480c812f27dcdeeae89fdb3db3be29 /dev/sda8
real 0m19.408s
user 0m2.486s
sys 0m0.361s
/dev/sda8 is a 1.07GB "playground" partition on my disk.
jsmeix commented at 2022-05-04 10:12:¶
For me CHECK_CONFIG_FILES=( ... '/dev/lp0' '/proc/cmdline' )
seems to work as expected - I get in
var/lib/rear/layout/config/files.md5sum
f8805aa1625aa28d41adbe422a94ebb2 /proc/cmdline
and on the screen there is
Skip /dev/lp0 in CHECK_CONFIG_FILES (no regular file matches)
jsmeix commented at 2022-05-04 10:19:¶
With CHECK_CONFIG_FILES=( ... '/dev/sda8' )
I get a longer delay during "rear savelayout"
because it computes the md5sum for /dev/sda8
but during "rear -D checklayout" I get on the terminal
Skip /dev/sda8 in CHECK_CONFIG_FILES (no regular file matches)
so I think we should exclude non regular files hardcoded in general
at least for now because I assume md5sum comparison
does not make sense in such cases.
If a user reports a special case where that is really needed
we could adapt and enhance things accordingly.
jsmeix commented at 2022-05-04 10:48:¶
The messages
Skip /etc/grub.cfg in CHECK_CONFIG_FILES (no regular file matches)
Skip /etc/grub2.cfg in CHECK_CONFIG_FILES (no regular file matches)
Skip /boot/grub2/grub2.cfg in CHECK_CONFIG_FILES (no regular file matches)
Skip /boot/grub/grub.cfg in CHECK_CONFIG_FILES (no regular file matches)
are caused by this
# find usr/sbin/rear usr/share/rear -type f | xargs grep 'CHECK_CONFIG_FILES+=' | grep grub
usr/share/rear/layout/save/default/450_check_bootloader_files.sh: CHECK_CONFIG_FILES+=( /boot/efi/EFI/*/grub*.cfg )
usr/share/rear/layout/save/default/450_check_bootloader_files.sh: CHECK_CONFIG_FILES+=( /etc/grub.cfg /etc/grub2.cfg /boot/grub2/grub2.cfg /bootgrub/grub.cfg )
usr/share/rear/layout/save/default/450_check_bootloader_files.sh: CHECK_CONFIG_FILES+=( /etc/lilo.conf /etc/yaboot.conf /etc/grub.cfg /etc/grub2.cfg /boot/grub2/grub2.cfg /boot/grub/grub.cfg)
because the patterns in CHECK_CONFIG_FILES must contain bash globbing
characters
like a [] around the first letter to ensure that 'shopt -s nullglob'
removes this file from the listing.
I will fix that right now because it is a bug in any case.
jsmeix commented at 2022-05-04 11:24:¶
Because
https://github.com/rear/rear/pull/2795#issuecomment-1117093699
is important here so I quote the matching excerpt:
I don't think that any modification of
layout/compare/default/510_compare_files.sh
(except for replacing the diff -u by a better comparison like cmp
and using sort to avoid false warnings when the set of files is unchanged,
but the order is changed) is needed.
layout/save/default/600_snapshot_files.sh
appends to the variable CHECK_CONFIG_FILES,
so layout/compare/default/510_compare_files.sh will inherit it
( layout/save is executed before layout/compare ).
pcahyna commented at 2022-05-04 11:55:¶
because the patterns in CHECK_CONFIG_FILES must contain bash globbing characters
like a [] around the first letter to ensure that 'shopt -s nullglob' removes this file from the listing.
I don't think CHECK_CONFIG_FILES
is expected to contain any patterns
that the shell would expand or remove. I believe that when adding to the
variable, one may use shell patterns if one needs to ensure that
nonexistent files are not added, and perhaps one should do it (I don't
see why it is necessary though, the original code skips nonexistent
files silently, so it is not a bug to specify a nonexistent file), but
the variable itself should not contain shell patterns (the pattern
should get expanded already when adding to the variable, not when
expanding the variable itself).
pcahyna commented at 2022-05-04 12:00:¶
@jsmeix thanks for the investigation of the details of file checking
here - I am not sure what should be the strategy with respect to #2795.
On one hand, I am afraid that the original issue is snowballing into
something much bigger. We have a real-world complaint about the lack of
checking after restore, but no-one has complained about the lack of
symlink handling in checklayout
and even the wrong use of diff -u
does not seem to have negative consequences in practice. For this
reason, I would like to merge #2795 soon and worry about the other
stuff later. OTOH, the discussion may make #2795 entirely obsolete,
because we may find that we need an entirely different strategy.
jsmeix commented at 2022-05-04 12:02:¶
That patterns get expanded before adding to CHECK_CONFIG_FILES
is what I implemented in my recent
https://github.com/rear/rear/commit/17ee5f87d742a57f82960b08795b07896969ac8b
and
https://github.com/rear/rear/commit/8c847f7b17d905d769acdf44dcb6d914cbe49c4a
jsmeix commented at 2022-05-04 12:13:¶
No,
https://github.com/rear/rear/pull/2795
is not obsolete.
The strategy you developed there looks exactly right and I like it.
All I intended with this pull request here is a proposal how
layout/save/default/600_snapshot_files.sh
in
https://github.com/rear/rear/pull/2795
might be improved regarding what it results in CHECK_CONFIG_FILES.
My idea is that basically CHECK_CONFIG_FILES should contain
what files_for_md5sum
contains in 510_compare_files.sh
in this proposal here.
Then 510_compare_files.sh could just run md5sum
for the elements "as is" in CHECK_CONFIG_FILES
without any further evaluation or checks because
all that was already done in 600_snapshot_files.sh
jsmeix commented at 2022-05-04 12:56:¶
An addendum to
https://github.com/rear/rear/pull/2796#issuecomment-1117223731
In general I consider it as good coding style
to not add nonexistent things to a variable.
To some extent this belongs to
"Try hard to care about possible errors" in
https://github.com/rear/rear/wiki/Coding-Style
because it avoids possible errors in subsequent code
when variables do not contain nonsense values
than to just rely on that all subsequent code
(including all future added or changed subsequent code)
will be sufficiently robust against nonsense values.
Of course all code should be sufficiently robust
but my point is to not just rely on that.
Furthermore when variables do not contain nonsense values
it makes debugging easier because nonsense values may
lead to false assumptions about the cause of an issue.
pcahyna commented at 2022-05-05 08:05:¶
While it is "nice" to not supply nonexistent filenames to other parts of
the code, the code is designed to cope with that, so I would not worry
too much about it.
I have thought more about the issue and came to the conclusion that
CHECK_CONFIG_FILES
needs to contain all the file names that one has
specified, even if they do not exist. Remember that we need to
eventually fix the current gap in #2795 :
https://github.com/rear/rear/pull/2795#issuecomment-1116010676
"When CHECK_CONFIG_FILES and FILES_TO_PATCH_PATTERNS evaluate to
different files this is a change that must be detected.". Presumably, we
are going to do it by recording the full content of CHECK_CONFIG_FILES
during layout/save
(including nonexistent files) and checking this
list during finalize
at the end of recovery.
If nonexistent files are simply omitted from CHECK_CONFIG_FILES
during
layout/save
, how will we detect that they have appeared at finalize
time? We can regenerate the content of CHECK_CONFIG_FILES
again at
finalize
time (that's how I will solve the issue for
FILES_TO_PATCH_PATTERNS
- see "You are right that if I move to
something else than md5sum -c in finalize, the same code needs to be run
there (finalize/default/060_compare_files.sh) to regenerate
CHECK_CONFIG_FILES" in
https://github.com/rear/rear/pull/2795#issuecomment-1117093699
) but to cover all the places where CHECK_CONFIG_FILES
is appended to
in layout/save
, one would need to rerun them or reimplement in
finalize
. This means running all the
layout/save/default/4*_check_*.sh
again during finalize
to recreate
an equivalent CHECK_CONFIG_FILES
, and I am not sure if the scripts are
ready for that.
(The problem already exists due to the presence of wildcards in some of
the layout/save/default/4*_check_*.sh
scripts, sigh.)
jsmeix commented at 2022-05-05 08:39:¶
The CHECK_CONFIG_FILES+=( ... )
in some 'layout/save' scripts
is a general problem because it modifies a user config variable
which means what there is specified for CHECK_CONFIG_FILES
in 'default.conf' and 'local.conf' is not authoritative for what
is acually used later.
So in workflows that run 'layout/save' we use a different
CHECK_CONFIG_FILES (different than what the user specified)
compared to workflows that do not run 'layout/save'.
My basic idea is to fully re-evaluate during "rear recover"
CHECK_CONFIG_FILES and FILES_TO_PATCH_PATTERNS
and regenerate md5sums and then compare the md5sum results.
After you merged
https://github.com/rear/rear/pull/2795
I will have a look how it behaves with it and then I will
think about how to completely clean up the current
md5sum comparison mess.
I think correct md5sum comparison is crucial for users
because users should be able to rely on our
md5sum comparison to behave consistently.
Issues like
https://github.com/rear/rear/issues/2785
is really bad user experience because it shows we do not
"Try hard to care about possible errors" in this specific area,
cf.
https://github.com/rear/rear/wiki/Coding-Style
ReaR can detect when restored basic system files
do not match the recreated system.
It only needs to do it correctly.
I think with
https://github.com/rear/rear/pull/2795
all needed code will be basically there.
It only needs to be run in a more consistent way.
FYI
what workflows run 'layout/save' (in current master code):
The workflows are from what "rear -v help" shows
('-v' is needed to let it show really all workflows).
# workflows="checklayout dump finalizeonly format layoutonly mkbackup mkbackuponly mkopalpba mkrescue mountonly opaladmin recover restoreonly savelayout shell udev validate"
# for w in $workflows ; do usr/sbin/rear -s $w | grep -q 'layout/save/' && echo $w ; done
checklayout
mkbackup
mkbackuponly
mkrescue
savelayout
pcahyna commented at 2022-05-05 09:11:¶
My basic idea is to fully re-evaluate during "rear recover"
CHECK_CONFIG_FILES and FILES_TO_PATCH_PATTERNS
and regenerate md5sums and then compare the md5sum results.
That's one way to do it for sure, but I am afraid that doing a full
re-evaluation will be difficult. So my proposal is to short-circuit it
by saving the full CHECK_CONFIG_FILES
list (except the addition
resulting from FILES_TO_PATCH_PATTERNS, but including files that did
not exist at the time) into layout and then use it during finalize. But
for this, the list must be static (no wildcard expansions, since they
can result into less than what will be there during finalize).
pcahyna commented at 2022-05-05 09:13:¶
but I am afraid that doing a full re-evaluation will be difficult.
.. because that essentially means running the whole layout/save
stage
before finalize
, in the mounted restored system, and I suspect there
will be problems with that.
jsmeix commented at 2022-05-05 09:30:¶
I won't run the whole layout/save stage before finalize.
I will try to solve the problem with
CHECK_CONFIG_FILES+=( ... ) in some 'layout/save' scripts
differently.
My immediate idea is to have all CHECK_CONFIG_FILES+=( ... )
that exists in scripts directly in default.conf because it does not
matter
to specify more in CHECK_CONFIG_FILES than what actually exists
in particular not when bash globbing patterns are used to ignore
nonexistent things via 'shopt -s nullglob'.
This does not affect the user who can specify mandatory things
for his specific CHECK_CONFIG_FILES in local.conf
without bash globbing patterns.
I do not understand what is wrong with bash globbing patterns in
CHECK_CONFIG_FILES because files that do not exists
cannot be used for md5sum so files that do not exists
should be skipped in any case from the md5sum call to avoid
md5sum: ...: No such file or directory
stderr messages.
I wonder what the difference is between skipping nonexistsnt files
implicitly via globbing patterns with 'shopt -s nullglob'
versus skipping them by some explicit test in the code.
I think the difference is that skipping via globbing happens silently
while skipping by explicit test in the code could show messages
to the user when something that is specified is not there.
pcahyna commented at 2022-05-05 09:41:¶
I wonder what the difference is between skipping nonexistsnt files
implicitly via globbing patterns with 'shopt -s nullglob'
versus skipping them by some explicit test in the code.
This depends on the intended way of solving this mess.
The solution that I had in mind was this pseudocode:
- assemble
CHECK_CONFIG_FILES
duringlayout/save
- save it somewhere under layout
- calculate md5sums
- in
finalize
loadCHECK_CONFIG_FILES
saved in 2. - calculate md5sums
- compare md5sums calculated in 3. and 5.
In this scheme, it matters a lot whether you skip nonexistent files during step 1. (presumably using glob patterns and nullglob) vs. in step 3. immediately before calculating hashes (presumably using an explicit test), because it affects what gets saved in step 2.
If you choose another approach, then it does not matter. Your approach
"to have all CHECK_CONFIG_FILES+=( ... )
that exists in scripts
directly in default.conf" will be correct, I am just afraid it might be
a bit restrictive (there are conditionals in
layout/save/default/4*_check_*.sh
).
jsmeix commented at 2022-05-05 09:56:¶
Please just merge
https://github.com/rear/rear/pull/2795
so we have a good starting point for further development
(which can be also tested by users how far that works for them)
then let's see how far I can get from there.
jsmeix commented at 2022-05-05 10:12:¶
As far as I see on first glance the conditionals in
layout/save/default/450_check_bootloader_files.sh and
layout/save/default/450_check_network_files.sh
are not really mutually exclusive so having all of them
in CHECK_CONFIG_FILES via globbing patterns
should "just ignore" what was done before with conditionals.
By the way:
Globbing patterns do help when used well:
Currently at SUSE we have /boot/grub2/grub.cfg
(at least on my openSUSE Leap 15.3 system)
but nothing in 450_check_bootloader_files.sh matches
while /[b]oot/*/grub*.cfg
would have matched.
So only the conditionals in
layout/save/default/400_check_backup_special_files.sh
are left.
Offhandedly one may think even such special things could be
"simply added with appropriate globbing patters".
But when someone has TSM and/or FDR/Upstream installed
but does not use BACKUP=TSM or BACKUP=FDRUPSTREAM
for ReaR, then he would get things for TSM and/or FDR/Upstream
in CHECK_CONFIG_FILES that is not wanted to be checked
when actually another backup method is used for ReaR.
I guess I can move 400_check_backup_special_files.sh
to a stage that is run in any case - e.g. the init
stage
that is run in any case directly by usr/sbin/rear
via SourceStage "init"
because the init
stage looks right to do such special
config variable adjustments.
pcahyna commented at 2022-05-05 10:19:¶
I think that moving the conditionals that are really needed to init
would work, yes. Preferably they should not depend on the content of the
system, but only on other configuration variables (like BACKUP
).
jsmeix commented at 2022-05-05 10:46:¶
With
https://github.com/rear/rear/pull/2795
merged
this pull request is outdated or obsoleted so I close it.
I keep the branch of this pull request for now as reference
https://github.com/rear/rear/tree/jsmeix-overhauled-510_compare_files
[Export of Github issue for rear/rear.]