#2788 PR closed
: Detect by UUIDs if restored config files do not match what was recreated¶
Labels: enhancement
, won't fix / can't fix / obsolete
jsmeix opened issue at 2022-04-11 13:17:¶
-
Type: Enhancement
-
Impact: High
-
Reference to related issue (URL):
https://github.com/rear/rear/issues/2787
https://github.com/rear/rear/issues/2785 -
How was this pull request tested?
Curently only a "rear mkrescue" test on my homeoffice laptop -
Brief description of the changes in this pull request:
During "rear mkrescue/mkbackup/mkbackuponly/savelayout"
save which UUIDs in disklayout.conf appear
in a config file in CHECK_CONFIG_FILES
which is used for comparison during "rear recover"
if those UUIDs in disklayout.conf still appear
in a restored config file in CHECK_CONFIG_FILES
in the recreated system under /mnt/local
The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.
jsmeix commented at 2022-04-11 13:36:¶
The default CHECK_CONFIG_FILES is insufficient
(in particular /etc/fstab is missing)
so that needs to be adapted to make it useful
For my test I have in my etc/rear/local.conf
CHECK_CONFIG_FILES+=( /etc )
On my rather slow Intel i3 Laptop with a spinning disk
I did a test after a reeboot so that the file buffer caches are cold
and usr/share/rear/layout/save/default/600_snapshot_files.sh
did run from 2022-04-12 10:42:49.598645397
until 2022-04-12 10:42:55.001197954
so about 5 seconds to check whole /etc/ with cold caches.
I got this /var/tmp/rear.GSxN9PoPwczFS6f/rootfs/etc/rear/rescue.conf
(excerpt):
# The following line was added by layout/save/default/600_snapshot_files.sh
DISKLAYOUT_UUIDS_IN_CONFIG_FILES='259dac9c-f2fd-4181-a351-83603398e465 4a42395e-4f9d-4056-9948-6d5d9d92d990 4c4a923d-1562-4254-a1fa-4e761278c02f 54fc77c5-8ec2-457f-b558-9deda3b843b2 6d8f8998-dd20-412a-bcc2-618eed858662 a6dba0d8-5be8-4970-b1e7-a272ae0cafdd f05af948-6075-40a3-9191-354b0a0a9afc '
The lines with a UUID in disklayout.conf
# grep "uuid=[[:alnum:]]" var/lib/rear/layout/disklayout.conf
fs /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 / ext4 uuid=f05af948-6075-40a3-9191-354b0a0a9afc label= blocksize=4096 reserved_blocks=4% max_mounts=-1 check_interval=0d bytes_per_inode=16383 default_mount_options=user_xattr,acl options=rw,relatime
fs /dev/sda4 /nfs ext4 uuid=4c4a923d-1562-4254-a1fa-4e761278c02f label= blocksize=4096 reserved_blocks=5% max_mounts=-1 check_interval=0d bytes_per_inode=16384 default_mount_options=user_xattr,acl options=rw,relatime
fs /dev/sda5 /var/lib/libvirt ext4 uuid=4a42395e-4f9d-4056-9948-6d5d9d92d990 label= blocksize=4096 reserved_blocks=1% max_mounts=-1 check_interval=0d bytes_per_inode=16384 default_mount_options=user_xattr,acl options=rw,relatime
#fs /dev/sda6 /other ext2 uuid=259dac9c-f2fd-4181-a351-83603398e465 label= blocksize=4096 reserved_blocks=4% max_mounts=-1 check_interval=0d bytes_per_inode=16384 default_mount_options=user_xattr,acl options=rw,relatime
swap /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 uuid=6d8f8998-dd20-412a-bcc2-618eed858662 label=
crypt /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 /dev/sda3 type=luks1 cipher=aes-xts-plain64 key_size=256 hash=sha256 uuid=a6dba0d8-5be8-4970-b1e7-a272ae0cafdd
crypt /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 /dev/sda2 type=luks1 cipher=aes-xts-plain64 key_size=256 hash=sha256 uuid=54fc77c5-8ec2-457f-b558-9deda3b843b2
The UUIDs in files in /etc:
# for u in f05af948-6075-40a3-9191-354b0a0a9afc 4c4a923d-1562-4254-a1fa-4e761278c02f 4a42395e-4f9d-4056-9948-6d5d9d92d990 259dac9c-f2fd-4181-a351-83603398e465 6d8f8998-dd20-412a-bcc2-618eed858662 a6dba0d8-5be8-4970-b1e7-a272ae0cafdd 54fc77c5-8ec2-457f-b558-9deda3b843b2 ; do find /etc -type f | xargs grep $u ; echo ; done
/etc/fstab:UUID=f05af948-6075-40a3-9191-354b0a0a9afc / ext4 acl,user_xattr 0 1
/etc/lvm/cache/.cache: "/dev/disk/by-uuid/f05af948-6075-40a3-9191-354b0a0a9afc",
/etc/fstab:UUID=4c4a923d-1562-4254-a1fa-4e761278c02f /nfs ext4 defaults 0 2
/etc/lvm/cache/.cache: "/dev/disk/by-uuid/4c4a923d-1562-4254-a1fa-4e761278c02f",
/etc/fstab:UUID=4a42395e-4f9d-4056-9948-6d5d9d92d990 /var/lib/libvirt ext4 defaults 0 2
/etc/lvm/cache/.cache: "/dev/disk/by-uuid/4a42395e-4f9d-4056-9948-6d5d9d92d990",
/etc/lvm/cache/.cache: "/dev/disk/by-uuid/259dac9c-f2fd-4181-a351-83603398e465",
/etc/fstab:UUID=6d8f8998-dd20-412a-bcc2-618eed858662 swap swap defaults 0 0
/etc/lvm/cache/.cache: "/dev/disk/by-uuid/6d8f8998-dd20-412a-bcc2-618eed858662",
/etc/crypttab:cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 UUID=a6dba0d8-5be8-4970-b1e7-a272ae0cafdd
/etc/lvm/cache/.cache: "/dev/disk/by-uuid/a6dba0d8-5be8-4970-b1e7-a272ae0cafdd",
/etc/crypttab:cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 UUID=54fc77c5-8ec2-457f-b558-9deda3b843b2
/etc/lvm/cache/.cache: "/dev/disk/by-uuid/54fc77c5-8ec2-457f-b558-9deda3b843b2",
My block devices:
# lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT,UUID
NAME KNAME PKNAME TRAN TYPE FSTYPE LABEL SIZE MOUNTPOINT UUID
/dev/sda /dev/sda sata disk 465.8G
|-/dev/sda1 /dev/sda1 /dev/sda part 8M
|-/dev/sda2 /dev/sda2 /dev/sda part crypto_LUKS 4G 54fc77c5-8ec2-457f-b558-9deda3b843b2
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2
| /dev/dm-1 /dev/sda2 crypt swap 4G [SWAP] 6d8f8998-dd20-412a-bcc2-618eed858662
|-/dev/sda3 /dev/sda3 /dev/sda part crypto_LUKS 200G a6dba0d8-5be8-4970-b1e7-a272ae0cafdd
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3
| /dev/dm-0 /dev/sda3 crypt ext4 200G / f05af948-6075-40a3-9191-354b0a0a9afc
|-/dev/sda4 /dev/sda4 /dev/sda part ext4 100G /nfs 4c4a923d-1562-4254-a1fa-4e761278c02f
|-/dev/sda5 /dev/sda5 /dev/sda part ext4 150G /var/lib/libvirt 4a42395e-4f9d-4056-9948-6d5d9d92d990
|-/dev/sda6 /dev/sda6 /dev/sda part ext2 8G /other 259dac9c-f2fd-4181-a351-83603398e465
|-/dev/sda7 /dev/sda7 /dev/sda part crypto_LUKS 1G 1b4198c9-d9b0-4c57-b9a3-3433e391e706
`-/dev/sda8 /dev/sda8 /dev/sda part crypto_LUKS 1G 3e874a28-7415-4f8c-9757-b3f28a96c4d2
/dev/sr0 /dev/sr0 sata rom 1024M
This are quick raw results from my very first test.
I did not check any details.
jsmeix commented at 2022-04-11 13:45:¶
I will continue tomorrow with the "rear recover" part (as time permits).
jsmeix commented at 2022-04-12 10:13:¶
I will test now the "rear recover" part...
jsmeix commented at 2022-04-12 12:27:¶
Now "rear recover" reports when a UUID that was in at least one of the
config files
in CHECK_CONFIG_FILES at the time when disklayout.conf was created
is not in at least one of the restored config files in
CHECK_CONFIG_FILES.
I did this "rear mkbackup" and "rear recover" test here
on a different system (now on two same VMs)
than the above so UUIDs are different here.
I tested it by modifying a UUID in
DISKLAYOUT_UUIDS_IN_CONFIG_FILES
in /etc/rear/rescue.conf in the ReaR recovery system before running
"rear recover"
and got on the terminal
Running 'finalize' stage ======================
...
UUID QQQe2NM5J-brmg-6kic-exST-oBwb-GY2h-KREbYJ not found in a restored config file (likely this must be manually corrected)
(I prepended the UUID with QQQ
in /etc/rear/rescue.conf).
jsmeix commented at 2022-04-12 12:29:¶
@pcahyna @rear/contributors
could you please have a look here and tell me what you think about it.
See in particular my explanatory comment about it at the beginning of
usr/share/rear/finalize/GNU/Linux/280_migrate_uuid_tags.sh
jsmeix commented at 2022-04-12 12:55:¶
@bwelterl
if you like you may try out what I have here up to now.
jsmeix commented at 2022-04-13 09:38:¶
Via
https://github.com/rear/rear/pull/2788/commits/21f500abaec9b70d76d484a5c0ba9a8f3d44fbaa
I explained in default.conf that CHECK_CONFIG_FILES is also used
during "rear recover"
to detect if restored config files do not match what was recreated
and
I added /etc/fstab and /etc/crypttab to the default
CHECK_CONFIG_FILES
because they can contain UUIDs that are relevant for "rear recover"
according to what I currently get on my homeoffice laptop:
# lsblk -ipo KNAME,TYPE,FSTYPE,MOUNTPOINT,UUID,PTUUID,PARTUUID
KNAME TYPE FSTYPE MOUNTPOINT UUID PTUUID PARTUUID
/dev/sda disk 9be0015e-cf90-4c6b-80ac-c4ea89832553
/dev/sda1 part 9be0015e-cf90-4c6b-80ac-c4ea89832553 4a795619-730c-4e74-854b-59ea03e4f1ab
/dev/sda2 part crypto_LUKS 54fc77c5-8ec2-457f-b558-9deda3b843b2 9be0015e-cf90-4c6b-80ac-c4ea89832553 ab0c1a1e-729a-4f9d-9961-9a090fc850f4
/dev/sda3 part crypto_LUKS a6dba0d8-5be8-4970-b1e7-a272ae0cafdd 9be0015e-cf90-4c6b-80ac-c4ea89832553 27533ab0-2616-4e49-89c3-8d7158636d30
/dev/sda4 part ext4 /nfs 4c4a923d-1562-4254-a1fa-4e761278c02f 9be0015e-cf90-4c6b-80ac-c4ea89832553 047a5693-3b2f-4586-9d9f-5b4b25680f81
/dev/sda5 part ext4 /var/lib/libvirt 4a42395e-4f9d-4056-9948-6d5d9d92d990 9be0015e-cf90-4c6b-80ac-c4ea89832553 a42060c3-33d5-4986-8a9f-41753ab83aa1
/dev/sda6 part ext2 259dac9c-f2fd-4181-a351-83603398e465 9be0015e-cf90-4c6b-80ac-c4ea89832553 630abca0-e067-44a0-bc97-8d8fac21b067
/dev/sda7 part crypto_LUKS 1b4198c9-d9b0-4c57-b9a3-3433e391e706 9be0015e-cf90-4c6b-80ac-c4ea89832553 1ffae0c8-1212-428c-bf9e-4c3e635763bb
/dev/sda8 part crypto_LUKS 3e874a28-7415-4f8c-9757-b3f28a96c4d2 9be0015e-cf90-4c6b-80ac-c4ea89832553 3af73460-4ed7-4000-b471-7c1eb35b6bc7
/dev/sr0 rom
/dev/dm-0 crypt swap [SWAP] 6d8f8998-dd20-412a-bcc2-618eed858662
/dev/dm-1 crypt ext4 / f05af948-6075-40a3-9191-354b0a0a9afc
# for uuid in $( lsblk -no UUID,PTUUID,PARTUUID ) ; do find /etc -type f | xargs grep -il $uuid ; done | sort -u
/etc/crypttab
/etc/fstab
/etc/lvm/cache/.cache
# for u in $( lsblk -no UUID,PTUUID,PARTUUID ) ; do grep -io $u var/lib/rear/layout/disklayout.conf ; done | sort -u
1b4198c9-d9b0-4c57-b9a3-3433e391e706
259dac9c-f2fd-4181-a351-83603398e465
3e874a28-7415-4f8c-9757-b3f28a96c4d2
4a42395e-4f9d-4056-9948-6d5d9d92d990
4c4a923d-1562-4254-a1fa-4e761278c02f
54fc77c5-8ec2-457f-b558-9deda3b843b2
6d8f8998-dd20-412a-bcc2-618eed858662
a6dba0d8-5be8-4970-b1e7-a272ae0cafdd
f05af948-6075-40a3-9191-354b0a0a9afc
# for uuid in $( for u in $( lsblk -no UUID,PTUUID,PARTUUID ) ; do grep -io $u var/lib/rear/layout/disklayout.conf ; done | sort -u ) ; do find /etc -type f | xargs grep -il $uuid ; done | sort -u
/etc/crypttab
/etc/fstab
/etc/lvm/cache/.cache
I ignored /etc/lvm/cache/.cache
because I think this one
is not actually relevant for "rear recover".
jsmeix commented at 2022-04-13 09:52:¶
From my current point of view this pull request should be now OK
(but I did not yet test what UUIDs are used with LVM and RAID
so some more config files may need to be added to
the default CHECK_CONFIG_FILES).
If there are no objections I would like to merge it next week
(as time permits probably on Wednesday afternoon or a bit later)
so that there will be at least a simple test during "rear recover"
that can detect if restored config files do not match what was
recreated,
cf.
https://github.com/rear/rear/issues/2787#issuecomment-1091793474
jsmeix commented at 2022-04-13 10:21:¶
Automated fixing UUIDs in restored config files
if UUIDs in restored config files do not match the UUIDS in
disklayout.conf
will not be part of this pull request.
Reasons:
In general:
First things first, one step at a time, and one after the other.
In particular:
Currently I have no idea how to implement such automated fixing.
Assume we save /etc/fstab at "rear mkrescue" time in the recovery
system
(so UUIDs in the saved fstab match those in disklayout.conf).
Assume UUIDs in the restored fstab differ.
How could we know what is the right UUID to replace a wrong UUID
without specific context knowledge of the particular file?
One of the UUIDs in disklayout.conf is the right UUID
but what is the matching wrong UUID in fstab?
I think we cannot rely on line numbers.
So I think we need specific context knowledge of the particular file.
But this requires a specific parser for each particular config file
which is not what I would implement.
Conclusion:
From my current point of view I think a test during "rear recover"
that shows such issues to the user is all what we can implement
and continuously maintain in the future with reasonable effort.
A possible further enhancement (not in this pull request):
When we save the config files in CHECK_CONFIG_FILES
at "rear mkrescue" time somewhere in the recovery system,
we could show better info to the user because we could
show which specific config file does not match
and
the user could see the differences in the file which would provide
needed file specific context to the user so it should be much easier
for the user to manually correct such issues, cf.
https://github.com/rear/rear/issues/2785#issuecomment-1091802968
pcahyna commented at 2022-04-13 16:22:¶
@jsmeix sorry for my late reaction, I took a day off yesterday. I will have a look at your changes and comment.
pcahyna commented at 2022-04-14 13:30:¶
@jsmeix
The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.
can you please give an example of this false alarm? Does this false alarm occur with the current ReaR code?
pcahyna commented at 2022-04-14 13:37:¶
@jsmeix isn't it a mistake that
layout/compare/default/510_compare_files.sh
is not executed after
restoring the backup (the checksums seem to be used only at
checklayout
)? I think it would be enough to detect the problem, and
likely other problems due to inconsistent backups as well.
bwelterl commented at 2022-04-14 13:54:¶
@jsmeix
The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.can you please give an example of this false alarm? Does this false alarm occur with the current ReaR code?
From the list of files that are checked ([b]oot/{grub.conf,menu.lst,device.map} [e]tc/grub.* [b]oot/grub/{grub.conf,grub.cfg,menu.lst,device.map} [b]oot/grub2/{grub.conf,grub.cfg,menu.lst,device.map} [e]tc/sysconfig/grub [e]tc/sysconfig/bootloader [e]tc/lilo.conf [e]tc/elilo.conf [e]tc/mtab [e]tc/fstab [e]tc/mtools.conf [e]tc/smartd.conf [e]tc/sysconfig/smartmontools [e]tc/sysconfig/rawdevices [e]tc/security/pam_mount.conf.xml [b]oot/efi///grub.cfg),some are restored but does not include automatically the UUID that should be updated (seen the issue with grub.cfg for example).
pcahyna commented at 2022-04-14 14:03:¶
From the list of files that are checked ([b]oot/{grub.conf,menu.lst,device.map} [e]tc/grub.* [b]oot/grub/{grub.conf,grub.cfg,menu.lst,device.map} [b]oot/grub2/{grub.conf,grub.cfg,menu.lst,device.map} [e]tc/sysconfig/grub [e]tc/sysconfig/bootloader [e]tc/lilo.conf [e]tc/elilo.conf [e]tc/mtab [e]tc/fstab [e]tc/mtools.conf [e]tc/smartd.conf [e]tc/sysconfig/smartmontools [e]tc/sysconfig/rawdevices [e]tc/security/pam_mount.conf.xml [b]oot/efi///grub.cfg),some are restored but does not include automatically the UUID that should be updated (seen the issue with grub.cfg for example).
Sure, but does this lead to any alarm currently?
My point is that
The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.
sounds like an explanation of the reason for this change, but is it the intent really?
jsmeix commented at 2022-04-20 08:45:¶
@pcahyna
regarding your questions in
https://github.com/rear/rear/pull/2788#issuecomment-1099189225
can you please give an example of this false alarm?
Does this false alarm occur with the current ReaR code?
Currently I cannot give an example
that a UUIDs in disklayout.conf does not appear in a config file.
On my homeoffice laptop all UUIDs in disklayout.conf
appear either in /etc/fstab or in /etc/crypttab
but none appears in both files.
There is no false alarm with the current ReaR master code
simply because there is not any alarm in ReaR master code
when UUIDs in disklayout.conf that had been in a config file
do no longer appear in a restored config file.
It is the whole purpose of this pull request to implement
to raise alarm when a UUID in disklayout.conf that had been in a config
file
does no longer appear in a restored config file.
Of course this current pull request does not prove that
a UUID in disklayout.conf that had been in several config files
still appears in all those restored config files.
Assume a UUID in disklayout.conf was in /etc/fstab and in
/etc/crypttab
but in the restored config files it appears only in /etc/fstab but
no longer in /etc/crypttab
then
the test in this current pull request detects the UUID
in the restored /etc/fstab but falsely does not raise alarm
that this UUID is missing in the resored /etc/crypttab
jsmeix commented at 2022-04-20 09:01:¶
This pull request is not meant as some final solution
that proves whether or not all restored config files
match the actually recreated system, cf.
https://github.com/rear/rear/issues/2787#issuecomment-1095029576
I like to do first things first, one step at a time, and one after the
other
in particular to get step by step sufficient understanding of an issue
to be able to develop step by step a reasonable solution in practice.
But if the final solution is the only thing that is acceptable,
someone else would have to implement it completely and properly.
My main reason is that I like to get ReaR 2.7 released
in the next weeks and not postponed indefinitely
and I have many other things to do to get there, cf.
https://github.com/rear/rear/issues/2751
jsmeix commented at 2022-04-20 09:21:¶
@pcahyna
regarding your
https://github.com/rear/rear/pull/2788#issuecomment-1099195961
isn't it a mistake that layout/compare/default/510_compare_files.sh
is not executed after restoring the backup
(the checksums seem to be used only at checklayout)?
I think it would be enough to detect the problem, and likely
other problems due to inconsistent backups as well.
Interesting idea!
We would need to run layout/compare/default/510_compare_files.sh
before any "finalize" script may modify some restored files
so we may run layout/compare/default/510_compare_files.sh
at the end of the "restore" stage e.g. as something like
usr/share/rear/restore/default/992_compare_files.sh
We would still have to enhance what there is
by default in CHECK_CONFIG_FILES
in particular add files liks /etc/fstab and /etc/crypttab
and likely some more of those generic config files.
jsmeix commented at 2022-04-20 09:26:¶
Cf. what I wote in
https://github.com/rear/rear/issues/2787#issuecomment-1092853451
For internal backup methods we may run what checklayout does
during mkbackuponly to avoid that an inconsistent backup
is taken with mkbackuponly after mkrescue.
So it seems the needed code is already there.
It only needs to be called more often
(and enhanced as needed).
pcahyna commented at 2022-04-20 09:28:¶
@jsmeix should I try to implement and test this idea? I think it would be both simpler and more reliable than what you propose in this PR.
jsmeix commented at 2022-04-20 09:30:¶
@pcahyna
I would very much appreciate it if you try to implement and test this
idea!
Thank you so much in advance!
jsmeix commented at 2022-04-20 09:39:¶
Something more related to my
https://github.com/rear/rear/issues/2787#issuecomment-1092853451
When an inconsistent backup was taken before mkrescue ...
I think we may run layout/save/default/600_snapshot_files.sh
at the end of mkbackuponly
and as counterpart
run layout/compare/default/510_compare_files.sh
at the beginning of mkrescue
to detect
when the backup that was taken before mkrescue
is now outdated
(i.e. when files in CHECK_CONFIG_FILES changed meanwhile)
so it could be inconsistent with what mkrescue
stores in
disklayout.conf.
pcahyna commented at 2022-04-20 09:56:¶
Something more related to my #2787 (comment)
When an inconsistent backup was taken before mkrescue ...
I think we may run layout/save/default/600_snapshot_files.sh at the end of
mkbackuponly
and as counterpart run layout/compare/default/510_compare_files.sh at the beginning ofmkrescue
to detect when the backup that was taken beforemkrescue
is now outdated (i.e. when files in CHECK_CONFIG_FILES changed meanwhile) so it could be inconsistent with whatmkrescue
stores in disklayout.conf.
That was the point of
https://github.com/rear/rear/issues/2787#issuecomment-1094988418
. But you would need a different place for information saved by
mkbackuponly
and checked by mkrescue
than for information saved by
mkrescue
and checked by mkbackuponly
. Otherwise you would not know
whether it is the backup or the rescue image that is out of date. (You
certainly don't need to detect and warn in mkrescue
about the
inconsistency with a previous mkrescue
run - only with a previous
mkbackuponly
run.)
jsmeix commented at 2022-04-20 10:45:¶
Only an offhanded idea:
I wonder if the most generic way to detect inconsistencies could be
to save existing var/lib/rear/layout/ and var/lib/rear/recovery/
directories
before a new layout is created e.g. in
layout/save/GNU/Linux/100_create_layout_file.sh via
mv -f $v $VAR_DIR/layout $VAR_DIR/layout.old
mv -f $v $VAR_DIR/recovery $VAR_DIR/recovery.old
and store in a file in the new created $VAR_DIR/layout
what workflow was used to create the new layout.
Then we can compare as we like the old layout with the current one
and do what is appropriate for any sequence of workflows.
Because of COPY_AS_IS=( $SHARE_DIR $VAR_DIR )
in default.conf
also $VAR_DIR/layout.old and $VAR_DIR/recovery.old get included
in the recovery system so we could even compare things
during "rear recover" and do what is appropriate and
also the user could manually compare things in the recovery system
when he needs to fix things manually.
jsmeix commented at 2022-04-20 10:53:¶
Sigh - as always offhanded ideas are incomplete:
A simple
mv -f $v $VAR_DIR/layout $VAR_DIR/layout.old
mv -f $v $VAR_DIR/recovery $VAR_DIR/recovery.old
is insufficient because when the user runs on arbitrary days
rear mkrescue
rear mkbackuponly
rear mkbackuponly
the layout at mkrescue
time is no longer saved.
pcahyna commented at 2022-04-20 10:58:¶
yes - that's why I wrote "But you would need a different place for information saved by mkbackuponly and checked by mkrescue than for information saved by mkrescue and checked by mkbackuponly"
jsmeix commented at 2022-04-20 11:00:¶
I am thinking about a generic way to detect inconsistencies
because there are many possible combinations of the
mkrescue mkbackuponly mkbackup workflows.
pcahyna commented at 2022-04-20 11:06:¶
I believe that one needs to determine what are the things that may become inconsistent and design the detection according to that. In this case, there seem to be two things that may have inconsistencies between them : backup and rescue image. It seems that it is then enough for each one to record what is current when they got created (by mkbackuponly and mkrescue respectively) and, and check what the other has recorded. So far, it does not seem that anything more generic is needed.
jsmeix commented at 2022-04-20 11:16:¶
My problem is this part in layout/save/GNU/Linux/100_create_layout_file.sh
Log "Creating layout directories (when not existing)"
mkdir -p $v $VAR_DIR/layout
mkdir -p $v $VAR_DIR/recovery
mkdir -p $v $VAR_DIR/layout/config
The hardcoded '$VAR_DIR/layout' '$VAR_DIR/recovery'
'$VAR_DIR/layout/config'
indicate this names are also used hardcoded at many other places in our
code.
Yes, I can find them in about 150 lines of code.
So I am thinking about to somehow only appropriately
move/rename those directories when the "layout/save" stage
was not run by 'mkrescue' or 'mkbackup' to keep the names
'$VAR_DIR/layout' '$VAR_DIR/recovery' '$VAR_DIR/layout/config'
only for what is created by 'mkrescue' or 'mkbackup'
i.e. for what will be used for recreating during "rear recover".
Perhaps an additional final script of the "layout/save" stage like
usr/share/rear/layout/save/default/995_move_layout_directories.sh
could do that when the "layout/save" stage
was not run by 'mkrescue' or 'mkbackup'?
jsmeix commented at 2022-04-20 11:35:¶
I think that moving away the layout base directories
'$VAR_DIR/layout' and '$VAR_DIR/recovery'
when the "layout/save" stage was not run by 'mkrescue' or 'mkbackup'
is a good thing because in this case it makes clear that there is
no up-to-date layout in $VAR_DIR on the original system.
The layout information that will be used for recreating
is then only inside the recovery system (e.g. in its ISO image).
Only when the "layout/save" stage was run by 'mkrescue' or 'mkbackup'
there is an up-to-date layout in $VAR_DIR also on the original system.
pcahyna commented at 2022-04-21 13:59:¶
@pcahyna I would very much appreciate it if you try to implement and test this idea! Thank you so much in advance!
I am working on it now.
Do you please know why does
layout/compare/default/510_compare_files.sh
save md5sums into a file
and compares the file with the original, instead of using the
md5sum -c
command, which seems suitable for this purpose?
jsmeix commented at 2022-04-22 06:56:¶
@pcahyna
the actual code in layout/compare/default/510_compare_files.sh
is dated 2011 and I don't know why the code therein is as it is.
In general feel free to also mercilessly clean up things
if you need to adapt or enhance certain code parts.
By the way:
Normally diff
is not the right tool to only compare if files differ
and diff -u
is plain wrong for that - if at all diff -s
or diff -q
may be used
# find usr/share/rear -type f | xargs grep 'diff -u'
usr/share/rear/output/default/940_grub_rescue.sh:
if ! diff -u $grub_conf $TMP_DIR/menu.lst >&2; then
usr/share/rear/output/default/940_grub2_rescue.sh:
if ! diff -u $grub_conf $generated_grub_conf >&2 ; then
usr/share/rear/layout/recreate/default/250_verify_mount.sh:
if diff -u <( df -P $TARGET_FS_ROOT ) <( df -P / ) >/dev/null ; then
usr/share/rear/layout/compare/default/510_compare_files.sh:
if ! diff -u $TMP_DIR/files.md5sum $VAR_DIR/layout/config/files.md5sum ; then
Sigh :-(
To only compare if files differ cmp -s
should be used,
e.g. see layout/compare/default/500_compare_layout.sh and
https://github.com/rear/rear/issues/1657#issuecomment-353353529
And layout/compare/default/500_compare_layout.sh
even shows an example how diff -U0
could be used.
For an md5sum -c
example you may have a look at
skel/default/etc/scripts/system-setup
pcahyna commented at 2022-04-22 09:36:¶
@jsmeix the advantage of diff -u
may be that it shows what is the
actual difference, which is what the user likes to know in this case -
cmp
would not do that.
I tried to use md5sum -c
instead of diff -u
in layout/compare
and
it seems to work, but not sure if I am going to clean this up as part of
this effort, solving the original problem has a higher priority and
there is more to do.
By the way, you proposed to run this check as
usr/share/rear/restore/default/992_compare_files.sh
. Wouldn't it be
more logical to add it to the finalize
stage as one of the first
steps, instead of adding it to the restore
stage as one of the last
steps? To me it seems that such checks belong more to the finalize
stage, but I am not that sure about the separation into stages.
pcahyna commented at 2022-04-22 09:38:¶
I think that moving away the layout base directories '$VAR_DIR/layout' and '$VAR_DIR/recovery' when the "layout/save" stage was not run by 'mkrescue' or 'mkbackup' is a good thing because in this case it makes clear that there is no up-to-date layout in $VAR_DIR on the original system. The layout information that will be used for recreating is then only inside the recovery system (e.g. in its ISO image).
Only when the "layout/save" stage was run by 'mkrescue' or 'mkbackup' there is an up-to-date layout in $VAR_DIR also on the original system.
It seems that it leads to a complicated discussion, so I would say let's focus on detecting inconsistencies during recovery first, and treat the approaches to prevent inconsistencies at backup time separately.
jsmeix commented at 2022-04-22 10:05:¶
I fully agree with focus on detecting inconsistencies during recovery
first,
and treat the approaches to prevent inconsistencies at backup time
separately.
jsmeix commented at 2022-04-22 10:14:¶
My thinking why to run this check at the end of the 'restore' stage
was
that we like to detect inconsistencies because of the restored files
so the check belongs to the 'restore' stage
i.e. it is an additional "was the restore OK?" check.
Furthermore we have the "restoreonly" workflow to restore several
backups, cf.
https://github.com/rear/rear/blob/master/doc/user-guide/11-multiple-backups.adoc
So a check to detect inconsistencies because of restored files
is perhaps better done separatedly for each individual restore.
On the other hand a check to detect inconsistencies because of restored
files
might be perhaps better done after all restores were done but then
it is not clear to the user which backup restore caused what
inconsistencies.
pcahyna commented at 2022-04-22 10:19:¶
By the way : @bwelterl 's original problem (#2785) seems to be the
volume ID of the FAT filesystem used on the EFI system partition. #2546
seems to address this case: it recreates the FAT filesystem using the
original volume ID, so no need to transform the ID using sed
, because
the recreated ID will be correct -> we get rid of one error-prone
transformation of the recreated system. (He was using ReaR 2.6, not the
development version.)
Of course, the problem will now be that even without sed
, the restored
/etc/fstab
will still be out of sync with the recovered system,
because the rescue ISO is not consistent with the backup. This IMO shows
that focusing on the files transformed using sed
is not the right
approach, sed
is not the real problem - the real problem is the
inconsistency and we should detect it regardless of whether sed
is
used on the file or not.
pcahyna commented at 2022-04-22 10:39:¶
So a check to detect inconsistencies because of restored files
is perhaps better done separatedly for each individual restore.
I am afraid that multiple backups will have problem with UUID
adjustements anyway, because the HOWTO for multiple backups instructs
the user to first do rear recover
and then rear restoreonly
, so the
finalize
steps will not run on files restored by the later
restoreonly
, and if there are UUIDs to migrate, they will not get
migrated.
Additionally, if there were UUIDs to migrate during the first recover
,
running the check at the end of each restoreonly
will cause false
alarms, because even if the md5sums matched at the beginning, after the
UUID migration (performed during finalize
) they will not match anymore
(which is fine, but there is no way to know that).
jsmeix commented at 2022-04-22 10:49:¶
Regarding
https://github.com/rear/rear/pull/2788#issuecomment-1106340261
Yes, that is what I meant in my
https://github.com/rear/rear/issues/2785#issuecomment-1091773765
which is why I did not approve
https://github.com/rear/rear/pull/2786
I also tried to explain that with long winded words in my
code comment in this pull request in particular starting at this line
https://github.com/rear/rear/blob/21f500abaec9b70d76d484a5c0ba9a8f3d44fbaa/usr/share/rear/finalize/GNU/Linux/280_migrate_uuid_tags.sh#L23
jsmeix commented at 2022-04-22 11:02:¶
Regarding
https://github.com/rear/rear/pull/2788#issuecomment-1106363648
Multiple backups work well when done correctly as described in
https://github.com/rear/rear/blob/master/doc/user-guide/11-multiple-backups.adoc
which reads (excerpt)
Recovery of that system could be done
by calling in the ReaR recovery/rescue system:
rear -C basic_system recover
rear -C home_backup restoreonly
rear -C opt_backup restoreonly
Note that system recovery with multiple backups requires
that first and foremost the basic system is recovered
where all files must be restored that are needed
to install the bootloader and to boot the basic system
into a normal usable state.
Nowadays systemd usually needs files in the /usr directory
so that in practice in particular all files in the /usr directory
must be restored during the initial basic system recovery
plus whatever else is needed to boot and run the basic system.
Multiple backups cannot be used to split the files of the basic system
into several backups. The files of the basic system must be in
one single backup and that backup must be restored during
the initial recovery of the basic system.
It could happen that all is well for rear -C basic_system recover
but a subsequent rear -C some_other_backup restoreonly
restores files that are inconsistent with the recreaded system
(e.g. because some_other_backup contains an inconsistent etc/fstab).
So a check to detect inconsistencies because of restored files
must be done separatedly for each individual restore to detect
possible
inconsistencies by a subsequent
rear -C some_other_backup restoreonly
where no 'finalize' stage is run.
In general it is a user error to have basic system files (e.g.
etc/fstab)
also in another subsequent backup so when UUIDs in etc/fstab
were modified (which is not the usual case because UUIDs are normally
recreated as they were on the original system) during rear recover
and then the check at the end of a subsequent rear restoreonly
causes alarm it is OK because "something is wrong" then (regardless
that the alarm was raised because of a different technical reason).
jsmeix commented at 2022-04-22 11:13:¶
Oops!
I think it is exactly the other way round!
Because the test does not test only the restored files of the last
restored backup
but the test tests files in a fixed list (e.g. those in
CHECK_CONFIG_FILES)
it means when that test is run also for a subsequent
rear restoreonly
and during the rear recover
before files were modified in 'finalize'
stage
then the second run of that test during subsequent rear restoreonly
will actually cause false alarm for all those modified files.
So that test must not be run at the end of the 'restore' stage
but at the beginning of the 'finalize' stage
to ensure that test is run only once
before resored files may get modified in 'finalize' stage.
This matches the intent of that test which is only to check
inconsistencies in basic system files.
pcahyna commented at 2022-04-22 11:15:¶
It could happen that all is well for
rear -C basic_system recover
but a subsequentrear -C some_other_backup restoreonly
restores files that are inconsistent with the recreaded system (e.g. because some_other_backup contains an inconsistent etc/fstab).
Then the backups are not taken according to the HOWTO that you quoted. We can detect this kind of misconfiguration, but should we?
So a check to detect inconsistencies because of restored files must be done separatedly for each individual restore to detect possible inconsistencies by a subsequent
rear -C some_other_backup restoreonly
where no 'finalize' stage is run.
This would work if we check only inconsistencies in files restored by
this particular restoreonly
run. But I am afraid that there is no
general way of knowing what files were restored by a particular run, so
we will check everything at the end of each restoreonly
run, leading
to the false alarm that I have described.
In general it is a user error to have basic system files (e.g. etc/fstab) also in another subsequent backup so when UUIDs in etc/fstab were modified (which is not the usual case because UUIDs are normally recreated as they were on the original system) during
rear recover
and then the check at the end of a subsequentrear restoreonly
causes alarm it is OK because "something is wrong" then (regardless that the alarm was raised because of a different technical reason).
The trouble is, the user will get this alarm even if all is well and fstab (or any other file with patched IDs) was not included in the subsequently restored backup.
jsmeix commented at 2022-04-22 11:17:¶
@pcahyna
I am sorry - I was confused.
You are right to run the test at the beginning of the 'finalize' stage,
see my
https://github.com/rear/rear/pull/2788#issuecomment-1106410919
pcahyna commented at 2022-04-22 11:18:¶
Regarding #2788 (comment)
Yes, that is what I meant in my #2785 (comment) which is why I did not approve #2786
I also tried to explain that with long winded words in my code comment in this pull request in particular starting at this line
Sorry, it was not meant as a critique of your approach, more of the assumptions in the original bug report. And sorry for commenting on something that you have already addressed, I lost the context of this issue a bit.
jsmeix commented at 2022-04-22 11:23:¶
@pcahyna
no problem - don't worry!
I appreciate all your comments.
I also got lost in overcomplicated thinking
and your comments helped me to find the way out :-)
It proves that such issues are actually rather complicated and need
such discussions so we together can find a solution that really works.
jsmeix commented at 2022-04-27 10:12:¶
This one is obsoleted by https://github.com/rear/rear/pull/2795
jsmeix commented at 2022-04-27 10:14:¶
I keep the branch of this pull request for now as reference
jsmeix-UUIDs-consistency-check
jsmeix commented at 2022-04-27 10:17:¶
@pcahyna
thank you so much for your
https://github.com/rear/rear/pull/2795
which looks much cleaner and simpler and more straightforward
than what I did here.
[Export of Github issue for rear/rear.]