#2788 PR closed: Detect by UUIDs if restored config files do not match what was recreated

Labels: enhancement, won't fix / can't fix / obsolete

jsmeix opened issue at 2022-04-11 13:17:

During "rear mkrescue/mkbackup/mkbackuponly/savelayout"
save which UUIDs in disklayout.conf appear
in a config file in CHECK_CONFIG_FILES
which is used for comparison during "rear recover"
if those UUIDs in disklayout.conf still appear
in a restored config file in CHECK_CONFIG_FILES
in the recreated system under /mnt/local

The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.

jsmeix commented at 2022-04-11 13:36:

The default CHECK_CONFIG_FILES is insufficient
(in particular /etc/fstab is missing)
so that needs to be adapted to make it useful

For my test I have in my etc/rear/local.conf

CHECK_CONFIG_FILES+=( /etc )

On my rather slow Intel i3 Laptop with a spinning disk
I did a test after a reeboot so that the file buffer caches are cold
and usr/share/rear/layout/save/default/600_snapshot_files.sh
did run from 2022-04-12 10:42:49.598645397
until 2022-04-12 10:42:55.001197954
so about 5 seconds to check whole /etc/ with cold caches.

I got this /var/tmp/rear.GSxN9PoPwczFS6f/rootfs/etc/rear/rescue.conf
(excerpt):

# The following line was added by layout/save/default/600_snapshot_files.sh
DISKLAYOUT_UUIDS_IN_CONFIG_FILES='259dac9c-f2fd-4181-a351-83603398e465 4a42395e-4f9d-4056-9948-6d5d9d92d990 4c4a923d-1562-4254-a1fa-4e761278c02f 54fc77c5-8ec2-457f-b558-9deda3b843b2 6d8f8998-dd20-412a-bcc2-618eed858662 a6dba0d8-5be8-4970-b1e7-a272ae0cafdd f05af948-6075-40a3-9191-354b0a0a9afc '

The lines with a UUID in disklayout.conf

# grep "uuid=[[:alnum:]]" var/lib/rear/layout/disklayout.conf

fs /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 / ext4 uuid=f05af948-6075-40a3-9191-354b0a0a9afc label= blocksize=4096 reserved_blocks=4% max_mounts=-1 check_interval=0d bytes_per_inode=16383 default_mount_options=user_xattr,acl options=rw,relatime
fs /dev/sda4 /nfs ext4 uuid=4c4a923d-1562-4254-a1fa-4e761278c02f label= blocksize=4096 reserved_blocks=5% max_mounts=-1 check_interval=0d bytes_per_inode=16384 default_mount_options=user_xattr,acl options=rw,relatime
fs /dev/sda5 /var/lib/libvirt ext4 uuid=4a42395e-4f9d-4056-9948-6d5d9d92d990 label= blocksize=4096 reserved_blocks=1% max_mounts=-1 check_interval=0d bytes_per_inode=16384 default_mount_options=user_xattr,acl options=rw,relatime
#fs /dev/sda6 /other ext2 uuid=259dac9c-f2fd-4181-a351-83603398e465 label= blocksize=4096 reserved_blocks=4% max_mounts=-1 check_interval=0d bytes_per_inode=16384 default_mount_options=user_xattr,acl options=rw,relatime
swap /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 uuid=6d8f8998-dd20-412a-bcc2-618eed858662 label=
crypt /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3 /dev/sda3 type=luks1 cipher=aes-xts-plain64 key_size=256 hash=sha256 uuid=a6dba0d8-5be8-4970-b1e7-a272ae0cafdd 
crypt /dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2 /dev/sda2 type=luks1 cipher=aes-xts-plain64 key_size=256 hash=sha256 uuid=54fc77c5-8ec2-457f-b558-9deda3b843b2

The UUIDs in files in /etc:

# for u in f05af948-6075-40a3-9191-354b0a0a9afc 4c4a923d-1562-4254-a1fa-4e761278c02f 4a42395e-4f9d-4056-9948-6d5d9d92d990 259dac9c-f2fd-4181-a351-83603398e465 6d8f8998-dd20-412a-bcc2-618eed858662 a6dba0d8-5be8-4970-b1e7-a272ae0cafdd 54fc77c5-8ec2-457f-b558-9deda3b843b2 ; do find /etc -type f | xargs grep $u ; echo ; done

/etc/fstab:UUID=f05af948-6075-40a3-9191-354b0a0a9afc  /                 ext4  acl,user_xattr  0  1
/etc/lvm/cache/.cache:          "/dev/disk/by-uuid/f05af948-6075-40a3-9191-354b0a0a9afc",

/etc/fstab:UUID=4c4a923d-1562-4254-a1fa-4e761278c02f  /nfs              ext4  defaults        0  2
/etc/lvm/cache/.cache:          "/dev/disk/by-uuid/4c4a923d-1562-4254-a1fa-4e761278c02f",

/etc/fstab:UUID=4a42395e-4f9d-4056-9948-6d5d9d92d990  /var/lib/libvirt  ext4  defaults        0  2
/etc/lvm/cache/.cache:          "/dev/disk/by-uuid/4a42395e-4f9d-4056-9948-6d5d9d92d990",

/etc/lvm/cache/.cache:          "/dev/disk/by-uuid/259dac9c-f2fd-4181-a351-83603398e465",

/etc/fstab:UUID=6d8f8998-dd20-412a-bcc2-618eed858662  swap              swap  defaults        0  0
/etc/lvm/cache/.cache:          "/dev/disk/by-uuid/6d8f8998-dd20-412a-bcc2-618eed858662",

/etc/crypttab:cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3  UUID=a6dba0d8-5be8-4970-b1e7-a272ae0cafdd
/etc/lvm/cache/.cache:          "/dev/disk/by-uuid/a6dba0d8-5be8-4970-b1e7-a272ae0cafdd",

/etc/crypttab:cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2  UUID=54fc77c5-8ec2-457f-b558-9deda3b843b2
/etc/lvm/cache/.cache:          "/dev/disk/by-uuid/54fc77c5-8ec2-457f-b558-9deda3b843b2",

My block devices:

# lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT,UUID

NAME                                     KNAME     PKNAME    TRAN   TYPE  FSTYPE      LABEL   SIZE MOUNTPOINT       UUID
/dev/sda                                 /dev/sda            sata   disk                    465.8G                  
|-/dev/sda1                              /dev/sda1 /dev/sda         part                        8M                  
|-/dev/sda2                              /dev/sda2 /dev/sda         part  crypto_LUKS           4G                  54fc77c5-8ec2-457f-b558-9deda3b843b2
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part2
|                                        /dev/dm-1 /dev/sda2        crypt swap                  4G [SWAP]           6d8f8998-dd20-412a-bcc2-618eed858662
|-/dev/sda3                              /dev/sda3 /dev/sda         part  crypto_LUKS         200G                  a6dba0d8-5be8-4970-b1e7-a272ae0cafdd
| `-/dev/mapper/cr_ata-TOSHIBA_MQ01ABF050_Y2PLP02CT-part3
|                                        /dev/dm-0 /dev/sda3        crypt ext4                200G /                f05af948-6075-40a3-9191-354b0a0a9afc
|-/dev/sda4                              /dev/sda4 /dev/sda         part  ext4                100G /nfs             4c4a923d-1562-4254-a1fa-4e761278c02f
|-/dev/sda5                              /dev/sda5 /dev/sda         part  ext4                150G /var/lib/libvirt 4a42395e-4f9d-4056-9948-6d5d9d92d990
|-/dev/sda6                              /dev/sda6 /dev/sda         part  ext2                  8G /other           259dac9c-f2fd-4181-a351-83603398e465
|-/dev/sda7                              /dev/sda7 /dev/sda         part  crypto_LUKS           1G                  1b4198c9-d9b0-4c57-b9a3-3433e391e706
`-/dev/sda8                              /dev/sda8 /dev/sda         part  crypto_LUKS           1G                  3e874a28-7415-4f8c-9757-b3f28a96c4d2
/dev/sr0                                 /dev/sr0            sata   rom                      1024M

This are quick raw results from my very first test.
I did not check any details.

jsmeix commented at 2022-04-11 13:45:

I will continue tomorrow with the "rear recover" part (as time permits).

jsmeix commented at 2022-04-12 10:13:

I will test now the "rear recover" part...

jsmeix commented at 2022-04-12 12:27:

Now "rear recover" reports when a UUID that was in at least one of the config files
in CHECK_CONFIG_FILES at the time when disklayout.conf was created
is not in at least one of the restored config files in CHECK_CONFIG_FILES.

I did this "rear mkbackup" and "rear recover" test here
on a different system (now on two same VMs)
than the above so UUIDs are different here.

I tested it by modifying a UUID in DISKLAYOUT_UUIDS_IN_CONFIG_FILES
in /etc/rear/rescue.conf in the ReaR recovery system before running "rear recover"
and got on the terminal

Running 'finalize' stage ======================
...
UUID QQQe2NM5J-brmg-6kic-exST-oBwb-GY2h-KREbYJ not found in a restored config file (likely this must be manually corrected)

(I prepended the UUID with QQQ in /etc/rear/rescue.conf).

jsmeix commented at 2022-04-12 12:29:

@pcahyna @rear/contributors
could you please have a look here and tell me what you think about it.

See in particular my explanatory comment about it at the beginning of
usr/share/rear/finalize/GNU/Linux/280_migrate_uuid_tags.sh

jsmeix commented at 2022-04-12 12:55:

@bwelterl
if you like you may try out what I have here up to now.

jsmeix commented at 2022-04-13 09:38:

Via
https://github.com/rear/rear/pull/2788/commits/21f500abaec9b70d76d484a5c0ba9a8f3d44fbaa
I explained in default.conf that CHECK_CONFIG_FILES is also used during "rear recover"
to detect if restored config files do not match what was recreated
and
I added /etc/fstab and /etc/crypttab to the default CHECK_CONFIG_FILES
because they can contain UUIDs that are relevant for "rear recover"
according to what I currently get on my homeoffice laptop:

# lsblk -ipo KNAME,TYPE,FSTYPE,MOUNTPOINT,UUID,PTUUID,PARTUUID

KNAME     TYPE  FSTYPE      MOUNTPOINT       UUID                                 PTUUID                               PARTUUID
/dev/sda  disk                                                                    9be0015e-cf90-4c6b-80ac-c4ea89832553 
/dev/sda1 part                                                                    9be0015e-cf90-4c6b-80ac-c4ea89832553 4a795619-730c-4e74-854b-59ea03e4f1ab
/dev/sda2 part  crypto_LUKS                  54fc77c5-8ec2-457f-b558-9deda3b843b2 9be0015e-cf90-4c6b-80ac-c4ea89832553 ab0c1a1e-729a-4f9d-9961-9a090fc850f4
/dev/sda3 part  crypto_LUKS                  a6dba0d8-5be8-4970-b1e7-a272ae0cafdd 9be0015e-cf90-4c6b-80ac-c4ea89832553 27533ab0-2616-4e49-89c3-8d7158636d30
/dev/sda4 part  ext4        /nfs             4c4a923d-1562-4254-a1fa-4e761278c02f 9be0015e-cf90-4c6b-80ac-c4ea89832553 047a5693-3b2f-4586-9d9f-5b4b25680f81
/dev/sda5 part  ext4        /var/lib/libvirt 4a42395e-4f9d-4056-9948-6d5d9d92d990 9be0015e-cf90-4c6b-80ac-c4ea89832553 a42060c3-33d5-4986-8a9f-41753ab83aa1
/dev/sda6 part  ext2                         259dac9c-f2fd-4181-a351-83603398e465 9be0015e-cf90-4c6b-80ac-c4ea89832553 630abca0-e067-44a0-bc97-8d8fac21b067
/dev/sda7 part  crypto_LUKS                  1b4198c9-d9b0-4c57-b9a3-3433e391e706 9be0015e-cf90-4c6b-80ac-c4ea89832553 1ffae0c8-1212-428c-bf9e-4c3e635763bb
/dev/sda8 part  crypto_LUKS                  3e874a28-7415-4f8c-9757-b3f28a96c4d2 9be0015e-cf90-4c6b-80ac-c4ea89832553 3af73460-4ed7-4000-b471-7c1eb35b6bc7
/dev/sr0  rom                                                                                                          
/dev/dm-0 crypt swap        [SWAP]           6d8f8998-dd20-412a-bcc2-618eed858662                                      
/dev/dm-1 crypt ext4        /                f05af948-6075-40a3-9191-354b0a0a9afc

# for uuid in $( lsblk -no UUID,PTUUID,PARTUUID ) ; do find /etc -type f | xargs grep -il $uuid ; done | sort -u

/etc/crypttab
/etc/fstab
/etc/lvm/cache/.cache

# for u in $( lsblk -no UUID,PTUUID,PARTUUID ) ; do grep -io $u var/lib/rear/layout/disklayout.conf ; done | sort -u

1b4198c9-d9b0-4c57-b9a3-3433e391e706
259dac9c-f2fd-4181-a351-83603398e465
3e874a28-7415-4f8c-9757-b3f28a96c4d2
4a42395e-4f9d-4056-9948-6d5d9d92d990
4c4a923d-1562-4254-a1fa-4e761278c02f
54fc77c5-8ec2-457f-b558-9deda3b843b2
6d8f8998-dd20-412a-bcc2-618eed858662
a6dba0d8-5be8-4970-b1e7-a272ae0cafdd
f05af948-6075-40a3-9191-354b0a0a9afc

# for uuid in $( for u in $( lsblk -no UUID,PTUUID,PARTUUID ) ; do grep -io $u var/lib/rear/layout/disklayout.conf ; done | sort -u ) ; do find /etc -type f | xargs grep -il $uuid ; done | sort -u

/etc/crypttab
/etc/fstab
/etc/lvm/cache/.cache

I ignored /etc/lvm/cache/.cache because I think this one
is not actually relevant for "rear recover".

jsmeix commented at 2022-04-13 09:52:

From my current point of view this pull request should be now OK
(but I did not yet test what UUIDs are used with LVM and RAID
so some more config files may need to be added to
the default CHECK_CONFIG_FILES).

If there are no objections I would like to merge it next week
(as time permits probably on Wednesday afternoon or a bit later)
so that there will be at least a simple test during "rear recover"
that can detect if restored config files do not match what was recreated,
cf. https://github.com/rear/rear/issues/2787#issuecomment-1091793474

jsmeix commented at 2022-04-13 10:21:

Automated fixing UUIDs in restored config files
if UUIDs in restored config files do not match the UUIDS in disklayout.conf
will not be part of this pull request.

Reasons:

In general:
First things first, one step at a time, and one after the other.

In particular:
Currently I have no idea how to implement such automated fixing.
Assume we save /etc/fstab at "rear mkrescue" time in the recovery system
(so UUIDs in the saved fstab match those in disklayout.conf).
Assume UUIDs in the restored fstab differ.
How could we know what is the right UUID to replace a wrong UUID
without specific context knowledge of the particular file?
One of the UUIDs in disklayout.conf is the right UUID
but what is the matching wrong UUID in fstab?
I think we cannot rely on line numbers.
So I think we need specific context knowledge of the particular file.
But this requires a specific parser for each particular config file
which is not what I would implement.

Conclusion:
From my current point of view I think a test during "rear recover"
that shows such issues to the user is all what we can implement
and continuously maintain in the future with reasonable effort.

A possible further enhancement (not in this pull request):
When we save the config files in CHECK_CONFIG_FILES
at "rear mkrescue" time somewhere in the recovery system,
we could show better info to the user because we could
show which specific config file does not match
and
the user could see the differences in the file which would provide
needed file specific context to the user so it should be much easier
for the user to manually correct such issues, cf.
https://github.com/rear/rear/issues/2785#issuecomment-1091802968

pcahyna commented at 2022-04-13 16:22:

@jsmeix sorry for my late reaction, I took a day off yesterday. I will have a look at your changes and comment.

pcahyna commented at 2022-04-14 13:30:

@jsmeix

The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.

can you please give an example of this false alarm? Does this false alarm occur with the current ReaR code?

pcahyna commented at 2022-04-14 13:37:

@jsmeix isn't it a mistake that layout/compare/default/510_compare_files.sh is not executed after restoring the backup (the checksums seem to be used only at checklayout)? I think it would be enough to detect the problem, and likely other problems due to inconsistent backups as well.

bwelterl commented at 2022-04-14 13:54:

@jsmeix

The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.

can you please give an example of this false alarm? Does this false alarm occur with the current ReaR code?

From the list of files that are checked ([b]oot/{grub.conf,menu.lst,device.map} [e]tc/grub.* [b]oot/grub/{grub.conf,grub.cfg,menu.lst,device.map} [b]oot/grub2/{grub.conf,grub.cfg,menu.lst,device.map} [e]tc/sysconfig/grub [e]tc/sysconfig/bootloader [e]tc/lilo.conf [e]tc/elilo.conf [e]tc/mtab [e]tc/fstab [e]tc/mtools.conf [e]tc/smartd.conf [e]tc/sysconfig/smartmontools [e]tc/sysconfig/rawdevices [e]tc/security/pam_mount.conf.xml [b]oot/efi///grub.cfg),some are restored but does not include automatically the UUID that should be updated (seen the issue with grub.cfg for example).

pcahyna commented at 2022-04-14 14:03:

From the list of files that are checked ([b]oot/{grub.conf,menu.lst,device.map} [e]tc/grub.* [b]oot/grub/{grub.conf,grub.cfg,menu.lst,device.map} [b]oot/grub2/{grub.conf,grub.cfg,menu.lst,device.map} [e]tc/sysconfig/grub [e]tc/sysconfig/bootloader [e]tc/lilo.conf [e]tc/elilo.conf [e]tc/mtab [e]tc/fstab [e]tc/mtools.conf [e]tc/smartd.conf [e]tc/sysconfig/smartmontools [e]tc/sysconfig/rawdevices [e]tc/security/pam_mount.conf.xml [b]oot/efi///grub.cfg),some are restored but does not include automatically the UUID that should be updated (seen the issue with grub.cfg for example).

Sure, but does this lead to any alarm currently?

My point is that

The main reason is to avoid false alarm
when UUIDs in disklayout.conf do not appear in a config file
on the original system - then those UUIDs do not need to
be checked in the recreated system.

sounds like an explanation of the reason for this change, but is it the intent really?

jsmeix commented at 2022-04-20 08:45:

@pcahyna
regarding your questions in
https://github.com/rear/rear/pull/2788#issuecomment-1099189225

can you please give an example of this false alarm?
Does this false alarm occur with the current ReaR code?

Currently I cannot give an example
that a UUIDs in disklayout.conf does not appear in a config file.
On my homeoffice laptop all UUIDs in disklayout.conf
appear either in /etc/fstab or in /etc/crypttab
but none appears in both files.

There is no false alarm with the current ReaR master code
simply because there is not any alarm in ReaR master code
when UUIDs in disklayout.conf that had been in a config file
do no longer appear in a restored config file.

It is the whole purpose of this pull request to implement
to raise alarm when a UUID in disklayout.conf that had been in a config file
does no longer appear in a restored config file.

Of course this current pull request does not prove that
a UUID in disklayout.conf that had been in several config files
still appears in all those restored config files.

Assume a UUID in disklayout.conf was in /etc/fstab and in /etc/crypttab
but in the restored config files it appears only in /etc/fstab but
no longer in /etc/crypttab
then
the test in this current pull request detects the UUID
in the restored /etc/fstab but falsely does not raise alarm
that this UUID is missing in the resored /etc/crypttab

jsmeix commented at 2022-04-20 09:01:

This pull request is not meant as some final solution
that proves whether or not all restored config files
match the actually recreated system, cf.
https://github.com/rear/rear/issues/2787#issuecomment-1095029576

I like to do first things first, one step at a time, and one after the other
in particular to get step by step sufficient understanding of an issue
to be able to develop step by step a reasonable solution in practice.

But if the final solution is the only thing that is acceptable,
someone else would have to implement it completely and properly.

My main reason is that I like to get ReaR 2.7 released
in the next weeks and not postponed indefinitely
and I have many other things to do to get there, cf.
https://github.com/rear/rear/issues/2751

jsmeix commented at 2022-04-20 09:21:

@pcahyna
regarding your
https://github.com/rear/rear/pull/2788#issuecomment-1099195961

isn't it a mistake that layout/compare/default/510_compare_files.sh
is not executed after restoring the backup
(the checksums seem to be used only at checklayout)?
I think it would be enough to detect the problem, and likely
other problems due to inconsistent backups as well.

Interesting idea!

We would need to run layout/compare/default/510_compare_files.sh
before any "finalize" script may modify some restored files
so we may run layout/compare/default/510_compare_files.sh
at the end of the "restore" stage e.g. as something like
usr/share/rear/restore/default/992_compare_files.sh

We would still have to enhance what there is
by default in CHECK_CONFIG_FILES
in particular add files liks /etc/fstab and /etc/crypttab
and likely some more of those generic config files.

jsmeix commented at 2022-04-20 09:26:

Cf. what I wote in
https://github.com/rear/rear/issues/2787#issuecomment-1092853451

For internal backup methods we may run what checklayout does
during mkbackuponly to avoid that an inconsistent backup
is taken with mkbackuponly after mkrescue.

So it seems the needed code is already there.
It only needs to be called more often
(and enhanced as needed).

pcahyna commented at 2022-04-20 09:28:

@jsmeix should I try to implement and test this idea? I think it would be both simpler and more reliable than what you propose in this PR.

jsmeix commented at 2022-04-20 09:30:

@pcahyna
I would very much appreciate it if you try to implement and test this idea!
Thank you so much in advance!

jsmeix commented at 2022-04-20 09:39:

Something more related to my
https://github.com/rear/rear/issues/2787#issuecomment-1092853451
When an inconsistent backup was taken before mkrescue ...

I think we may run layout/save/default/600_snapshot_files.sh
at the end of mkbackuponly and as counterpart
run layout/compare/default/510_compare_files.sh
at the beginning of mkrescue to detect
when the backup that was taken before mkrescue is now outdated
(i.e. when files in CHECK_CONFIG_FILES changed meanwhile)
so it could be inconsistent with what mkrescue stores in disklayout.conf.

pcahyna commented at 2022-04-20 09:56:

Something more related to my #2787 (comment) When an inconsistent backup was taken before mkrescue ...

I think we may run layout/save/default/600_snapshot_files.sh at the end of mkbackuponly and as counterpart run layout/compare/default/510_compare_files.sh at the beginning of mkrescue to detect when the backup that was taken before mkrescue is now outdated (i.e. when files in CHECK_CONFIG_FILES changed meanwhile) so it could be inconsistent with what mkrescue stores in disklayout.conf.

That was the point of https://github.com/rear/rear/issues/2787#issuecomment-1094988418 . But you would need a different place for information saved by mkbackuponly and checked by mkrescue than for information saved by mkrescue and checked by mkbackuponly. Otherwise you would not know whether it is the backup or the rescue image that is out of date. (You certainly don't need to detect and warn in mkrescue about the inconsistency with a previous mkrescue run - only with a previous mkbackuponly run.)

jsmeix commented at 2022-04-20 10:45:

Only an offhanded idea:

I wonder if the most generic way to detect inconsistencies could be
to save existing var/lib/rear/layout/ and var/lib/rear/recovery/ directories
before a new layout is created e.g. in
layout/save/GNU/Linux/100_create_layout_file.sh via

mv -f $v $VAR_DIR/layout $VAR_DIR/layout.old
mv -f $v $VAR_DIR/recovery $VAR_DIR/recovery.old

and store in a file in the new created $VAR_DIR/layout
what workflow was used to create the new layout.

Then we can compare as we like the old layout with the current one
and do what is appropriate for any sequence of workflows.

Because of COPY_AS_IS=( $SHARE_DIR $VAR_DIR ) in default.conf
also $VAR_DIR/layout.old and $VAR_DIR/recovery.old get included
in the recovery system so we could even compare things
during "rear recover" and do what is appropriate and
also the user could manually compare things in the recovery system
when he needs to fix things manually.

jsmeix commented at 2022-04-20 10:53:

Sigh - as always offhanded ideas are incomplete:

A simple

mv -f $v $VAR_DIR/layout $VAR_DIR/layout.old
mv -f $v $VAR_DIR/recovery $VAR_DIR/recovery.old

is insufficient because when the user runs on arbitrary days

rear mkrescue
rear mkbackuponly
rear mkbackuponly

the layout at mkrescue time is no longer saved.

pcahyna commented at 2022-04-20 10:58:

yes - that's why I wrote "But you would need a different place for information saved by mkbackuponly and checked by mkrescue than for information saved by mkrescue and checked by mkbackuponly"

jsmeix commented at 2022-04-20 11:00:

I am thinking about a generic way to detect inconsistencies
because there are many possible combinations of the
mkrescue mkbackuponly mkbackup workflows.

pcahyna commented at 2022-04-20 11:06:

I believe that one needs to determine what are the things that may become inconsistent and design the detection according to that. In this case, there seem to be two things that may have inconsistencies between them : backup and rescue image. It seems that it is then enough for each one to record what is current when they got created (by mkbackuponly and mkrescue respectively) and, and check what the other has recorded. So far, it does not seem that anything more generic is needed.

jsmeix commented at 2022-04-20 11:16:

My problem is this part in layout/save/GNU/Linux/100_create_layout_file.sh

Log "Creating layout directories (when not existing)"
mkdir -p $v $VAR_DIR/layout
mkdir -p $v $VAR_DIR/recovery
mkdir -p $v $VAR_DIR/layout/config

The hardcoded '$VAR_DIR/layout' '$VAR_DIR/recovery' '$VAR_DIR/layout/config'
indicate this names are also used hardcoded at many other places in our code.
Yes, I can find them in about 150 lines of code.

So I am thinking about to somehow only appropriately
move/rename those directories when the "layout/save" stage
was not run by 'mkrescue' or 'mkbackup' to keep the names
'$VAR_DIR/layout' '$VAR_DIR/recovery' '$VAR_DIR/layout/config'
only for what is created by 'mkrescue' or 'mkbackup'
i.e. for what will be used for recreating during "rear recover".

Perhaps an additional final script of the "layout/save" stage like
usr/share/rear/layout/save/default/995_move_layout_directories.sh
could do that when the "layout/save" stage
was not run by 'mkrescue' or 'mkbackup'?

jsmeix commented at 2022-04-20 11:35:

I think that moving away the layout base directories
'$VAR_DIR/layout' and '$VAR_DIR/recovery'
when the "layout/save" stage was not run by 'mkrescue' or 'mkbackup'
is a good thing because in this case it makes clear that there is
no up-to-date layout in $VAR_DIR on the original system.
The layout information that will be used for recreating
is then only inside the recovery system (e.g. in its ISO image).

Only when the "layout/save" stage was run by 'mkrescue' or 'mkbackup'
there is an up-to-date layout in $VAR_DIR also on the original system.

pcahyna commented at 2022-04-21 13:59:

@pcahyna I would very much appreciate it if you try to implement and test this idea! Thank you so much in advance!

I am working on it now.

Do you please know why does layout/compare/default/510_compare_files.sh save md5sums into a file and compares the file with the original, instead of using the md5sum -c command, which seems suitable for this purpose?

jsmeix commented at 2022-04-22 06:56:

@pcahyna
the actual code in layout/compare/default/510_compare_files.sh
is dated 2011 and I don't know why the code therein is as it is.

In general feel free to also mercilessly clean up things
if you need to adapt or enhance certain code parts.

By the way:
Normally diff is not the right tool to only compare if files differ
and diff -u is plain wrong for that - if at all diff -s or diff -q may be used

# find usr/share/rear -type f | xargs grep 'diff -u'

usr/share/rear/output/default/940_grub_rescue.sh:
if ! diff -u $grub_conf $TMP_DIR/menu.lst >&2; then

usr/share/rear/output/default/940_grub2_rescue.sh:
    if ! diff -u $grub_conf $generated_grub_conf >&2 ; then

usr/share/rear/layout/recreate/default/250_verify_mount.sh:
if diff -u <( df -P $TARGET_FS_ROOT ) <( df -P / ) >/dev/null ; then

usr/share/rear/layout/compare/default/510_compare_files.sh:
    if ! diff -u $TMP_DIR/files.md5sum $VAR_DIR/layout/config/files.md5sum ; then

Sigh :-(

To only compare if files differ cmp -s should be used,
e.g. see layout/compare/default/500_compare_layout.sh and
https://github.com/rear/rear/issues/1657#issuecomment-353353529

And layout/compare/default/500_compare_layout.sh
even shows an example how diff -U0 could be used.

For an md5sum -c example you may have a look at
skel/default/etc/scripts/system-setup

pcahyna commented at 2022-04-22 09:36:

@jsmeix the advantage of diff -u may be that it shows what is the actual difference, which is what the user likes to know in this case - cmp would not do that.
I tried to use md5sum -c instead of diff -u in layout/compare and it seems to work, but not sure if I am going to clean this up as part of this effort, solving the original problem has a higher priority and there is more to do.
By the way, you proposed to run this check as usr/share/rear/restore/default/992_compare_files.sh. Wouldn't it be more logical to add it to the finalize stage as one of the first steps, instead of adding it to the restore stage as one of the last steps? To me it seems that such checks belong more to the finalize stage, but I am not that sure about the separation into stages.

pcahyna commented at 2022-04-22 09:38:

I think that moving away the layout base directories '$VAR_DIR/layout' and '$VAR_DIR/recovery' when the "layout/save" stage was not run by 'mkrescue' or 'mkbackup' is a good thing because in this case it makes clear that there is no up-to-date layout in $VAR_DIR on the original system. The layout information that will be used for recreating is then only inside the recovery system (e.g. in its ISO image).

Only when the "layout/save" stage was run by 'mkrescue' or 'mkbackup' there is an up-to-date layout in $VAR_DIR also on the original system.

It seems that it leads to a complicated discussion, so I would say let's focus on detecting inconsistencies during recovery first, and treat the approaches to prevent inconsistencies at backup time separately.

jsmeix commented at 2022-04-22 10:05:

I fully agree with focus on detecting inconsistencies during recovery first,
and treat the approaches to prevent inconsistencies at backup time separately.

jsmeix commented at 2022-04-22 10:14:

My thinking why to run this check at the end of the 'restore' stage was
that we like to detect inconsistencies because of the restored files
so the check belongs to the 'restore' stage
i.e. it is an additional "was the restore OK?" check.

Furthermore we have the "restoreonly" workflow to restore several backups, cf.
https://github.com/rear/rear/blob/master/doc/user-guide/11-multiple-backups.adoc
So a check to detect inconsistencies because of restored files
is perhaps better done separatedly for each individual restore.
On the other hand a check to detect inconsistencies because of restored files
might be perhaps better done after all restores were done but then
it is not clear to the user which backup restore caused what inconsistencies.

pcahyna commented at 2022-04-22 10:19:

By the way : @bwelterl 's original problem (#2785) seems to be the volume ID of the FAT filesystem used on the EFI system partition. #2546 seems to address this case: it recreates the FAT filesystem using the original volume ID, so no need to transform the ID using sed, because the recreated ID will be correct -> we get rid of one error-prone transformation of the recreated system. (He was using ReaR 2.6, not the development version.)

Of course, the problem will now be that even without sed, the restored /etc/fstab will still be out of sync with the recovered system, because the rescue ISO is not consistent with the backup. This IMO shows that focusing on the files transformed using sed is not the right approach, sed is not the real problem - the real problem is the inconsistency and we should detect it regardless of whether sed is used on the file or not.

pcahyna commented at 2022-04-22 10:39:

So a check to detect inconsistencies because of restored files
is perhaps better done separatedly for each individual restore.

I am afraid that multiple backups will have problem with UUID adjustements anyway, because the HOWTO for multiple backups instructs the user to first do rear recover and then rear restoreonly, so the finalize steps will not run on files restored by the later restoreonly, and if there are UUIDs to migrate, they will not get migrated.
Additionally, if there were UUIDs to migrate during the first recover, running the check at the end of each restoreonly will cause false alarms, because even if the md5sums matched at the beginning, after the UUID migration (performed during finalize) they will not match anymore (which is fine, but there is no way to know that).

jsmeix commented at 2022-04-22 10:49:

Regarding
https://github.com/rear/rear/pull/2788#issuecomment-1106340261

Yes, that is what I meant in my
https://github.com/rear/rear/issues/2785#issuecomment-1091773765
which is why I did not approve
https://github.com/rear/rear/pull/2786

I also tried to explain that with long winded words in my
code comment in this pull request in particular starting at this line
https://github.com/rear/rear/blob/21f500abaec9b70d76d484a5c0ba9a8f3d44fbaa/usr/share/rear/finalize/GNU/Linux/280_migrate_uuid_tags.sh#L23

jsmeix commented at 2022-04-22 11:02:

Regarding
https://github.com/rear/rear/pull/2788#issuecomment-1106363648

Multiple backups work well when done correctly as described in
https://github.com/rear/rear/blob/master/doc/user-guide/11-multiple-backups.adoc
which reads (excerpt)

Recovery of that system could be done
by calling in the ReaR recovery/rescue system:

rear -C basic_system recover
rear -C home_backup restoreonly
rear -C opt_backup restoreonly

Note that system recovery with multiple backups requires
that first and foremost the basic system is recovered
where all files must be restored that are needed
to install the bootloader and to boot the basic system
into a normal usable state.

Nowadays systemd usually needs files in the /usr directory
so that in practice in particular all files in the /usr directory
must be restored during the initial basic system recovery
plus whatever else is needed to boot and run the basic system.

Multiple backups cannot be used to split the files of the basic system
into several backups. The files of the basic system must be in
one single backup and that backup must be restored during
the initial recovery of the basic system.

It could happen that all is well for rear -C basic_system recover
but a subsequent rear -C some_other_backup restoreonly
restores files that are inconsistent with the recreaded system
(e.g. because some_other_backup contains an inconsistent etc/fstab).

So a check to detect inconsistencies because of restored files
must be done separatedly for each individual restore to detect possible
inconsistencies by a subsequent rear -C some_other_backup restoreonly
where no 'finalize' stage is run.

In general it is a user error to have basic system files (e.g. etc/fstab)
also in another subsequent backup so when UUIDs in etc/fstab
were modified (which is not the usual case because UUIDs are normally
recreated as they were on the original system) during rear recover
and then the check at the end of a subsequent rear restoreonly
causes alarm it is OK because "something is wrong" then (regardless
that the alarm was raised because of a different technical reason).

jsmeix commented at 2022-04-22 11:13:

Oops!

I think it is exactly the other way round!

Because the test does not test only the restored files of the last restored backup
but the test tests files in a fixed list (e.g. those in CHECK_CONFIG_FILES)
it means when that test is run also for a subsequent rear restoreonly
and during the rear recover before files were modified in 'finalize' stage
then the second run of that test during subsequent rear restoreonly
will actually cause false alarm for all those modified files.

So that test must not be run at the end of the 'restore' stage
but at the beginning of the 'finalize' stage
to ensure that test is run only once
before resored files may get modified in 'finalize' stage.

This matches the intent of that test which is only to check
inconsistencies in basic system files.

pcahyna commented at 2022-04-22 11:15:

It could happen that all is well for rear -C basic_system recover but a subsequent rear -C some_other_backup restoreonly restores files that are inconsistent with the recreaded system (e.g. because some_other_backup contains an inconsistent etc/fstab).

Then the backups are not taken according to the HOWTO that you quoted. We can detect this kind of misconfiguration, but should we?

So a check to detect inconsistencies because of restored files must be done separatedly for each individual restore to detect possible inconsistencies by a subsequent rear -C some_other_backup restoreonly where no 'finalize' stage is run.

This would work if we check only inconsistencies in files restored by this particular restoreonly run. But I am afraid that there is no general way of knowing what files were restored by a particular run, so we will check everything at the end of each restoreonly run, leading to the false alarm that I have described.

In general it is a user error to have basic system files (e.g. etc/fstab) also in another subsequent backup so when UUIDs in etc/fstab were modified (which is not the usual case because UUIDs are normally recreated as they were on the original system) during rear recover and then the check at the end of a subsequent rear restoreonly causes alarm it is OK because "something is wrong" then (regardless that the alarm was raised because of a different technical reason).

The trouble is, the user will get this alarm even if all is well and fstab (or any other file with patched IDs) was not included in the subsequently restored backup.

jsmeix commented at 2022-04-22 11:17:

@pcahyna
I am sorry - I was confused.

You are right to run the test at the beginning of the 'finalize' stage, see my
https://github.com/rear/rear/pull/2788#issuecomment-1106410919

pcahyna commented at 2022-04-22 11:18:

Regarding #2788 (comment)

Yes, that is what I meant in my #2785 (comment) which is why I did not approve #2786

I also tried to explain that with long winded words in my code comment in this pull request in particular starting at this line

https://github.com/rear/rear/blob/21f500abaec9b70d76d484a5c0ba9a8f3d44fbaa/usr/share/rear/finalize/GNU/Linux/280_migrate_uuid_tags.sh#L23

Sorry, it was not meant as a critique of your approach, more of the assumptions in the original bug report. And sorry for commenting on something that you have already addressed, I lost the context of this issue a bit.

jsmeix commented at 2022-04-22 11:23:

@pcahyna
no problem - don't worry!
I appreciate all your comments.

I also got lost in overcomplicated thinking
and your comments helped me to find the way out :-)

It proves that such issues are actually rather complicated and need
such discussions so we together can find a solution that really works.

jsmeix commented at 2022-04-27 10:12:

This one is obsoleted by https://github.com/rear/rear/pull/2795

jsmeix commented at 2022-04-27 10:14:

I keep the branch of this pull request for now as reference
jsmeix-UUIDs-consistency-check

jsmeix commented at 2022-04-27 10:17:

@pcahyna
thank you so much for your https://github.com/rear/rear/pull/2795
which looks much cleaner and simpler and more straightforward
than what I did here.


[Export of Github issue for rear/rear.]