#2831 PR merged
: Refactor rsync URL support, fixes rsync OUTPUT_URL¶
Labels: enhancement
, cleanup
, fixed / solved / done
pcahyna opened issue at 2022-06-29 10:04:¶
Pull Request Details:¶
-
Type: Bug Fix
-
Impact: Normal
-
Reference to related issue (URL): closes #2781
-
How was this pull request tested?
- recovery using
OUTPUT=USB
BACKUP=RSYNC
with arsync://...:/...
(rsync over ssh) andrsync://...::/...
(rsyncd)BACKUP_URL
- recovery using
OUTPUT=ISO
BACKUP=RSYNC
with arsync://...:/...
(rsync over ssh)BACKUP_URL
and three values ofOUTPUT_URL
:rsync://...:/...
(rsync over ssh) but different fromBACKUP_URL
,rsync://...::/...
(rsyncd), and undefined (to inheritBACKUP_URL
). Verified that the output ISO is found at the right places on the rsync server. - mkrescue and mkbackuponly to localhost with a
rsync://...:/...
(rsync over ssh)BACKUP_URL
andOUTPUT_URL
, the latter pointing to a different directory than the former. Verified that aftermkrescue
the ISO is underOUTPUT_URL
and notBACKUP_URL
.
- recovery using
-
Brief description of the changes in this pull request:
The code to parse rsync://
URLs was BACKUP_URL
specific.
If one specified BACKUP=RSYNC
and an OUTPUT_URL
different from
BACKUP_URL
,
the OUTPUT_URL
was ignored and the output files went to BACKUP_URL
.
Fix by introducing generic functions for rsync URL parsing and
use them for both BACKUP_URL
and OUTPUT_URL
, as appropriate.
Replace all uses of global RSYNC_* variables derived
from BACKUP_URL
by those functions.
jsmeix commented at 2022-06-29 11:59:¶
@pcahyna
thank you very much for that major cleanup!
Could you please provide me a summary of the advantages
but also the possibly backward incompatible changes
that could happen because of this cleanup?
I would like to add that summary to the
"New features, bigger enhancements, and possibly backward incompatible
changes"
section in the ReaR 2.7 release notes which are currently at
https://github.com/rear/rear.github.com/blob/jsmeix-release-notes-2-7/documentation/release-notes-2-7.md
pcahyna commented at 2022-06-29 15:12:¶
During the work on this PR I have found many other anomalies in the rsync code, most of them are left untouched.
- the handling of
rsync
output URL scheme is under RSYNC (https://github.com/rear/rear/tree/master/usr/share/rear/output/RSYNC/default), which is a backup method, despite the output URL being in principle independent from backup method. This leads to different code being executed forOUTPUT_URL=rsync://
depending on whetherBACKUP=RSYNC
or not. (If not, the output is handled in950_copy_result_files.sh
, which seems to handle the URL differently: https://github.com/rear/rear/blob/8b8ad86c71eaea0ad03d64f5892e3f37d797a5e0/usr/share/rear/output/default/950_copy_result_files.sh#L131 )- as a consequence, if
BACKUP=RSYNC
, the structure of the output files at arsync://
location is different. In other cases,RSYNC_PREFIX
is not respected, but in case ofBACKUP=RSYNC
,RSYNC_PREFIX
is respected. See https://github.com/rear/rear/issues/2781#issuecomment-1160172273 for details.
- as a consequence, if
- Creation of the
backup
prefix directory happens in theoutput
stage (https://github.com/rear/rear/blob/8b8ad86c71eaea0ad03d64f5892e3f37d797a5e0/usr/share/rear/output/RSYNC/default/200_make_prefix_dir.sh#L6). Ifbackup
is executed withoutoutput
(by runningrear mkbackuponly
), it crashes, because the prefix directory does not exist. (This one is fixed in the PR.) BACKUP_RSYNC_OPTIONS
inconsistencies between rsync-over-ssh and rsyncd cases. For rsyncd, they are always added to rsync options, for ssh only when creating the backup. The former actually leads to a bug, where--relative
gets added toBACKUP_RSYNC_OPTIONS
and then.autorelabel
becomes a directory (in https://github.com/rear/rear/blob/8b8ad86c71eaea0ad03d64f5892e3f37d797a5e0/usr/share/rear/backup/RSYNC/GNU/Linux/620_force_autorelabel.sh#L25), leading to selinux relabeling not working (I had to fix SELinux manually in my tests).- Some places seem to assume ssh access unconditionally, especially
500_make_rsync_backup.sh (see the
check_remote_df
,check_remote_du
functions). One instance of similar code got removed in #2797. - There is detection of xattr support on the remote server (using ssh
access)
https://github.com/rear/rear/blob/8b8ad86c71eaea0ad03d64f5892e3f37d797a5e0/usr/share/rear/prep/RSYNC/default/150_check_rsync_protocol_version.sh#L43,
but this does not seem to work: in
cd ${path} && touch .is_xattr_supported && setfattr -n user.comment -v 'File created by ReaR to test if this filesystems suppor extended attributes.' .is_xattr_supported && getfattr -n user.comment .is_xattr_supported 1>/dev/null; find .is_xattr_supported -empty -delete
, the lastfind
clobbers the exit status.
pcahyna commented at 2022-06-29 19:16:¶
Could you please provide me a summary of the advantages
but also the possibly backward incompatible changes
that could happen because of this cleanup?
I suppose you are interested in changes since 2.6, that is this change and the change that triggered #2781 combined. Ok, I will provide a summary.
pcahyna commented at 2022-06-29 19:55:¶
@jsmeix "Rsync OUTPUT_URLs are now properly supported with
BACKUP=RSYNC
. Previously the output went to the location specified by
BACKUP_URL
and OUTPUT_URL
was ignored. One exception was
OUTPUT=PXE
, where the output was uploaded to OUTPUT_URL
in addition
to BACKUP_URL
, but RSYNC_PREFIX
was not respected and the
interpretation of the URL was different - an URL of the form
rsync://[USER@]HOST[:PORT]/PATH
was interpreted as using the rsync
protocol, while in all other cases such URL would be interpreted as
using rsync over ssh. This special handling is now removed - a rsync
OUTPUT_URL
with OUTPUT=PXE
now creates the RSYNC_PREFIX
directory
at the destination and the URL is interpreted as in all other cases. "
Does it make sense? I know it is complicated. Too many special cases, fortunately it is about removing some of them...
jsmeix commented at 2022-06-30 07:56:¶
@pcahyna
thank you for your summary!
Yes, it does make sense.
didacog commented at 2022-07-05 08:16:¶
Hello, seems that you are avoiding OUTPUT_PREFIX_PXE and is not being created when set in the config.
jsmeix commented at 2022-07-05 09:02:¶
@didacog
I will act for now here for @pcahyna
until he has again time to look here
but I am neither a PXE user nor a rsync user
so I can not at all imagine what you have
in your etc/rear/local.conf
Therefore please provide here sufficient information
that I have at least a chance to understand your issue
in particular your complete etc/rear/local.conf
and your complete etc/rear/site.conf if you use one.
I need the actual ReaR config files because
I am not a DRLM user so DRLM config files are "foreign" for me.
Additionally I need a complete ReaR debugscript log file
to have a chance to deduce what happens on your system.
Only FYI:
I want to release ReaR 2.7 on the date specified on
https://github.com/rear/rear/milestones
because I already postponed it rather often and
I can't wait endlessly until every issue is solved
because I am under increasing pressure to release
a new ReaR version (regardless if it is perfect or not).
For SUSE I could fix missing things relatively easy
by SUSE maintenance updates for our SLE-HA customers
and I assume it is same for Red Hat for RHEL customers.
Other users could use our latest master code.
jsmeix commented at 2022-07-05 09:07:¶
Regarding OUTPUT_PREFIX_PXE I see no difference between
the changed code here and the master code:
localhost:~/rear.pcahyna.rsync-url-fix-refactor.pullrequest2831 # find usr/sbin/rear usr/share/rear -type f | xargs grep 'OUTPUT_PREFIX_PXE' >/tmp/OUTPUT_PREFIX_PXE.pcahyna.rsync-url-fix-refactor
localhost:~/rear.github.master # find usr/sbin/rear usr/share/rear -type f | xargs grep 'OUTPUT_PREFIX_PXE' >/tmp/OUTPUT_PREFIX_PXE.rear.github.master
# diff -s /tmp/OUTPUT_PREFIX_PXE.rear.github.master /tmp/OUTPUT_PREFIX_PXE.pcahyna.rsync-url-fix-refactor
Files /tmp/OUTPUT_PREFIX_PXE.rear.github.master and /tmp/OUTPUT_PREFIX_PXE.pcahyna.rsync-url-fix-refactor are identical
so it seems nothing was changed regarding OUTPUT_PREFIX_PXE here.
didacog commented at 2022-07-05 09:27:¶
Regarding OUTPUT_PREFIX_PXE I see no difference between the changed code here and the master code:
localhost:~/rear.pcahyna.rsync-url-fix-refactor.pullrequest2831 # find usr/sbin/rear usr/share/rear -type f | xargs grep 'OUTPUT_PREFIX_PXE' >/tmp/OUTPUT_PREFIX_PXE.pcahyna.rsync-url-fix-refactor localhost:~/rear.github.master # find usr/sbin/rear usr/share/rear -type f | xargs grep 'OUTPUT_PREFIX_PXE' >/tmp/OUTPUT_PREFIX_PXE.rear.github.master # diff -s /tmp/OUTPUT_PREFIX_PXE.rear.github.master /tmp/OUTPUT_PREFIX_PXE.pcahyna.rsync-url-fix-refactor Files /tmp/OUTPUT_PREFIX_PXE.rear.github.master and /tmp/OUTPUT_PREFIX_PXE.pcahyna.rsync-url-fix-refactor are identical
so it seems nothing was changed regarding OUTPUT_PREFIX_PXE here.
@jsmeix
I just figured it out because of this comment:
This special handling is now removed - a rsync OUTPUT_URL with OUTPUT=PXE now creates the RSYNC_PREFIX directory at the destination and the URL is interpreted as in all other cases. "
didacog commented at 2022-07-05 09:32:¶
@didacog I will act for now here for @pcahyna until he has again time to look here but I am neither a PXE user nor a rsync user so I can not at all imagine what you have in your etc/rear/local.conf
Therefore please provide here sufficient information that I have at least a chance to understand your issue in particular your complete etc/rear/local.conf and your complete etc/rear/site.conf if you use one. I need the actual ReaR config files because I am not a DRLM user so DRLM config files are "foreign" for me. Additionally I need a complete ReaR debugscript log file to have a chance to deduce what happens on your system.
Only FYI: I want to release ReaR 2.7 on the date specified on https://github.com/rear/rear/milestones because I already postponed it rather often and I can't wait endlessly until every issue is solved because I am under increasing pressure to release a new ReaR version (regardless if it is perfect or not). For SUSE I could fix missing things relatively easy by SUSE maintenance updates for our SLE-HA customers and I assume it is same for Red Hat for RHEL customers. Other users could use our latest master code.
@jsmeix
I understand, but I think this regression should be fixed before
releasing, or at least an agreement on how to solve it. Before any
change ReaR always created the PXE subfolder in the URL path, and is
important for security reasons. I'm sure nobody wants to share data over
tftp because backup and pxe outputs are in the same place ...
Hopefully we can solve this before July 12th.
Regards,
Didac
pcahyna commented at 2022-07-05 15:32:¶
@jsmeix still here... I don't think that OUTPUT_PREFIX_PXE
is the
problem (it had no special handling in the old code indeed, so it should
not need one now), but there is something else going on.
jsmeix commented at 2022-07-06 07:43:¶
@pcahyna
I appreciate it that you still participate here
but you don't need if you don't like and when you have
currently better things "to do" with "higher priority" ;-)
jsmeix commented at 2022-07-06 07:50:¶
I have a plan how to proceed with this issue:
I would like to merge it tomorrow afternoon
unless there are objections from @pcahyna or @rear/contributors
My reasoning:
In its current state it is another good step forward
towards cleaned up rsync code in ReaR.
And when it is merged it is simpler and easier to get tested
by all users who use our current GitHub master code.
Remaining issues because of our rsync code cleanup in ReaR
can and will be fixed via separated pull requests.
jsmeix commented at 2022-07-06 08:15:¶
In general regarding BACKUP=... versus OUTPUT=... :
The final goal should be that consistently and in general
for all combinations of internal BACKUP methods and OUTPUT methods
that are specified in etc/rear/local.conf like
OUTPUT=output_method
OUTPUT_URL=output_scheme://output/destination
BACKUP=backup_method
BACKUP_URL=backup_scheme://backup/destination
the RESULT_FILES (i.e. the ReaR recovery system files)
get sent to the output/destination
via the output_scheme protocol
and
the backup files (e.g. backup.tar.gz and backup.log)
get sent to the backup/destination
via the backup_scheme protocol.
Optionally (but not strictly required)
at most one of OUTPUT_URL or BACKUP_URL could be left out
and then the one that is specified determines both.
For example the usual way like
OUTPUT=output_method
BACKUP=backup_method
BACKUP_URL=backup_scheme://backup/destination
the RESULT_FILES (i.e. the ReaR recovery system files)
get sent to the backup/destination
via the backup_scheme protocol
and
the backup files (e.g. backup.tar.gz and backup.log)
get sent to the backup/destination
via the backup_scheme protocol.
But also the other way round should work like
OUTPUT=output_method
OUTPUT_URL=output_scheme://output/destination
BACKUP=backup_method
the RESULT_FILES (i.e. the ReaR recovery system files)
get sent to the output/destination
via the output_scheme protocol
and
the backup files (e.g. backup.tar.gz and backup.log)
get sent to the output/destination
via the output_scheme protocol.
External BACKUP methods can behave as they like
(i.e. whatever is reasonable for each particular external BACKUP
method).
pcahyna commented at 2022-07-07 09:51:¶
@jsmeix yes, I agree, the problem is that the current code has special
output URL handling for BACKUP=RSYNC
, which is inconsistent, since
output should be independent on backup. See
This pull request makes the code respect OUTPUT_URL
if it is not
identical to BACKUP_URL
, but does not remove this special handling.
Should it be removed entirely before 2.7 ? Would be good for
consistency, but it would mean more changes and more testing (and also
behavior changes).
pcahyna commented at 2022-07-07 09:58:¶
the problem is (...)
note that this is not the problem reported by @didacog, it is a separate problem (and not a regression, it has been always there).
jsmeix commented at 2022-07-07 10:07:¶
@pcahyna
I would prefer to not add new bigger changes for ReaR 2.7
so we could finally release it "as is" - even when 'rsync'
is currently not yet well cleaned up - but we also have some
more areas where current things in ReaR are not yet clean
(e.g. OUTPUT=USB and ReaR recovery system bootloader setup)
but nevertheless the current state should be sufficiently OK
to be released.
jsmeix commented at 2022-07-07 10:14:¶
By the way only as a side note FYI see my
https://github.com/rear/rear/pull/2577#issuecomment-789696867
and the subsequent comment there
which looks like an inconsistency to me.
I think there should not be two subtle different methods
in ReaR to do the backup via rsync and I still do not understand
the reason behind why there are two different implementations
in ReaR to do the backup via rsync - this all is too confusing.
pcahyna commented at 2022-07-07 10:14:¶
@jsmeix I understand, the only advantage would be to make all related behavior changes at the same time, so that users would not have to adapt their workflows/configs multiple times.
jsmeix commented at 2022-07-07 10:22:¶
Normally users would not have to adapt their workflows/configs multiple
times
because when a user has a working disaster recovery procedure with
ReaR,
then he should not update ReaR ("never change a working system").
In contrast when things do not work sufficiently with an older ReaR
version,
then the user should try out if a newer ReaR version works better.
Cf. the section "Version upgrades with Relax-and-Recover" in
https://en.opensuse.org/SDB:Disaster_Recovery
So we can move forward little step by little step
together with users who test each little step
which is the only way we can move forward
with our manpower at ReaR upstream.
pcahyna commented at 2022-07-07 10:25:¶
By the way only as a side note FYI see my #2577 (comment) and the subsequent comment there which looks like an inconsistency to me.
I think there should not be two subtle different methods in ReaR to do the backup via rsync and I still do not understand the reason behind why there are two different implementations in ReaR to do the backup via rsync - this all is too confusing.
Indeed, that has been very confusing to me as well, but it is yet a
different kind of confusion than the one here. (And I have thought that
the BACKUP=NETFS
and BACKUP_PROG=rsync
does not use rsync over ssh,
but local rsync to a mounted filesystem, which makes more sense: the
rsync invocation reads
$BACKUP_PROG --verbose "${BACKUP_RSYNC_OPTIONS[@]}" --one-file-system --delete \
--exclude-from=$TMP_DIR/backup-exclude.txt --delete-excluded \
$(cat $TMP_DIR/backup-include.txt) "$backuparchive" >&2
and $backuparchive
does not contain anything ssh-transport-related. )
jsmeix commented at 2022-07-07 10:34:¶
@pcahyna
for the fun of it:
It's a great relief to me to see that also you and @gdha seem to be
confused
what BACKUP=NETFS with BACKUP_PROG=rsync versus BACKUP=RSYNC actually
does
so I feel less lonely with my own confusion :-)
jsmeix commented at 2022-07-07 10:40:¶
@rear/contributors
I would like to merge it in a few hours
(at about 15:00 CEST) unless there are objections
didacog commented at 2022-07-07 12:52:¶
@jsmeix please see
https://github.com/rear/rear/issues/2781#issuecomment-1177558312
before merging!
jsmeix commented at 2022-07-07 13:01:¶
@didacog
thank you for debugging and for your fix!
Of course I will wait now until @pcahyna had a look.
So I will not merge it this week.
Perhaps next week if all goes well
(I am not in the office tomorrow).
jsmeix commented at 2022-07-11 11:08:¶
@pcahyna
THANK YOU SO MUCH!
I think I can feel the "URI/URL parsing hell" a bit ;-)
I was in there when I cleaned up the url_* functions :-)
(cf.
https://github.com/rear/rear/issues/856
and others)
jsmeix commented at 2022-07-11 11:11:¶
According to
https://github.com/rear/rear/issues/2781#issuecomment-1180233302
the issue
https://github.com/rear/rear/issues/2781
is fixed with the latest commit here
https://github.com/rear/rear/pull/2831/commits/e7556a13eb95f21b5fb749c089facaf8678f3cd4
so I would like to merge this pull request tomorrow afternoon
unless there are objections by @rear/contributors
pcahyna commented at 2022-07-12 18:25:¶
By the way, I tested the code after the last commit again by backing
up/recovering to/from a rsync://
BACKUP_URL (only rsync over ssh was
tested though) and several possibilities for OUTPUT_URL. All the tests
pass.
[Export of Github issue for rear/rear.]