#2402 PR merged: Updated generated locale filename from "rear.UTF-8" to "en_US.UTF-8" (BACKUP=BORG)

Labels: cleanup, fixed / solved / done, minor bug

gozora opened issue at 2020-05-21 10:57:

Pull Request Details:
  • Type: Bug Fix

  • Impact: Normal

  • Reference to related issue (URL): N/A

  • How was this pull request tested?
    Full backup/restore

  • Brief description of the changes in this pull request:
    During restore with BACKUP=BORG messages as follow could be observed:

=== Borg archives list ===
Location:   backup
Repository: /mnt/rear/borg/node1

[1] HAEnode1_1  Thu, 2020-05-21 10:40:15

[2] Exit

Choose archive to recover from
(timeout 300 seconds)
1
Recovering from backup archive /mnt/rear/borg/node1::HAEnode1_1 on backup
Remote: bash: warning: setlocale: LC_ALL: cannot change locale (rear.UTF-8)
...

Remote: bash: warning: setlocale: LC_ALL: cannot change locale (rear.UTF-8) message is shown when server site does not have rear.UTF-8 available.
Changing locale name to standard en_US.UTF-8 will fix this problem.

One can workaround this message by generating rear.UTF-8 locales on destination host running Borg.

# localedef -f UTF-8 -i en_US "/usr/lib/locale/rear.UTF-8"

flyinggreenfrog commented at 2020-05-25 04:51:

I could reproduce that issue. Fix in this PR looks good to me, and should be merged/accepted imho.

One other workaround would be to disable the forwarding of the LC_ALL variable in /etc/ssh/ssh_config:

# This enables sending locale enviroment variables LC_* LANG, see ssh_config(5).
    SendEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
    SendEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
    SendEnv LC_IDENTIFICATION LC_ALL

e.g. change last line to SendEnv LC_IDENTIFICATION, that is remove LC_ALL.

flyinggreenfrog commented at 2020-05-25 05:59:

Just finished a recover test with that commit, works as expected for me. No Remote: bash: warning: setlocale: LC_ALL: cannot change locale (rear.UTF-8) warning appeared.

gozora commented at 2020-05-25 06:28:

@flyinggreenfrog thaks for you review!

V.

gozora commented at 2020-05-25 06:39:

One other workaround would be to disable the forwarding of the LC_ALL variable in /etc/ssh/ssh_config:

@flyinggreenfrog did you actually try this?

V.

flyinggreenfrog commented at 2020-05-25 06:47:

One other workaround would be to disable the forwarding of the LC_ALL variable in /etc/ssh/ssh_config:

@flyinggreenfrog did you actually try this?

Yes, I tried it and it worked too.

Editing /etc/ssh/ssh_config in the rescue system with removing LC_ALL worked.

And also editing /etc/ssh/ssh_config in the original system, then running rear mkbackup (with the result LC_ALL removed also in the rescue system), worked then of course too.

jsmeix commented at 2020-05-25 07:55:

I have a general question about Borg out of curiosity:

I am not a Borg user so I don't know how Borg actually works.

I wonder why it seems the Borg behaviour depends on the locale.

I would understand that only a backup program user interface
depends on the locale of the user who runs the program
(e.g. different language for user dialog messages).

I would expect that things work when a backup program is called
to make a backup under a different locale than the locale that
another user uses when calling the program to restore the backup.

I.e. I would expect that Borg can be called under a UTF8 locale
to make a backup and later Borg can be called under POSIX locale
to restore that backup.

But it seems the locale when Borg is called to restore must match
the locale that was used when Borg was called to make a backup.
If yes, this looks like a plain wrong behaviour of Borg to me.

The borg_extract function contains a comment
https://github.com/rear/rear/blob/master/usr/share/rear/lib/borg-functions.sh#L142

# Scope of LC_ALL is only within run of `borg extract'.
# This avoids Borg problems with restoring UTF-8 encoded files names in archive
# and should not interfere with remaining stages of rear recover.

which looks scaring and plain wrong inside Borg because
file names must be in any case restored byte-by-byte
exactly same as they have been on the original system
(same as any file content that must be restored byte-by-byte
exactly same as it was on the original system).

How file names (and file contents) are restored must never ever
depend on the locale that is used while the restore program runs,
cf. the section "Non-ASCII characters in file names"
in
https://en.opensuse.org/SDB:Plain_Text_versus_Locale
that contains

For the operating system a filename
is a plain sequence of bytes
without any additional information what
characters are meant by this sequence of bytes

So any backup program must store file names as a plain sequence of bytes
and restore the files with the exact same sequence of bytes as their name.
A reason behind is that e.g. different users can use different locales
and have their own file names in their own different localizations.
After a backup restore operation all file names must appear
exactly same for all the users as they have been before.

flyinggreenfrog commented at 2020-05-25 08:07:

I have a general question about Borg out of curiosity:

I am not a Borg user so I don't know how Borg actually works.

I wonder why it seems the Borg behaviour depends on the locale.

Was that not already discussed/explained in https://github.com/borgbackup/borg/issues/1702#issuecomment-253190651?

jsmeix commented at 2020-05-25 10:40:

I know about
https://github.com/borgbackup/borg/issues/1702#issuecomment-253190651
but this did not explain anything to me.
It even confirms from my point of view that Borg seems
to mess around with things (user's sacrosanct content)
that it should not at all touch.

jsmeix commented at 2020-05-25 10:42:

I cannot review Borg related things in ReaR when the reasoning behind
is some Borg specific exceptional stuff that I do not understand.

jsmeix commented at 2020-05-25 10:45:

If Borg needs dirty hacks in ReaR to make Borg work in ReaR
it is perfectly fine and much appreciated to have such dirty hacks,
cf. "Dirty hacks welcome" in
https://github.com/rear/rear/wiki/Coding-Style

gozora commented at 2020-05-25 13:11:

@jsmeix I'm afraid that in terms of Borg I'm just a simple user, without any deeper background on how and why things work in certain way in Borg. Same as with majority of Linux programs I use, I just want blindly trust that authors knows what they are doing, because getting details on every (or even some) application I'm working with is simply surpasses my available brain and time capacity ;-) ...

V.


[Export of Github issue for rear/rear.]