#3240 PR merged: Improve Bareos integration

Labels: enhancement, external tool

joergsteffens opened issue at 2024-06-07 21:10:

Relax-and-Recover (ReaR) Pull Request Template

Please fill in the following items before submitting a new pull request:

Pull Request Details:
  • Type: Enhancement

  • Impact: Normal

  • Reference to related issue (URL):

  • How was this pull request tested?

    • manually this Debian 11/12 an Bareos 21 and 23.
  • Description of the changes in this pull request:

    • mostly modernizing the Bareos integration, making it more robust.
    • this should only affect the Bareos bconsole integration, not the Bareos bextract integration. I've not tested the bextract integration, but fixed an obvious typo in it, therefore I doubt that is has been working before.
    • "Automatic Recovery" is working with these changes.
  • Questions

    • The restore in the original code did start rear_shell at the end of the restore. I added an optional execution to of rear_shell also to 410_restore_backup_bconsole.sh. I could not figure out (by comparing to other integrations) if this is a good approach or not. A hint about the best practice to finish a restore would be welcome.
    • Is there a function to determine how much data have been written to the disks? I mean something like df, but it should sum up all mounts below /mnt/local
  • Remark:

    • as already discussed with @gdha we receiving reports of Bareos/Rear users getting hit by #2901. While this problem is long been solved in master, the fix is not in the latest release rear-2.7. A new release would be very welcome.

gdha commented at 2024-06-10 09:39:

@joergsteffens Reviewed the code updates (quite a bit a must say). The updates seems a serious improvement over the current bareos code base and are most welcome.
If you do not care about distro's without systemd then I would merge your code updates as is if there are no comments from my fellow @rear/contributors (by the end of this week)?

joergsteffens commented at 2024-06-10 12:48:

@joergsteffens Are you sure that Bareos will start properly on older distro's? (without systemd)

Good point. No, it will not, but we oldest release Bareos supports is bareos-21 and the oldest distros bareos-21 supports are CentOS_7, Debian_10, Fedora_34, RHEL_7, SLE_12_SP5, Univention_4.4, openSUSE_Leap_15.2, xUbuntu_18.04
All of them to support systemd. So from Bareos point of view this is fine.

But it might be a good idea to add a check for systemd in prep. Is there a function or variable I can query to determine if the target rescue system will contain systemd?

gdha commented at 2024-06-10 13:20:

@joergsteffens If the system where you initiate rear mkrescue on contains systemd then the ReaR ISO image will also contain systemd. ReaR v2.6 and higher have systemd services on-board by default.

jsmeix commented at 2024-06-10 14:11:

In general regarding things like:

Will Bareos start properly on older distro's without systemd?

and

might be a good idea to add a check for systemd in pre

I would recommend against such environmental checks.

Instead I recommend in general
to first and foremost only check "the real thing"
and usually leave environmental things to the user.

In this case here it would mean:

First and foremost only check that
Bareos is working properly as actually needed
and if not clearly tell the user about that
without additional checks what environmental thing
might be the cause that Bareos is not working properly.

Ideally show an original Bareos error message to the user
and leave it to the user to make sense of it what the
actual root cause is in his particular environment.

When this is properly implemented, then as an optional
separated additional step there could be environmental
checks to provide hints to the user what the actual
root cause could be in his particular environment.

In particular in this case here:
The systemd environment in the ReaR recovery system
is rather different (in particular rather minimal)
compared to the systemd environment on the original system.
So a simple check whether or not systemd is running
basically tells nothing whether or not some specific
systemd service will actually work within the
ReaR recovery system environment.
So you may need rather sophisticated checks to really
find out whether or not the root cause is in systemd
when Bareos is not working properly as actually needed.

joergsteffens commented at 2024-06-13 15:58:

I think, I covered most/all of the feedback. However the changes are pretty much untested. I'll test them soon, hopefully tomorrow.

jsmeix commented at 2024-06-17 08:24:

Only a tiny generic text style side note:

Usually one should not use "please" in user messages
because computers (i.e. machines) should not beg
the user (i.e. humans) to do something.
When machines talk in some kind of personal tone
to the user it could be perceived as intrusive.

Better simple and straightforward wording,
for example like:

Error "Could not determine this system as Bareos client (no BAREOS_CLIENT specified)"

Here I also omitted to tell the user which config file to use
because the user could not only use local.conf but also
site.conf or via '-C' whatever other config file.

Same for things like

Error "Could not determine which restore job to use. Use BAREOS_RESTORE_JOB to specify it."

Error "Could not determine which restore fileset to use. Use BAREOS_FILESET to configure it."

LogPrint "
Verify that the backup has been restored correctly to '$TARGET_FS_ROOT'
...

schlomo commented at 2024-06-18 04:51:

@joergsteffens thanks a lot for the effort you put into this! let me know when I should take the next look

gdha commented at 2024-06-19 08:52:

@joergsteffens Please make bconsole the default setting in https://github.com/joergsteffens/rear/blob/bareos-improvements/usr/share/rear/conf/default.conf#L2806 (of variable BAREOS_RESTORE_MODE). Schlomo told me that you might drop the bextract restore (as nobody uses it and therefore, never has been tested properly). The ReaR Automated Testing framework also used only the bconsoleway of working.
That would also reduce the complexity around the REQUIRED_PROGS_BAREOS section as we could get rid of the two extra variables REQUIRED_PROGS_BAREOS_BEXTRACT and REQUIRED_PROGS_BAREOS_BCONSOLE.

Furthermore, which versions of Bareos are supported with this update? Are these all tested? We could list them up in the Test Matrix of 3.0.

schlomo commented at 2024-06-19 08:54:

Good thoughts, can we state (and maybe check for) a minimum Bareos version that we want to support? And use that to get rid of some tech debt?

joergsteffens commented at 2024-06-19 10:11:

Bareos supports the current and the last 2 major releases, so currently bareos-21, bareos-22 and bareos-23. I've tested Debian 11 with bareos-21 and Debian 12 with bareos-23.

I'll remove the bextract parts with this PR and also do some other improvements soon, in the next days.

schlomo commented at 2024-06-19 10:15:

Thank you @joergsteffens for taking the time to make ReaR and Bareos a really great team!

joergsteffens commented at 2024-06-19 10:17:

One question: is there a better (easy) way to run a program as another user? I looks that for su (and sudo) I probably need to also include PAM modules and configuration.
Currently I plan to add

chmod u+s,g+s ${ROOTFS_DIR}/bin/bconsole

during build.

Reason is that in future version bconsole might limit commands when running as root.

schlomo commented at 2024-06-19 10:22:

@joergsteffens I'd suggest using chroot which actually includes the ability to change the user and should be already part of the ReaR rescue system.

And I just noticed that we already copy all users and groups by default, according to https://github.com/rear/rear/blob/c2b9a4fe1fcf634c7312ee4def5a6dcf248fccba/usr/share/rear/conf/default.conf#L1976-L1980

joergsteffens commented at 2024-06-24 10:31:

So, with my latest commits, I hope I've now covered all remarks. Feel free to review it again.

@schlomo thank you for the hint about using chroot. This did work well, but in the end it made the code more complicated, so I decided to use another approach, without running as another user.

joergsteffens commented at 2024-06-24 17:18:

I think, I covered all suggestions.
Having said this, the github action build currently fails in my fork. Error message:

 ********** Copying dist to dist-all/opensuse-leap-15
********** registry.suse.com/suse/sle15             **********
ERROR: Cannot find required programs: ps
Some messages from /var/tmp/rear.hgyQWImzgw7JXIO/tmp/rear.dump.stdout_stderr since the last called script 950_check_missing_programs.sh:
  /rear/usr/share/rear/lib/_input-output-functions.sh: line 581: type: ps: not found
  /rear/usr/share/rear/lib/_input-output-functions.sh: line 581: type: ps: not found

see https://github.com/joergsteffens/rear/actions/runs/9648693605

I don't see how this is related to my changes.
I've seen, there have recently been some github actions cleanup (removing some SUSE distros). Outside of this PR I checked it after rebasing on master. However, a similar error did appear (this time on SLES15.6).

I've not rebased this PR so far to make the review not more complicated.
Just tell me, if I can do a rebase and if you think the error is related to my changes.

joergsteffens commented at 2024-06-24 17:31:

Feel free to merge it!

schlomo commented at 2024-06-24 19:45:

Thank you very - very!!! - much @joergsteffens for this truly great contribution. I took the liberty of aggregating the commit message to cover what IMHO are the relevant changes here.

gdha commented at 2024-06-25 06:07:

@joergsteffens @schlomo I will test this via my ReaR Automating Testing platform to be sure it works as designed as it is a massive change. Thanks for the hard work @joergsteffens and thank you @schlomo for the excellent and thorough review work.

schlomo commented at 2024-06-25 06:48:

I fixed the SLE15 build: https://github.com/rear/rear/actions/runs/9657102631

gdha commented at 2024-06-25 09:13:

@joergsteffens I performed the first test with an unsupported BAREOS version 19 - IMHO the error output should mention that we are using an unsupported version for the restore job - see https://gist.github.com/gdha/76e2275f04e89b17ae3c3cea155e201a for the debug output, local.conf used and bareos backup script.

schlomo commented at 2024-06-25 09:18:

@gdha nice test! Maybe on some occasion you can also add the automatic Bareos configuration which should pick up the client, restore job and fileset automatically if there is only one for each.

joergsteffens commented at 2024-06-25 11:53:

n> @joergsteffens I performed the first test with an unsupported BAREOS version 19 - IMHO the error output should mention that we are using an unsupported version for the restore job - see https://gist.github.com/gdha/76e2275f04e89b17ae3c3cea155e201a for the debug output, local.conf used and bareos backup script.

I see. I use functionality that has been introduced with bareos-20, see https://github.com/bareos/bareos/pull/574
Relevant is the version of the Bareos Director, not the local File Daemon.

I see 3 options:

  • do nothing, as bareos-19 is not supported by bareos.com anymore
  • add a Director version check and exit with an proper error if the Director is too old (for that, could you (@gdha) please issue the commands: .api json and than version on the Director and send me the results? Just to make sure, that the version command supports the json output mode.)
  • implement a work around. This is harder than I first thought. However the code to skip the restore_job tests, if BAREOS_RESTORE_JOB has been set manually and Director version is less than 20 (or get_available_restore_job_names fails. Getting all job names and check against them is also easy.).

schlomo commented at 2024-06-25 12:32:

I'm in favor of actively checking for the minimum supported Bareos version and erroring out during the prep stage if ReaR won't work properly.

The general philosophy of ReaR is that if you could create a rescue media than there is a very high probability that you can use it for a successful recovery. Therefore we need to error out in prep for as many known problems as possible.

gdha commented at 2024-06-25 14:04:

@joergsteffens I tested it as follow:

$ cat bareos_test.sh
#
# Bareos version extracting with json api
#
DISPENSABLE_OUTPUT_DEV="null"

source ./bareos-functions.sh
source ./array-functions.sh
source ./_input-output-functions.sh

BAREOS_CLIENT=client-fd

echo "-------------------------------------------"
bcommand_json "version" | jq -c '.result[].version'
echo "-------------------------------------------"

and the output is:

-------------------------------------------
"19.2.7"
-------------------------------------------

I think we could add a short prep script to test the Bareos version and bail out if it is lower then 21?

schlomo commented at 2024-06-25 14:09:

great use case for rear shell :-)

joergsteffens commented at 2024-06-25 15:04:

Makes sense. On pre-releases the version number has an postfix text, like: 24.0.0~pre668.a159a82bd. So best only check for the first number before the dot.
I'll do so in a separate PR.


[Export of Github issue for rear/rear.]