#2232 PR closed: Add possiblity to restore filesystems in parallel with TSM.

Labels: enhancement, external tool, no-pr-activity

cookie33 opened issue at 2019-09-11 10:16:

Relax-and-Recover (ReaR) Pull Request Template

Pull Request Details:
  • Type: Enhancement

  • Impact: Normal

  • Reference to related issue (URL):

  • How was this pull request tested?

A restore of a SLES12SP3 system was done with this version of the TSM restore script with the parallel mode set to true.

  • Brief description of the changes in this pull request:
  • Add a new parameter to differentiatie between normal (OLD) serial behaviour and parallel mode. Default is serial behaviour
  • make the TSM restore of a filesystem a function and call it either serial of in parallel

rear-asm4.log

cookie33 commented at 2019-09-11 10:24:

Extra actions done before the restore worked on sles12sp3:

  • copy /usr/lib64/libsnapper.so* to rescue image after boot from it
  • copy /usr/lib64/libboost.so* to rescue image after boot from it
  • copy /usr/lib64/libbtrfs.so* to rescue image after boot from it
  • copy /usr/lib64/libstdc++* to rescue image after boot from it
  • set LD_LIBRARY_PATH before rear recover to:
/usr/lib/usr/lib64:/opt/tivoli/tsm/client/ba/bin:/opt/tivoli/tsm/client/api/bin64:/opt/tivoli/tsm/client/api/bin:/opt/tivoli/tsm/client/api/bin64/cit/bin

gdha commented at 2019-09-13 06:59:

Extra actions done before the restore worked on sles12sp3:
* copy /usr/lib64/libsnapper.so* to rescue image after boot from it
* copy /usr/lib64/libboost.so* to rescue image after boot from it
* copy /usr/lib64/libbtrfs.so* to rescue image after boot from it
* copy /usr/lib64/libstdc++* to rescue image after boot from it
* set LD_LIBRARY_PATH before rear recover to:
/usr/lib/usr/lib64:/opt/tivoli/tsm/client/ba/bin:/opt/tivoli/tsm/client/api/bin64:/opt/tivoli/tsm/client/api/bin:/opt/tivoli/tsm/client/api/bin64/cit/bin

A few things as remark and/or comments:

  • these libraries were automatically copied to the rescue image, right?
  • could you verify script /usr/share/rear/prep/TSM/default/400_prep_tsm.sh as it defines TSM_LD_LIBRARY_PATH=$TSM_LD_LIBRARY_PATH:$gsk_dir
  • perhaps you could write this variable to the $ROOTFS_DIR//etc/rear/rescue.conf file in above mentioned prep script:
echo "TSM_LD_LIBRARY_PATH=\"$TSM_LD_LIBRARY_PATH:$gsk_dir\"" >> $ROOTFS_DIR//etc/rear/rescue.conf
  • give it a try and if it works add it to the PR as we are not able to test the PR due to lack of HW

jsmeix commented at 2019-09-13 12:21:

@schabrolles
could you please review this one because neither I nor @gdha
have TSM so that we cannot actually review it.

jsmeix commented at 2019-09-13 12:40:

@cookie33
I do not have TSM but out of curiosity
I wonder how the messages look on the terminal
while several dsmc restore processes are running in parallel.
Does that look somewhat confusing or perhaps even messed up?

jsmeix commented at 2019-09-13 12:44:

I think each dsmc restore process needs its own
separated backup_restore_log_file because otherwise
the error handling does no longer work correctly.

schabrolles commented at 2019-09-13 15:20:

@jsmeix, I’m currently in vacation till the end of the month. I will try to do my best in october, but will be busy with client on-site requests.

jsmeix commented at 2019-09-14 10:49:

@schabrolles
take your time (this is an enhancement for "ReaR future")
and thank you in advance!

I am also not in the office currently and for some more weeks
so that I cannot do much for ReaR.
In particular I cannot try out or test anything for ReaR.
I expect to be back in the office at about beginning of October.
I also expect that I have to do first and foremost other stuff with higher priority.

jsmeix commented at 2019-09-14 11:26:

@cookie33 @schabrolles
I am wondering about another possible generic issue with parallel restores.

In the section "Running Multiple Backups and Restores in Parallel" in
https://github.com/rear/rear/blob/master/doc/user-guide/11-multiple-backups.adoc
I wrote in particular (excerpt a bit modified here)

system recovery with multiple backups requires that
first and foremost the basic system is recovered
where all files must be restored that are needed
to ... [get] ... the basic system into a normal usable state

One reason is that in particular the tree of directories
of the basic system must have been restored as a precondition
that subsequent backup restore operations can succeed.

The concern is that subsequent backup restore operations may fail
or restore incorrectly when basic system directories are not yet there.

For example assume the files in /opt/mystuff/ are in a separated backup.
When the files of the basic system (in this example the /opt/ directory)
are restored in parallel with the separated backup of /opt/mystuff/
it may happen that the files in /opt/mystuff/ are restored before
the /opt/ directory was restored.

The concern is that it is not clear what the final result is in that case.

Perhaps it fails to restore the files in /opt/mystuff/ when /opt/ is not yet there?

Perhaps it does not fail to restore the files in /opt/mystuff/ when /opt/ is not yet there
but it creates the missing /opt/ directory with fallback owner/group/permissions/ACLs/...
that may differ from what /op/ had on the original system?

So the concern with multiple backup restores in parallel is
how to ensure that the final overall backup restore result
always matches exactly what there was on the original system.

gdha commented at 2020-02-21 12:36:

@schabrolles Could you please review this PR for a moment and give @cookie33 the feedback?

github-actions commented at 2020-06-27 01:33:

Stale pull request message


[Export of Github issue for rear/rear.]