#841 Issue closed: Implement rear recovery system update support (my next SUSE hackweek project)

Labels: enhancement, fixed / solved / done

jsmeix opened issue at 2016-05-17 15:27:

I like to implement a "rear update" workflow
that is initially intended for the following use-case:

In the rear recovery system one can run "rear update"
which can download and install a tar.gz archive
that gets installed into the running rear recovery system.

This way a rear recovery system can be updated
if needed during run-time without the need to
create a new recovery system ISO image
on the original system via "rear mkrescue".

The basic idea behind is to implement the same
kind of functionality as the so called "dud" provides
for SUSE installation systems, cf. "DUD" and "dud" at
https://en.opensuse.org/SDB:Linuxrc

In particular my use-case will be to update the
rear configuration files in a running rear recovery system.

This way I can use one same fixed rear recovery system
for various sufficiently similar systems (i.e. systems
where the same rear recovery system works but
the only differences are in the rear configuration files).

This should in particular mitigate the problem that
UEFI bootable ISO images are much bigger than
traditional BIOS bootable ISO images, cf.
https://github.com/rear/rear/issues/810#issuecomment-205783287

An UEFI bootable ISO image is much bigger (almost 450MiB)
than a traditional BIOS bootable ISO image which is less than
120 MiB for me.

Assume one has 100 servers with identical hardware of type A
and 200 servers with identical hardware of type B, then
(I hope) it is possible to have only one bootable ISO image
for all type A servers and one for all type B servers.

For recovery of a type A server one boots the type A
ISO image on type A replacement hardware and
runs in the rear recovery system first and foremost
something like

rear update server_ID

which downloads a tar.gz archive that matches server_ID
(e.g. from a fixed predefined HTTP or NFS server
that is stored in the currently used local.conf file)
or explicitly something like

rear update http://my_internal_server/server_ID.tar.gz

that contains the rear configuration files that belong
to server_ID.

Afterwards one runs as usual

rear recover

to recover exactly that server_ID machine.

jsmeix commented at 2016-06-22 08:47:

First things first
(or "avoid RFC 1925 item 5"):

For the SUSE Hackweek 14 I will implement only to
download rear configuration files in the recovery system.

See
https://hackweek.suse.com/14/projects/1508

Reason:

A generic recovery system update is complicated
because to be generic it must also work to update
the currently running scripts (i.e. via "rear update"
it must also work to update in particular the currently
running /usr/sbin/rear and also all scripts of the
currently running "update" workflow).

Because I have currently no use-case to update anything
in the recovery system I like to keep it simple and do
first things first and implement only my current use-case.

When I have some specific "download and install" functionality
working in the "recover" workflow, I can later step by step
enhance it as actually needed towards more and more
generic recovery system update functionality.

Outlook:

I think generic recovery system update functionality
is useful:

Example:

Assume one has 100 servers with identical hardware of type A
and matching bootable ISO image for them.

Assume the replacement hardware for type A servers
is step by step changed from identical type A hardware
to new type C hardware that is similar but a bit different
so that the existing type A bootable ISO images do no
longer work.

It would be nice if one could still have one single
recovery system ISO image that works for both
the old type A replacement hardware and also
for the new type C replacement hardware.

Then generic recovery system update functionality
could help to update the running recovery system
during "rear recover" so that it also works even
on the new type C replacement hardware.

didacog commented at 2016-06-22 09:00:

@jsmeix
Regarding updating rear config files at recovery time, with DRLM running the recover workflow:

# rear recover SERVER=X.X.X.X REST_OPTS=-k ID=drlm_cli_name

http://docs.drlm.org/en/latest/Restore.html#step-by-step-client-recover

It downloads rear configuration from DRLM server and start recovery with the latest configuration from DRLM.

Just needed to use a properly configured DRLM server to do this ;)

jsmeix commented at 2016-06-22 09:34:

@didacog
many thanks for your hint!

Thanks to free software I will shamelessly and selfishly
try to re-use your code for my own purposes ;-)

didacog commented at 2016-06-22 09:44:

@jsmeix

You can take a look at: https://github.com/rear/rear/pull/522/commits

This was the code regarding DRLM integration in rear: issue #522

Hope this helps! ;)

jsmeix commented at 2016-06-22 09:56:

Just an untested idea
how to get the recovery more automated:

In https://hackweek.suse.com/14/projects/1508 I wrote

rear recover HOSTNAME

i.e. an explicitly specified HOSTNAME is needed.

But that contradicts the basic idea of rear
that it should be possible to do the recovery
by a (relatively) unexperienced person
so that the experienced admin can relax
after he had set it up.

Therefore I think about a variable in local.conf like

RECOVERY_CONFIG_URL="http://my_internal_server/$MAC_ADDRESS.tar.gz"

or

RECOVERY_CONFIG_URL="http://my_internal_server/$HOSTNAME.tar.gz"

that evaluates in the recovery system to the mac address
or the hostname of the host where "rear recover" runs.

This way (I hope) it is possible that the admin could provide
appropriate rear configs on the HTML server in advance
so that plain "rear recover" works as intended - provided also
things like DHCP setup of the recovery system networking
were appropriately prepared in advance (this cannot work
with random mac addresses or hostnames in the recovery
system).

Next week I will see what I can get easily.

jsmeix commented at 2016-06-28 14:40:

My very first steps for download config in
https://github.com/jsmeix/rear/tree/first_steps_for_download_config_issue841
work for me on SLE12 with ext4:

On the original system with hostname 'e229'
I have in etc/rear/local.conf:

OUTPUT=ISO
BACKUP=NETFS
BACKUP_OPTIONS="nfsvers=3,nolock"
BACKUP_URL=nfs://10.160.4.244/nfs
NETFS_KEEP_OLD_BACKUP_COPY=yes
SSH_ROOT_PASSWORD="rear"
USE_DHCLIENT="yes"
KEEP_BUILD_DIR=""
RECOVERY_CONFIG_URL="http://caps.suse.de/$HOSTNAME.rear_config.tar.gz"
REQUIRED_PROGS=( "${REQUIRED_PROGS[@]}" curl )

On the original system I made a e229.rear_config.tar.gz with

tar -czf e229.rear_config.tar.gz etc/rear/ var/lib/rear/recovery var/lib/rear/layout

that I copied onto my HTTP server caps.suse.de
as /srv/www/htdocs/e229.rear_config.tar.gz

Then in the recovery system it looks as follows:

RESCUE e229:~ # rear -d -D recover
Relax-and-Recover 1.18 / Git
Using log file: /var/log/rear/rear-e229.log
/ ~
etc/rear/
etc/rear/local.conf
var/lib/rear/recovery/
var/lib/rear/recovery/initrd_modules
var/lib/rear/recovery/bootloader
var/lib/rear/recovery/bootdisk
var/lib/rear/recovery/diskbyid_mappings
var/lib/rear/recovery/mountpoint_permissions
var/lib/rear/recovery/storage_drivers
var/lib/rear/recovery/if_inet6
var/lib/rear/recovery/mountpoint_device
var/lib/rear/layout/
var/lib/rear/layout/config/
var/lib/rear/layout/config/files.md5sum
var/lib/rear/layout/config/df.txt
var/lib/rear/layout/disktodo.conf
var/lib/rear/layout/lvm/
var/lib/rear/layout/diskdeps.conf
var/lib/rear/layout/disklayout.conf
~
Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.
Started RPC portmapper 'rpcbind'.
RPC portmapper 'rpcbind' available.
Started rpc.statd.
RPC status rpc.statd available.
NOTICE: Will do driver migration
Calculating backup archive size
Backup archive size is 787M     /tmp/rear.44l0OFOzkyMeTJR/outputfs/e229/backup.tar.gz (compressed)
Comparing disks.
Disk configuration is identical, proceeding with restore.
Start system layout restoration.
Creating partitions for disk /dev/sda (msdos)
Creating filesystem of type ext4 with mount point / on /dev/sda2.
/dev/sda2: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef
Mounting filesystem /
Creating swap on /dev/sda1
Disk layout created.
Decrypting disabled
Restoring from '/tmp/rear.44l0OFOzkyMeTJR/outputfs/e229/backup.tar.gz'
Restored 2124 MiB [avg 77698 KiB/sec]OK
Restored 2124 MiB in 29 seconds [avg 75018 KiB/sec]
Restore the Mountpoints (with permissions) from /var/lib/rear/recovery/mountpoint_permissions
Running mkinitrd...
Recreated initramfs (mkinitrd).
Installing GRUB2 boot loader
Finished recovering your system. You can explore it under '/mnt/local'.

and /var/log/rear/rear-e229.log contains

2016-06-28 14:29:09.444520730 Entering debugscripts mode via 'set -x'.
+ source /usr/share/rear/init/default/03_update_recovery_system.sh
++ test recover '!=' recover
++ test http://caps.suse.de/e229.rear_config.tar.gz
++ local update_archive_filename=rear-update.tar.gz
+++ curl --verbose -f -s -S -w '%{http_code}' -o /rear-update.tar.gz http://caps.suse.de/e229.rear_config.tar.gz
* Hostname was NOT found in DNS cache
*   Trying 10.160.4.244...
* Connected to caps.suse.de (10.160.4.244) port 80 (#0)
...
* Connection #0 to host caps.suse.de left intact
++ local http_response_code=200
++ test 200 = 200
++ pushd /
++ tar --verbose -xf rear-update.tar.gz
++ popd

jsmeix commented at 2016-09-28 13:20:

As far as I noticed in https://github.com/rear/rear/issues/943
it seems my very first implementation works sufficiently
well - at least for now - so that I close this issue as "fixed".
For bugs or further enhancements please open new issues.


[Export of Github issue for rear/rear.]