#3068 Issue closed: ISO_RECOVERY_MODE=unattended reboot loop if ISO CD-ROM is first boot device

Labels: enhancement, support / question, fixed / solved / done

GitarPlayer opened issue at 2023-11-06 12:24:

ReaR version ("/usr/sbin/rear -V"):
2.7

If your ReaR version is not the current version, explain why you can't upgrade:

OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
RHEL8

ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):

OUTPUT=ISO
OUTPUT_URL=nfs://myserver/backups
ISO_DEFAULT=automatic
# ISO_RECOVER_MODE=unattended

Description of the issue (ideally so that others can reproduce it):

Hello,

I am attempting to create a ReaR recovery environment
to perform automated disaster recovery with ISO output and NFS share.
I want to fully automate the recovery.
If I use ISO_DEFAULT when booted from that ISO
it will automatically trigger the recovery
but at the end there is a prompt
on how one wants to proceed
(Enter rear recovery shell, reboot).
So this is not fully automatic.
If I use ISO_RECOVER_MODE=unattended it automatically reboots
but one is caught in an infinite reboot loop
if the ISO CD Rom is the first boot device.

I know I can handle that at the automation layer
with Ansible and the ILO/Vsphere API logic.
But ideally I am looking for something like
ISO_RECOVER_MODE=poweroff,
which would just shut off after recovery.
I know I could always sed the corresponding bash script
after installing the rear package so it does a poweroff
instead of a reboot.
But I would like to avoid that to keep support by Red Hat.

  1. Is there a way to configure this behaviour already?
  2. Or do I need to create a PR for this?
  3. Would it also be merged back to rear 2.6 and repackaged?

Thank you very much for your hard work.
I did around 40 rear recovery across physical,
virtualized, RHEL7 and RHEL8 and all worked like a charm.

jsmeix commented at 2023-11-06 14:09:

My first totally offhanded idea is
whether or not it could be sufficiently simple
to enhance the current hardcoded behaviour of

ISO_RECOVER_MODE="unattended"

by optionally something like

ISO_RECOVER_MODE="unattended=COMMAND"

so that optionally COMMAND could be run after successful 'rear recover'
instead of the current hardcoded 'reboot' that is currently online at
https://github.com/rear/rear/blob/master/usr/share/rear/skel/default/etc/scripts/system-setup#L200

Or alternatively (perhaps easier to implement and more versatile
as a separated variable where its value is not a kernel argument)
an additional new config variable like

ISO_RECOVER_MODE_COMMAND="COMMAND"

or

RECOVERY_REBOOT_COMMAND="COMMAND"

?

GitarPlayer commented at 2023-11-06 18:57:

ISO_RECOVER_MODE_COMMAND="COMMAND sounds like a great suggestion. With that you could even register your client to a server (for example subscription-manager or start a monitoring agent like Zabbix or Splunk). The default value could be reboot for backwards compatibility.

jsmeix commented at 2023-11-07 11:56:

@GitarPlayer
to do some additional things after "rear recover"
had recreated the system (just before "rear recover" exits)
you can use POST_RECOVERY_SCRIPT or POST_RECOVERY_COMMANDS,
see its description in usr/share/rear/conf/default.conf
for ReaR 2.7 online at
https://github.com/rear/rear/blob/rear-2.7/usr/share/rear/conf/default.conf#L3494

What I mean here is only specifically what (simple) command
should be called after "rear recover" had exited
to automatically reboot the recreated system
or alternatively do something else.

jsmeix commented at 2023-11-07 13:30:

After looking at the code in
usr/share/rear/skel/default/etc/scripts/system-setup
I decided (at least for now as far as I understand things)
to implement a simple and generic RECOVERY_REBOOT_COMMAND
in particular because ISO_RECOVER_MODE="unattended=COMMAND"
would be specific for ISO but we also have PXE_RECOVER_MODE
and I do not like to mess around with possibly complicated
kernel command line arguments with options and values like
ISO_RECOVER_MODE="unattended='COMMAND -option1 --option2=value'"
and in general I like to Keep Separated Issues Separated "KSIS"
(i.e. the recover mode versus its "reboot" command),
cf. RFC 1925 item (5)

It is always possible to aglutenate [sic]
multiple separate problems into a single
complex interdependent solution.
In most cases this is a bad idea.

@GitarPlayer
I would appreciate it if you could have a look
at my currently offhanded and untested proposal
https://github.com/rear/rear/pull/3070

Perhaps you could even test it?
You could manually change your
usr/share/rear/skel/default/etc/scripts/system-setup
as shown in
https://github.com/rear/rear/pull/3070/files
but only the RECOVERY_REBOOT_COMMAND changes
(i.e. not the SECRET_OUTPUT_DEV stuff)
and then specify in your etc/rear/local.conf

RECOVERY_REBOOT_COMMAND="poweroff"

redo a "rear mkbackup" and test if "rear recover"
does an automated 'poweroff' in 'unattended' mode.

GitarPlayer commented at 2023-11-07 21:45:

many thanks @jsmeix that worked like a charm.
I did your suggested changes and I think it would be a great addition.

Pardon my ignorance, but what about setting

POST_RECOVERY_COMMANDS+=( 'echo "powering off in 30 seconds" sleep 30 poweroff' )

Of course your proposed solution is much nicer
but this would work just fine as interim solution
until your PR is merged and released to RHEL8?
Or am I missing out something obvious?

jsmeix commented at 2023-11-08 08:46:

@GitarPlayer
thank you so much for your test!

Calling 'poweroff' in POST_RECOVERY_COMMANDS
happens while "rear recover" is still running
so 'poweroff' will terminate this running 'rear' process
which works likely OK in practice because at that state
"rear recover" had done its job (i.e. the system is recreated).

But it is cleaner to let the running 'rear' process finish
regularly on its own which does in particular

LogToSyslog "$PROGRAM $WORKFLOW finished with zero exit code"

see in current GitHub master code
https://github.com/rear/rear/blob/master/usr/sbin/rear#L759
so with 'poweroff' in POST_RECOVERY_COMMANDS
one would no longer get that syslog message
which could make an important difference
for some users in some cases.

GitarPlayer commented at 2023-11-08 14:16:

thank you very much for pointing out the exact difference. LogToSyslog could be very useful for Event Driven Ansible indeed.

pcahyna commented at 2023-11-08 23:08:

Hi @GitarPlayer

I think there are two ways to approach the problem. You can have manual intervention before the start of the recovery, or after it. In the first case, you can remove ISO_DEFAULT=automatic and select the appropriate entry in the ReaR boot menu manually when the CD boots. I believe the default entry chainloads the first hard drive without manual intervention, so this way the reboot loop is avoided.

If after the recovery, you need to change the boot order after the recovery has completed and the system has rebooted. The new RECOVERY_REBOOT_COMMAND="poweroff" will help with the reboot loop, but you still have to change the boot order next time when you power the machine back on. Since we can't change the boot order from inside the system, there is no way around the problem (except maybe putting "eject" into POST_RECOVERY_COMMANDS, but even then I suspect the BIOS will load the CD-ROM tray back before attempting to boot from it).

Note that the problem is specific to BIOS machines. On machines with smarter firmware (UEFI, Open Firmware) we restore the original boot order, which means that next time we will boot into the recovered system and not the recovery medium.

schlomo commented at 2023-11-10 10:40:

Thanks everybody for not doing KEY=VAL=CUSTOM stuff...

I guess you could manage to start a delayed poweroff via POST_RECOVERY_COMMANDS but it is not very intuitive as ReaR tries really hard to collect all subprocesses.

#3070 will provide a nice solution.

pcahyna commented at 2023-11-13 16:35:

KEY=VAL=CUSTOM would be really quite ugly and at the same time inflexible.

jsmeix commented at 2023-11-21 11:50:

With https://github.com/rear/rear/pull/3070 merged
this issue should be solved.


[Export of Github issue for rear/rear.]