#840 Issue closed: udev workflow cannot work as it is currently implemented

Labels: won't fix / can't fix / obsolete, minor bug

jsmeix opened issue at 2016-05-17 13:17:

currently in rear 1.18 the udev workflow
runs a longer running task (by default the mkrescue workflow)
that is triggered by plug-in of a "REAR-000" labeled USB stick
via a udev event rule in etc/udev/rules.d/62-rear-usb.rules

But udev cannot be used to launch longer running tasks
when a udev event happens.

Reason:

There is a hard 120 seconds timeout in udev
after which any by udev started process is killed.

The reason is that udev proceeds event by event
and as long as there is a process running that was
started by udev for an event, udev cannot proceed
with any subsequent event for that device.

Therefore udev kills any running process that was
started by udev after 120 seconds.

As a consequence the whole udev workflow in rear
cannot work as it is currently implemented.

This issue was found via https://github.com/rear/rear/issues/838

jsmeix commented at 2016-05-17 13:41:

Interestingly doc/rear-presentation.adoc contains

  ** Easier workflow (udev): insert, wait, pull (2 min max)

where the "2 min max" seems to indicate that the
hard 120 seconds timeout in udev was know.

But from my point of view a hard kill after 120 seconds
cannot be seriously considered to be an acceptable
constraint for running a rear workflow.

jsmeix commented at 2016-05-17 13:58:

@gdha
@schlomo
I ask for your opinion what to do here because
I don't know what to do with the udev workflow
at least for now for rear 1.19

schlomo commented at 2016-05-17 14:05:

IMHO we should change the design:

  • udev serves as a trigger
  • a systemd service waits for that trigger to run rear mkrescueor rear mkbackup as a task

That we we should get much better control over the entire process.

schlomo commented at 2016-05-17 14:06:

BTW, what prevents rear mkrescue from completing in 2 minutes?

jsmeix commented at 2016-05-17 14:19:

Ha!
my colleague who is a udev expert (cf. https://github.com/rear/rear/issues/838#issuecomment-219712673 )
also mentioned to use a "systemd service"
to do longer running tasks.

But ( fortunately! ;-) I am both a noob regarding udev
and a noob regarding systemd so that (seriously)
I really have no clue how I could implement that.

Regarding
"what prevents rear mkrescue from completing in 2 minutes":

a)
Slow USB medium and big UEFI bootable system
cf. https://github.com/rear/rear/issues/810#issuecomment-205783287

An UEFI bootable ISO image is much bigger (almost 450MiB)
than a traditional BIOS bootable ISO image which is less than
120 MiB for me.

b)
The "mkrescue" workflow is not the only possible workflow
that can be triggered via the udev workflow, e.g. via

UDEV_WORKFLOW=mkbackup

the mkbackup can be triggered via the udev workflow
and then things can really last long when the slow
USB stick is also used to store the backup.

c)
In general running a rear workflow with a time bomb
is probably not "the right thing" that can be used in a
more professional (a.k.a. "enterprise") environment.
On the other hand in such an enviroment probably
nobody plugs-in USB sticks into hundreds of servers.

schlomo commented at 2016-05-17 14:53:

I fully agree that we should not rely on a 2 minute time limit whatsoever.

However, I would still recommend to investigate why the UEFI image is 3x bigger than the BIOS image. Sounds very strange to me.

gozora commented at 2016-05-17 15:04:

I guess we have improved this size problem implementing much precise measurements in commit 46212bc.
But yes, EFI related devices are larger as kernel and initrd needs to be copied two times, because of huge differences how elilo and grub behaves.

schlomo commented at 2016-05-17 15:09:

IMHO we should find a way to have kernel & initrd only once on the boot
media.

On 17 May 2016 at 17:04, Vladimir Gozora notifications@github.com wrote:

I guess we have improved this size problem implementing much precise
measurements in commit 46212bc
https://github.com/rear/rear/commit/46212bc0d76f306cedb55a654c57c01f68621943
.
But yes, EFI related devices are larger as kernel and initrd needs to be
copied two times, because of huge differences how elilo and grub behaves.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/rear/rear/issues/840#issuecomment-219746801

gozora commented at 2016-05-17 15:14:

Well that will be kind of hard, especially with elilo as it needs to have initrd and kernel on same partition where from it is launched. Grub on the other hand does not have such limitations...
This of course applies only for EFI booting ...

schlomo commented at 2016-05-17 15:20:

I am not sure if I understand the full context. IMHO

  • we build a rescue image on an USB device
  • the USB device has a single partition with a simple file system like vfat
  • we put kernel and initramfs on that filesystem
  • we write elilo or grub configs that use those kernel and initrd

Or what actually happens? Maybe the "rescue boot from local system disk"
feature makes stuff more complicated?

On 17 May 2016 at 17:14, Vladimir Gozora notifications@github.com wrote:

Well that will be kind of hard, especially with elilo as it needs to have
initrd and kernel on same partition where from it is launched. Grub on the
other hand does not have such limitations...
This of course applies only for EFI booting ...


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/rear/rear/issues/840#issuecomment-219750224

gozora commented at 2016-05-17 15:29:

I've foud point d) as continuance for @jsmeix list

If you are running server with no DNS servers configured following code in 99_sysreqs.sh adds to runtime until it times out:

ip addr show | grep inet | grep -v 127.0.0. | sed -e "s/ brd.*//" -e "s/inet6//" -e "s/inet//" | while read ip ; do
  DNSname="$( dig +short -x ${ip%/*} )"
  if test -z "$DNSname" ; then
      echo "    ip ${ip%/*} subnet /${ip#*/}"
  else
      echo "    ip ${ip%/*} subnet /${ip#*/} DNS name $( dig +short -x ${ip%/*} )"
  fi
done

Btw. I've uploaded whole log here. As you can see Debian looks to be even more eager and terminates the script after aprox 30s.

jsmeix commented at 2016-05-17 15:30:

FYI:
Regarding how to mitigate the problem that
UEFI bootable ISO images are much bigger than
traditional BIOS bootable ISO images,
see https://github.com/rear/rear/issues/841
I have that idea in mind since https://github.com/rear/rear/issues/810#issuecomment-205783287

jsmeix commented at 2016-05-17 15:48:

Should I perhaps create a separated issue to
"investigate why UEFI image is much bigger than BIOS image" ?

gozora commented at 2016-05-17 15:57:

@jsmeix no need for investigation, I know it ;-) and I think I've already explained it.
It is all because of elilo and grub.

When you launch grub in EFI it can conveniently browse whole ISO content so we can reach kernel and initrd wherever it is located.
elilo on the other hand can see only content of partition/virtual image where from it was launched for this reason it is desirable to have kernel and initrd in same location like elilo is.

One more important fact is version of EFI. Some versions are capable of seeing ONLY vfat portion of ISO (boot image) but other versions can see content of whole ISO image...

For this reasons I've decided to have rather two copies of same data then risking ending up with unbootable/useless medium.

gdha commented at 2017-04-25 18:39:

@jsmeix Changed the flag from to 'bug' to 'minor bug' as the issue is already open for 1 yr, and as nobody complains it cannot be a big blocking issue. When I have time I will look into it - perhaps it is a feature which can be removed?

jsmeix commented at 2017-04-26 10:34:

I close it as "won't fix" because as nobody complains
it doesn't hurt so that we can leave it as is.


[Export of Github issue for rear/rear.]