#1208 Issue closed: ebiso segmentation fault when executed on ReaR restored XFS filesystem

Labels: enhancement, bug, documentation, fixed / solved / done

dilli1 opened issue at 2017-02-24 07:19:

Relax-and-Recover (rear) Issue Template

Please fill in the following items before submitting a new issue (quick response is not guaranteed with free support):

  • rear version (/usr/sbin/rear -V): 2.00
  • OS version (cat /etc/rear/os.conf or lsb_release -a):
  • rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
  • Are you using legacy BIOS or UEFI boot? both
  • Brief description of the issue:

Hello Everybody,

we made our first footsteps with REAR (Relax-and-Recover 2.00). When generating a bootable ISO the ebiso generates a "Segmentation Fault" error message.

Even after upgrading rear and ebiso the newest releases the problem still occurs.:
ebiso 0.2.3
rear 2.00

2017-02-22 14:04:36 Including output/ISO/Linux-i386/810_prepare_multiple_iso.sh
2017-02-22 14:04:37 Including output/ISO/Linux-i386/820_create_iso_image.sh
2017-02-22 14:04:37 Starting '/usr/bin/ebiso'
2017-02-22 14:04:37 Making ISO image
2017-02-22 14:04:37 Including ISO UEFI boot (as triggered by USING_UEFI_BOOTLOADER=1)
/usr/share/rear/output/ISO/Linux-i386/820_create_iso_image.sh: line 27: 9326 Segmentation fault $ISO_MKISOFS_BIN -R -o $ISO_DIR/$ISO_PREFIX.iso -e boot/efiboot.img .
2017-02-22 14:04:37 ERROR: Could not create ISO image (with /usr/bin/ebiso)
==== Stack trace ====

I've even tried to create the iso manually with ebiso... same error:

/tmp/bootdirfiles/backup # ebiso -R -o /tmp/bootable.iso -e backup/boot.img /tmp/bootdirfiles
Segmentation fault

My SLES release is 12 SP1!

gozora commented at 2017-02-24 08:22:

Hi,

  1. I guess you should rather use https://github.com/gozora/ebiso/issues. (assuming this is fault of ebiso)
  2. can you send output of uname -a from your system
  3. can you paste log from rear -d -D mkrescue
  4. can you send me core file created by ebiso crash

Thanks

V.

gozora commented at 2017-02-24 08:25:

Aaaand a bit more: :-)

  1. how did you install ebiso (sources, RPM, ...)?
  2. where did you get ebiso from? (SUSE, GIT, gozora.sk, ...)

V.

gozora commented at 2017-02-24 08:49:

@dilli1
Quote from you email:

It is strictly forbidden to copy this e-mail or disclose it to third parties

I had to delete your reply sorry for that!

Please either use email without such notice or use standard github interface for reply.

Thanks for understanding

V.

gozora commented at 2017-02-24 09:05:

@dilli1
To create core file following should work:

ulimit -c unlimited && rear mkrescue

gozora commented at 2017-02-24 09:35:

@dilli1,

You have statement in your mail footer:

NOTICE: This e-mail is intended for the addressee only and may contain business secrets or other confidential information. If you have received this message in error please notify the sender immediately and destroy this e-mail. It is strictly forbidden to copy this e-mail or disclose it to third parties. Thank you.

This is public forum, so disclosing this mail content to third parties is quite certain!

So please, if you want to continue in communication here on github, either remove this notice from your email footer or use github.com forms.

Thanks

V.

gozora commented at 2017-02-27 19:07:

Just a short update on this topic.
This issue happens only after rear recover of system and if TMPDIR runs on XFS.

I need to investigate 2 things:

  1. What is the difference between original XFS setup and what ReaR changed during rear recover. I assume that there was some change as @dilli1 confirmed that ebiso worked fine before restore
  2. In spite of XFS parameter change, ebiso should not misbehave, so I need to do a bit of debugging on XFS

V.

gozora commented at 2017-02-27 19:10:

Workaround for this issue can be found here: https://github.com/gozora/ebiso/issues/9#issuecomment-282814996

gozora commented at 2017-02-27 21:10:

I've probably find out where the problem is.
Full story here

From my point of view ebiso is the main trouble maker, but ReaR behavior is not correct as well.

I did not check ReaR code deeper, but I suspect that it does not call xfs_info to store current filesystem attributes during rear mkbackup/mkrescue. So if user have filesystem created with some non-default options (crc, attr, ftype ...) , ReaR just ignores this and creates filesystem with default options.

Next days I'll be writing patch for ebiso and then I can have a look on how could be XFS options correctly recovered.

V.

jsmeix commented at 2017-02-28 09:13:

@gozora
awesome!
I can hardly express in words how impressed I am
about your outstanding debugging.
As always your contributions to ReaR are very very valuable.

Regarding the general underlying problem in ReaR:
In general ReaR may recreate anything (in particular filesystems)
with any kind of possibly subtle but severe differences, cf.
"Deployment via recovery installation" and
"The limitation is what the special ReaR recovery system can do" in
https://en.opensuse.org/SDB:Disaster_Recovery

@dilli1
see also
"Let's face it: Deployment via the recovery installer is a must" in
https://en.opensuse.org/SDB:Disaster_Recovery

@gdha
I think it should be better documented in ReaR
that at least currently

"rear recover" may recreate the system
with possibly subtle but severe differences

so that the users are better informed in advance
about current general limitations in ReaR.

As side note only FYI:
The "Deployment via the recovery installer is a must"
is another main underlying reason for my attempts
to enhance ReaR so that one can also do
the initial system installation with ReaR,
cf. "ReaR system deployment" in
https://github.com/rear/rear/issues/1085#issue-191661559

jsmeix commented at 2017-02-28 09:45:

@gozora
in general regarding how to recreate a filesystem
exactly as it was before:

I think some time ago a colleague told me that it is
in general not possible (or not possible with reasonable effort)
to autodetect all settings for a filesystem.

I think for particular filesystems the admin can do whatever
special tuning settings that cannot be reliably autodetected
in the running system.

Furthermore when ReaR autodetects all filesystem settings
that can be autodetected, it results the problem that then
ReaR will store zillions of filesystem options/attributes
in the 'fs' lines in var/lib/rear/layout/disklayout.conf

I think this results a conflict because ideally the
filesystem options/attributes in disklayout.conf
should only show non-default settings - i.e. what
is intentionally set different compared to the results
of plain "mkfs.FSTYPE /dev/sdXN".

I think when all filesystem options/attributes would be stored
in disklayout.conf the 'fs' entries become a meaningless
"dumb dump" of tons of system values.

And even then some special filesystem options/attributes
would be missing when they cannot be autodetected.

Therefore I think ReaR should provide configuration
variables where the user can specify his particular
wanted special filesystem options/attributes, cf.
"too much secretly working 'magic automatisms' in ReaR" in
https://github.com/rear/rear/pull/1204#issuecomment-282252947

In general FYI:

Another subtle but severe setting in SLES12 (since SP1)
with its default btrfs structure is that some btrfs subvolume
directories (/var/lib/pgsql /var/lib/libvirt/images /var/lib/mariadb)
have the "no copy on write (C)" file attribute set
so that chattr is required in the recovery system, cf.
usr/share/rear/conf/examples/SLE12-SP2-btrfs-example.conf
When that file attribute is not set the system will
probably behave normally for some time after "rear recover"
but the default btrfs behaviour (copy on write) on those
btrfs subvolumes will result that when those databases
are in use their databse files will get totally fragmented, cf.
https://btrfs.wiki.kernel.org/index.php/Gotchas#Fragmentation
so that after some time those databases will beome
unusable slow (delays of seconds up to minutes).

Currently ReaR only supports the "no copy on write (C)"
file attribute on SLES12 for btrfs subvolumes but
currently there is no general support in ReaR for
'chattr' file attributes (search for 'chattr' in the ReaR code).

gozora commented at 2017-02-28 16:08:

@jsmeix

awesome!
I can hardly express in words how impressed I am
about your outstanding debugging.
As always your contributions to ReaR are very very valuable.

Thanks for your kind words!

@gdha, @jsmeix
In general I agree with everything you've say in https://github.com/rear/rear/issues/1208#issuecomment-282991951.
I was thinking to implement attribute capture only for XFS, as it looks to me that missing some of attributes drastically changes its behavior. You can see this issue and https://github.com/rear/rear/issues/1065#issuecomment-260148740

My idea would be to check only for parameters which we had trouble with in the past (crc and ftype for a start) Especially ftype is captious one as it can make application segfault after restore out of the sudden. it is correct that during rear recover user can change diskrestore.sh but it is not much worth if you don't know original values.

In this issue it was only ebiso which segfaulted (mostly because of my bad work) but there can be more applications wrote by programmers like me.

I'll put together some example code and we can discuss it further, if it will be worthy to continue or not.

V.

gozora commented at 2017-02-28 20:30:

Problem with ebiso mentioned in https://github.com/rear/rear/issues/1208#issuecomment-282854316 is corrected in ebiso v0.2.5.

RPM download links can be found here: https://github.com/gozora/ebiso/issues/9#issuecomment-283150790

V.

gozora commented at 2017-03-09 20:35:

First part of this issue is solved with release of ebiso-0.2.5.
Second part of this problem is discussed in #1213, so I guess we can close this one.


[Export of Github issue for rear/rear.]