#1450 PR merged: Allow full migration of /dev/disk/by-id/devices

Labels: enhancement, fixed / solved / done, minor bug

schabrolles opened issue at 2017-08-21 11:38:

Currently, /dev/disk/by-id/ devices are not "always" migrated.

  • It should be taken by finalize/GNU/Linux/260_rename_diskbyid.sh script; but in its actual form, this script only migrate NEW by-id name if the real device (/dev/name) has the same name between source and target system.

example: (diskbyid_mappings file)

virtio-a1bd20bf-f66d-442c-8 vda
virtio-a1bd20bf-f66d-442c-8-part1 vda1
virtio-a1bd20bf-f66d-442c-8-part2 vda2

=> /dev/disk/by-id/virtio-a1bd20bf-f66d-442c-8 will be renamed only if /dev/vda is present on the target system.

  • Some files like /etc/lilo.conf /etc/yaboot.conf /etc/default/grub_installdevice are not part of the list of file to be migrated by finalize/GNU/Linux/260_rename_diskbyid.sh
    /etc/default/grub_installdevice is used in SuSe Linux and use /dev/disk/by-id/ devices to point to the bootloader partition. If we forget to migrate this file, yast bootloader will fail.

Proposal:

1- Add /etc/lilo.conf /etc/yaboot.conf /etc/default/grub_installdevice to FILES variable in finalize/GNU/Linux/260_rename_diskbyid.sh

2- Use $SHARE_DIR/layout/prepare/default/320_apply_mappings.sh to apply device mapping on diskbyid_mappings file to migrate /dev/old_device to /dev/new_device in case of recover on a different system.

Add this section at the beginning of finalize/GNU/Linux/260_rename_diskbyid.sh

# Apply device mapping to replace device in case of migration.
tmp_layout="$LAYOUT_FILE"
LAYOUT_FILE="$OLD_ID_FILE"
source $SHARE_DIR/layout/prepare/default/320_apply_mappings.sh
LAYOUT_FILE="$tmp_layout"

But, in order to migrate real device from diskbyid_mappings file (second column), with 320_apply_mappings.sh, those one should be in absolute PATH.

virtio-a1bd20bf-f66d-442c-8 /dev/vda
virtio-a1bd20bf-f66d-442c-8-part1 /dev/vda1
virtio-a1bd20bf-f66d-442c-8-part2 /dev/vda2

Tested with sles11 and sles12 on POWER

example of output with sles12 (/etc/default/grub_installdevice)

[...]
Restore the Mountpoints (with permissions) from /var/lib/rear/recovery/mountpoint_permissions
Patching /etc/default/grub_installdevice: Replacing [/dev/disk/by-id/virtio-3af70bf3-fa3b-40b8-9-part1] by [/dev/disk/by-id/dm-name-mpatha-part1]
[...]

schlomo commented at 2017-08-22 12:22:

@schabrolles I would put all these functions into layout-functions.sh because they are used in that stage.

schabrolles commented at 2017-08-23 11:13:

@schlomo for your review.
I have put all functions in layout-function.sh as per your request.
Currently, I did not update any other additional script to point to those functions. I think it is better to do it in a separate pull request for better control and limit side effect.

Tested during a disk migration on :

  • sles11 sp4
  • sles12 sp2
  • rhel7.3
  • ubuntu 16.04

schlomo commented at 2017-08-23 15:23:

In general, please try to use positive logic as much as possible. It helps a lot to read code and understand the logic.

Good:

  • check foo AND do something
  • check foo OR die with error (this is the classical assertion)

Bad:

  • check not foo or do something
  • check not foo and die with error

The negated check makes one stop and think what does it actually mean.

jsmeix commented at 2017-08-23 15:31:

FYI:
In general I would avoid the *IfError functions
because they cannot work reliably in all cases
because they test $? but $? can be an unexpected value
(totally in compliance how bash works but unexpected)
in a bit more complicated cases, cf.
https://github.com/rear/rear/issues/1415#issuecomment-315692391
where

StopIfError "USB device '$USB_DEVICE' is already mounted on $(grep -E "^$REAL_USB_DEVICE\\s" /proc/mounts | cut -d' ' -f2 |tail -1)"

did never error out as intended because the pipe after 'grep' always
returns $? with zero value because 'tail -1' results zero return code.

schabrolles commented at 2017-08-23 22:42:

@jsmeix

What about : test -z "$var" && Error "Empty string passed to UdevSymlinkName()"

schlomo commented at 2017-08-24 08:23:

for that type of thing I prefer test "$var" || Error "Empty string passed to UdevSymlinkName()" because it is a positive check with an else clause.

jsmeix commented at 2017-08-24 09:38:

@schabrolles
see my comment
https://github.com/rear/rear/pull/1450#pullrequestreview-58116661
what I personally prefer - i.e.

test "$var" || Error "var must not be empty"

or even

test $var || Error "var must not be empty (or only spaces)"

But technically

test -z "$var" && Error "var must not be empty"

is also right and sufficiently well understandable.

In contrast oversophisticated stuff like

! test -z "$var" || Error "var must not be empty"

is bad because it is needlessly hard to understand
and it could even make simple minds explode ;-)

Ultimately it is your code so that the final decision
is yours what specific coding style you prefer.

jsmeix commented at 2017-08-24 09:50:

I made evil typos in my prevois comment that I fixed now.
The right code is

test "$var" || Error "var must not be empty"

and

test $var || Error "var must not be empty (or only spaces)"

@schabrolles
FYI regarding 'Error' versus 'BugError':

BugError should only be used when the cause is a bug in ReaR.

Error should be used when the cause is not in ReaR itself.

For example when in script 100_prepare_it.sh a variable is set
by ReaR that is needed in a subsequent script 200_do_it.sh
then in script 200_do_it.sh an assertion should be like

# var must have been set in 100_prepare_it.sh:
test $var || BugError "var empty (or only spaces)"

but in 100_prepare_it.sh it would be usually a test like

# ... [ code that sets var ] ...
test $var || Error "Failed to set var (empty or only spaces)"

when the root cause why it failed to set var is not a bug in ReaR
but an error in the environment where that code is run
(i.e. something outside of ReaR).
And when var must be provided by the user it would be

test $var || Error "var must not be empty (or only spaces)"

schabrolles commented at 2017-08-24 17:38:

@jsmeix
Thanks for the precious info (I learn).
This test on $var is made inside a function (provided by layout-function.sh) that need an argument (var=$1), this function is called by other ReaR scripts (like 260_rename_diskbyid.sh).
So, if $var is empty, this mean that something goes wrong in 260_rename_diskbyid.sh (which should have called the function with an argument).
Then it should be considered has a "Bug" ... Am I right ?

If yes, I see this example (in layout-function.sh):

[[ "$name" ]]
BugIfError "Empty string passed to get_device_name"

What so you think about this implementation ?

jsmeix commented at 2017-08-25 10:35:

@schabrolles
you are right - a syntactically wrong function call is a bug in ReaR.
E.g. see the BugError in the UserInput function when things
are hopelessly wrong versus the 'using fallback' or 'ignored'
behaviour when there is a reasonable way to proceed.

jsmeix commented at 2017-08-25 10:47:

@schabrolles
I found

    if test $sed_change -eq 1 ; then

which may show unwanted bash errors
but quoting "$sed_change" does not help here
because (on commandline):

# sed_change=" 1 " ; if test $sed_change -eq 1 ; then echo OK ; fi
OK

# sed_change="X" ; if test $sed_change -eq 1 ; then echo OK ; fi
-bash: test: X: integer expression expected

# sed_change="" ; if test $sed_change -eq 1 ; then echo OK ; fi
-bash: test: -eq: unary operator expected

# unset sed_change ; if test $sed_change -eq 1 ; then echo OK ; fi
-bash: test: -eq: unary operator expected

# sed_change=" 1 " ; if test "$sed_change" -eq 1 ; then echo OK ; fi
OK

# sed_change="X" ; if test "$sed_change" -eq 1 ; then echo OK ; fi
-bash: test: X: integer expression expected

# sed_change="" ; if test "$sed_change" -eq 1 ; then echo OK ; fi
-bash: test: : integer expression expected

# unset sed_change ; if test "$sed_change" -eq 1 ; then echo OK ; fi
-bash: test: : integer expression expected

so that - as far as I know - all what one can do is to
suppress the unwanted STDERR bash messages
that won't help in any way if they appear in the log
(but they may cause user questions that something
could be wrong in our code), so that I suggest simply

    # Avoid stderr if sed_change is not set or empty or not an integer value:
    if test "$sed_change" -eq 1 2>/dev/null ; then

(c.f. what I do in the UserInput function).

jsmeix commented at 2017-08-25 11:01:

I need to correct my wrong
https://github.com/rear/rear/pull/1450#issuecomment-324373208
where I had written

In general I would avoid the *IfError functions
because they cannot work reliably in all cases
because they test $? but $? can be an unexpected value

That reason is nonsense because also

COMMAND || Error "..."

test the exit code (i.e. $?) of COMMAND.

The actual reason why I avoid the *IfError functions
is that they cannot work in case of 'set -e'
cf. https://github.com/rear/rear/issues/700
because in case of 'set -e' the bash directly exits
when a single COMMAND results non-zero exit code
and no subsequent *IfError function is ever called
that could test $? of the previous COMMAND.
For example on commandline:

# ( set -e ; cat qqq ; if (( $? != 0 )) ; then echo ERROR ; fi )
cat: qqq: No such file or directory

# ( set -e ; cat qqq || echo ERROR )
cat: qqq: No such file or directory
ERROR

# ( set -e ; if ! cat qqq ; then echo ERROR ; fi )
cat: qqq: No such file or directory
ERROR

The crucial point is that 'cat qqq || echo ERROR'
or 'if ! cat qqq ; then echo ERROR ; fi'
is a single COMMAND in bash.

schlomo commented at 2017-08-25 11:19:

@schabrolles @jsmeix maybe I have an idea for a much simpler solution to the sed change monitor problem:

Instead of writing to a special file write to STDOUT like this:

$ rm changed
$ date > d ; sed -i -e 's/2017/XXX/w /dev/stdout' d >>changed
$ date > d ; sed -i -e 's/2016/XXX/w /dev/stdout' d >>changed
$ wc -l changed
1 changed
$ cat changed
Fr 25. Aug 13:15:10 CEST XXX
$ if test $(wc -l < changed) -gt 0 ; then echo changed ; fi
changed

This trick also makes the code shorter and removes the need to reset the change tracking file every time. You can collect all the change infos from all the sed calls and then check in the end.

The other thing I noticed is that there is no /g flag in the sed substitution which means that sed will only replace the first occurrence of an ID on a single line. If this is not intentional then better add /g to every substitution.

jsmeix commented at 2017-08-25 12:06:

@schlomo
do I understand it correctly that the actual improvement of

sed -i -e 's/THIS/THAT/gw /dev/stdout' file >>all_changes

over

sed -i -e 's/THIS/THAT/gw current_changes' file

is that appending various sed's STDOUT into all_changes
makes all changes sum up therein while in contrast
sed's own 'w current_changes' cannot append
so that one cannot collect changes of various
sed calls this way in one same file?

schlomo commented at 2017-08-25 12:15:

@jsmeix yes, that is the idea. Saves you from checking the result file after every sed call and counting the lines in this file avoids the problem with the quoting that you mentioned above.

schabrolles commented at 2017-08-25 21:06:

@schlomo thanks for the tip ... it is implemented now

schabrolles commented at 2017-08-30 09:28:

@jsmeix @gdha @schlomo Same for this one... Last review before merging.

jsmeix commented at 2017-08-30 11:55:

@schabrolles
preferably use paired parenthesis for case patterns
cf. "Paired parenthesis" in
https://github.com/rear/rear/wiki/Coding-Style

schabrolles commented at 2017-08-30 19:33:

@schlomo @jsmeix changes applied


[Export of Github issue for rear/rear.]