#1903 PR merged: change the logic of the continue with function is_disk_a_pv during grub2 config

Labels: bug, fixed / solved / done

gdha opened issue at 2018-08-22 13:17:

  • Type: Bug Fix

  • Impact: High

  • Reference to related issue (URL): #1902

  • How was this pull request tested? yes via https://github.com/gdha/rear-automated-testing/issues/65

  • Brief description of the changes in this pull request:
    Recovery fails during grub-config with GRUB2_INSTALL_DEVICES empty
    See script 620_install_grub2.sh

    @schabrolles could you review this in the absence of Johannes?

schabrolles commented at 2018-08-22 13:51:

@gdha

The is_disk_a_pv() looks strange to me.... From my point of view, it does exactly the opposite of what is written into the comments. By looking into the code it should return 0 if the disk is a PV (when disk is known as a PV in LAYOUT FILE) and return 1 when not exist or not a lvmdev.

# Returns 1 if the device is an LVM physical volume
# Returns 0 otherwise or if the device doesn't exists
is_disk_a_pv() {
    disk=$1
    if grep -q "^lvmdev .* ${disk} " $LAYOUT_FILE ; then
        return 0
    else
        return 1
    fi
}

gdha commented at 2018-08-22 14:47:

@schabrolles at line https://github.com/rear/rear/blob/e75b86a7d0cc5d043e0712f0d9b6846797167525/usr/share/rear/lib/layout-functions.sh#L638 I see something different then what you showing above?
Was this function modified lately, but not yet pushed into the master branch perhaps?

schabrolles commented at 2018-08-22 15:15:

@gdha, my mistake... I wasn't looking at master branch (I also need vacation...)

So, this mean is_disk_a_pv() previously return 0 when disk is a PV and 1 if not.
It was not what was written in the comments (opposite) but looked more logic in my mind
is_disk_a_pv return 0 means disk is a PV....

This explain why, we now have a problem and have to reverse the logic....

But I just think that may be we should keep 620_install_grub2.sh with the old logic and rewrite is_disk_a_pv to return 0 if it is a PV and 1 if not ...
What do you think ?

gdha commented at 2018-08-22 15:23:

aha https://github.com/rear/rear/commit/c2602347e47251b0a68c027ce7b89b218a77ed47 shows the @rmetrich did the last modifications in that functions. We need his assistance to get on the same page here!

gdha commented at 2018-08-22 15:40:

@schabrolles @rmetrich it seems the code is only used at:

/usr/share/rear/lib/layout-functions.sh:is_disk_a_pv() {
./usr/share/rear/finalize/Linux-i386/620_install_grub2.sh:    is_disk_a_pv "$disk" && continue
./usr/share/rear/finalize/Linux-i386/610_install_grub.sh:        if is_disk_a_pv "$disk" ; then

so it would be easy to make a decision here, right?

rmetrich commented at 2018-08-22 15:40:

@schabrolles Maybe I shouldn't have changed the code. For sure there is some oddity here.

gdha commented at 2018-08-22 15:42:

@schabrolles @rmetrich Perhaps go back to the original return codes as now it has been reversed and breaking recoveries

rmetrich commented at 2018-08-22 15:44:

sure, all this will be fixed with rear-3, right ;-)

schabrolles commented at 2018-08-23 07:14:

@gdha Maybe we should keep c260234 but restore the old is_disk_a_pv() function.
We should also correct the comments to reflect how the function really behaves (return 0 when is a pv and 1 when it's not) in order to avoid a future mistake.

# Returns 0 if the device is an LVM physical volume
# Returns 1 otherwise or if the device doesn't exists
is_disk_a_pv() {
    disk=$1
    if grep -q "^lvmdev .* ${disk} " $LAYOUT_FILE ; then
        return 0
    else
        return 1
    fi
}

gdha commented at 2018-08-23 09:41:

@schabrolles @rmetrich After reading the content of https://github.com/rear/rear/pull/1897 we might break it again for lvmdev uuid. Therefore, I'm not in favour of going back to the original code.

Perhaphs, it is better to keep the function as it is now and write some good comments in the 2 scripts were it is used?

gdha commented at 2018-08-25 09:14:

@schabrolles @rmetrich If there are no objections I would like to merge it somewhere during the course of next Monday?

jsmeix commented at 2018-09-13 09:26:

I do not like that one particular is_specific_type function has inverse logic
(i.e. returns 1 = bash false if something is of that specific type)
compared to our other is_... functions (that would return 0 = bash true).

Accordingly I would like to align the is_disk_a_pv function
with our other is_... functions and let it return 0 if disk is a PV
and also fix where the is_disk_a_pv function is called.

I fear if I do not align the is_disk_a_pv function with our other is_... functions
we (or our successors after some years) will be hit by this inconsistency.

gdha commented at 2018-09-13 09:43:

@jsmeix The reason of the inversion is in the function itself:

is_disk_a_pv() {
    disk=$1

    # Using awk, select the 'lvmdev' line for which $disk is the device (column 3),
    # cf. https://github.com/rear/rear/pull/1897
    # If exit == 1, then there is such line (so $disk is a PV),
    # otherwise exit with default value '0', which falls through to 'return 0' below.
    awk "\$1 == \"lvmdev\" && \$3 == \"${disk}\" { exit 1 }" "$LAYOUT_FILE" >/dev/null || return 1
    return 0
}

The awk exits with 1 when it hits a PV - I wanted to avoid rethinking this awkward statement.
The function was re-written by @rmetrich

jsmeix commented at 2018-09-13 14:04:

@gdha
I know about that awk-ward code because I also approved
that changes in https://github.com/rear/rear/pull/1897 ;-)

Offhandedly I think the following would result the normal logic

is_disk_a_pv() {
    disk=$1
    # Using awk, select the 'lvmdev' line for which $disk is the device (column 3),
    # cf. https://github.com/rear/rear/pull/1897
    # If awk exit == 1, then there is such line so $disk is a PV
    # and then we let this function return 0
    # cf. https://github.com/rear/rear/pull/1903#issuecomment-420943026
    # otherwise continue and return 1
    awk "\$1 == \"lvmdev\" && \$3 == \"${disk}\" { exit 1 }" "$LAYOUT_FILE" >/dev/null || return 0
    return 1
}

Or do I perhaps misunderstand some special awk-ward behaviour?

E.g. what happens when awk exits with exit code 2 because of a fatal error
or what happens when awk exits with exit code 1 because of a normal error?

I think currently the is_disk_a_pv function also results that $disk is a PV
in case of any error in awk.

@rmetrich
could you explain why you did not use what seems to be more straightforward:

awk "\$1 == \"lvmdev\" && \$3 == \"${disk}\" { exit 0 }" "$LAYOUT_FILE" >/dev/null && return 0
return 1

Probably I misunderstand something here because
I am really not an expert in using awk-ward tools ;-)

rmetrich commented at 2018-09-13 14:10:

Oops, sorry for introducing this powerful awk line :-)
The awk command will exit 1 if the disk is a PV (lvmdev). If no such line is ever hit, then awk will exit 0 which is normal return code for commands.
If you exited prematurely with exit 0 in case of PV, how would you let awk exit with 1 as normal return code? That's just not possible.

jsmeix commented at 2018-09-13 14:18:

@rmetrich
many thanks for your promt explanation - it helps me a lot.

Is that awk line also sufficiently fail safe to let the
is_disk_a_pv function result "the right thing"
even in case of errors in awk?

rmetrich commented at 2018-09-13 14:20:

awk is designed to survive humanity


[Export of Github issue for rear/rear.]