#1563 Issue closed: 'EFI System Partition' name restoration fails (blanks in value)

Labels: bug, cleanup, fixed / solved / done

MarcoS80 opened issue at 2017-11-06 09:13:

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

  • rear version (/usr/sbin/rear -V): Relax-and-Recover 2.2-git.0.b7927e5.unknown.changed / 2017-10-31
  • OS version (cat /etc/rear/os.conf or lsb_release -a): Ubuntu/17.04/i386
  • rear configuration files (cat /etc/rear/site.conf or cat /etc/rear/local.conf):
OUTPUT=ISO
BACKUP=NETFS
BACKUP_URL=iso:///backup
ISO_VOLID="REARISO"
OUTPUT_URL=file:///tmp
MODULES=( 'all_modules' )
  • Are you using legacy BIOS or UEFI boot? UEFI

  • Brief description of the issue:
    The EFI System Partition cannot be created correctly because the name "EFI System Partition" does not have double quotes.

  • Work-around, if any:
    In:
    /usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh
    the command:

parted -s $device mkpart '$name' ${start}B $end >&2

at lines 190 and 203, should be:

parted -s $device mkpart '"$name"' ${start}B $end >&2

whereas the command:

echo "parted -s $device name $number '$name' >&2"

at line 246 should be:

echo "parted -s $device name $number '\"$name\"' >&2"

gozora commented at 2017-11-06 16:13:

@jsmeix as you are the master in (among other things) quoting, what is your opinion here ?
@MarcoS80 could you create pull request for changes mentioned above?

Thanks

V.

jsmeix commented at 2017-11-07 10:54:

@gozora
yesterday I had a quick look at that code
but I do not yet fully understand what goes on.

In particular I do not understand how those values
could contain blanks (so that special quoting is needed).

I fear a plain wrong value gets used here - I mean
a value that is not meant for that parted parameters.

Because the values in a disklayout.conf line are positional values
a missing value (e.g. when one is empty) lead to false
positional values in a disklayout.conf line which results that
wrong values get used for certain parted parameters
and that shows up as weird errors at unexpected places.

But currently I have no time for a deeper analysis
what actually goes wrong here - be patient...

rmetrich commented at 2018-01-19 13:27:

I hit the issue as well.
Commit
https://github.com/rear/rear/commit/2540c7a2319791b58fa2b9c87fdefbe3c5341b4c
removed the double-quote from the following lines:

-parted -s $device mkpart '"$name"' ${start}B $end >&2
+parted -s $device mkpart '$name' ${start}B $end >&2

and

-parted -s $device mkpart '"$name"' $start_mb $end_mb >&2
+parted -s $device mkpart '$name' $start_mb $end_mb >&2

which is wrong. This is needed for the UEFI partition which is usually named "EFI System Partition".
parted is likely not parsing arguments correctly:

  • parted -s /dev/sda mkpart "EFI System Partition" ... fails
  • parted -s /dev/sda mkpart 'EFI System Partition' ... fails
  • parted -s /dev/sda mkpart EFI\ System\ Partition ... fails

jsmeix commented at 2018-01-23 10:58:

@rmetrich
additionally I wonder about when the $name variable gets evaluated
compared to when the $device variable gets evaluated
because there is a difference between

"'$var'"

and

'"$var"'

as in the following example

# device="/dev/sdX"

# name="EFI System Partition"

# echo parted -s $device mkpart '$name'
parted -s /dev/sdX mkpart $name

# echo parted -s $device mkpart "'$name'"
parted -s /dev/sdX mkpart 'EFI System Partition'

# echo parted -s $device mkpart '"$name"'
parted -s /dev/sdX mkpart "$name"

jsmeix commented at 2018-01-23 11:02:

@OliverO2
what was the reason behind why you changed the quoting of the $name variable in your
https://github.com/rear/rear/commit/2540c7a2319791b58fa2b9c87fdefbe3c5341b4c

jsmeix commented at 2018-01-23 11:10:

@rmetrich
forget my above https://github.com/rear/rear/issues/1563#issuecomment-359755054
I overlooked the outer "..." quotation so that actually
it was before

# echo "parted -s $device mkpart '"$name"' ..."
parted -s /dev/sdX mkpart 'EFI System Partition' ...

versus now

# echo "parted -s $device mkpart '$name' ..."
parted -s /dev/sdX mkpart 'EFI System Partition' ...

but I do not see a difference in the echo output?

jsmeix commented at 2018-01-23 11:44:

As I wrote in https://github.com/rear/rear/issues/1563#issuecomment-342446715
I cannot imagine how partition names with blanks could work at all because
as far as I see in layout/prepare/GNU/Linux/100_include_partition_code.sh
the code is basically as follows:

create_partitions() {
...
    while read part disk size pstart name flags partition junk; do
...
            cat >> "$LAYOUT_CODE" <<EOF
my_udevsettle
parted -s $device mkpart '$name' ${start}B $end >&2
my_udevsettle
EOF
...
    done < <(grep "^part $device " $LAYOUT_FILE)
...
}

For me name values with blanks (even if quoted) in disklayout.conf
cause wrong reading and therefore wrong variable assignments
when I use a disklayout.conf with name values with blanks as in

# grep ^part disklayout.conf
part /dev/sda 14173601792 1048576 partition_name1 none /dev/sda1
part /dev/sda 1048576 14174650368 partition name2 bios_grub /dev/sda2
part /dev/sda 2147483648 14175698944 'partition name3' swap /dev/sda3
part /dev/sda 5151636992 16323182592 "partition name4" none /dev/sda4

# while read part disk size pstart name flags partition junk ; do echo "'$part' '$disk' '$size' '$pstart' name='$name' flags='$flags' partition='$partition' junk='$junk'" ; done < <(grep "^part /dev/sda " disklayout.conf)
'part' '/dev/sda' '14173601792' '1048576' name='partition_name1' flags='none' partition='/dev/sda1' junk=''
'part' '/dev/sda' '1048576' '14174650368' name='partition' flags='name2' partition='bios_grub' junk='/dev/sda2'
'part' '/dev/sda' '2147483648' '14175698944' name=''partition' flags='name3'' partition='swap' junk='/dev/sda3'
'part' '/dev/sda' '5151636992' '16323182592' name='"partition' flags='name4"' partition='none' junk='/dev/sda4'

The only one that works is partition_name1 which does not contian a character from $IFS
cf. https://github.com/rear/rear/issues/1372

rmetrich commented at 2018-01-23 11:48:

@jsmeix There is encoding of partitions with space in their name:

usr/share/rear/layout/save/GNU/Linux/200_partition_layout.sh
type=$(echo "$type" | sed -e 's/ /0x20/g') # replace spaces with 0x20 in name field

jsmeix commented at 2018-01-23 12:15:

@rmetrich
thanks to point that out!

In usr/share/rear/layout/prepare/GNU/Linux/100_include_partition_code.sh
there is the counterpart

        # The 'name' could contain spaces (were replaced with 0%20; need to change this again).
        name=$(echo "$name" | sed -e 's/0x20/ /g')

which was implemented by @gdha via
https://github.com/rear/rear/commit/963dfd7a3968e1506830b42deb88b5f04eaa5e38

jsmeix commented at 2018-01-23 12:23:

Unbelievable!
Duplicated quoting as in

parted -s /dev/sdb unit MiB mkpart '"part name 4"' 7 8

and also the other way round via

parted -s /dev/sdb unit MiB mkpart "'part name 5'" 9 10

seem to be the only way how to make parted work,cf.
https://github.com/rear/rear/commit/316b5f8d5aa5bb96b8aa037266912a59e6cda046

jsmeix commented at 2018-01-23 12:32:

I will do a pull request to fix it again and explain it in the code.

rmetrich commented at 2018-01-23 12:44:

Indeed, there is likely a big issue with parted command-line parsing ...

jsmeix commented at 2018-01-23 12:57:

Unfortunately the parted manual is very silent about quoting.
In
https://www.gnu.org/software/parted/manual/parted.html
the only place is

Command: name number name

Sets the name for the partition number (GPT, Mac, MIPS and PC98 only).
The name can be placed in quotes.

Example:
   (parted) name 2 'Secret Documents'

Set the name of partition 2 to ‘Secret Documents’. 

As far as I understand it this means parted's internal parser
supports single quotes for values with blanks.

When calling parted on the command line it seems
there is a bug in parted that it does not handle a
command line argument argv[N] that contains blanks
as a single value with single quotes for its internal parser.

It seems when calling parted on the command line all
command line arguments get sent to its internal parser
as a single string of all command line arguments.

In this case it explains why when calling parted on the command line
a value with blanks must be provided with duplicated quoting
to get it with the inner quoting caracters to parted's parser
i.e. in the form like in the above example

parted -s /dev/sdb unit MiB mkpart "'part name 5'" 9 10

to get 'part name 5' to parted's parser.

rmetrich commented at 2018-01-23 13:00:

Just tried parted name 2 'Secret Documents' and it doesn't work ...

jsmeix commented at 2018-01-23 13:04:

When you call plain 'parted' and then use it interactively
single quotes work because then you talk directly
to parted's parser:

# parted /dev/sdb
GNU Parted 3.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit MiB mkpart 'part name 6' 11 12

In contrast when you call a whole parted command on command line
you need duplicated quoting, the outer quoting for bash, the inner
quoting for parted's internal parser:

# parted -s /dev/sdb unit MiB mkpart "'part name 7'" 13 14

OliverO2 commented at 2018-01-23 13:32:

@jsmeix

what was the reason behind why you changed the quoting of the $name variable in your
2540c7a

It was just a cleanup. Of course I don't know now what I was really thinking at the time but I guess I was just surprised by the unusual double quoting. In addition I might have overlooked that the lines affected were inside a here document:

Here's a little test:

#!/usr/bin/env bash

device='/dev/xd1'
name='EFI System Partition'
start=1024
end=4096

echo "Outside of here document:"
echo '#1' parted -s $device mkpart "$name" ${start}B $end >&2
echo '#2' parted -s $device mkpart '"$name"' ${start}B $end >&2

echo "Inside a here document:"
cat <<EOF | bash
echo '#1' parted -s $device mkpart "$name" ${start}B $end >&2
echo '#2' parted -s $device mkpart '"$name"' ${start}B $end >&2
EOF

Output is:

Outside of here document:
#1 parted -s /dev/xd1 mkpart EFI System Partition 1024B 4096
#2 parted -s /dev/xd1 mkpart "$name" 1024B 4096
Inside a here document:
#1 parted -s /dev/xd1 mkpart EFI System Partition 1024B 4096
#2 parted -s /dev/xd1 mkpart "EFI System Partition" 1024B 4096

Your analysis seems absolutely plausible. Parted obviously has some strange quoting requirements which would be expected with the MS-DOS command interpreter but not here. Parted really should do the necessary translations from arguments internally. (Parted has other issues as well.)

So in a nutshell, my quoting changes for the parted calls should be reverted. Parted would then be called as before with its partition name argument enclosed in double quotes which are not stripped by the shell.

jsmeix commented at 2018-01-23 13:57:

I am thinking about a more fail-safe way how to encode
possible $IFS characters (like spaces) in a GPT partition name
because the current ' ' -> '0x20' -> ' ' re-replacement breaks
when a partition name already contains the (sub)string '0x20'.

In
http://download.intel.com/support/motherboards/server/sb/gpt_white_paper_1_1.pdf
and also in
http://www.jonrajewski.com/data/Presentations/CEIC2013/Partition_Table_Documentation_Compressed.pdf
I found

Each GPT partition has a 36-character Unicode name

GPT allows for each partition to have a 36 character Unicode name

which looks scaring because Unicode is no character encoding
(but e.g. UTF-8 would be a character encoding for Unicode)
so that currently I do not know what byte values are valid
for a GPT partition name (I guess UTF-8 would be valid).

jsmeix commented at 2018-01-23 14:13:

In
https://askubuntu.com/questions/53770/how-can-i-encode-and-decode-percent-encoded-strings-on-the-command-line
I found the bash functions

urlencode() {
    # urlencode 
    local length="${#1}"
    for (( i = 0; i < length; i++ )); do
        local c="${1:i:1}"
        case $c in
            [a-zA-Z0-9.~_-]) printf "$c" ;;
            *) printf '%%%02X' "'$c"
        esac
    done
}

urldecode() {
    # urldecode 
    local url_encoded="${1//+/ }"
    printf '%b' "${url_encoded//%/\\x}"
}

which look promising at least for ASCII strings
(I did not yet test UTF-8 strings).

OliverO2 commented at 2018-01-23 14:14:

GUID Partition Table - Wikipedia says:

Partition name (36 UTF-16LE code units)

What about simply using the &nbsp; HTML entity to encode spaces? If someone had a partition name containing the &nbsp; sequence it probably happened by mistake anyway. In contrast 0x20 could appear as part of something like P10x200G.

OliverO2 commented at 2018-01-23 14:26:

And another thought: There is probably no need to be concerned about GPT-internal encodings. I'd expect command line tools such as parted and gdisk to accept arguments in the current locale's encoding and do necessary translations internally. Anything else would seem like a bug to me.

jsmeix commented at 2018-01-23 15:27:

For the fun of it:
ReaR's current locale is always POSIX/C, see /usr/sbin/rear

# make sure that we use only english
export LC_CTYPE=C LC_ALL=C LANG=C

so that any non-ASCII characters could lead to "interesing effects",
cf. "Character encoding" in
https://github.com/rear/rear/wiki/Coding-Style
and see
https://en.opensuse.org/SDB:Plain_Text_versus_Locale
;-)

FWIW:
This issue here is also about working around a bug
in parted when it parses command line arguments
but that's what ReaR must do,
cf. "Dirty hacks welcome" in
https://github.com/rear/rear/wiki/Coding-Style

jsmeix commented at 2018-01-24 15:37:

I did the pull request https://github.com/rear/rear/pull/1706
regardless that it is not yet tested so that you could have an early look.

jsmeix commented at 2018-01-25 10:30:

FWIW:

On SLES10 with GNU Parted 1.6.25.1
parted does not support providing a GPT partition name
directly in the mkpart command.

With GNU Parted 1.6.25.1 one must do it via a separated 'name' command
as follows:

# parted -s /dev/hdc mklabel gpt

# parted -s /dev/hdc mkpart primary 10 20

# parted -s /dev/hdc print
Disk geometry for /dev/hdc: 0kB - 2147MB
Disk label type: gpt
Number  Start   End     Size    File system  Name                  Flags
1       10MB    20MB    10MB                                       

# parted -s /dev/hdc name 1 "'my first partition'"

# parted -s /dev/hdc print
Disk geometry for /dev/hdc: 0kB - 2147MB
Disk label type: gpt
Number  Start   End     Size    File system  Name                  Flags
1       10MB    20MB    10MB                 my first partition    

jsmeix commented at 2018-01-25 10:59:

On SLES11 SP4 with GNU parted 2.3 it works
to provide a GPT partition name directly in the mkpart command:

# parted -s /dev/sdb mklabel gpt

# parted -s /dev/sdb unit MiB print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdb: 2048MiB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number  Start  End  Size  File system  Name  Flags

# parted -s /dev/sdb unit MiB mkpart "'my first partition'" 10 20

# parted -s /dev/sdb unit MiB print
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdb: 2048MiB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start    End      Size     File system  Name                Flags
 1      10.0MiB  20.0MiB  10.0MiB               my first partition

jsmeix commented at 2018-01-26 15:31:

With https://github.com/rear/rear/pull/1706 merged
this issue should be fixed to a currently reasonable extent
which means: Currently it fails for UTF-8 encoded strings
so that UTF-8 encoded GPT partition names
are not (yet?) supported.

jsmeix commented at 2018-01-26 15:52:

@OliverO2
regarding your
https://github.com/rear/rear/issues/1563#issuecomment-359806641
and how UTF-8 stuff fails when the "current locale" is POSIX/C see
https://github.com/rear/rear/pull/1706#issuecomment-360820779
therein in particular

+++ parted -s /dev/sdb mkpart ''\''UTF-8 name bin<C3><A4>r'\''' 12587008B 23072767B
Error during translation: Invalid or incomplete multibyte or wide character

Have fun with locales!
Cf.
https://en.opensuse.org/SDB:Plain_Text_versus_Locale

OliverO2 commented at 2018-01-29 21:48:

Yes, that's always the problem with arguments being interpreted according to the current locale, when we'd really like some locale-agnostic binary string, which could be converted back and forth in a lossless fashion without caring for its native locale.

jsmeix commented at 2018-01-30 10:54:

Ideally when during "rear mkrescue/mkbackup" that runs in POSIX/C locale
tools report values, those values are already "somehow right" so that
later those same values can be used "as is" again as input for those tools
to set up something in POSIX/C locale - even if those vaules actually
contain non-ASCII bytes.

But I fear in reality things "just break arbitrarily" when non-ASCII bytes
appear in values when tools run in POSIX/C locale.

Therefore I have the dim feeling that in the end we are forced to call
specific tools as needed during "rear mkrescue/mkbackup" in a UTF-8 locale
and accordingly we would need to have UTF-8 support by default in the
recovery system so that those specific tools which need it can also run
in a UTF-8 locale during "rear recover" - but as far as I learned
the whole scripts must still run in POSIX/C locale, cf.
https://github.com/rear/rear/issues/1035

Currently only Borg backup restore is run in a UTF-8 locale,
see prep/BORG/default/200_prep_borg.sh
and restore/BORG/default/400_restore_backup.sh

By the way I found another place in the code where
a dirty hack 'sed' workaround for UTF-8 is done,
in lib/mkrescue-functions.sh - see also
https://github.com/rear/rear/issues/1018#issuecomment-251385721

jsmeix commented at 2018-01-30 11:13:

I wonder if I should implememt a test
that is run during "rear mkrescue/mkbackup"
which tests if there is a non-ASCII byte in disklayout.conf
and errors out perhaps even via BugError?

@gdha @rmetrich @gozora @OliverO2
what do you think?

OliverO2 commented at 2018-01-30 14:29:

If ReaR character set support should be limited to 7-bit only, then yes, I'd favor an early error message during rear mkrescue/mkbackup over a failing rescue operation.

On the other hand, I'd prefer ReaR to move towards supporting an 8-bit character set (the default being UTF-8), which is possible in my view in a backward-compatible manner. Cf. https://github.com/rear/rear/issues/1035#issuecomment-361609619.

jsmeix commented at 2018-01-30 15:22:

@OliverO2
of course an early error exit during "rear mkrescue/mkbackup"
is preferred over a failing "rear recover" operation, cf.
https://github.com/rear/rear/pull/1697#issuecomment-358922509

Only a nitpicking side note:
I think the root of all evil is not 7-bit versus 8-bit character encoding.
I assume any single-byte character encoding would "just work" - only
how characters look at the screen could be wrong - but I would think
that for any single-byte encoding the actual byte values would not change.
I think the root of all evil is single byte encoding versus multibyte encoding.

OliverO2 commented at 2018-01-31 18:51:

@jsmeix

I assume any single-byte character encoding would "just work"

Actually, probably not - at least with parted: See https://github.com/rear/rear/pull/1706#issuecomment-362031250 for an explanation.

OliverO2 commented at 2018-01-31 19:02:

@jsmeix

I think the root of all evil is not 7-bit versus 8-bit character encoding.

Multi-byte encodings are downward compatible with 7-bit ASCII. So 7-bit will always work.

I think the root of all evil is single byte encoding versus multibyte encoding.

Not really. Even if you used a single-byte 8-bit encoding such as ISO 8859.1, a tool like parted operating in the C locale would have no idea of how to convert this into a wide-character string: As soon as it finds character codes > 127, it would fail.

So it's all about proper locale setting, or sticking to 7-bit ASCII.

OliverO2 commented at 2018-01-31 19:08:

@jsmeix
Below is some code you could use to reproduce what parted does:

$ LC_CTYPE=C ./wcs Schön
Error during translation: Invalid or incomplete multibyte or wide character
$ LC_CTYPE=en_US.utf8 ./wcs Schön
"Schön" converted to a wide-character string is "Schön"

Store as wcs.c and use cc -o wcs wcs.c to compile:

#include <wchar.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>
#include <malloc.h>
#include <locale.h>

/* original function from parted's strlist.c, with xrealloc call replaced by realloc call */
static wchar_t*
gettext_to_wchar (const char* str)
{
    int     count;
    wchar_t*    result;
    size_t      status;
    mbstate_t   ps;

    count = strlen (str) + 1;
    result = malloc (count * sizeof (wchar_t));
    if (!result)
        goto error;

    memset(&ps, 0, sizeof (ps));
    status = mbsrtowcs(result, &str, count, &ps);
    if (str != NULL)
        goto error;

    result = realloc (result, (status + 1) * sizeof (wchar_t));
    return result;

error:
    fprintf (stderr, "Error during translation: %s\n", strerror (errno));
    exit (EXIT_FAILURE);
}

int main(int argc, char *argv[]) {
    setlocale(LC_ALL, "");

    if (argc != 2) {
        fprintf(stderr, "usage: %s STRING\n", argv[0]);
        exit(1);
    }

    printf("\"%s\" converted to a wide-character string is \"%ls\"\n", argv[1], gettext_to_wchar(argv[1]));

    return 0;
}

jsmeix commented at 2018-02-01 11:27:

@OliverO2
many thanks for your analysis!

Currently I have no time to have a closer look
how parted and ReaR in C locale interact, cf.
https://github.com/rear/rear/pull/1706#issuecomment-362236133
and after FOSDEM there are already "other important things" waiting...

I think my percent-encoding does not change any byte value
so that I think I got UTF-8 encoded characters as output
from parted regardless that parted runs in C locale
during "rear mkrescue" and that looks wrong to me.
Later when "rear recover" (that also runs in C locale)
feeds that same bytes into parted (under the assumption
that my percent-decode really results the same bytes)
parted fails when it gets its own output bytes as input,
cf. my "Ideally..." wish versus my "...fear..." and my "...dim feeling..." in my above
https://github.com/rear/rear/issues/1563#issuecomment-361557086


[Export of Github issue for rear/rear.]