#2903 PR merged
: Protect against colons in pvdisplay output¶
Labels: enhancement
, fixed / solved / done
pcahyna opened issue at 2023-01-02 15:12:¶
Pull Request Details:¶
-
Type: Bug Fix
-
Impact: Normal
-
Reference to related issue (URL): fixes #1958
-
How was this pull request tested?
- full back-up and recovery on RHEL 8.7 / x86_64 (BIOS) with SATA disk, LVM in default configuration
- full back-up and recovery on RHEL 8.7 / x86_64 (BIOS) with SATA
disk, LVM configured to show
/dev/disk/by-path
device symlinks instead of the usual/dev/sd*
device nodes by using this line in /etc/lvm/lvm.conf:
filter = [ "r|/dev/disk/by-path/.*-usb-|", "a|/dev/disk/by-path/pci-.*-nvme-|", "a|/dev/disk/by-path/pci-.*-scsi-|", "a|/dev/disk/by-path/pci-.*-ata-|", "a|/dev/disk/by-path/pci-.*-sas-|", "a|loop|", "r|.*|" ]
.
This case breaks with the original code already duringrear savelayout
and is fixed by this change.
-
Brief description of the changes in this pull request:
LVM can be configured to show device names under /dev/disk/by-path
in
command output. These names often contain colons that separate fields
like channel and target (for example
/dev/disk/by-path/pci-*-scsi-0:0:1:0-*
, similarly the pci-*
part,
which contains colon-separated PCI bus and device numbers). Since the
"pvdisplay -c
" output also uses colons as field separators and does
not escape embedded colons in any way, embedded colons break parsing of
this output.
As a fix, use the pipe character '|
' as the field separator in
pvdisplay output. (This would break if a PV device has a '|
' in its
name, but this is very much less likely than having a ':
' .)
I did a survey of how similar problems are handled in other tools. Some
Solaris tools (ipadm, dladm, zoneadm) have a special switch (-p
) to
request machine-parseable format. This format has colon-separated
fields, and embedded colons are escaped by backslash to allow parsing by
the shell "read" command (this escaping is the key functionality missing
in the pvdisplay command). zfs has a -H
option which uses an
alternative parseable format: in addition to omitting headers it
separates the fields by a single TAB character. This would be in
principle ideal for our purpose (TABs are unlikely to occur in the
values), but in our case the output is processed via an unquoted "echo"
command, which would not preserve the TAB character. (This is needed to
remove the two leading spaces in the output, not sure what their purpose
is.) I believe that using |
as the separator is the best choice in our
situation.
Also, configure explicitly what fields to output - "pvdisplay -c
"
prints many fields, but I have not found documentation about what fields
is it using exactly, so one had to guess what the output means. Using
"pvdisplay -C
" and selecting the fields explicitly is much clearer.
This also changes the PV size field to match documentation, the comment says that size is in bytes, but it actually was not in bytes. As nothing is actually using the PV size field, this inconsistency has not caused any problem in practice, and no code needs adjusting for the change.
Fix provided by a customer in https://bugzilla.redhat.com/show_bug.cgi?id=2091163 .
jsmeix commented at 2023-01-02 16:09:¶
@pcahyna
while you are at that code
and when you like to do it:
I think when we have '|' as field separator
the whole parsing could be simplified and
made more straightforward as follows
# { echo ' /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part1|system|107340627968|7wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W7' ; \
echo ' /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part2 | system2 | 207340627962|2wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W2 ' ; } \
| tr -d ' ' \
| while IFS='|' read pdev vgrp size uuid ; \
do echo -e "\npdev='$pdev'\nvgrp='$vgrp'\nsize='$size'\nuuid='$uuid'\n" ; \
done
pdev='/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part1'
vgrp='system'
size='107340627968'
uuid='7wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W7'
pdev='/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part2'
vgrp='system2'
size='207340627962'
uuid='2wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W2'
I remove all space characters via tr -d ' '
because space is the field separator in disklayout.conf
according to "Disk layout file syntax" in
https://github.com/rear/rear/blob/master/doc/user-guide/06-layout-configuration.adoc
so spaces in values are forbidden anyway.
The
... | while IFS='|' read ... ; do ...
sets the non-standard IFS value only for the 'read' command
but it keeps the standard IFS value (space, tab, newline)
for the commands in the 'while' body:
# echo 'foo|bar' | while IFS='|' read a b ; do echo "a=$a" ; echo -n "IFS=$IFS" | od -a ; echo "b=$b" ; done
a=foo
0000000 I F S = sp ht nl
0000007
b=bar
pcahyna commented at 2023-01-04 15:32:¶
Hello @jsmeix , thank you for the suggestion, but I must say I am reluctant to make such changes to the layout code, if there is no bug to fix (except the bug fixed by the PR in the current form) .
jsmeix commented at 2023-01-05 09:42:¶
@pcahyna
it is perfectly fine with me to only fix the actual issue here.
I had meant it only as a suggestion for further improvement.
Perhaps I may do that when I have to work on that code again.
[Export of Github issue for rear/rear.]