#198 Issue closed: Recovery fails on partitions creation: Kernel failed to re-read partition table

Labels: enhancement, waiting for info

altring opened issue at 2013-02-06 12:17:

ReaR 1.14 from github
OS to restore CentOS 6.3 x32
Scenario:
There is a CentOS 6.3 system on Linux Soft RAID1 created on system's setup.
disklayout.conf:

disk /dev/sda 320071851520 msdos
part /dev/sda 209715200 1048576 primary boot,raid /dev/sda1
part /dev/sda 13424850778 210763776 primary raid /dev/sda2
part /dev/sda 486472214181 8800698368 primary raid /dev/sda3
disk /dev/sdb 320072933376 msdos
part /dev/sdb 13424806769 1048576 primary raid /dev/sdb1
part /dev/sdb 209715200 8590983168primary boot,raid /dev/sdb2
part /dev/sdb 486472258190 8800698368 primary raid /dev/sdb3

raid /dev/md2 level=raid1 raid-devices=2 uuid=xxxxxxxxxxx devices=/dev/sda3,/dev/sdb3
raid /dev/md1 level=raid1 raid-devices=2 uuid=xxxxxxxxxxx devices=/dev/sda2,/dev/sdb1
raid /dev/md0 level=raid1 raid-devices=2 uuid=xxxxxxxxxxx devices=/dev/sda1,/dev/sdb2

fs /dev/md2 / ext4 uuid=xxxxxxxxxxx label= blocksize=4096 ... options=rw
fs /dev/md0 /boot ext4 uuid=xxxxxxxxxxx label= blocksize=1024 ... options=rw
swap /dev/md1 uuid=xxxxxxxxxxx label=

Recovery process usually fails on partitions creation stage on /dev/sdb with the message:

Warning: WARNING: the kernel failed to re-read the partition table on /dev/sdb (Device or resource busy). 
As a result, it may not reflect all of your changes until after reboot.

Partitions recreation on /dev/sda seems to be ok.

Relax-and-recover.log:

+++ echo -e 'Creating partitions for disk /dev/sdb (msdos)'
+++ parted -s /dev/sdb mklabel msdos
+++ parted -s /dev/sdb mkpart primary 32768B 13424839536B
Warning: The resulting partition is not properly alighned for best performance.
+++ parted -s /dev/sdb set 1 raid on
+++ parted -s /dev/sdb mkpart primary 13424840704B 13634555903B
Warning: The resulting partition is not properly alighned for best performance.
+++ parted -s /dev/sdb set 1 boot on
Warning: WARNING: the kernel failed to re-read the partition table on /dev/sdb (Device or resource busy).  As a result, it may not reflect all of your changes until after reboot.
An error occurred during layout recreation.

# partprobe /dev/sdb gives the same error.

I've tried more than 10 times with another HDDs and workstations - error also can happen at the start of recreation after
+++ parted -s /dev/sdb mklabel msdos
line or after
... set raid ... on
line.
I couldn't see any dependence.
Couple of times partitions' recreation was successful and ReaR restored data from backup.tar gz but grub installation hangs with this in logfile:

Including finalize/Linux-i386/21_install_grub.sh
Installing GRUB bootloader
Probing devices to guess BIOS drives. This may take long time,

There is no sense to wait "long time" when you need system up and running write now.
Anyway waiting 4 hours gives no result.
I think this is a bug too.

altring commented at 2013-02-07 08:57:

If I mount /dev/md0 to /mnt/md0 I can see contents of my /boot partition: kernel, initramfs, "grub" folder with all contents and other stuff.
Attempts to boot from any of HDDs gived no luck.
When I try:

grub-install /dev/md0
I get:
/dev/md0 does not have any corresponding BIOS drive.

gdha commented at 2013-02-07 12:28:

I found the following page claiming GRUB cannot construct a software RAID
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Installation_Guide/s1-grub-installing.html

The RH knowledge article goes in more details https://access.redhat.com/knowledge/articles/7094 but you need a valid subscription to read it.

altring commented at 2013-02-07 13:12:

Thank you.
Now I'm trying to restore system with bootloader on the separate partition, not in RAID.
And rear constantly fails on:

parted -s /dev/sda mklabel msdos

if disk already contains RAID partitions.
Workaround: remove RAID partitions before running rear.
I think rear should check for existing partitions on target disks and if so - remove them.
...
Rear hangs on

Including finalize/Linux-i386/21_install_grub.sh
Installing GRUB bootloader
Probing devices to guess BIOS drives. This may take long time,

altring commented at 2013-02-11 10:42:

I 've managed to solve grub problem:
Grub Legacy doesn't understand RAID metadata > 0.90 and Rear creates RAID partitions with default metadata value=1.2. Thus grub failed to install.
Rear should check grub version and deal with raid metadata accordingly.

altring commented at 2013-02-11 12:53:

What about "re-read partitions table" bug? Will it be solved?

gdha commented at 2013-02-11 14:26:

Found an interesting article http://dione.no-ip.org/AlexisWiki/DebianSqueezeRaid1AndGrub2

@altring Could you shine some light on the details before and after conc. grub version, and other executables you might have changed? To be honest - I couldn't follow all the way...sorry

altring commented at 2013-02-11 14:41:

@gdha
"I 've managed to solve grub problem:"
I've added metadata=0.90 in disklayout.conf in corresponding string (RAID partition with /boot).
Before:
raid /dev/md0 level=raid1 raid-devices=2 uuid=xxxxxxxxxxx devices=/dev/sda1,/dev/sdb2
After:
raid /dev/md0 level=raid1 metadata=0.90 raid-devices=2 uuid=xxxxxxxxxxx devices=/dev/sda1,/dev/sdb2

gdha commented at 2013-02-11 15:07:

@altring Ok, now I'm following again. How can we detect the metadata version? Please show me. If we want to enhance the code we must know this. thx.

altring commented at 2013-02-12 09:05:

Sorry if I break logic line.
Metadata version can be seen with

mdadm --examine --scan

This will give you something like:

ARRAY /dev/md/1 metadata=1.2 UUID=xxxxxxxxxxxxxxxxxxxxx name=localhost.localdomain:1

Please ask for information or additional tests - I'll help with pleasure!

pavoldomin commented at 2013-02-12 09:23:

Just wondering, can grub actually boot md with the metadata version 1.2?

From: altring [mailto:notifications@github.com]
Sent: Tuesday, February 12, 2013 10:06 AM
To: rear/rear
Subject: Re: [rear] Recovery fails on partitions creation: Kernel failed to re-read partition table (#198)

Sorry if I break logic line.
Metadata version can be seen with

mdadm --examine --scan

This will give you something like:

ARRAY /dev/md/1 metadata=1.2 UUID=xxxxxxxxxxxxxxxxxxxxx name=localhost.localdomain:1

Please ask for information or additional tests - I'll help with pleasure!


Reply to this email directly or view it on GitHubhttps://github.com/rear/rear/issues/198#issuecomment-13423564.

altring commented at 2013-02-12 09:42:

@pavoldomin
grub legacy (<=0.97) can deal with metadata 0.90
grub2 (>1.98) up to metadata 1.20.

gdha commented at 2013-02-12 13:07:

@jhoekx is this something we should add or enhance the current savelayout code to achieve this?

jhoekx commented at 2013-02-12 13:12:

I think we should be able to detect the current metadata version and use that during restores.

jhoekx commented at 2013-02-12 13:14:

The other error about the partition table not being re-read is likely because the md device is open. Can you test if /proc/mdstat exists before the restore starts?

jhoekx commented at 2013-02-12 13:17:

Is the metadata version visible in mdadm --misc --detail $device or mdadm --detail --scan --config=partitions?

If yes, then we can add it quickly. I don't have a VM with md raid anymore to test...

JCMendes commented at 2013-02-15 22:05:

System: CentOS release 6.3 (Final)
Kernel: 2.6.32-279.22.1.el6.x86_64
parted-2.1-18.el6.x86_64
rear-1.14-1.el6.noarch

Recovery Problem: After command: # rear recover

Intermittently the system stops and not back the recovery

The Logs on item 1 shows in the last line:

*** partprobe –s /dev/sda
Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy). As a result, it may not reflect all of your changes until after reboot.

Workaround: I have edited the restore script and deleted the line partprobe –s /dev/sda, after that, I can go on "item 5" without errors.

I think the best solution will be “rear” has the software product “util-linux-ng” that provides “partx” instead partprobe that has some BUGs.

I hope had helped.

gdha commented at 2013-02-18 20:29:

@JCMendes according https://bugzilla.redhat.com/show_bug.cgi?id=614357 parted version 2.3 fixes your problem with partprobe

gdha commented at 2013-02-18 20:31:

@altring would it be possible to answer on Jeroen's (jhoekx) questions?

altring commented at 2013-02-20 11:53:

"The other error about the partition table not being re-read is likely because the md device is open. Can you test if /proc/mdstat exists before the restore starts?"
Yes. It exists.

RESCUE localhost:~ # ls -l /proc | grep mdstat
--r--r--r-- 1 root root 0 Dec 31 19:24 mdstat

"Is the metadata version visible in mdadm --misc --detail $device or mdadm --detail --scan --config=partitions?"
In rescue system or after booting from rescue media or after restore completes? Where should I check?

jhoekx commented at 2013-02-20 12:03:

What I was actually hoping for was that you posted the output :-) Could you do that please?

altring commented at 2013-02-21 14:53:

Please respond.

jhoekx commented at 2013-02-21 14:58:

Can you do cat /proc/mdstat in the rescue system?

For the metadata version I would need the ouput in the real system.

altring commented at 2013-02-22 08:13:

In real system:

mdadm --detail --scan --config=partitions
ARRAY /dev/md3 metadata=1.1 name=localhost.localdomain:3 UUID=03709fd5:5575633a:3cddfea3:7b706370
ARRAY /dev/md1 metadata=1.1 name=localhost.localdomain:1 UUID=863dd69d:50129c5c:0e556583:9b769087
ARRAY /dev/md2 metadata=1.1 name=localhost.localdomain:2 UUID=34b4ec11:fd95c035:4ece3f32:b42a1442
ARRAY /dev/md0 metadata=1.0 name=localhost.localdomain:0 UUID=518fe3d0:56b11ba3:cd9be72d:0de5b337

mdadm --misc --detail /dev/md*

/dev/md0:
Version : 1.0
Creation Time : Mon Jan 1 02:09:40 2007
Raid Level : raid1
Array Size : 511988 (500.07 MiB 524.28 MB)
Used Dev Size : 511988 (500.07 MiB 524.28 MB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Update Time : Mon Jan  1 02:09:00 2007
      State : clean

Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

       Name : localhost.localdomain:0
       UUID : 518fe3d0:56b11ba3:cd9be72d:0de5b337
     Events : 19

Number   Major   Minor   RaidDevice State
   0       8        1        0      active sync   /dev/sda1
   1       8       17        1      active sync   /dev/sdb1

/dev/md1:
Version : 1.1
Creation Time : Mon Jan 1 02:09:41 2007
Raid Level : raid1
Array Size : 4193272 (4.00 GiB 4.29 GB)
Used Dev Size : 4193272 (4.00 GiB 4.29 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Update Time : Mon Jan  1 02:09:00 2007
      State : clean

Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

       Name : localhost.localdomain:1
       UUID : 863dd69d:50129c5c:0e556583:9b769087
     Events : 20

Number   Major   Minor   RaidDevice State
   0       8        3        0      active sync   /dev/sda3
   1       8       19        1      active sync   /dev/sdb3

/dev/md2:
Version : 1.1
Creation Time : Mon Jan 1 02:09:42 2007
Raid Level : raid1
Array Size : 51198908 (48.83 GiB 52.43 GB)
Used Dev Size : 51198908 (48.83 GiB 52.43 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Mon Jan  1 02:09:28 2007
      State : active

Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

       Name : localhost.localdomain:2
       UUID : 34b4ec11:fd95c035:4ece3f32:b42a1442
     Events : 141

Number   Major   Minor   RaidDevice State
   0       8        2        0      active sync   /dev/sda2
   1       8       18        1      active sync   /dev/sdb2

/dev/md3:
Version : 1.1
Creation Time : Mon Jan 1 02:09:54 2007
Raid Level : raid1
Array Size : 256660348 (244.77 GiB 262.82 GB)
Used Dev Size : 256660348 (244.77 GiB 262.82 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Mon Jan  1 02:09:26 2007
      State : active, resyncing

Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Resync Status : 36% complete

       Name : localhost.localdomain:3
       UUID : 03709fd5:5575633a:3cddfea3:7b706370
     Events : 237

Number   Major   Minor   RaidDevice State
   0       8        5        0      active sync   /dev/sda5
   1       8       21        1      active sync   /dev/sdb5

Metadata version can be seen in both cases.

altring commented at 2013-02-22 08:24:

Here is "cat /proc/mdstat" output in RESCUE system:

RESCUE localhost # cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
511988 blocks super 1.0 [2/2] [UU]

md2 : active raid1 sdb2[1] sda2[0]
51198908 blocks super 1.1 [2/2] [UU]
bitmap: 1/1 pages [4KB], 65536KB chunk

md3 : active raid1 sda5[0] sdb5[1]
256660348 blocks super 1.1 [2/2] [UU]
[==============>......] resync = 74.7% (191949440/256660348) finish=33.6min speed=32054K/sec
bitmap: 2/2 pages [8KB], 65536KB chunk

md1 : active raid1 sda3[0] sdb3[1]
4193272 blocks super 1.1 [2/2] [UU]

It looks like Rescue system mounts existing RAID partitons on HDD.

jhoekx commented at 2013-02-22 08:35:

Ok, thanks. I can add metadata support now.

The MD activatition is the reason why partitioning fails. I guess there's a udev rule that automatically enables the MD devices. We'll have to look into how to avoid that...

altring commented at 2013-02-22 08:43:

Thank you. Let me know please - I'll test it.

gdha commented at 2013-09-03 11:05:

Are there any updates for this issue? If not, I think it is safe to close this one.

altring commented at 2013-09-03 12:22:

No updates so far.
Should be ok now...

JCMendes commented at 2013-09-03 12:47:

It´s working well now.

Regards,

J. Carlos Mendes

From: altring [mailto:notifications@github.com]
Sent: terça-feira, 3 de setembro de 2013 09:23
To: rear/rear
Cc: Mendes, Jose Carlos
Subject: Re: [rear] Recovery fails on partitions creation: Kernel failed to re-read partition table (#198)

No updates so far.
Should be ok now...


Reply to this email directly or view it on GitHubhttps://github.com/rear/rear/issues/198#issuecomment-23708416.

gdha commented at 2013-09-06 15:07:

closing the issue with customer's agreement

altring commented at 2013-09-08 06:49:

Thank you guys!


[Export of Github issue for rear/rear.]