#293 Issue closed: CentOS 6.4 - Rescue Boot Failed: VFS: Cannot open root device "(null)" or unknown-block(253,0)

MuninnAndHuginn opened issue at 2013-08-20 18:34:

Using OUTPUT=ISO, BACKUP=RBME on CentOS 6.4 (up-to-date with epel and remi repos) I am unable to boot from the recovery media (burned from the iso dropped on the backup server) or from the grub-installed local kernel/initrd stanza.

I tried adding root=/dev/ram0 to the kernel command line params and that gets me a different error: No filesystem could mount root, tried: iso9660

This is a group development server that is not yet in use (initial testing and recovery verification only). I can mess with whatever settings or reboot as needed. Unfortunately I haven't been able to get the serial output working from this server so I haven't been able to read the entire kernel output during boot. I was able to get the serial output from the boot and there didn't appear to be any errors until the panic.

Thanks in advance for any assistance.

root@iommi /var/log/rear# rear -v -d -D mkbackup

Relax-and-Recover 1.14-git201308130312 / 2013-08-13
Using log file: /var/log/rear/rear-iommi.log
Creating disk layout
Creating root filesystem layout
Copying files and directories
Copying binaries and libraries
Copying kernel modules
Creating initramfs
Making ISO image
Wrote ISO image: /var/lib/rear/output/rear-iommi.iso (46M)
Copying resulting files to nfs location
You should also rm -Rf /tmp/rear.Ske6tMYxYmLfMaS

root@iommi /var/log/rear# echo $?

0

root@iommi /var/log/rear# rear -v dump

Relax-and-Recover 1.14-git201308130312 / 2013-08-13
Using log file: /var/log/rear/rear-iommi.log.lockless
Dumping out configuration and system information
This is a 'Linux-x86_64' system, compatible with 'Linux-i386'.
System definition:
                                    ARCH = Linux-i386
                                      OS = GNU/Linux
                        OS_MASTER_VENDOR = Fedora
                       OS_MASTER_VERSION = 6
                   OS_MASTER_VENDOR_ARCH = Fedora/i386
                OS_MASTER_VENDOR_VERSION = Fedora/6
           OS_MASTER_VENDOR_VERSION_ARCH = Fedora/6/i386
                               OS_VENDOR = CentOS
                              OS_VERSION = 6.4
                          OS_VENDOR_ARCH = CentOS/i386
                       OS_VENDOR_VERSION = CentOS/6.4
                  OS_VENDOR_VERSION_ARCH = CentOS/6.4/i386
Configuration tree:
                         Linux-i386.conf : OK
                          GNU/Linux.conf : OK
                             Fedora.conf : missing/empty
                        Fedora/i386.conf : missing/empty
                           Fedora/6.conf : missing/empty
                      Fedora/6/i386.conf : missing/empty
                             CentOS.conf : missing/empty
                        CentOS/i386.conf : missing/empty
                         CentOS/6.4.conf : missing/empty
                    CentOS/6.4/i386.conf : missing/empty
                               site.conf : missing/empty
                              local.conf : OK
Backup with RBME
                             RBME_BACKUP =
                           RBME_HOSTNAME = iommi
                  BACKUP_INTEGRITY_CHECK =
                         BACKUP_MOUNTCMD =
                          BACKUP_OPTIONS =
                    BACKUP_RSYNC_OPTIONS = --sparse --archive --hard-links --verbose --numeric-ids --stats
                  BACKUP_SELINUX_DISABLE = 1
                        BACKUP_UMOUNTCMD =
                              BACKUP_URL = nfs://10.4.33.53/backup
Output to ISO
                                 ISO_DIR = /var/lib/rear/output
                               ISO_FILES =
                              ISO_IMAGES =
                        ISO_ISOLINUX_BIN =
                            ISO_MAX_SIZE =
                         ISO_MKISOFS_BIN = /usr/bin/mkisofs
                              ISO_PREFIX = rear-iommi
                               ISO_VOLID = RELAXRECOVER
                           RESULT_MAILTO =

/usr/share/rear/lib/validated/CentOS/6.4/i386.txt
Your system is not yet validated. Please carefully check all functions
and create a validation record with 'rear validate'. This will help others
to know about the validation status of Relax-and-Recover on this system.

gdha commented at 2013-08-24 08:59:

Try rear -v -d -D mkrescue. In the temporary directory of (e.g. /tmp/rear.Ske6tMYxYmLfMaS) you'll find the rootfs. Try the chrootcommand (that's one). Secondly, check the initrd.cgz initial ramdisk. Did you check the rear log file for errors?

MuninnAndHuginn commented at 2013-08-24 13:30:

I did examine the temp directory root fs and the generated image both by unpacking and by using lsinitrd.

The logs didn't indicate anything to me when I looked.

I didn't try chrooting into it. I'll try that.

I am suspecting my kernel. I found options to enable small system support. I built a new zImage with them on but haven't tested it yet.

I'll do that on Monday and rebuild, check the logs again, and try the chroot.

MuninnAndHuginn commented at 2013-08-26 13:36:

I can chroot into the tmp rootfs but I couldn't use 'ls' and so I checked and it is not even present. I noticed that it seems other files are missing as well including '/init'. Compared to one of the working "initramfs*.img" under my '/boot', they have one under '/'. The one in the rear rootfs is a broken symbolic link to 'bin/init'.

Using 'lsinitrd' on the generated 'initrd.cgz' file shows the same things missing.

I haven't yet tested my kernel build with small system support but maybe I don't need it if the rootfs is indeed missing files.

Checking the log, there were no error indications, but there were some warnings that seem unrelated.

root@iommi /var/log/rear# cat rear-iommi.log | grep -v "++" | grep -i "warn"

2013-08-20 14:22:05.437008832 WARNING: Could not collect user info for 'usbmux'
2013-08-20 14:22:05.456126253 WARNING: Could not collect group info for 'group'
2013-08-20 14:22:05.509526404 WARNING: Could not collect group info for 'ssh_keys'
2013-08-20 14:22:05.523903066 WARNING: Could not collect group info for 'usbmux'
                /^\t.+ => not found/ { print "WARNING: Dynamic library " $1 " not found" > "/dev/stderr" }
                /^\t.+ => not found/ { print "WARNING: Dynamic library " $1 " not found" > "/dev/stderr" }
                /^\t.+ => not found/ { print "WARNING: Dynamic library " $1 " not found" > "/dev/stderr" }
                /^\t.+ => not found/ { print "WARNING: Dynamic library " $1 " not found" > "/dev/stderr" }
2013-08-20 14:22:15.762668803 WARNING: unmatched external call to '/bin/raw' in etc/udev/rules.d/60-raw.rules
2013-08-20 14:22:15.767086669 WARNING: unmatched external call to '/bin/sleep' in etc/udev/rules.d/60-openct.rules
2013-08-20 14:22:15.770865876 WARNING: unmatched external call to '/etc/init.d/kdump' in etc/udev/rules.d/98-kexec.rules
2013-08-20 14:22:15.775793659 WARNING: unmatched external call to '/path/to/script' in lib/udev/rules.d/65-libsane.rules
2013-08-20 14:22:15.779422770 WARNING: unmatched external call to '/sbin/alsactl' in etc/udev/rules.d/90-alsa.rules
2013-08-20 14:22:15.782939268 WARNING: unmatched external call to '/sbin/crda' in lib/udev/rules.d/85-regulatory.rules
2013-08-20 14:22:15.786437092 WARNING: unmatched external call to '/sbin/hwclock' in lib/udev/rules.d/88-clock.rules
2013-08-20 14:22:15.790104033 WARNING: unmatched external call to '/sbin/mdadm' in lib/udev/rules.d/64-md-raid.rules
2013-08-20 14:22:15.793620740 WARNING: unmatched external call to '/sbin/mdadm' in lib/udev/rules.d/65-md-incremental.rules
2013-08-20 14:22:15.797111101 WARNING: unmatched external call to '/sbin/microcode_ctl' in lib/udev/rules.d/89-microcode.rules
2013-08-20 14:22:15.800625856 WARNING: unmatched external call to '/sbin/modprobe' in etc/udev/rules.d/60-pcmcia.rules
2013-08-20 14:22:15.804117221 WARNING: unmatched external call to '/sbin/modprobe' in lib/udev/rules.d/40-redhat.rules
2013-08-20 14:22:15.807620364 WARNING: unmatched external call to '/sbin/modprobe' in lib/udev/rules.d/80-drivers.rules
2013-08-20 14:22:15.811091263 WARNING: unmatched external call to '/sbin/modprobe' in lib/udev/rules.d/89-microcode.rules
2013-08-20 14:22:15.814590683 WARNING: unmatched external call to '/sbin/partx' in lib/udev/rules.d/40-multipath.rules
2013-08-20 14:22:15.818081880 WARNING: unmatched external call to '/sbin/setregdomain' in lib/udev/rules.d/85-regulatory.rules
2013-08-20 14:22:15.821603233 WARNING: unmatched external call to '/usr/bin/eject' in lib/udev/rules.d/61-mobile-action.rules
2013-08-20 14:22:15.825150125 WARNING: unmatched external call to '/usr/bin/setfacl' in lib/udev/rules.d/70-cups-libusb.rules
2013-08-20 14:22:15.828731952 WARNING: unmatched external call to '/usr/sbin/bluetoothd' in lib/udev/rules.d/97-bluetooth.rules
2013-08-20 14:22:15.832259329 WARNING: unmatched external call to '/usr/sbin/usbmuxd' in lib/udev/rules.d/85-usbmuxd.rules
2013-08-20 14:22:15.835991144 WARNING: unmatched external call to 'check-mtp-device' in lib/udev/rules.d/40-libgphoto2.rules

MuninnAndHuginn commented at 2013-08-26 18:37:

Enabling small systems support in the kernel did not do anything. It was a shot in the dark but at least we know for sure.

MuninnAndHuginn commented at 2013-08-26 19:10:

After the test, I ran another rear -v -d -D mkrescue and am going over the scripts. It looks like the /usr/share/rear/build/GNU/Linux/39_copy_binaries_libraries.sh script should be copying the files that are required. The log shows the following line.

2013-08-26 15:00:36.529312455 Binaries being copied: /bin/awk /bin/bash /bin/cpio /bin/dd /bin/df /bin/dumpkeys /bin/grep /bin/ipcalc /bin/kbd_mode /bin/loadkeys /bin/mv /bin/nano /bin/sort /bin/stty /bin/sync /bin/usleep /lib/udev/ata_id /lib/udev/cdrom_id /lib/udev/edd_id /lib/udev/path_id /lib/udev/usb_id /sbin/agetty /sbin/cryptsetup /sbin/dhclient /sbin/dmeventd /sbin/dmsetup /sbin/fsadm /sbin/grub-install /sbin/ip /sbin/kpartx /sbin/lvm /sbin/mingetty /sbin/mount.nfs /sbin/parted /sbin/pidof /sbin/scsi_id /sbin/sfdisk /sbin/udevadm /sbin/udevd /sbin/umount.nfs /usr/bin/diff /usr/bin/file /usr/bin/getopt /usr/bin/join /usr/bin/scp /usr/bin/sftp /us

This looks to be about what I get and I'm not seeing init or ls in here.

I don't think they are printed from the conf but I see this debug in the log:

2013-08-26 15:00:27.043436308 Including conf/GNU/Linux.conf
+ . /usr/share/rear/conf/GNU/Linux.conf
++ PROGS=(${PROGS[@]} rpc.statd rpcbind bash mknod blkid vol_id udev_volume_id portmap readlink rpcinfo grep cat tac tr reboot halt shutdown killall5 killall chroot tee awk ip ifconfig nslookup route ifenslave ifrename nameif klogd syslog-ng syslogd rsyslogd echo cp date wc cut rm rmdir test init telinit ethtool expand sed mount umount insmod modprobe lsmod true false mingetty rmmod hostname uname sleep logger ps ln dirname basename mkdir tty ping netstat free traceroute less vi pico nano rmmod df ls dmesg du tar gzip netcat top iptraf joe pico getent id ldd strace rsync tail head find md5sum mkfs mkfs.ext2 mkfs.ext3 mkfs.ext4 mkfs.ext4dev mkfs.jfs mkfs.xfs mkfs.reiserfs mkfs.vfat mkfs.btrfs mkreiserfs fsck fsck.ext2 fsck.ext3 fsck.ext4 fsck.ext4dev fsck.xfs fsck.reiserfs reiserfsck fsck.btrfs btrfsck tune2fs tune4fs xfs_admin xfs_db btrfs jfs_tune reiserfstune expr egrep grep fgrep df chmod chown stat mkswap swapon swapoff mknod touch scsi_id lsscsi logd initctl lspci usleep mktemp /bin/true strace which mkfifo seq openvt poweroff chacl getfacl setfacl attr getfattr setfattr mpath_wait strings xargs)

This would suggest that init and ls should be included in what gets copied. They must be getting excluded or they are not where they are expected to be but it doesn't cause an error.

MuninnAndHuginn commented at 2013-08-26 20:50:

I added a debug log to the end of each script that ran for my configuration to print out the PROGS and REQUIRED_PROGS arrays. The lists are correct until... you guessed it... my local.conf is read.

I tried to add a custom program to be included and I broke the PROGS variable. Let that be a lesson to anyone else trying to do that.

Here is an example of what I did in the local.conf and how I corrected the issue.

INCORRECT:

PROGS=(
myFavoriteApp
)

CORRECT:

PROGS=(
"${PROGS[@]}"
myFavoriteApp
)

I didn't notice on my other server because I added this after I had validated the configuration and never actually tested the new image. I'll be re-doing all my images after this.

Marking this as solved.


[Export of Github issue for rear/rear.]