Kernel Panic After Update To CentOS 6.4

After successfully updating three CentOS 6.3 VM guests to 6.4 I decided to update the host as well, and it failed to boot.

Kernel panic – Not syncing: Attempted to kill init!
Pid: 1, comm: init not tainted: 2.6.32-358.2.1.el6.x86_64 #1
Plus a call trace I couldn’t see

Luckily I was able to boot from the previous kernel and get my system back up. After booting to the previous kernel I removed the “358″ kernel and all of it’s related module and devel packages using yum remove, then did yum update again, as I could only guess that the install somehow didn’t complete. But it still fails to boot.

Now the kernel panic has happened to a 6.3 VM guest I upgraded, now it isn’t just the host hardware that is a problem.

Has anyone else seen this?, Any ideas where to start troubleshooting?

, , ,

  1. #1 by Akemi Yagi on March 12th, 2013 - 3:49 pm

    At the time of this writing, CentOS kernel 2.6.32-358.2.1.el6 is not out yet. Where did you get this one from ??? Did you build it yourself?

    Akemi

  2. #2 by Emmett Culley on March 12th, 2013 - 6:05 pm

    I did yum update –enablerepo=epel. I just checked and it appears that kernel was from the updates repo:

    [~]# yum list kernel Installed Packages kernel.x86_64 2.6.32-279.9.1.el6 @updates kernel.x86_64 2.6.32-279.14.1.el6 @updates kernel.x86_64 2.6.32-279.19.1.el6 @updates kernel.x86_64 2.6.32-279.22.1.el6 @updates kernel.x86_64 2.6.32-358.0.1.el6 @updates

    [~]# rpm -qa |grep kernel abrt-addon-kerneloops-2.0.8-15.el6.CentOS.x86_64
    kernel-2.6.32-279.19.1.el6.x86_64
    dracut-kernel-004-303.el6.noarch kernel-devel-2.6.32-279.14.1.el6.x86_64
    kernel-2.6.32-279.14.1.el6.x86_64
    kernel-devel-2.6.32-279.22.1.el6.x86_64
    kernel-headers-2.6.32-358.0.1.el6.x86_64
    kernel-firmware-2.6.32-358.0.1.el6.noarch kernel-2.6.32-358.0.1.el6.x86_64
    kernel-devel-2.6.32-279.19.1.el6.x86_64
    kernel-devel-2.6.32-358.0.1.el6.x86_64
    kernel-2.6.32-279.9.1.el6.x86_64
    libreport-plugin-kerneloops-2.0.9-15.el6.CentOS.x86_64
    kernel-2.6.32-279.22.1.el6.x86_64
    kernel-devel-2.6.32-279.9.1.el6.x86_64

    This is from a VM that succeeded with the update to the “359″ kernel. There are three more like that.

    Emmett

  3. #3 by Emmett Culley on March 12th, 2013 - 7:09 pm

    Yes, kernel “358″. the “359″ was a typo. And… the kernel panic lines were transcribed for a photo I took of the screen after the failed boot. On second look I see that the version is 2.6.32-358.0.1.el6.x86_64.

    So let’s start again.

    Kernel panic – Not syncing: Attempted to kill init!
    Pid: 1, comm: init not tainted: 2.6.32-358.0.1.el6.x86_64 #1

    After yum upgrade –enablerepo=epel on two of five machines, one of which is the host for the three VM’s that succeeded and the one that failed, just as the host.

    I have a screen shot of that VM’s boot failure, but I don’t know the proper way to include it in a post.

    I’ve uninstalled that kernel and ran yum upgrade again, it still fails on that kernel, on both the host and the VM. I suppose the good thing is that it happened on a VM guest that is not critical, so I don’t have to experiment with the host that has four important guests running on it.

    Any ideas?

    Emmett

  4. #4 by Liam O’Toole on March 13th, 2013 - 4:19 pm

    (…)

    I saw this problem on one machine I upgraded from 6.3 to 6.4 recently. When I boot it in verbose mode I see the following messages:

    dracut: /proc/misc: No entry for device-mapper found dracut: Failure to communicate with kernel device-mapper driver

    which led me to the following bug report:

    http://bugs.CentOS.org/view.php?idc04

    Just today kernel 2.6.32-358.2.1 became available. The problem is still present, but only on the same one machine.

  5. #5 by Johnny Hughes on March 13th, 2013 - 4:44 pm

    What does this command say:

    rpm -q device-mapper

  6. #6 by Nux! on March 13th, 2013 - 4:52 pm

    Is selinux on? If you boot the new kernel with selinux=0 does the panic happen?
    I had this happen to me recently.

  7. #7 by Johnny Hughes on March 13th, 2013 - 5:17 pm

    This file exists in the kernel:
    /lib/modules/2.6.32-358.2.1.el6.x86_64/kernel/drivers/md/dm-mod.ko

    Somehow it seems that in some machines it is not making it into the initrd.

  8. #8 by Johnny Hughes on March 13th, 2013 - 5:43 pm

    Try this command where the kernel is installed and not booting (when booted into a kernel that works):

    lsinitrd /boot/initramfs-2.6.32-358.2.1.el6.x86_64.img | grep dm-mod

    (if you have one of the other 358 kernels installed, use that version instead of2.6.32-358.2.1.el6.x86_64 )

  9. #9 by Liam O’Toole on March 14th, 2013 - 5:09 am

    (…)

    The output is

    device-mapper-1.02.77-9.el6.i686

    Ditto on the machines which are *not* affected.

  10. #10 by Liam O’Toole on March 14th, 2013 - 5:17 am

    (…)

    Here you go:

    $ uname -r && lsinitrd /boot/initramfs-2.6.32-358.2.1.el6.i686.img |
    grep dm-mod
    2.6.32-279.22.1.el6.i686
    -rwxr–r– 1 root root 106212 Mar 13 17:11
    lib/modules/2.6.32-358.2.1.el6.i686/kernel/drivers/md/dm-mod.ko

    So the module appears to be present in the initrd.

    The only distinguishing feature I can think of in the machine that is failing is that it has a solid-state drive. Relevant?

  11. #11 by Daniel J on March 14th, 2013 - 7:51 am

    You can leave SELinux on in permissive mode using enforcing=0.

    —–BEGIN PGP SIGNATURE—

  12. #12 by Johnny Hughes on March 14th, 2013 - 9:19 am

    Maybe … try this (everything done as root):

    Boot on a kernel that works and do this:

    1. Backup you current initrd:

    cp -a /boot/initramfs-2.6.32-358.2.1.el6.x86_64.img
    /boot/initramfs-2.6.32-358.2.1.el6.x86_64.img.bak

    2. Go to this directory:

    cd /lib/modules/2.6.32-358.2.1.el6.x86_64/kernel/drivers/md/

    3. Figure out the md modules loaded in the old kernel:

    lsmod | grep dm_

    in my case, that output would be this:

    [root@localhost md]# lsmod | grep dm_
    dm_round_robin 2525 0
    dm_multipath 17756 1 dm_round_robin dm_mirror 14133 0
    dm_region_hash 12085 1 dm_mirror dm_log 9930 2 dm_mirror,dm_region_hash dm_mod 82839 12 dm_multipath,dm_mirror,dm_log

    (note, dm-mod and dm_mod are the same thing)

    5. Do an file list and make sure all the modules you need to include
    (in my case the 6 in column 1):

    ls

    Note: make sure all the modules are listed ad you see the file names
    (should be for me: dm-round-robin.ko, dm-multipath.ko, dm-mirror.ko, dm-region-hash.ko, dm-log.ko, dm-mod.ko)

    4. Create a new initrd with all the relevant md modules preloaded (in my case, this command line … preload only the modules you need from your list .. again, have to do this as root):

    mkinitrd -f –preload=dm_round_robin –preload=dm_multipath
    –preload=dm_mirror –preload=dm_region_hash –preload=dm_log
    –preload=dm_mod /boot/initramfs-2.6.32-358.2.1.el6.x86_64.img
    2.6.32-358.2.1.el6.x86_64

    Note: The above mkinitrd command (and all the other commands) should be entered all on one line, I am sure it will wrap when posted.

    5. This may not work, because there may need to be some other things loaded that are not, like the disc controller’s kernel module driver, etc. What I think is going on is either something has been removed from this kernel that existed before … OR … something is being mis-detected with this kernel on your machine.

  13. #13 by Emmett Culley on March 14th, 2013 - 10:03 am

    I figured out that in both failure cases the “yum update” was never completed as I had to run yum-complete-transaction on both. And doing that and re-installing the 358.0.1 had the same boot failures.

    Yesterday I did another update which installed the 358.2.1 kernel, which booted. So I guess I’ll attempt to update the host machine.

    I don’t know what happened, but it seems to be resolved.

    Emmett

  14. #14 by Liam O’Toole on March 14th, 2013 - 10:34 am

    (…)

    OK.

    For me the corresponding directory is
    /lib/modules/2.6.32-358.2.1.el6.i686/kernel/drivers/md.

    In my case I have:

    # lsmod | grep dm_
    dm_mirror 11678 0
    dm_region_hash 9609 1 dm_mirror dm_log 8322 2 dm_mirror,dm_region_hash dm_mod 66925 8 dm_mirror,dm_log

    Yes, all are present:

    # ls dm*.ko dm-bufio.ko dm-log-userspace.ko dm-queue-length.ko dm-service-time.ko dm-crypt.ko dm-memcache.ko dm-raid45.ko dm-snapshot.ko dm-delay.ko dm-mirror.ko dm-raid.ko dm-thin-pool.ko dm-flakey.ko dm-mod.ko dm-region-hash.ko dm-zero.ko dm-log.ko dm-multipath.ko dm-round-robin.ko

    (Indeed it did wrap.) The command I used was

    mkinitrd -f –preload=dm_mirror –preload=dm_region_hash
    –preload=dm_log –preload=dm_mod
    /boot/initramfs-2.6.32-358.2.1.el6.i686.img 2.6.32-358.2.1.el6.i686

    That produced a file of similar size to the original:

    # ls -l /boot/initramfs-2.6.32-358.2.1.el6.i686.img*
    -rw-r–r–. 1 root root 15450686 Mar 14 15:13
    /boot/initramfs-2.6.32-358.2.1.el6.i686.img
    -rw-r–r–. 1 root root 15450647 Mar 13 17:11
    /boot/initramfs-2.6.32-358.2.1.el6.i686.img.bak

    Unfortunately, when rebooting into kernel 358.2.1 I get the same result as before.

    Thanks for taking the time to look into this. Is it an upstream bug?

  15. #15 by Johnny Hughes on March 14th, 2013 - 11:09 am

    It is actually kind of hard to tell where the bug lies … it is possible that somehow our kernel or other tools are causing some problem, but I would think it is more likely that some driver is not being detected and loaded. If it is in relation to dm-mod, then I would suspect that the driver is for the chipset’s drive controller.

    If you can figure out which driver that is and make it preload, you might be able to boot … then we can figure out why it is not detected.

  16. #16 by Emmett Culley on March 16th, 2013 - 12:45 pm

    Yesterday I upgraded all of the guests (4) and the host to the “358.2.1″ kernel. All of the VMs restarted fine, but the host has the same boot failure.

    But I have some new information that might make a difference.

    First: When I first saw this issue on two machines, I had updated the machines to 6.4 while logged via VNC. Since both of the failures also had incomplete updates and required me to run yum-complete-transaction, I assumed that those yum update session failures were the reason for the boot failure. Because I assume the update caused the VNCserver to reset, interrupting the yum update session.

    So this time I ran the updates via ssh. All went well, all updates completed, but the host fails to boot on the “358.2.1″ kernel.

    Here is the new information. When the host boots on the previous “good” kernel I see the simplified plymouth trail (the tri-color tape that runs along the bottom of the screen during boot). But when it boots from the “bad” kernels I see the “fancy” CentOS splash, with the spinning circle under the CentOS logo.

    In all cases, the VM guests all boot with the simplified splash. So I suppose that means the the new kernel installation is incorrectly detecting my video hardware.

    Can anybody suggest some changes I can make to the kernel parameters that could mitigate that mid-detection?

    Emmett

  17. #17 by Douglas Lidstone on March 12th, 2013 - 11:39 pm

    I had the same problem, Kernel Panic…

  18. #18 by Douglas Lidstone on March 12th, 2013 - 11:47 pm

    Correction. I have the same problem with the same message. I am running a Lenovo G780 Quad CPU, double threaded per CPU. I have several KVM’s running under the CENTOS 6.3 KVM/QEMU setup. Nothing wierd. I can reboot to the last good kernel and the VM’s, all Centos 6.3, work.

    uname-a output as follows:

    Linux mechanica.machinari.com 2.6.32-279.22.1.el6.x86_64 #1 SMP Wed Feb 6 03:10:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

  19. #19 by Daniel Fitzgerald on March 15th, 2013 - 8:48 pm

    Having this problem on 3 computers, I wanted to install updates, reboot and go home for the weekend…. I find posts all over the net about this issue but no fix from Redhat, what the hell is going on here?

  20. #20 by irfan on April 12th, 2013 - 5:39 am

    I had this issue too but successfully resolved it via following method, try it.
    1. open terminal and type “su -” without qoutes and hit enter
    2. Enter password (root user)
    3. Enter “yum update” and hit enter
    4. System will find updates (6.3 to 6.4), just start updating
    5. After update you have two situations as follows,
    5.1 System started but need “yum-complete-transaction” (goto step 5.1.1)
    5.2 Kernel Panic and system did not start (goto step 6)

    5.1: Solution
    5.1.1. open terminal and type “su -” without qoutes and hit enter
    5.1.2. Enter password (root user)
    5.1.3. Type “yum-complete-transaction” without qouts and hit enter
    5.1.4 If question as then enter “yes” and hit enter
    5.1.5 Let it to complete and then reboot system
    5.1.6 Now you will get Kernel Panic (got step 6)

    6. Restart machine
    7. As soon as startup count start hit “i” without qouts
    8. Select OLD kernel and hit enter
    9. System will boot, login to your user account
    10. Open terminal and type “su -” without qoutes and hit enter
    11. Enter password (root user)
    12. Enter “yum reinstall kernel-2.6.32-358.2.1.el6.x86_64″ without qouts and hit enter
    13. It will ask you to confirm, just confirm and it will proceed with installation
    14. Once installation complete, reboot

    15. Machine will start without any issue
    Enjoy

  21. #21 by Jacob Vennervald on April 27th, 2013 - 9:20 am

    I have upgraded from 6.3 to 6.4 and I am now on kernel 2.6.32-358.6.1 and it won’t boot.
    I get an error saying “Starting udev: udevd-work[640]: error opening ATTR{/sys/devices/virtual/block/md0/queue/iosched/slice_idle} for writing: No such file or directory” followed by a similar error but with quantum instead of slice_idle. Then finally it gives me an error “fsck.ext3: Unable to resolve ‘UUID=’ [FAILED]“.
    I can’t just boot the old kernel. It doesn’t work either and gives me the error “Unable to resolve…”. I’ve tried ot enter the shell by entering the root password and run yum-complete-transaction, but I get an error, that it’s already locked by yum-complete-transaction, which tells me that it never fully finished the yum-complete-transaction as you are talking about. Is there a way I can boot into the system through a rescue disk and then do the reinstall of the kernel?

    Hope you can help me. The kids are screaming for cartoons :)

    /Jacob

(will not be published)
Subscribe to comments feed