DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel
Explore NY Stream

How to Fix unbootable after kernel upgrade for 6.x Physical

— ny_wk

How to Fix unbootable after kernel upgrade for 6.x Physical

A physical RHEL/CentOS 6.x server that fails to boot after a kernel upgrade is almost always caused by an LVM filter in /etc/lvm/lvm.conf that hides the root volume group from the new initramfs. The fix is to boot the old kernel from the GRUB menu, open the LVM filter so the root disk is visible, rebuild the new kernel image with dracut, and verify the physical volumes before re-tightening the filter. This guide walks through the complete recovery on bare-metal hardware managed through HP iLO.

This is one of the classic kernel upgrade failures on legacy enterprise Linux: the update succeeds, the package installs cleanly, but the freshly generated initramfs cannot assemble the root logical volume because an over-restrictive LVM filter excludes the underlying block device. The kernel panics during early boot with a "Unable to mount root fs" or "Volume group not found / rootvg not found" message.

Why a kernel upgrade makes a physical server unbootable

When you install a new kernel package on a Linux system, the post-install scripts regenerate the early-boot RAM disk (initramfs) using dracut. On a system that boots from LVM, that RAM disk must contain the right LVM configuration so it can find and activate the root volume group (often called rootvg) before the real root filesystem is mounted.

The trouble comes from the filter line in /etc/lvm/lvm.conf. The filter is a list of accept (a) and reject (r) regular expressions that tells LVM which block devices it is allowed to scan. On servers attached to SAN storage (EMC PowerPath, multipath, cciss RAID controllers), administrators frequently tighten this filter to stop LVM from scanning hundreds of duplicate SAN paths. If that tightened filter does not also explicitly accept the local boot disk, the new initramfs bakes in a configuration that cannot see the root disk at all.

A few real-world triggers make this worse on bare metal:

  • Device renaming: after a reboot the local boot disk can move from /dev/sda to /dev/sdb (or another letter) because SAN LUNs are enumerated in a different order. A filter hard-coded to sda then misses the real root device.
  • Duplicate PV warnings: the same physical volume is visible through several paths, and LVM picks the wrong one.
  • Stale RAM disk: the old kernel still boots because its initramfs was built when the filter was correct; only the new kernel inherits the broken filter.

Because the old kernel usually still works, recovery does not require rescue media in most cases. You boot the previous kernel, fix the filter, rebuild the new image, and reboot.

Important note on RHEL/CentOS 6 (legacy, end of life)

The commands below target Red Hat Enterprise Linux 6 / CentOS 6 (kernel 2.6.32, GRUB Legacy, dracut). RHEL 6 reached the end of Extended Life Cycle Support, and CentOS 6 is fully end of life, so these systems should be migrated to a supported release. The same failure mode and recovery logic still applies on modern systems (RHEL/Rocky/AlmaLinux 8 and 9), with two key differences you should know:

  • Modern releases use GRUB 2, so you press a key (often e or Esc) to edit the boot menu rather than Space, and kernels are listed by grubby --info=ALL.
  • The preferred way to limit LVM device scanning today is the devices/global_filter setting (or the LVM devices file, /etc/lvm/devices/system.devices, on RHEL 8.4+), not just the legacy filter key.

If you are on a current OS, read the steps for the concepts, then substitute the modern equivalents noted at the end.

Step-by-step: recover the server through HP iLO

The goal is to interrupt boot, select the last known-good kernel, repair the LVM filter, regenerate the new kernel's initramfs, and confirm everything before tightening security again.

1. Get console access and reboot to the GRUB menu

  1. Confirm HP iLO access works and log in to the iLO web interface for the affected server.
  2. Launch the Remote Console (Integrated Remote Console or the HTML5 console).
  3. Reset the server from iLO (Power → Reset, or Cold Boot if it is hung).
  4. Watch the console. When the GRUB boot loader appears and shows the kernel line, press the Space bar within the short 3-5 second window to reveal the kernel selection menu. (On GRUB 2 systems, press Esc or an arrow key instead.)
  5. Use the arrow keys to highlight the previous (old) kernel entry, then press Enter to boot it.
  6. When the system is up, log in as root (or a sudo-capable account).

2. Temporarily open the LVM filter

Edit the LVM configuration so it accepts every device while you rebuild. This guarantees the new initramfs can see the root disk regardless of how it was renamed.

  1. Back up the file first so you can revert cleanly:
    cp -p /etc/lvm/lvm.conf /etc/lvm/lvm.conf.bak
  2. Open it for editing:
    vi /etc/lvm/lvm.conf
  3. Find the filter setting in the devices { } section. Comment out the restrictive line and enable the permissive "accept everything" filter:
    filter = [ "a/.*/" ]
    #filter = [ "a/sda[1-9]$/", "a/cciss/c0d0.*/", "a|/dev/mapper/.*|", "a/emcpower.*/", "a/dm-.*/", "r/.*/" ]
  4. Save and exit (:wq in vi).

Why this is safe as a temporary step: the permissive filter only widens what LVM may scan; it does not change your volume groups or data. You will re-tighten it in step 5 once you know which device the root VG lives on.

3. Identify the new kernel and rebuild its initramfs

List the contents of /boot to find the exact version string of the newly installed kernel. The version suffix changes with every update, so always read it from disk rather than assuming it.

  1. Change into the boot directory and list the kernel files:
    cd /boot
    ls -l vmlinuz-* initramfs-*
  2. Identify the highest/newest version, for example 2.6.32-431.29.2.el6.x86_64 (yours will differ by date and patch level).
  3. Regenerate that kernel's RAM disk with dracut. The -f flag forces an overwrite of the existing image:
    dracut -f /boot/initramfs-2.6.32-431.29.2.el6.x86_64.img 2.6.32-431.29.2.el6.x86_64

The first argument is the output image path and the second is the kernel version the image is built for. They must match. Because the permissive filter is active, the new image will now embed an LVM configuration that can find the root disk.

4. Reboot into the new kernel

  1. Reboot the server:
    reboot
  2. Let it boot normally. With the rebuilt image, it should come up on the new kernel by default, no menu interaction required.
  3. Confirm the running kernel after login:
    uname -r

5. Verify the physical volumes, then re-tighten the filter

Before locking the filter down again, find out exactly which device the root and system volume groups now use. After a kernel upgrade the boot disk may have shifted (for example sdasdb), and you must accept the current device in the filter.

  1. List physical volumes and watch for duplicate-PV warnings:
    pvs

A typical output on a SAN-attached box looks like this:

PVVGFmtAttrPSizePFree
/dev/emcpowera1dbvg1lvm2a--249.99g0
/dev/emcpowere1dbvg5lvm2a--65.00g3.00g
/dev/sdb2rootvglvm2a--29.97g15.97g
/dev/sdb3systemvglvm2a--29.97g0
/dev/sdb4wilyvglvm2a--161.00g1.00g

Note the warning line you may see above the table:

Found duplicate PV xEKfbOLuz...: using /dev/sdb1 not /dev/sdq1

That message confirms a device moved or is visible on multiple paths. In this example the root VG is now on /dev/sdb, not sda. Your re-tightened filter must accept sdb.

  1. Re-open the config: vi /etc/lvm/lvm.conf
  2. Replace the permissive filter with a precise one that accepts the local boot disk(s) carrying rootvg and systemvg plus the SAN devices, and rejects everything else. Adjust device names to match your pvs output:
    filter = [ "a|^/dev/sda[1-9]$|", "a|^/dev/sdb[1-9]$|", "a|^/dev/emcpower.*|", "r|.*|" ]
  3. Save the file.
  4. Rebuild the current kernel's image one more time so the tightened (correct) filter is baked in:
    dracut -f /boot/initramfs-$(uname -r).img $(uname -r)

Why include both sda and sdb: because device order is not guaranteed across reboots on a SAN host, accepting both candidate local disks makes the boot resilient to future renaming, while r|.*| still rejects the noisy duplicate SAN paths.

If you forgot the filter and the new kernel won't boot

If you skipped opening the filter and the server is already stuck, you do not have to reinstall. Recover with the same loop:

  1. Through iLO, reboot and press Space (GRUB Legacy) or Esc/e (GRUB 2) at the menu and boot the old, working kernel.
  2. Open the filter to filter = [ "a/.*/" ] as in step 2.
  3. Rebuild the broken new image, substituting its exact version:
    cd /boot
    dracut -f /boot/initramfs-2.6.32-431.20.5.el6.x86_64.img 2.6.32-431.20.5.el6.x86_64
  4. reboot and confirm it comes up on the new kernel, then re-tighten the filter and rebuild once more (step 5).

If both kernels fail to boot, drop to the dracut emergency shell or boot from RHEL/CentOS install media in rescue mode, chroot into the system, and run the same filter edit plus dracut -f there.

Common pitfalls when fixing a kernel upgrade boot failure

  • Wrong version string in dracut. The image path and the kernel-version argument must match exactly, character for character. Copy them from ls /boot, never type from memory.
  • Forgetting -f. Without the force flag, dracut refuses to overwrite an existing image and your changes never take effect.
  • Re-tightening the filter to the old device. If the disk moved from sda to sdb, a filter that only accepts sda recreates the exact failure. Always re-tighten after reading pvs.
  • Editing only lvm.conf but not rebuilding. The running system reads the live file, but boot uses the copy baked into initramfs. You must run dracut -f for the change to survive a reboot.
  • Ignoring duplicate-PV warnings. They are the clearest signal that multipath/SAN paths are confusing LVM; resolve which path is authoritative before locking the filter.
  • Quoting regexes incorrectly. Use a consistent delimiter. a|/dev/...| or a/.../ both work, but mixing slashes inside a slash-delimited regex (as in some legacy configs) is error-prone; pipe delimiters are safer for paths.

Verification: confirm the fix is durable

After the final rebuild, prove the server will survive future reboots, not just this one:

  1. Confirm the intended kernel is running: uname -r
  2. Confirm all expected volume groups are active: vgs and lvs should list rootvg, systemvg, and your data VGs as available (a attribute).
  3. Inspect the new image to confirm the LVM config is inside it:
    lsinitrd /boot/initramfs-$(uname -r).img | grep lvm.conf
  4. Do one more clean reboot from iLO and confirm it boots unattended to the new kernel with the tightened filter, with no GRUB interaction.
  5. Check for boot-time LVM errors: dmesg | grep -i lvm and journalctl -b (or /var/log/messages on RHEL 6).

When the server boots cleanly to the new kernel twice in a row with the precise filter in place, the upgrade is complete and the root cause is closed.

Modern equivalent (RHEL/Rocky/Alma 8 and 9)

On current releases, apply the same logic with these tools:

  • List and pick kernels with grubby --info=ALL and set the default with grubby --set-default.
  • Prefer devices/global_filter in lvm.conf, or manage allowed devices with the LVM devices file: lvmdevices --adddev /dev/sdb2.
  • Regenerate the image the same way: dracut -f /boot/initramfs-$(uname -r).img $(uname -r), then optionally dracut --regenerate-all -f.
  • Use persistent device naming (/dev/disk/by-id or WWIDs) and proper multipath configuration so the boot disk never silently renames.

Key Takeaways

  • The root cause of most post-kernel-upgrade boot failures on LVM systems is an over-restrictive filter in /etc/lvm/lvm.conf that hides the root disk from the new initramfs.
  • Boot the old kernel from the GRUB menu first; recovery rarely needs rescue media because the previous image still works.
  • Temporarily set filter = [ "a/.*/" ], rebuild with dracut -f, reboot, then re-tighten the filter only after checking pvs.
  • On SAN/multipath hosts the boot disk can move (sdasdb); accept all real local boot devices in the filter and reject duplicate paths with r|.*|.
  • Always run dracut -f after editing lvm.conf and verify with lsinitrd plus a clean reboot, because boot reads the image, not the live file.

Frequently Asked Questions

Why does my Linux server boot the old kernel but not the new one?

Each kernel has its own initramfs. The old kernel's image was built when the LVM filter was correct, so it still finds the root volume group. The new kernel's image was generated after a restrictive filter was applied, so it cannot see the root disk and panics. Rebuilding the new image with a permissive filter, then re-tightening, resolves it.

How do I rebuild the initramfs after a kernel upgrade?

Use dracut -f /boot/initramfs-<version>.img <version>, where both the image path and the version argument match the exact kernel string from ls /boot. The -f flag forces overwrite. For the running kernel you can use dracut -f /boot/initramfs-$(uname -r).img $(uname -r).

What is the LVM filter and why does it break boot?

The filter in /etc/lvm/lvm.conf is a list of accept/reject regular expressions that controls which block devices LVM is allowed to scan. If it rejects the local disk holding the root volume group, the early-boot initramfs cannot activate rootvg and the system fails to mount its root filesystem.

Will opening the LVM filter to accept everything harm my data?

No. Setting filter = [ "a/.*/" ] only widens what LVM may scan; it does not modify, delete, or re-create any volume group or logical volume. It is a safe temporary step. You should still re-tighten it afterward to avoid duplicate-PV confusion on SAN hosts.

For more Linux troubleshooting and system administration walkthroughs, subscribe to YouTube @explorenystream.