DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel
Explore NY Stream

VM is still running and cannot be shutdown

— ny_wk

VM is still running and cannot be shutdown

When a VMware VM is still running and cannot be shut down from the vSphere Client, you can force it off from the host's command line by finding the virtual machine's process ID and terminating it. On modern ESXi hosts the supported way is esxcli vm process list followed by esxcli vm process kill; on legacy ESX hosts you used kill -9 against the VM's process. This guide walks through both, safely.

The problem: a VMware VM is still running and cannot be shut down

Every administrator eventually meets the VM that ignores every polite request. You click Power Off in the vSphere Client and nothing happens. The guest shows as powered-on, the console is black or frozen, and operations like vMotion, snapshot, or reconfigure fail because the machine appears busy. Sometimes the task bar in vCenter sits at a perpetual percentage that never completes.

This usually means the VM's underlying process on the host has hung. Each running virtual machine is backed by a process (historically called the VMX / vmware-vmx process) on the hypervisor. When that process stops responding to the power-off signal, the management layer cannot reap it gracefully, so the VM looks stuck. The fix is to identify that host-side process and terminate it directly.

Important: killing a VM process is a hard stop, equivalent to pulling the power cord. It is safe for the hypervisor, but the guest OS gets no chance to flush data. Treat it as a last resort after the graceful options below have failed.

Before you start: requirements and a quick triage

Before reaching for a force kill, confirm the situation and gather access:

  • Host access. You need SSH (or DCUI/ESXi Shell) access to the specific ESXi host that is actually running the VM, plus root credentials. On legacy ESX, that was the service console.
  • Identify the right host. In a cluster, the VM lives on one host. Killing the process must happen on that host, so note where the VM is registered.
  • Enable SSH if needed. In the vSphere Client, select the host, go to Configure > System > Services, and start the SSH service (it is disabled by default for security).
  • Try the graceful path first. Right-click the VM and choose Power > Reset or Power Off again; if VMware Tools is installed and responsive, Shut Down Guest OS may still work.

If those fail, move to the host command line. The modern esxcli method below is the recommended approach for any supported release of VMware ESXi and vSphere.

Method 1 (recommended): force kill a stuck VM with esxcli on ESXi

This is the supported, VMware-documented procedure on current ESXi hosts. It avoids guessing at raw process IDs and gives you three escalating kill types.

  1. Connect to the host over SSH as root:
    ssh root@esxi-host-ip
  2. List the running VMs and their World IDs. The World ID is what you will pass to the kill command:
    esxcli vm process list
    Each entry shows the VM Display Name, the World ID, the Process ID, the VMX Cartel ID, and the path to the .vmx config file. Find the stuck VM by its display name.
  3. Kill the VM by World ID. Always start with the gentlest type and escalate only if it does not work:
    esxcli vm process kill --type=soft --world-id=<WorldID>
    If the VM is still listed after a minute, try a hard kill:
    esxcli vm process kill --type=hard --world-id=<WorldID>
    As a final resort, force:
    esxcli vm process kill --type=force --world-id=<WorldID>
  4. Confirm it is gone. Run esxcli vm process list again; the VM should no longer appear.
  5. Power the VM back on normally from the vSphere Client, or from the CLI once you have its inventory ID (see the section below).

The three kill types matter. soft sends the equivalent of a clean SIGTERM and lets the VMX shut down its threads; hard stops the process immediately; force is the most aggressive and should only be used when soft and hard both fail. Escalating in order minimizes the chance of corrupting the VM's files.

Powering the VM back on from the command line

If the vSphere Client is also unresponsive, you can manage power state by inventory VMID using vim-cmd:

  1. List all registered VMs and their numeric VMIDs:
    vim-cmd vmsvc/getallvms
  2. Check the current power state to confirm it is off:
    vim-cmd vmsvc/power.getstate <VMID>
  3. Power it back on:
    vim-cmd vmsvc/power.on <VMID>

Note the distinction: esxcli vm process kill uses the transient World ID of the running process, while vim-cmd vmsvc/* uses the persistent inventory VMID. They are different numbers for the same VM.

Method 2 (legacy ESX only): kill the vmware-vmx process

On the old ESX platform (the version with a Linux-based service console, not ESXi), there was no esxcli vm process command in early releases, so admins killed the process by hand. This applies only to classic ESX with service-console access and not to ESXi.

  1. Find the process backing the VM by searching for its name:
    ps -auxwww | grep VMNAME
    For example: ps -auxwww | grep OFF-RS-WEBQA1. Look for the vmware-vmx process whose command line references that VM's .vmx file, and note its PID (the second column).
  2. Terminate that PID:
    kill -9 <PID>
    Be precise: a wide grep can match more than one VM, so confirm the PID belongs to the VM you intend to stop before killing it.
  3. Restart the VM from the VI Client or the command line once the process is gone.

An older support utility, vm-support, could also list and kill VMs (vm-support -x to list IDs, vm-support -X <VMID> to kill). It still exists primarily as a diagnostic log-bundle collector, and its kill behavior varies by version, so prefer the esxcli method on anything supported.

Modern-equivalent note: classic ESX reached end of general support in 2014 and ESX 4.1 was the last release; VMware consolidated everything onto the thin ESXi hypervisor. If you are on ESX today, the platform is long past end of life and should be upgraded. On every current host, use Method 1 (esxcli) rather than raw kill -9.

Quick command reference

GoalModern ESXiLegacy ESX
List running VM processesesxcli vm process listps -auxwww | grep VMNAME
Kill the stuck VMesxcli vm process kill --type=soft --world-id=<id>kill -9 <PID>
List inventory VMsvim-cmd vmsvc/getallvmsvm-support -x
Check power statevim-cmd vmsvc/power.getstate <VMID>n/a
Power onvim-cmd vmsvc/power.on <VMID>VI Client

Common pitfalls when a VM cannot be shut down

  • Running the kill on the wrong host. In a cluster, only the host actually running the VM lists its World ID. Killing on a neighbor host does nothing.
  • Confusing World ID with VMID. Passing a vim-cmd VMID to esxcli vm process kill (or vice versa) will fail or target the wrong thing. Use the ID from the matching command.
  • Jumping straight to --type=force. Force can leave a VM in a dirty state. Always try soft, then hard, then force.
  • A too-broad grep. On legacy ESX, grep WEB may match several VMs; verify the exact .vmx path before issuing kill -9.
  • Storage is the real cause. A VM frozen on an unreachable datastore (APD/PDL, a dead NFS mount, or stale SCSI locks) will hang during power-off. If many VMs are stuck at once, investigate storage connectivity first rather than killing each one.
  • Killing the hostd or vpxa agent instead. If management agents are the problem, restart them with /etc/init.d/hostd restart and /etc/init.d/vpxa restart before touching VM processes.

Verification: confirm the VM actually stopped and restarts cleanly

After the kill, verify rather than assume:

  1. Re-run esxcli vm process list (or ps -auxwww | grep VMNAME on ESX) and confirm the VM no longer appears.
  2. In the vSphere Client, refresh the inventory; the VM should show Powered Off and the stuck task should clear.
  3. Power it back on and watch the console boot. If the guest reports an unclean shutdown, that is expected after a hard kill, let it run its file-system check.
  4. Check the host logs for the root cause so it does not recur: /var/log/vmkernel.log and /var/log/hostd.log on ESXi.

If the same VM keeps hanging, look beyond the symptom: failing storage paths, a corrupt snapshot chain, or an overcommitted host are the usual culprits behind a VM that is still running and cannot be shut down.

Key Takeaways

  • A stuck VM that won't power off almost always means its host-side process has hung; the fix is to terminate that process from the hypervisor CLI.
  • On modern ESXi, use esxcli vm process list then esxcli vm process kill --type=soft|hard|force --world-id=<id>, escalating only as needed.
  • The legacy ESX approach (ps -auxwww | grep then kill -9 <PID>) applies only to the old service-console platform, which is end of life.
  • World ID (running process) and VMID (inventory) are different identifiers, use the one that matches your command.
  • Force-killing is a hard stop with no guest flush; try graceful power-off first and investigate storage if VMs hang repeatedly.

Frequently Asked Questions

How do I force shutdown a VMware VM that won't power off?

SSH into the ESXi host running the VM, run esxcli vm process list to get its World ID, then run esxcli vm process kill --type=soft --world-id=<id>. If it survives, escalate to --type=hard and finally --type=force.

What is the difference between soft, hard, and force kill in esxcli?

soft asks the VMX process to shut down cleanly, hard stops it immediately, and force is the most aggressive option for when the first two fail. Always escalate in that order to reduce the risk of file corruption.

Can I use kill -9 on ESXi like on old ESX?

It is not recommended. ESXi is not the Linux service console that classic ESX had, and esxcli vm process kill is the supported, safer equivalent. Reserve raw kill -9 for genuine legacy ESX hosts.

Why does my VM get stuck during power-off in the first place?

Common causes are unreachable storage (APD/PDL, dead NFS or SCSI locks), a corrupt snapshot chain, an unresponsive hostd/vpxa agent, or an overcommitted host. If multiple VMs hang together, check storage and management agents before force-killing.

For more hands-on system administration and virtualization walkthroughs, subscribe to YouTube @explorenystream.