vSAN in VCF 9: Preventing ESXi Reboot Hangs with an Orchestrator Precheck Gate

The Cloud Architect
The Cloud Architect

This post covers a narrow operational risk in vSAN-backed VCF environments: an ESXi host reboot can hang when vSAN CMMDS shutdown is delayed under specific network conditions.

Source KB: https://knowledge.broadcom.com/external/article/405500/esxi-shutdown-hung-at-rebootrunhandlersv.html

What the KB is telling you

If you reboot hosts during maintenance, you need a deterministic gate that prevents reboots when the cluster is not in a safe state (network instability, ongoing storage transitions, etc.).

Orchestrator action: vSAN reboot precheck gate

Goal: block host reboot workflows unless vSAN conditions are healthy.

Workflow steps (VMware Aria Orchestrator)

  • Create a workflow: 'VCF9 - vSAN Reboot Precheck Gate'
  • Inputs: vcCluster (VC:ClusterComputeResource), maxActiveResync (number, default 0)
  • Step 1: Query vSAN/vCenter health for the cluster. If critical issues exist, fail the workflow.
  • Step 2: Query resync/rebuild activity. If active resync components > maxActiveResync, fail the workflow.
  • Step 3: If checks pass, return PASS and allow downstream 'Enter Maintenance Mode' / 'Reboot Host' workflows to proceed.

Expected outcome

This turns host reboots into a controlled operation: no precheck PASS = no reboot.


More Articles

Related Content