Upgrading AHV Firmware with Ansible Automation Platform
Last updated: March 4, 2026
Table of Contents
Why Automate Firmware Upgrades?
Nutanix's Life Cycle Manager (LCM) does a good job of managing firmware and software upgrades through the Prism Central UI. You click through a wizard, LCM inventories available updates, you select what to apply, and it handles the node-by-node rolling upgrade process.
But there are real operational reasons to drive this through Ansible and AAP instead:
Consistent change window enforcement β AAP job templates enforce that upgrades only run during approved maintenance windows via schedules and approval gates
Audit trail β every upgrade run is logged in AAP with who triggered it, what the before/after versions were, and the full output
Pre/post gating β I want automated health checks before and after every firmware change, not manual ones I might skip at 11pm
Multi-cluster coordination β I manage three CE clusters. A single AAP workflow upgrades all three sequentially with health validation between stages
This article shows the playbooks and AAP configuration I use to automate AHV and node firmware upgrades on my CE clusters.
Nutanix Upgrade Concepts β LCM vs Manual
Life Cycle Manager (LCM)
LCM is Nutanix's built-in upgrade orchestration framework, accessible via:
LCM handles:
AOS
Nutanix operating system on CVMs
AHV
Hypervisor version on each node
Firmware
BIOS, BMC, HBA, NIC, SSD firmware
Prism Central
PC appliance software
NCC
Nutanix Cluster Check health framework
Foundation
Imaging tool on nodes
LCM operates by:
Inventory β scans available updates from Nutanix portal or a dark-site bundle
Pre-upgrade checks β validates cluster health, version compatibility, disk space
Rolling upgrade β upgrades one node at a time, live-migrates VMs off (for AHV upgrades), then upgrades that node, brings it back
Manual Upgrade (for reference)
Without LCM:
Manual upgrades are rarely needed but useful to understand β LCM wraps similar calls internally.
Understanding AHV Firmware Upgrade Scope
When I say "AHV firmware upgrade" in this article, I mean both:
AHV hypervisor version upgrade (e.g.,
20230302.xβ20240304.x) β manages the KVM/QEMU layer version across nodesNode hardware firmware (BIOS, BMC/iDRAC, NIC, SSD) β applied by LCM during a firmware update cycle
Both are managed through the LCM API via Prism Central. The Ansible playbooks in this article call the LCM REST API to:
Run an LCM inventory (discover available updates)
Create an LCM update plan for the target components
Execute the plan
Poll until completion
Run NCC post-upgrade checks
Ansible Automation Platform Setup
I run AAP 2.4 on a small VM (4 vCPU, 8 GB RAM) inside my Nutanix CE cluster β yes, it lives on the same cluster it manages, which is fine for CE but something to think carefully about for production.
AAP Version and Collections Required
No dedicated Nutanix collection exists for LCM β LCM operations use the standard Prism Central REST API via ansible.builtin.uri.
Execution Environment
I created a custom Execution Environment image with Nutanix-specific Python dependencies:
Build and push to your private registry, then register in AAP:
Inventory and Credentials
Inventory
Credentials in AAP
Create two credential types in AAP:
Prism Central API Credential (Custom type):
CVM SSH Credential (Machine type):
Pre-Upgrade Health Check Playbook
Run this playbook before any firmware or AHV upgrade. It fails immediately if the cluster is not in a healthy state, preventing upgrades on a degraded cluster.
Trigger LCM Firmware Upgrade via Prism Central API
The LCM update flow is:
GET entity list β identify UUIDs and target versions for the components you want to upgrade
POST
/lcm/v1.r0/operations/updateβ create the update planPoll the returned task UUID until completion
Note on the entity_update_spec_list format: The LCM API expects a list of objects with
uuidandto_version. The approach above is illustrative; for production use, build the list explicitly with aset_factloop over known entity UUIDs and target versions instead of relying on zip transforms.
Explicit update spec example (recommended):
Poll Upgrade Progress
LCM upgrades for AHV can take 30β90 minutes (node-by-node, with VMs live-migrated off each node). This playbook polls until the task completes.
Post-Upgrade Validation Playbook
After LCM completes, validate that:
All CVMs are UP
All node services are running
NCC health check passes
AHV version matches the expected target
Full Pipeline: Job Template Chain in AAP
In AAP, chain the playbooks into a Workflow Job Template with approval gates:
AAP Schedule for Maintenance Windows
Notification on Failure
Rollback Considerations
Nutanix LCM does not have a one-click rollback for AHV or firmware upgrades. Here is what I do instead:
AHV Version Rollback
AHV upgrade is managed by LCM and is typically forward-only once applied. However:
Nutanix Support can provide a downgrade path for AHV β contact them with your ticket before attempting
For CE, the fastest recovery from a bad AHV upgrade is to re-image the node using Foundation
Pre-Upgrade Snapshot of CVMs
Before triggering LCM, snapshot all CVM VMs as a precaution (though CVM snapshots are rarely needed β Nutanix recommends against CVM snapshots in production, but for CE this is acceptable insurance):
Firmware Rollback
For BMC/BIOS firmware changes:
Nutanix LCM stores the previous firmware version metadata in its internal DB
In some versions, LCM supports rollback for specific components; check
lcm/v1.r0/resources/entities/listforrollback_versionin entity metadataHardware vendor tools (e.g., iDRAC firmware rollback) remain an option if LCM rollback is not available
What I Learned the Hard Way
1. Never run LCM upgrades with degraded storage
I once triggered a firmware upgrade while a disk was showing Warning in Prism. LCM started but paused mid-way when the node it was upgrading couldn't confirm RF2 compliance. The cluster entered a partially upgraded state that took an hour to resolve. The pre-upgrade NCC check should catch this β do not skip it.
2. VM live migration must work before AHV upgrade
LCM drains each node before upgrading AHV β it live-migrates all VMs off the node. If your CPUs between nodes are different generations (e.g., mixed NUC 12 and NUC 13), live migration may fail for VMs without CPU masking. Test live migration manually before LCM runs:
Fix CPU compatibility issues (enable EVC/CPU masking) before triggering any AHV upgrade.
3. LCM task API vs v3 task API
LCM uses https://pc:9440/lcm/v1.r0/ for its own operations, but the task polling endpoint for LCM tasks uses the standard v3 tasks API at https://pc:9440/api/nutanix/v3/tasks/<uuid>. This inconsistency caught me off guard β I was polling the LCM endpoint for task status and getting 404 errors until I found this in the LCM API docs.
4. CE LCM requires internet access for inventory
CE clusters need outbound HTTPS to download.nutanix.com for LCM to discover available updates. In a fully air-gapped CE setup, you need to set up a dark-site LCM bundle. In AAP, make sure your execution environment can reach the Prism Central IP, and that PC can reach Nutanix download servers during inventory.
Next Steps
Tags: Nutanix, AHV, LCM, Life Cycle Manager, Ansible, Ansible Automation Platform, AAP, Firmware Upgrade, AOS Upgrade, Prism Central API, Community Edition
Last updated