Implementing Hyper-Converged Infrastructure with Nutanix

Last updated: March 4, 2026


Table of Contents


Why I Chose HCI for My Home Lab

I used to run a traditional three-tier homelab setup: a separate NAS for storage (TrueNAS), a standalone switch for networking, and a couple of Intel NUC nodes running ESXi. Managing it meant juggling multiple management interfaces β€” the NAS UI, vCenter, and the switch CLI β€” and when something went wrong, tracing the problem across three separate systems was time-consuming.

When I discovered Nutanix Community Edition (CE), the appeal was immediately obvious: one platform, one management plane, compute and storage handled together. The operational overhead dropped significantly within the first week.

This article walks through everything I did to implement HCI with Nutanix β€” from sizing decisions to cluster formation to running production workloads on it.


What Is Hyper-Converged Infrastructure?

Hyper-Converged Infrastructure (HCI) collapses the historically separate compute, storage, and networking tiers into a single software-defined platform running on standard x86 hardware.

In classical HCI:

Resource
Where It Lives

Compute

x86 CPU + RAM on each node

Storage

Local NVMe/SSD on each node, pooled via software

Networking

Managed by the HCI layer (overlay or physical)

Management

Single unified control plane

The storage across all nodes is pooled by a distributed file system β€” in Nutanix, this is NDFS (Nutanix Distributed File System). Each node contributes its local drives to a shared, highly available storage pool. VMs running on any node can access data regardless of which node physically holds the blocks.

Core HCI Principles

  • Converged resource model β€” no external SAN or NAS required

  • Software-defined resilience β€” RF2 or RF3 replication (Replication Factor) across nodes instead of hardware RAID

  • Scale-out architecture β€” add a node, instantly expand both compute and storage

  • Unified management β€” Prism Element (per cluster) and Prism Central (multi-cluster)


Traditional 3-Tier vs HCI Architecture

With HCI, the storage fabric is internal to the cluster. There is no external SAN, no Fibre Channel, no separate storage admin. The CVM (Controller VM) on each node manages local I/O and participates in the distributed storage fabric.


Nutanix HCI Architecture Components

AOS (Acropolis Operating System)

AOS is the software layer running on each node that provides:

  • Distributed storage fabric (NDFS) β€” pools local drives across nodes

  • Data redundancy β€” RF2 (two copies) or RF3 (three copies) of every block

  • Data services β€” inline deduplication, compression, erasure coding

  • CVM management β€” each node runs one CVM that handles all local I/O

CVM (Controller VM)

The CVM is a always-running virtual machine on each node. It:

  • Intercepts all I/O from VMs on that node

  • Coordinates with CVMs on other nodes for replication

  • Runs Cassandra (metadata), Stargate (I/O), Curator (MapReduce), and Chronos (scheduling) services

AHV Hypervisor

AHV (KVM-based) runs VMs on each node. The VMs communicate with storage through the CVM via iSCSI or NFS internally. AHV itself has no persistent storage β€” all state lives in NDFS.

Prism Element (PE)

Per-cluster management UI available at https://<cluster-VIP>:9440. Manages:

  • VMs, snapshots, and clones

  • Storage containers and pools

  • Network configuration and VLANs

  • Cluster health and alerts

Prism Central (PC)

Multi-cluster management layer. Once you deploy PC, you register clusters to it and get:

  • Cross-cluster VM inventory

  • Policy-based governance

  • Flow networking and microsegmentation

  • NCM Self-Service (Calm blueprints)


Planning Your HCI Cluster

Before touching hardware or software, plan these dimensions:

Cluster Size

Nodes
Use Case

1 node

Development only (no HA, no RF2)

3 nodes

Minimum for production with RF2

4+ nodes

RF3 or multi-failure tolerance

6+ nodes

Erasure coding eligible (2+1 or 4+1)

For my home lab, I run 3 nodes β€” the minimum for meaningful HA testing.

Replication Factor

  • RF2 β€” Every block written twice across two nodes. Can tolerate 1 node failure. Requires minimum 2 nodes (3 for production).

  • RF3 β€” Every block written three times. Can tolerate 2 node failures. Requires minimum 3 nodes (5 for full RF3 with CVM tolerance).

Workload Profile

Estimate total vCPU, RAM, and storage across all planned VMs. Add 20–30% headroom for CVM overhead and growth. A CVM typically consumes 4–8 vCPU and 12–24 GB RAM per node depending on your AOS version and enabled features.


Hardware Requirements and Node Sizing

Nutanix Community Edition (CE) Requirements

CE is the free version for homelab and learning. Per-node requirements:

Component
Minimum
Recommended

CPU

8 cores (with VT-x/AMD-V)

16+ cores

RAM

32 GB

64 GB

Boot drive

32 GB SSD (for AHV)

60 GB SSD

SSD tier

200 GB SSD

400 GB+ NVMe

HDD tier

500 GB HDD (optional)

1 TB+ HDD

NICs

1 GbE (2 recommended)

10 GbE

My Node Hardware

I run 3 nodes using Intel NUC 13 Pro units:

NUCs are not officially supported by Nutanix CE but work fine with hardware compatibility tweaks (disable Secure Boot, enable VT-d, configure NIC drivers via Foundation).


Networking Design for HCI

Nutanix nodes use multiple network roles. Even with a single physical NIC, you need to understand the traffic separation:

Traffic Types

Traffic Type
Description
VLAN

AHV Host Management

AHV management IP (acropolis)

Management VLAN

CVM

Controller VM IP (storage fabric + Prism)

Storage/Management VLAN

VM Guest Traffic

VM workload traffic

VM VLANs

IPMI/BMC

Out-of-band management

OOB VLAN (if available)

Required IP Addresses (per 3-node cluster)

Resource
Count

AHV host IPs

3 (one per node)

CVM IPs

3 (one per node)

Cluster Virtual IP

1

Data Services IP

1

Prism Central VM

1

Total

9 minimum


Installing Nutanix AOS and Foundation

Nutanix uses a tool called Foundation to image and cluster nodes. Foundation is a standalone VM you run on your laptop or a temporary host.

Step 1 – Download Artifacts

From the Nutanix Portalarrow-up-right:

  1. AOS Bundle – e.g., nutanix_installer_package-release-euphrates-6.8.1.tar.gz

  2. AHV Hypervisor – bundled inside AOS for CE

  3. Foundation VM – .ova or .qcow2 image

Step 2 – Deploy Foundation VM

Step 3 – Boot Nodes to Phoenix

Foundation PXE-boots each node into a minimal imaging environment called Phoenix. To trigger this:

  1. Connect to each node's IPMI/iDRAC

  2. Mount the AHV ISO via virtual media

  3. Power on β€” node boots into Phoenix and is discovered by Foundation

Step 4 – Run Foundation

In the Foundation web UI:

  1. Discover Nodes β€” Foundation scans the subnet and finds Phoenix-booted nodes

  2. Assign Node Details β€” set AHV host IP, CVM IP, hostname per node

  3. Cluster Settings β€” set cluster name, cluster VIP, Data Services VIP, DNS, NTP

  4. Select AOS Bundle β€” point Foundation to the downloaded AOS .tar.gz

  5. Start β€” Foundation images all three nodes (takes 45–90 minutes)


Cluster Formation and Configuration

After Foundation completes, log into Prism Element:

Initial Setup Checklist

Verify CVM Services

SSH into any CVM (default: nutanix@<cvm-ip>, password: nutanix/4u) and run:


Storage Configuration – Storage Pools, Containers, and vDisks

Storage Pool

A Storage Pool is the total physical storage contribution from all nodes. In most deployments, you have a single storage pool.

Storage Container

A Container is a logical grouping within the storage pool. It is analogous to a datastore in VMware. Configure one container per use case:

vDisk

vDisks are virtual disks created within a container. They are created automatically when you create a VM disk. Each vDisk is backed by extents that are RF2/RF3 replicated across node local drives.


Network Configuration – VLANs and VM Networks

In Prism, VM networks map to AHV bridges (virtual switches):

IPAM is optional but useful for lab environments where you want Prism to issue IPs via its own DHCP.


Deploying Your First Workload

Create a VM from ISO

Upload Images via Image Service

Images are stored in the container you specify and are available to all nodes in the cluster.

Verify VM Placement and Storage


Scaling the Cluster – Add a Node

One of the strongest HCI capabilities is non-disruptive scale-out. Adding a fourth node expands both compute and storage without any downtime.

Process Overview

Via Prism

Via CLI (acli)

Data migration is automatic. Curator runs a background MapReduce job to rebalance extents across nodes. You can monitor progress:


Monitoring and Health Checks

NCC (Nutanix Cluster Check)

NCC is Nutanix's built-in health framework. Run it after initial setup and periodically:

Key Prism Dashboards

Dashboard
What to Watch

Home

Cluster-level CPU/RAM/storage usage, IOPS, throughput, latency

Hardware

Node and disk health, SSD/HDD tiering, drive failures

Storage

Savings (dedup/compression ratio), container usage, vDisk distribution

Alerts

Critical and warning alerts from NCC and platform monitors

Stargate (I/O) Metrics to Monitor

Key Alerts to Configure

In Prism β†’ Alerts β†’ Alert Policies, ensure you have email or webhook notifications for:

  • Node down

  • CVM not reachable

  • Disk failure or predicted failure

  • Storage capacity > 75%

  • Cluster not fault tolerant (RF2 degraded)


My Honest Take on Nutanix HCI

After running HCI with Nutanix CE on my three-node NUC cluster for over a year, here's what I actually think:

What HCI Gets Right

Single management plane is the biggest win. I log into Prism for everything β€” VM operations, storage checks, network config, and alerting. There are no separate admin tools.

Scale-out without drama works as advertised. Adding a node is genuinely non-disruptive. The data rebalancing happens in the background, VMs keep running, and by the next morning the storage distribution is balanced.

RF2 resilience has saved me twice β€” once when a NVMe drive failed silently (Prism alerted me), once when I powered down a node mid-test without thinking. Both times, VMs kept running on the remaining nodes with no data loss.

Where It Gets Complicated

CVM overhead is non-trivial. On a 64 GB node, the CVM consumes about 12 GB RAM and 8 cores under load. For a workstation homelab budget, that matters. You are paying a real resource tax for the HCI abstraction.

AHV live migration requires enough CPUs to be the same generation. I ran into this when one of my NUC 12 and NUC 13 nodes had different CPU generations. The vMotion-equivalent (AHV Live Migration) failed until I pinned VMs to specific nodes or used CPU masking. Keep hardware generations consistent when possible.

Community Edition is not production-supported. CE is great for learning but does not have Nutanix support SLAs behind it. If you plan to use Nutanix in production, budget for proper NX hardware or a certified HCI partner appliance.

When Nutanix HCI Makes Sense

Scenario
Recommendation

Small private cloud (3–20 nodes)

Strong fit β€” Nutanix HCI excels here

Edge deployments (single-node or 2-node)

Use Nutanix HCI with RF1 or robo config

Large-scale public cloud replacement

Review cost carefully β€” pure HCI may lose to cloud-native at scale

Learning/homelab

CE is ideal β€” fast to set up, realistic feature set

VMware migration

Excellent path β€” Move tool and AHV compatibility are mature


Next Steps


Tags: Nutanix, HCI, Hyper-Converged Infrastructure, AOS, AHV, Prism, NDFS, Foundation, CVM, home-lab

Last updated