Kubernetes Cluster API (CAPI)

Introduction

I manage multiple Kubernetes clusters: a local development cluster, staging on AWS, production on Azure. Before Cluster API (CAPI), each cluster was provisioned with a different tool and a different workflow — eksctl for EKS, az aks create for AKS, kind for local. Updating them involved different CLI flags, different templates, different processes. Cluster API changed all of that.

Cluster API is a Kubernetes sub-project that lets you declaratively manage entire cluster lifecycles — create, scale, upgrade, and delete cluster infrastructure — using the same kubectl apply workflow you already know. You manage clusters the same way you manage deployments.

This article covers CAPI v1.12 (the current release at time of writing).

Core Concept: Management Cluster

The central CAPI concept is the management cluster — a Kubernetes cluster that runs CAPI controllers and stores cluster manifests as CRDs. The management cluster provisions and manages workload clusters (the clusters where your applications actually run).

┌─────────────────────────────────────────────────┐
│             Management Cluster                   │
│                                                   │
│  ┌──────────────────────────────────────────┐    │
│  │  CAPI Controllers (capi-controller-mgr)  │    │
│  └──────────────────────────────────────────┘    │
│                                                   │
│  Cluster CRD ──────────────────────────────────► │─────► Workload Cluster A (AWS)
│  MachineDeployment CRD ────────────────────────► │─────► Workload Cluster B (Azure)
│  KubeadmControlPlane CRD ──────────────────────► │─────► Workload Cluster C (vSphere)
│                                                   │
└─────────────────────────────────────────────────┘

One management cluster can manage tens or hundreds of workload clusters.

CAPI CRD Overview

All CRDs use API group cluster.x-k8s.io/v1beta1:

CRD

Purpose

Cluster

Top-level cluster definition — links control plane + infrastructure

Machine

Single node declaration — immutable, replaced not updated

MachineSet

Maintains a stable set of Machines (like ReplicaSet)

MachineDeployment

Rolling update controller for Machines (like Deployment)

MachinePool

Provider-managed group of machines (e.g., AWS ASG, Azure VMSS)

MachineHealthCheck

Auto-remediation for unhealthy nodes

ClusterClass

Reusable cluster topology template

MachineClass

(deprecated — use ClusterClass topology + variables)

Infrastructure-provider CRDs (provider-specific, same API group pattern):

AWSCluster, AWSMachine, AWSMachineTemplate — from CAPA
AzureCluster, AzureMachine — from CAPZ
vSphereCluster, VSphereMachine — from CAPV
DockerCluster, DockerMachine — from CAPD (for local development)

Bootstrap-provider CRDs:

KubeadmConfig, KubeadmConfigTemplate — from CABPK
KubeadmControlPlane — from KCP provider

Setting Up the Management Cluster

Prerequisites

# Install clusterctl — the CAPI CLI
curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.12.0/clusterctl-darwin-arm64 \
  -o /usr/local/bin/clusterctl
chmod +x /usr/local/bin/clusterctl
clusterctl version

Bootstrap with kind (Local Development)

# Create the management cluster with kind
kind create cluster --name capi-management

# Initialize CAPI with Docker provider (CAPD) for local development
clusterctl init --infrastructure docker

# Wait for controllers to be ready
kubectl wait --for=condition=Available deployment \
  -n capi-system capi-controller-manager --timeout=90s
kubectl wait --for=condition=Available deployment \
  -n capd-system capd-controller-manager --timeout=90s
kubectl wait --for=condition=Available deployment \
  -n capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager --timeout=90s
kubectl wait --for=condition=Available deployment \
  -n capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager --timeout=90s

Initialize with AWS Provider (CAPA)

# Export credentials
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret>
export AWS_SESSION_TOKEN=<token-if-mfa>

# Pre-install AWS IAM resources (one-time setup)
clusterawsadm bootstrap iam create-cloudformation-stack

# Export the encoded credentials
export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

# Initialize
clusterctl init --infrastructure aws

Creating a Workload Cluster

Using clusterctl generate cluster (quick start)

# Docker (local)
clusterctl generate cluster my-cluster \
  --infrastructure docker \
  --kubernetes-version v1.32.0 \
  --control-plane-machine-count 1 \
  --worker-machine-count 3 \
  > my-cluster.yaml

# AWS
clusterctl generate cluster my-aws-cluster \
  --infrastructure aws \
  --kubernetes-version v1.32.0 \
  --control-plane-machine-count 3 \
  --worker-machine-count 3 \
  --region us-east-1 \
  > my-aws-cluster.yaml

kubectl apply -f my-cluster.yaml

The Cluster CRD

The generated YAML contains a Cluster resource linking infrastructure and control plane:

# cluster.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: my-cluster
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
    services:
      cidrBlocks: ["10.96.0.0/12"]
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: my-cluster-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
    kind: DockerCluster               # or AWSCluster, AzureCluster, etc.
    name: my-cluster

KubeadmControlPlane

KubeadmControlPlane manages the control plane nodes as a unit — it handles initial bootstrap, upgrades, and replacement of failed control plane nodes.

# kubeadm-control-plane.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  name: my-cluster-control-plane
  namespace: default
spec:
  version: v1.32.0
  replicas: 3                         # always odd for etcd quorum
  rolloutStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
      kind: DockerMachineTemplate     # or AWSMachineTemplate, etc.
      name: my-cluster-control-plane
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        extraArgs:
          - name: audit-log-maxage
            value: "30"
          - name: audit-log-maxbackup
            value: "10"
          - name: audit-log-path
            value: /var/log/audit/audit.log
      controllerManager:
        extraArgs:
          - name: bind-address
            value: "0.0.0.0"
      scheduler:
        extraArgs:
          - name: bind-address
            value: "0.0.0.0"
    initConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          - name: node-labels
            value: "role=control-plane"
    joinConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          - name: node-labels
            value: "role=control-plane"

MachineDeployment (Worker Nodes)

MachineDeployment manages worker nodes with the same rolling update model as a Deployment manages pods.

# machine-deployment.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: my-cluster-workers
  namespace: default
spec:
  clusterName: my-cluster
  replicas: 3
  rolloutAfter: ""                    # force rotation after a specific time (optional)
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
      cluster.x-k8s.io/deployment-name: my-cluster-workers
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0               # zero-disruption rolling upgrade
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: my-cluster
        cluster.x-k8s.io/deployment-name: my-cluster-workers
    spec:
      version: v1.32.0
      clusterName: my-cluster
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: my-cluster-workers
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
        kind: DockerMachineTemplate    # or AWSMachineTemplate, etc.
        name: my-cluster-workers

KubeadmConfigTemplate (Worker Bootstrap)

# kubeadm-config-template.yaml
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: my-cluster-workers
  namespace: default
spec:
  template:
    spec:
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            - name: node-labels
              value: "role=worker"
      preKubeadmCommands:
        - "apt-get update && apt-get install -y curl"   # custom node prep
      postKubeadmCommands:
        - "echo 'Node joined cluster'"
      files:
        - path: /etc/sysctl.d/99-kubernetes.conf
          content: |
            net.bridge.bridge-nf-call-iptables = 1
            net.ipv4.ip_forward = 1
          permissions: "0644"

Machine: The Immutable Node Unit

Individual Machine objects are created by MachineSet (managed by MachineDeployment). They are immutable — if you need to change the machine type or OS image, you update the MachineDeployment template, and CAPI creates new Machines then deletes the old ones.

# machine.yaml — usually managed by MachineDeployment, shown here for reference
apiVersion: cluster.x-k8s.io/v1beta1
kind: Machine
metadata:
  name: my-cluster-worker-abc12
  namespace: default
spec:
  version: v1.32.0
  clusterName: my-cluster
  bootstrap:
    configRef:
      apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
      kind: KubeadmConfig
      name: my-cluster-worker-abc12
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
    kind: DockerMachine
    name: my-cluster-worker-abc12

MachineHealthCheck (Auto-Remediation)

MachineHealthCheck monitors node health and replaces unhealthy machines automatically.

# machinehealthcheck.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
  name: my-cluster-worker-unhealthy
  namespace: default
spec:
  clusterName: my-cluster
  selector:
    matchLabels:
      cluster.x-k8s.io/deployment-name: my-cluster-workers
  unhealthyConditions:
    - type: Ready
      status: Unknown
      timeout: 5m           # node not ready for 5 minutes
    - type: Ready
      status: "False"
      timeout: 5m
  maxUnhealthy: "33%"       # pause remediation if > 33% are unhealthy
  nodeStartupTimeout: 10m   # allow 10 min for a new node to come up

The maxUnhealthy safeguard is critical — without it, CAPI could attempt to replace the entire cluster if a network partition makes all nodes appear unhealthy.

MachinePool (Cloud-Native Autoscaling)

MachinePool maps to the cloud provider's native autoscaling group (AWS ASG, Azure VMSS). The cloud provider manages the actual machine instances.

# machinepool.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: my-cluster-pool
  namespace: default
spec:
  clusterName: my-cluster
  replicas: 5
  template:
    spec:
      version: v1.32.0
      clusterName: my-cluster
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: my-cluster-pool
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
        kind: AWSMachinePool          # AWS Auto Scaling Group
        name: my-cluster-pool

ClusterClass (Topology API)

ClusterClass is the recommended way to create clusters at scale — it defines a reusable cluster topology with variables, similar to Helm values.

# clusterclass.yaml — define once, reuse many times
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
  name: my-cluster-class
  namespace: default
spec:
  controlPlane:
    ref:
      apiVersion: controlplane.cluster.x-k8s.io/v1beta1
      kind: KubeadmControlPlaneTemplate
      name: my-cluster-class-control-plane
    machineInfrastructure:
      ref:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
        kind: DockerMachineTemplate
        name: my-cluster-class-control-plane
  infrastructure:
    ref:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
      kind: DockerClusterTemplate
      name: my-cluster-class
  workers:
    machineDeployments:
      - class: default-worker
        template:
          bootstrap:
            ref:
              apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
              kind: KubeadmConfigTemplate
              name: my-cluster-class-worker
          infrastructure:
            ref:
              apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
              kind: DockerMachineTemplate
              name: my-cluster-class-worker
  variables:
    - name: workerMachineType
      required: true
      schema:
        openAPIV3Schema:
          type: string
          default: "t3.medium"
    - name: controlPlaneMachineCount
      required: false
      schema:
        openAPIV3Schema:
          type: integer
          default: 3

Create a cluster from the ClusterClass:

# cluster-from-class.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: production-eu
  namespace: default
spec:
  topology:
    class: my-cluster-class
    version: v1.32.0
    controlPlane:
      replicas: 3
    workers:
      machineDeployments:
        - class: default-worker
          name: workers
          replicas: 5
    variables:
      - name: workerMachineType
        value: "m5.xlarge"

Cluster Operations

Get kubeconfig for a Workload Cluster

clusterctl get kubeconfig my-cluster > my-cluster.kubeconfig
export KUBECONFIG=./my-cluster.kubeconfig
kubectl get nodes

Inspect Cluster Status

# High-level overview of all clusters
kubectl get clusters -A

# Machine status
kubectl get machines -A

# MachineDeployment status
kubectl get machinedeployments -A

# Health checks
kubectl get machinehealthcheck -A

Scale Worker Nodes

kubectl scale machinedeployment my-cluster-workers --replicas=6

Upgrade a Cluster

Update spec.version in KubeadmControlPlane and MachineDeployment — CAPI handles the rest:

# Control plane upgrade (upgrade control plane first)
kubectl patch kubeadmcontrolplane my-cluster-control-plane \
  --type merge -p '{"spec":{"version":"v1.33.0"}}'

# Wait for control plane to be ready
kubectl wait kubeadmcontrolplane my-cluster-control-plane \
  --for=condition=Ready --timeout=10m

# Worker upgrade
kubectl patch machinedeployment my-cluster-workers \
  --type merge -p '{"spec":{"template":{"spec":{"version":"v1.33.0"}}}}'

The MachineDeployment rolling update replaces nodes one at a time following maxUnavailable: 0.

Delete a Cluster

kubectl delete cluster my-cluster

CAPI cascades the deletion — it removes Machines, Secrets, and cloud infrastructure in the correct order.

Infrastructure Providers Reference

Provider

Name

API Import Path

AWS

CAPA

sigs.k8s.io/cluster-api-provider-aws

Azure

CAPZ

sigs.k8s.io/cluster-api-provider-azure

GCP

CAPG

sigs.k8s.io/cluster-api-provider-gcp

vSphere

CAPV

sigs.k8s.io/cluster-api-provider-vsphere

Docker (local)

CAPD

sigs.k8s.io/cluster-api/test/infrastructure/docker

OpenStack

CAPO

sigs.k8s.io/cluster-api-provider-openstack

Initialize a Production AWS Management Cluster

# 1. Bootstrap IAM (one-time, creates CloudFormation stack)
clusterawsadm bootstrap iam create-cloudformation-stack \
  --region us-east-1

# 2. Set credentials
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=<key>
export AWS_SECRET_ACCESS_KEY=<secret>
export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

# 3. Create an EKS management cluster (or use existing EKS)
eksctl create cluster --name capi-mgmt --region us-east-1 --node-type t3.medium

# 4. Initialize CAPI with AWS provider
clusterctl init --infrastructure aws

# 5. Generate and deploy a workload cluster
export AWS_SSH_KEY_NAME=my-keypair
export AWS_CONTROL_PLANE_MACHINE_TYPE=t3.large
export AWS_NODE_MACHINE_TYPE=t3.medium
clusterctl generate cluster production-workload \
  --infrastructure aws \
  --kubernetes-version v1.32.0 \
  --control-plane-machine-count 3 \
  --worker-machine-count 5 \
  | kubectl apply -f -

Deploying the Go Microservice to a CAPI Workload Cluster

Once kubectl get nodes shows healthy nodes on the workload cluster:

export KUBECONFIG=./production-workload.kubeconfig

# Apply all manifests from the previous articles
kubectl apply -f namespace.yaml
kubectl apply -f serviceaccount.yaml
kubectl apply -f deployment.yaml
kubectl apply -f hpa.yaml
kubectl apply -f pdb.yaml
kubectl apply -f networkpolicy.yaml

# Install Gateway API CRDs
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.5.0/standard-install.yaml

# Deploy Envoy Gateway controller
helm install envoy-gateway envoy-gateway/gateway-helm \
  --namespace envoy-gateway-system --create-namespace

# Apply gateway resources
kubectl apply -f gatewayclass.yaml
kubectl apply -f gateway.yaml
kubectl apply -f httproute-basic.yaml

kubectl get pods -n production

What I Learned

Immutable Machines change how you think about upgrades — instead of kubectl edit, you update the template and CAPI replaces machines one at a time. This is safer and more auditable.
ClusterClass is worth the initial setup cost — once I defined the class, spinning up a new cluster for a new team takes less than 10 minutes with a 5-line YAML file.
MachineHealthCheck maxUnhealthy is a safety valve — always set it. A network partition that makes 80% of your nodes look unhealthy should pause remediation, not trigger a mass replacement.
Always upgrade control plane before workers — CAPI does not enforce this ordering automatically, but Kubernetes version skew policy requires it.
CAPI management cluster itself needs to be production-grade — backing it up, HA-ing its etcd, and having a bootstrap procedure for disaster recovery is important infrastructure work.

Next Steps

Read the CAPI Book — especially the provider integration guide if you need a custom infrastructure provider
Explore ClusterClass for templated cluster management
Look at Cluster API Operator for managing CAPI providers themselves as CRDs

hashtagIntroduction

hashtagCore Concept: Management Cluster

hashtagCAPI CRD Overview

hashtagSetting Up the Management Cluster

hashtagPrerequisites

hashtagBootstrap with kind (Local Development)

hashtagInitialize with AWS Provider (CAPA)

hashtagCreating a Workload Cluster

hashtagUsing clusterctl generate cluster (quick start)

hashtagThe Cluster CRD

hashtagKubeadmControlPlane

hashtagMachineDeployment (Worker Nodes)

hashtagKubeadmConfigTemplate (Worker Bootstrap)

hashtagMachine: The Immutable Node Unit

hashtagMachineHealthCheck (Auto-Remediation)

hashtagMachinePool (Cloud-Native Autoscaling)

hashtagClusterClass (Topology API)

hashtagCluster Operations

hashtagGet kubeconfig for a Workload Cluster

hashtagInspect Cluster Status

hashtagScale Worker Nodes

hashtagUpgrade a Cluster

hashtagDelete a Cluster

hashtagInfrastructure Providers Reference

hashtagInitialize a Production AWS Management Cluster

hashtagDeploying the Go Microservice to a CAPI Workload Cluster

hashtagWhat I Learned

hashtagNext Steps

hashtagFurther Reading