Testing Operators with envtest

Table of Contents


Introduction

Testing a Kubernetes controller against a mock is almost useless β€” the interesting behavior is in how the controller interacts with the Kubernetes API. You need real watch semantics, real owner reference garbage collection, and real status subresource behavior.

envtest gives you exactly that: a real kube-apiserver and etcd binary running locally, no cluster needed. Your tests run the actual controller against the actual API machinery. This is what the controller-runtime team uses to test controller-runtime itself.

This article covers the test structure from appstack-operator, including both controller integration tests (Ginkgo/Gomega) and pure unit tests for the build functions.


What Is envtest?

envtest (from sigs.k8s.io/controller-runtime/pkg/envtest) downloads and runs local kube-apiserver and etcd binaries. Your test process:

  1. Starts a real API server and etcd

  2. Registers your CRDs

  3. Runs your controllers against it

  4. Cleans up when tests finish

The binaries are managed by setup-envtest, a tool from the controller-runtime project:

The Makefile generated by kubebuilder handles this automatically when you run make test.


Setting Up the Test Suite

kubebuilder generates internal/controller/suite_test.go. The scaffold is minimal β€” here's what a complete setup looks like for appstack-operator:

Key points:

  • testEnv.Start() launches kube-apiserver and etcd and returns a *rest.Config

  • CRDDirectoryPaths points at your generated CRD YAML (the output of make manifests)

  • The manager runs in a goroutine for the test suite's lifetime

  • cancel() in AfterSuite stops the manager cleanly


Writing Controller Integration Tests

Integration tests live in internal/controller/appstack_controller_test.go:


Testing Reconcile Paths

Why Eventually Instead of Direct Assertions

Tests against a controller are asynchronous. After k8sClient.Create(), the controller's reconcile loop runs independently. You can't assert immediately after create:

Eventually with a 10-second timeout and 100ms polling interval is standard. The controller running locally should reconcile within milliseconds, but the 10-second window handles slow CI environments.

Testing Deletion

Testing Self-Healing

An operator that doesn't re-create deleted resources is just declarative junk. Test it:


Unit Testing the Build Functions

The buildDeployment, buildService, and buildHPA functions are pure functions β€” given an AppStack, they return a Kubernetes resource. These are ideal for table-driven unit tests:

For build functions to be testable from the _test package, export them (or test from the same package). I use the convention of exporting build functions and keeping reconcile logic unexported.


Running Tests in CI

The generated Makefile includes make test:

In GitHub Actions:

The setup-envtest tool downloads the API server binaries on first run. They're cached between CI runs if you cache the $(LOCALBIN) directory.


What I Test and What I Don't

I test:

  • Happy path: create CR β†’ owned resources are created with the correct spec

  • Update path: change spec β†’ owned resources are updated

  • Deletion: CR deleted β†’ finalizer removed

  • Self-healing: owned resource deleted externally β†’ recreated

  • HPA toggling: disable autoscaling after enabling β†’ HPA is removed

  • Status conditions: correct phase and condition values after creation

I don't test:

  • Kubernetes API behavior (it's tested by upstream)

  • Behavior of the Deployment itself (Kubernetes handles that)

  • make manifests output (it's generated code)

  • Log output (implementation detail, not behavior)

The rule I follow: if deleting the test would leave a behavioral path untested that could fail in production, the test should exist.


Next: RBAC, Deployment, and Production Hardening β†’

Last updated