Testing Operators with envtest
Table of Contents
Introduction
Testing a Kubernetes controller against a mock is almost useless β the interesting behavior is in how the controller interacts with the Kubernetes API. You need real watch semantics, real owner reference garbage collection, and real status subresource behavior.
envtest gives you exactly that: a real kube-apiserver and etcd binary running locally, no cluster needed. Your tests run the actual controller against the actual API machinery. This is what the controller-runtime team uses to test controller-runtime itself.
This article covers the test structure from appstack-operator, including both controller integration tests (Ginkgo/Gomega) and pure unit tests for the build functions.
What Is envtest?
envtest (from sigs.k8s.io/controller-runtime/pkg/envtest) downloads and runs local kube-apiserver and etcd binaries. Your test process:
Starts a real API server and etcd
Registers your CRDs
Runs your controllers against it
Cleans up when tests finish
The binaries are managed by setup-envtest, a tool from the controller-runtime project:
The Makefile generated by kubebuilder handles this automatically when you run make test.
Setting Up the Test Suite
kubebuilder generates internal/controller/suite_test.go. The scaffold is minimal β here's what a complete setup looks like for appstack-operator:
Key points:
testEnv.Start()launches kube-apiserver and etcd and returns a*rest.ConfigCRDDirectoryPathspoints at your generated CRD YAML (the output ofmake manifests)The manager runs in a goroutine for the test suite's lifetime
cancel()inAfterSuitestops the manager cleanly
Writing Controller Integration Tests
Integration tests live in internal/controller/appstack_controller_test.go:
Testing Reconcile Paths
Why Eventually Instead of Direct Assertions
Eventually Instead of Direct AssertionsTests against a controller are asynchronous. After k8sClient.Create(), the controller's reconcile loop runs independently. You can't assert immediately after create:
Eventually with a 10-second timeout and 100ms polling interval is standard. The controller running locally should reconcile within milliseconds, but the 10-second window handles slow CI environments.
Testing Deletion
Testing Self-Healing
An operator that doesn't re-create deleted resources is just declarative junk. Test it:
Unit Testing the Build Functions
The buildDeployment, buildService, and buildHPA functions are pure functions β given an AppStack, they return a Kubernetes resource. These are ideal for table-driven unit tests:
For build functions to be testable from the _test package, export them (or test from the same package). I use the convention of exporting build functions and keeping reconcile logic unexported.
Running Tests in CI
The generated Makefile includes make test:
In GitHub Actions:
The setup-envtest tool downloads the API server binaries on first run. They're cached between CI runs if you cache the $(LOCALBIN) directory.
What I Test and What I Don't
I test:
Happy path: create CR β owned resources are created with the correct spec
Update path: change spec β owned resources are updated
Deletion: CR deleted β finalizer removed
Self-healing: owned resource deleted externally β recreated
HPA toggling: disable autoscaling after enabling β HPA is removed
Status conditions: correct phase and condition values after creation
I don't test:
Kubernetes API behavior (it's tested by upstream)
Behavior of the Deployment itself (Kubernetes handles that)
make manifestsoutput (it's generated code)Log output (implementation detail, not behavior)
The rule I follow: if deleting the test would leave a behavioral path untested that could fail in production, the test should exist.
Last updated