The Controller and Reconcile Loop

Table of Contents


Introduction

The controller is where your operator does its work. It watches for AppStack resources and reconciles the cluster toward the desired state described in the spec. Everything in this article is proven patterns from appstack-operator — not theoretical.

The complete controller file lives at internal/controller/appstack_controller.go. Let's build it from scratch.


The Reconciler Struct

The scaffold generates a minimal reconciler struct:

This is the core of your controller. client.Client is the controller-runtime API client — it handles GETs, CREATEs, UPDATEs, DELETEs, and status updates against the Kubernetes API. The embedded Client means you call its methods directly on the reconciler: r.Get(...), r.Create(...).

Add a Recorder for Kubernetes events:

Register it in main.go:


Setting Up Watches

SetupWithManager tells the manager what resources to watch and how to map watch events to reconcile requests:

For(&AppStack{}): Watch AppStack resources. Any change to an AppStack triggers Reconcile() with that AppStack's namespaced name.

Owns(&Deployment{}): Watch Deployment resources owned by an AppStack. If the Deployment is modified or deleted, the owning AppStack is reconciled. Ownership is determined by owner references.

The Owns() calls are what make the operator self-healing. Without them, if someone manually deletes the Service your operator created, the operator would never know.


The Reconcile Function Structure

Every reconcile follows this skeleton:

This is the consistent pattern I use. Each step is responsible for one thing. Errors bubble up cleanly.


Fetching the Custom Resource

client.IgnoreNotFound(err) returns nil when the resource no longer exists. This happens when:

  • The resource was deleted and this reconcile was already queued before the deletion was processed

  • A race condition during startup

Returning nil on NotFound is correct. Returning an error would cause the controller to requeue and retry indefinitely for a resource that doesn't exist.

Never assume the object is still valid after a Get returns without error. Always check .DeletionTimestamp immediately after fetching — the object may be in the process of being deleted.


Creating and Updating Owned Resources

The pattern for managing a child resource (Deployment, Service, HPA) is always the same:

Building the Desired Deployment

Building the Service

Building the HPA


Owner References and Garbage Collection

controllerutil.SetControllerReference(owner, owned, scheme) sets the ownerReferences field on the child resource. When the AppStack is deleted, Kubernetes automatically garbage-collects all owned resources (Deployment, Service, HPA).

controller: true means only one owner can be the controller. blockOwnerDeletion: true means the child blocks the parent from being fully deleted until the child is gone. This is appropriate for all owned resources.

Important: Owner references only work within the same namespace. Cross-namespace ownership is not supported by the Kubernetes GC. All resources created by AppStack must be in the same namespace.


Handling Deletion with Finalizers

Kubernetes finalizers let your controller run cleanup logic before the resource is garbage collected. Without a finalizer, the resource is deleted immediately.

For AppStack, I add a finalizer to handle cases where cleanup needs to happen in a specific order — for example, waiting for in-flight requests to drain before removing the Service.

If AppStack only owns native Kubernetes resources, owner references handle cleanup automatically — you don't strictly need a finalizer. Finalizers are necessary when you have external resources to clean up (e.g., cloud load balancers, external DNS records, database users).


ctrl.Result and Requeue Strategies

The return value from Reconcile() controls what the controller does next:

When to use each:

Return
Use when

{}, nil

Reconcile completed successfully

{}, err

A real error occurred — controller-runtime applies exponential backoff

{RequeueAfter: N}, nil

You need to poll (e.g., waiting for an external resource)

{Requeue: true}, nil

You made a change (e.g., added finalizer) and need to re-enter reconcile immediately without triggering a watch event

The most common mistake is returning {}, err for expected transient conditions (like a resource not being ready yet). This floods the controller with retries. Use {RequeueAfter: 10s}, nil for known wait conditions.


Full Controller Implementation

Here is the complete appstack_controller.go reflecting all the patterns above:


Next: Status, Events, and Observability →

Last updated