Part 2: Kubernetes Deployment with Helm Charts

Part of the SRE Playbook series

What You'll Learn: This article covers how I package the GoReliable services into Helm charts — one chart per service, a shared library chart for common templates, and per-environment values files. You'll see how I configure health probes based on how Go exposes them, set resource requests and limits derived from actual profiling data, add horizontal pod autoscaler configuration, and manage database migrations as Helm hooks. By the end, you'll have a Helm chart structure you can adapt for your own Go services.

The Configuration Problem I Kept Hitting

Before I adopted Helm for this project, I was managing Kubernetes manifests as raw YAML. That worked fine for one environment, but the moment I added a staging environment alongside production, I had duplicated YAML files drifting apart. Staging would get a fix, production wouldn't, and I'd spend 30 minutes debugging a deployment difference rather than an actual bug.

Helm solves this by making differences explicit. The base chart defines the shape of the deployment; values files define what varies per environment. When I look at values.staging.yaml, I see exactly what's different from values.production.yaml. There's no guesswork.

For background on Helm fundamentals, the Helm 101 series covers chart structure and templating in depth. This article focuses on patterns specific to Go microservices and reliability.

Chart Structure

I use one Helm chart per service, plus a shared library chart for templates used by all four services.

deployments/helm/
├── charts/
│   └── go-reliable-lib/          # Library chart — not deployed directly
│       ├── Chart.yaml
│       └── templates/
│           ├── _deployment.tpl
│           ├── _service.tpl
│           ├── _hpa.tpl
│           └── _pdb.tpl
├── api-gateway/
│   ├── Chart.yaml
│   ├── templates/
│   │   ├── deployment.yaml
│   │   ├── service.yaml
│   │   ├── hpa.yaml
│   │   ├── pdb.yaml
│   │   └── configmap.yaml
│   ├── values.yaml              # defaults
│   ├── values.staging.yaml
│   └── values.production.yaml
├── order-service/
│   ├── Chart.yaml
│   ├── templates/
│   │   ├── deployment.yaml
│   │   ├── service.yaml
│   │   ├── hpa.yaml
│   │   ├── pdb.yaml
│   │   ├── configmap.yaml
│   │   └── migration-job.yaml   # Helm pre-upgrade hook
│   ├── values.yaml
│   ├── values.staging.yaml
│   └── values.production.yaml
├── notification-worker/
│   └── ...
└── ml-gateway/
    └── ...

The Deployment Template

The deployment template is the most important. Let me walk through the decisions I made.

# api-gateway/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "api-gateway.fullname" . }}
  labels:
    {{- include "api-gateway.labels" . | nindent 4 }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      {{- include "api-gateway.selectorLabels" . | nindent 6 }}
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0      # Never reduce capacity during rollout
  template:
    metadata:
      labels:
        {{- include "api-gateway.selectorLabels" . | nindent 8 }}
      annotations:
        # Force pod restart when ConfigMap changes — otherwise config updates don't take effect
        checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
        prometheus.io/scrape: "true"
        prometheus.io/port:   "{{ .Values.service.metricsPort }}"
        prometheus.io/path:   "/metrics"
    spec:
      serviceAccountName: {{ include "api-gateway.serviceAccountName" . }}
      securityContext:
        runAsNonRoot: true
        runAsUser: 65532      # nonroot from distroless image
        runAsGroup: 65532
        fsGroup: 65532
      terminationGracePeriodSeconds: 30
      containers:
        - name: api-gateway
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: metrics
              containerPort: 9090
              protocol: TCP
          env:
            - name: SERVICE_NAME
              value: "api-gateway"
            - name: ENVIRONMENT
              value: {{ .Values.environment }}
            - name: PORT
              value: "8080"
            - name: METRICS_PORT
              value: "9090"
            - name: LOG_LEVEL
              value: {{ .Values.logLevel }}
          envFrom:
            - secretRef:
                name: {{ include "api-gateway.fullname" . }}-secrets
          livenessProbe:
            httpGet:
              path: /healthz/live
              port: http
            initialDelaySeconds: 10
            periodSeconds: 15
            failureThreshold: 3
            timeoutSeconds: 5
          readinessProbe:
            httpGet:
              path: /healthz/ready
              port: http
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
            timeoutSeconds: 3
          startupProbe:
            httpGet:
              path: /healthz/live
              port: http
            failureThreshold: 30   # 30 × 5s = 150 seconds to start before liveness kicks in
            periodSeconds: 5
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop: ["ALL"]

There are three deliberate choices here I want to explain:

maxUnavailable: 0 — During a rolling update, I never want fewer than the desired number of pods. This means Kubernetes brings up the new version before it brings down the old one. The trade-off is that you briefly run replicaCount + 1 pods. For my workloads, that's acceptable.

ConfigMap checksum annotation — Kubernetes doesn't restart pods when ConfigMaps change. By adding a checksum of the ConfigMap as an annotation, any change to the ConfigMap changes the annotation, which triggers a rolling restart. I discovered this behavior the hard way when an environment variable change didn't take effect and I spent an hour investigating the wrong thing.

Startup probe with high failureThreshold — The startup probe is specifically for the window between "container started" and "application ready to accept traffic". I give it 150 seconds because the Order Service sometimes takes longer to start on first deploy when it's running database migrations. Without the startup probe, the liveness probe would kill the pod before it finished starting.

Health Check Handler

The health check implementation in Go needs to distinguish between liveness and readiness. These are meaningfully different concepts.

// pkg/health/health.go
package health

import (
    "context"
    "encoding/json"
    "net/http"
    "sync/atomic"
    "time"

    "github.com/jackc/pgx/v5/pgxpool"
    "github.com/nats-io/nats.go/jetstream"
)

type Status struct {
    Status     string            `json:"status"`
    Checks     map[string]string `json:"checks,omitempty"`
    Uptime     string            `json:"uptime"`
}

type Handler struct {
    db        *pgxpool.Pool
    js        jetstream.JetStream  // nil for services that don't use NATS
    ready     atomic.Bool
    startTime time.Time
}

func NewHandler(db *pgxpool.Pool, js jetstream.JetStream) *Handler {
    h := &Handler{
        db:        db,
        js:        js,
        startTime: time.Now(),
    }
    return h
}

// MarkReady signals that the application completed initialization and is ready for traffic.
// Call this after all startup tasks (migrations, cache warm-up) complete.
func (h *Handler) MarkReady() {
    h.ready.Store(true)
}

// Liveness returns 200 as long as the process is alive and not deadlocked.
// Kubernetes uses this to decide whether to restart the container.
// Keep this simple — only return unhealthy if the process is fundamentally broken.
func (h *Handler) Liveness(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(Status{
        Status: "ok",
        Uptime: time.Since(h.startTime).Round(time.Second).String(),
    })
}

// Readiness returns 200 only when the service can handle traffic.
// Kubernetes uses this to decide whether to send traffic to this pod.
// Return unhealthy during startup, migrations, or when dependencies are unavailable.
func (h *Handler) Readiness(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")

    if !h.ready.Load() {
        w.WriteHeader(http.StatusServiceUnavailable)
        json.NewEncoder(w).Encode(Status{Status: "starting"})
        return
    }

    checks := make(map[string]string)
    httpStatus := http.StatusOK
    overallStatus := "ok"

    // Check database connectivity with a short timeout
    ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
    defer cancel()

    if err := h.db.Ping(ctx); err != nil {
        checks["database"] = "unhealthy: " + err.Error()
        overallStatus = "degraded"
        httpStatus = http.StatusServiceUnavailable
    } else {
        checks["database"] = "healthy"
    }

    w.WriteHeader(httpStatus)
    json.NewEncoder(w).Encode(Status{
        Status: overallStatus,
        Checks: checks,
        Uptime: time.Since(h.startTime).Round(time.Second).String(),
    })
}

The separation matters in production. If I roll out a bad database query that causes timeouts, I want the readiness probe to fail (stop sending traffic) but the liveness probe to succeed (don't restart the pod — I might need to exec into it to debug). Failing liveness on a dependency check is a common mistake that causes cascading restarts.

Resource Requests and Limits

Resource requests and limits are the most commonly misconfigured aspect of Kubernetes deployments I've seen. Either they're not set at all (cluster scheduling chaos), or they're guessed and wildly inaccurate.

I derived my values using Go's built-in profiling tools. I ran each service under realistic load using k6 (covered more in Part 7) and captured pprof profiles.

API Gateway (values.yaml — defaults for development):

resources:
  requests:
    cpu: "50m"
    memory: "64Mi"
  limits:
    cpu: "500m"
    memory: "128Mi"

API Gateway (values.production.yaml):

resources:
  requests:
    cpu: "100m"
    memory: "96Mi"
  limits:
    cpu: "1000m"          # CPU limit set higher to allow burst
    memory: "256Mi"       # Memory limit = 2× p99 RSS observed under load

A note on CPU limits: I've read arguments both ways. In Kubernetes, CPU limits create CPU throttling via CFS quotas, which can hurt latency even when the node has spare capacity. For my Go services (mostly I/O bound at the gateway, not CPU bound), I set generous CPU limits and rely on HPA to scale horizontally rather than squeezing CPU. For memory limits, I set them tightly because memory limits determine whether the container gets OOM-killed — I'd rather OOM-kill a single pod and have Kubernetes restart it than have a memory leak quietly grow until the node is pressured.

Horizontal Pod Autoscaler

# api-gateway/templates/hpa.yaml
{{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: {{ include "api-gateway.fullname" . }}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ include "api-gateway.fullname" . }}
  minReplicas: {{ .Values.autoscaling.minReplicas }}
  maxReplicas: {{ .Values.autoscaling.maxReplicas }}
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300    # Wait 5 minutes before scaling down
      policies:
        - type: Pods
          value: 1
          periodSeconds: 60              # Remove at most 1 pod per minute on scale-down
    scaleUp:
      stabilizationWindowSeconds: 0     # Scale up immediately
      policies:
        - type: Percent
          value: 100                    # Allow doubling pod count in one step
          periodSeconds: 30
{{- end }}

# values.production.yaml (relevant section)
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

The scaleDown.stabilizationWindowSeconds: 300 prevents flapping. Without it, a brief traffic dip would scale down pods, then a traffic spike would scale them back up, thrashing the cluster. Five minutes is long enough to absorb normal traffic variance.

Pod Disruption Budget

PDBs prevent Kubernetes from evicting too many pods simultaneously during node drains (maintenance, upgrades). Without a PDB, a node drain could take down all my pods at once.

# api-gateway/templates/pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: {{ include "api-gateway.fullname" . }}
spec:
  minAvailable: 1     # Always keep at least 1 pod running
  selector:
    matchLabels:
      {{- include "api-gateway.selectorLabels" . | nindent 6 }}

For production with minReplicas: 2, setting minAvailable: 1 means Kubernetes can evict at most one pod at a time. Combined with maxUnavailable: 0 in the rollout strategy, I have reasonable protection against both rolling updates and voluntary disruptions.

Database Migration Helm Hook

The Order Service needs database migrations to run before the new deployment starts. I use a Kubernetes Job as a Helm pre-upgrade hook.

# order-service/templates/migration-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "order-service.fullname" . }}-migration-{{ .Release.Revision }}
  annotations:
    "helm.sh/hook": pre-upgrade,pre-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  ttlSecondsAfterFinished: 300
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: migration
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          command:
            - /bin/goose
            - "-dir"
            - "/migrations"
            - postgres
            - "$(DATABASE_URL)"
            - up
          envFrom:
            - secretRef:
                name: {{ include "order-service.fullname" . }}-secrets
          resources:
            requests:
              cpu: "50m"
              memory: "32Mi"
            limits:
              cpu: "200m"
              memory: "64Mi"

The hook-delete-policy: before-hook-creation means Helm automatically deletes the old migration job before creating a new one. The hook-succeeded part means a successful job is also cleaned up. This keeps the namespace tidy.

I append .Release.Revision to the job name because job names must be unique — without it, Helm can't create a new job if an old one with the same name still exists.

Per-Environment Values

Here's the actual structure of how values differ across environments:

# values.yaml (base defaults — used for development)
replicaCount: 1
environment: development
logLevel: debug

image:
  repository: ghcr.io/htunn/go-reliable/api-gateway
  pullPolicy: IfNotPresent
  tag: ""   # Overridden by CI pipeline on each release

service:
  type: ClusterIP
  port: 80
  targetPort: 8080
  metricsPort: 9090

autoscaling:
  enabled: false

resources:
  requests:
    cpu: "50m"
    memory: "64Mi"
  limits:
    cpu: "500m"
    memory: "128Mi"

# values.staging.yaml
environment: staging
logLevel: info
replicaCount: 1

resources:
  requests:
    cpu: "100m"
    memory: "96Mi"
  limits:
    cpu: "500m"
    memory: "192Mi"

# values.production.yaml
environment: production
logLevel: info
replicaCount: 2

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

resources:
  requests:
    cpu: "100m"
    memory: "96Mi"
  limits:
    cpu: "1000m"
    memory: "256Mi"

The values files don't override the image tag — that's set by the CI pipeline at deploy time: helm upgrade --set image.tag=$GIT_SHA. This keeps image tags out of git (they'd change on every commit and create noisy diffs).

Deploying Manually vs GitOps

At this stage, I can deploy manually:

# Deploy to staging
helm upgrade --install api-gateway ./deployments/helm/api-gateway \
  --namespace go-reliable \
  --create-namespace \
  --values ./deployments/helm/api-gateway/values.yaml \
  --values ./deployments/helm/api-gateway/values.staging.yaml \
  --set image.tag=$(git rev-parse --short HEAD) \
  --wait \
  --timeout 5m

# Check rollout status
kubectl rollout status deployment/api-gateway -n go-reliable

But manual deploys don't scale and don't provide the audit trail I want. In Part 3, I wire this up to ArgoCD so every commit to the GitOps repository automatically drives the cluster state — no manual commands needed.

What I Got Wrong Initially

First attempt: I set memory.limit equal to memory.request for all services. This caused periodic OOM kills when traffic spiked and Go's garbage collector needed temporary headroom. The fix was setting limits to 2-3× the steady-state RSS.

Second attempt: I enabled HPA on the Notification Worker. The worker scales based on message queue depth, not CPU — CPU autoscaling caused it to scale down while there were still thousands of unprocessed messages. In Part 7, I show how to configure HPA with a custom Prometheus metric (NATS consumer pending count) instead.

Third attempt: I initially didn't use a startup probe, just a readiness probe with initialDelaySeconds: 60. This caused issues when Order Service migration took longer than expected on a fresh cluster. The startup probe with high failureThreshold handles variable startup times cleanly.

Where We Are

The services are packaged as Helm charts with:

Health probes correctly configured for Go services
Resources derived from profiling, not guessing
HPA and PDB for production reliability
Database migrations as Helm hooks
Per-environment values files

In Part 3, I set up ArgoCD to deploy these charts automatically from a GitOps repository, wire up a CI pipeline that builds images and updates the Helm values, and configure secret management with External Secrets Operator.

PreviousPart 1: Building a Production-Ready Go Microservices Platform NextPart 3: GitOps with ArgoCD and Continuous Delivery

Last updated 4 days ago

hashtagThe Configuration Problem I Kept Hitting

hashtagChart Structure

hashtagThe Deployment Template

hashtagHealth Check Handler

hashtagResource Requests and Limits

hashtagHorizontal Pod Autoscaler

hashtagPod Disruption Budget

hashtagDatabase Migration Helm Hook

hashtagPer-Environment Values

hashtagDeploying Manually vs GitOps

hashtagWhat I Got Wrong Initially

hashtagWhere We Are