Understanding GitOps Core Concepts

The Day Configuration Drift Cost Us $12,000

It was a normal Wednesday until our AWS bill arrived: $12,000 for the month. Usually $3,000.

I started investigating:

# Check production cluster
kubectl get deployments -n production
# api-service: 3 replicas βœ“
# worker-service: 50 replicas ← WHAT?!

kubectl describe deployment worker-service
# Replicas: 50
# Last scaled: 2 weeks ago by [email protected]

I called John. "Did you scale worker-service to 50 replicas?"

"Oh yeah, we had a spike two weeks ago. I forgot to scale it back down."

The manifests in Git said 5 replicas. The cluster was running 50. Configuration drift for two weeks.

This wouldn't happen with proper GitOps. Let me explain why.

Declarative Infrastructure

Declarative = Describing the desired end state, not the steps to get there.

Imperative Approach (The Problem)

Problems:

  • Order matters

  • Can't re-run safely

  • No single source of truth

  • Hard to reproduce

Declarative Approach (The Solution)

Benefits:

  • Can apply multiple times safely (idempotent)

  • Order doesn't matter

  • Clear desired state

  • Easy to reproduce

  • Diff-able

Desired State vs Actual State

This is the heart of GitOps.

Desired State = What SHOULD be running (defined in Git) Actual State = What IS running (in the cluster)

spinner

Example: Desired vs Actual

Desired State (Git):

Actual State (Cluster):

State comparison:

GitOps agent action: Sync actual β†’ desired

Reconciliation Loops

Reconciliation = Continuously ensuring actual state matches desired state.

This is how GitOps prevents configuration drift.

The Reconciliation Process

spinner

Key points:

  1. Continuous loop (default: every 3 minutes)

  2. Always comparing desired (Git) vs actual (cluster)

  3. Auto-corrects drift

  4. Self-healing

Real Example: Auto-Healing

Let's say someone manually scales a deployment:

What happens next:

The manual change was automatically reverted. This is GitOps preventing drift.

Reconciliation Configuration

Git Workflows for GitOps

Git is the source of truth, so how you use Git matters.

Workflow 1: Branch Per Environment

Flow:

  1. Develop on dev branch

  2. Merge dev β†’ staging (PR)

  3. Merge staging β†’ main (PR)

  4. Each branch triggers deployment to its environment

Pros:

  • Simple

  • Clear environment separation

  • Easy to promote changes

Cons:

  • Branch management overhead

  • Merge conflicts

  • Hard to compare environments

Workflow 2: Directory Per Environment

Flow:

  1. All environments on main branch

  2. Different directories for different environments

  3. Change dev β†’ test β†’ merge to main β†’ auto-deploy to dev

  4. Manually sync staging/production via ArgoCD UI

Pros:

  • Single branch

  • Easy to compare environments

  • Clear structure

Cons:

  • All environments mixed

  • Need careful ArgoCD app configuration

Workflow 3: Kustomize Overlays (Best Practice)

base/deployment.yaml:

overlays/production/replicas-patch.yaml:

Pros:

  • DRY (Don't Repeat Yourself)

  • Shared base, environment-specific overrides

  • Easy to manage differences

  • Native Kubernetes tool

Cons:

  • Learning curve

  • More complex structure

Workflow 4: App Repo + Config Repo (Separation)

Flow:

  1. Developer pushes code to my-app

  2. GitHub Actions builds Docker image: my-app:abc123

  3. GitHub Actions commits to my-app-config: updates image tag

  4. ArgoCD syncs my-app-config β†’ cluster

Pros:

  • Separation of concerns

  • Application developers don't touch manifests

  • Platform team manages manifests

  • Clean audit trail

Cons:

  • Two repos to manage

  • CI needs write access to config repo

  • More complex setup

Pull vs Push Deployments

This is a critical difference between traditional CD and GitOps.

Push-Based Deployment (Traditional CI/CD)

spinner

Flow:

  1. Developer pushes code

  2. CI server builds image

  3. CI server pushes deployment to cluster

  4. CI server has cluster credentials

Problems:

  • CI server needs cluster access

  • Cluster credentials stored in CI

  • Security risk

  • Hard to audit

  • No drift detection

  • Push model = CI controls when

Pull-Based Deployment (GitOps)

spinner

Flow:

  1. Developer pushes code

  2. CI server builds image

  3. CI updates image tag in Git

  4. ArgoCD pulls changes from Git

  5. ArgoCD applies to cluster

Benefits:

  • No cluster credentials in CI

  • ArgoCD lives in cluster

  • Pull model = cluster controls when

  • Continuous drift detection

  • Better security

  • Self-healing

Security Comparison

Push model:

Pull model:

Drift Detection and Remediation

Configuration drift is when actual state diverges from desired state.

How Drift Happens

ArgoCD Drift Detection

With selfHeal: true:

With selfHeal: false:

Handling Legitimate Drift

Some drift is okay:

ArgoCD ignores HPA-managed replicas:

The GitOps Control Loop

Putting it all together:

spinner

The loop:

  1. Developer commits to Git

  2. ArgoCD polls Git (every 3 min)

  3. ArgoCD compares desired (Git) vs actual (cluster)

  4. If different: sync cluster to match Git

  5. Update status: Synced/Healthy

  6. Repeat forever

This continuous loop ensures:

  • Git is always deployed

  • Manual changes are reverted

  • Cluster matches Git

  • Zero drift tolerance

Key Takeaways

  1. Declarative infrastructure

    • Describe WHAT, not HOW

    • Idempotent operations

    • YAML manifests = desired state

  2. Desired state (Git) vs Actual state (cluster)

    • Git = source of truth

    • Cluster = current reality

    • Goal: Keep them in sync

  3. Reconciliation loops

    • Continuous comparison

    • Auto-healing

    • Prevents drift

    • Self-correcting system

  4. Git workflows

    • Branch per environment

    • Directory per environment

    • Kustomize overlays (recommended)

    • Separate app + config repos

  5. Pull vs Push

    • Pull-based = more secure

    • Cluster credentials stay in cluster

    • ArgoCD pulls from Git

    • CI only builds and updates Git

  6. Drift detection

    • Manual changes detected

    • Auto-corrected (if selfHeal enabled)

    • Some drift is okay (HPA, VPA)

    • Configure ignoreDifferences

In the next article, we'll dive deep into ArgoCD architecture: how it's built, what components it has, and how they work together to make GitOps magic happen.


Previous: Introduction to GitOps Next: ArgoCD Architecture and Components

Last updated