ArgoCD Architecture and Components

When I Needed to Understand ArgoCD Internals

Our ArgoCD instance was syncing 127 applications across 4 Kubernetes clusters. Everything worked perfectly... until one day it didn't.

# Check ArgoCD status
kubectl get pods -n argocd
# argocd-server: CrashLoopBackOff
# argocd-repo-server: Running
# argocd-application-controller: Running but CPU 400%

# Check logs
kubectl logs -n argocd argocd-application-controller
# [ERROR] Failed to sync application: timeout
# [ERROR] Failed to sync application: timeout
# [ERROR] Failed to sync application: timeout
# ... 127 timeouts

The application controller was overwhelmed. But why? What was it actually doing?

I needed to understand ArgoCD's architecture to fix this. Let me share what I learned.

ArgoCD High-Level Architecture

ArgoCD is a Kubernetes-native continuous delivery tool. It runs inside your Kubernetes cluster as a set of pods.

spinner

Core components:

  1. API Server - Web UI, CLI, API endpoints

  2. Repository Server - Fetches manifests from Git

  3. Application Controller - Reconciliation loop (sync Git β†’ cluster)

  4. Redis - Cache and message queue

  5. Dex - SSO integration (optional)

Let's dive into each component.

API Server (argocd-server)

The API server is the frontend of ArgoCD.

What It Does

spinner

Responsibilities:

  1. Serve Web UI (port 8080)

  2. Handle CLI requests (port 8080)

  3. Expose REST/gRPC API (port 8080, 8083)

  4. Authentication (local users, SSO via Dex)

  5. Authorization (RBAC)

  6. Proxy to Kubernetes API

  7. Cache reads from Redis

API Server Pod

What I Learned: API Server Scaling

When you have 100+ applications and 50+ users, you need to scale the API server.

Signs you need to scale:

  • UI is slow

  • CLI commands timeout

  • High CPU on argocd-server pod

  • Many users accessing simultaneously

Repository Server (argocd-repo-server)

The repo server is the Git interface of ArgoCD.

What It Does

spinner

Responsibilities:

  1. Clone Git repositories

  2. Fetch manifests from Git

  3. Run Kustomize build

  4. Render Helm charts

  5. Execute config management plugins

  6. Cache results in Redis

  7. Serve manifests to application controller

Repo Server Pod

What I Learned: Repo Server is CPU-Intensive

Rendering Helm charts and building Kustomize manifests is expensive.

Problem I faced:

  • 127 applications

  • 50+ Helm charts

  • Sync every 3 minutes

  • Repo server: CPU 300-400%

Solution:

Signs you need to scale:

  • Sync timeouts

  • High CPU on repo-server pod

  • Slow manifest generation

  • Many Helm/Kustomize apps

Application Controller (argocd-application-controller)

The application controller is the heart of ArgoCD. This is where GitOps reconciliation happens.

What It Does

spinner

Responsibilities:

  1. Watch Application CRDs

  2. Fetch desired state from Git (via repo server)

  3. Fetch actual state from Kubernetes

  4. Compare desired vs actual

  5. Sync if different (apply manifests)

  6. Monitor application health

  7. Detect drift

  8. Update application status

  9. Trigger sync hooks and waves

Application Controller Pod

Why StatefulSet?

  • Need stable identity

  • Sharding support (multi-replica)

  • Each replica handles subset of applications

What I Learned: Application Controller is the Bottleneck

With 127 applications syncing every 3 minutes, the controller was overwhelmed.

Math:

Solution: Enable sharding

How sharding works:

Redis

Redis is the cache and message bus for ArgoCD.

What It Stores

spinner

Data stored:

  1. Git repository cache (clone results)

  2. Manifest generation cache (Helm/Kustomize output)

  3. Application state cache

  4. Sync operation queue

  5. User session data

  6. RBAC policy cache

Redis Pod

What I Learned: Redis Can Be a Bottleneck

Problem: With many apps and frequent syncs, Redis can hit memory limits.

Solution:

For production with 100+ apps:

  • Use Redis with persistence (enable AOF)

  • Or use external Redis (AWS ElastiCache, etc.)

  • Monitor Redis memory usage

Dex (Optional SSO Server)

Dex is an identity provider that integrates with SSO systems.

What It Does

spinner

Supports:

  • OIDC (OpenID Connect)

  • SAML 2.0

  • LDAP

  • GitHub

  • Google

  • GitLab

  • Okta

  • Azure AD

  • etc.

Dex Configuration Example

Without Dex:

  • Local user accounts only

  • Manually manage users/passwords

  • No SSO integration

With Dex:

  • SSO with corporate identity provider

  • Automatic user provisioning

  • Group/team-based access

How Components Work Together

Let's trace a complete sync operation:

Scenario: Developer Pushes to Git

Step-by-Step Flow

1. Application Controller polls Git (every 3 min)

2. Repo Server fetches from Git

3. Application Controller compares states

4. Application Controller syncs

5. API Server shows status

Mermaid Sequence Diagram

spinner

Component Communication

spinner

Ports:

  • API Server: 8080 (HTTP), 8083 (gRPC)

  • Repo Server: 8081 (gRPC)

  • Redis: 6379 (TCP)

  • Dex: 5556 (HTTP)

  • Kubernetes API: 6443 (HTTPS)

Performance Tuning Based on Architecture

For 1-50 Applications

For 50-200 Applications

For 200+ Applications

Troubleshooting Based on Architecture

Slow UI/CLI

Problem: API Server overwhelmed Check:

Solution: Scale API server replicas

Sync Timeouts

Problem: Repo Server slow or Application Controller overwhelmed Check:

Solution:

  • Scale Repo Server (if high CPU)

  • Scale Application Controller with sharding (if many apps)

High Memory Usage

Problem: Redis cache too small Check:

Solution: Increase Redis memory limits

Key Takeaways

  1. ArgoCD has 5 core components

    • API Server: Frontend (UI, CLI, API)

    • Repo Server: Git interface (clone, build manifests)

    • Application Controller: Reconciliation engine (sync Git β†’ cluster)

    • Redis: Cache and message bus

    • Dex: SSO integration (optional)

  2. Application Controller is the heart

    • Does the actual GitOps reconciliation

    • Watches Application CRDs

    • Compares desired (Git) vs actual (cluster)

    • Applies changes

  3. Repo Server is CPU-intensive

    • Renders Helm charts

    • Builds Kustomize manifests

    • Scale when you have many Helm/Kustomize apps

  4. Scale based on load

    • 1-50 apps: Default setup

    • 50-200 apps: Scale to 2-3 replicas

    • 200+ apps: Scale to 3-5 replicas, enable sharding

  5. Understand the flow

    • Dev pushes β†’ Git

    • Controller polls β†’ Repo Server

    • Repo Server fetches β†’ Git

    • Controller compares β†’ desired vs actual

    • Controller syncs β†’ Kubernetes

In the next article, we'll get hands-on: installing ArgoCD on a Kubernetes cluster, configuring it, and preparing it for your first GitOps deployment.


Previous: Understanding GitOps Core Concepts Next: Installing and Configuring ArgoCD

Last updated