# Progressive Delivery in Kubernetes with Argo Rollouts and ArgoCD

## The Deployment That Went Wrong in 30 Seconds

I deployed a new version of an API service on a Friday afternoon. Green build, passed all tests, reviewed by two engineers.

Within 30 seconds of it hitting production:

```bash
kubectl get pods -n production
# NAME                     READY   STATUS             RESTARTS
# api-7f9c4d-xk2lp         0/1     CrashLoopBackOff   3
# api-7f9c4d-m9p3z         0/1     CrashLoopBackOff   3
# api-7f9c4d-q1n8r         0/1     CrashLoopBackOff   3
# (old pods already terminated)

kubectl logs api-7f9c4d-xk2lp
# Error: Cannot read properties of undefined (reading 'config')
# at ServiceBootstrap.init (/app/bootstrap.js:42:23)
```

A missing environment variable. Every pod was down. 100% of traffic hit dead pods.

The rollback was instant — but the 4 minutes of full outage already fired alerts, paged on-call engineers, and triggered customer-visible errors.

What I needed wasn't faster rollback. I needed a way to discover the problem before 100% of traffic hit it.

**That's progressive delivery.**

## What is Progressive Delivery?

Progressive delivery is a deployment strategy that controls *how much* of your traffic sees a new version, using **automated analysis** to decide whether to proceed or roll back — before the problem affects all users.

It builds on top of Continuous Delivery, adding traffic control and observability checkpoints:

```
Continuous Delivery: Build → Test → Deploy
                                         ↑
                          100% of traffic sees new version immediately

Progressive Delivery: Build → Test → Deploy 5% → Analyze → Deploy 25% → Analyze → 100%
                                        ↑               ↑
                                Limited blast radius   Rollback if metrics degrade
```

The core idea: **release progressively, validate continuously.**

### Why Kubernetes Deployments Aren't Enough

Native Kubernetes `Deployment` does support rolling updates:

```yaml
apiVersion: apps/v1
kind: Deployment
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
```

But there's a critical limitation: traffic is routed purely based on pod count. If you have 10 pods and update 1, roughly 10% of requests hit the new version — but you have **no automated way to stop the rollout if error rates spike**. Kubernetes will keep rolling out regardless.

The other problem: Kubernetes `RollingUpdate` doesn't support proper blue-green deployments or traffic-percentage-based canaries. It's a best-effort pod count approximation.

For real progressive delivery in Kubernetes, you need **Argo Rollouts**.

## The Ecosystem: How the Tools Fit Together

{% @mermaid/diagram content="graph TB
subgraph "Git Repositories"
APPREPO\[Application Repo<br/>Dockerfile, source code]
GITREPO\[Config Repo<br/>Rollout manifests, Helm charts]
end

```
subgraph "CI Layer"
    CI[GitHub Actions / GitLab CI<br/>Build, test, push image]
end

subgraph "Kubernetes Cluster"
    subgraph "ArgoCD"
        ARGOCD[ArgoCD Application Controller<br/>Watches Config Repo]
    end

    subgraph "Argo Rollouts"
        CTRL[Rollouts Controller<br/>Traffic shifting logic]
        ANALYSIS[AnalysisTemplate<br/>Prometheus / Datadog metrics]
    end

    subgraph "Traffic Management"
        ISTIO[Istio VirtualService<br/>or NGINX Ingress]
    end

    subgraph "Workloads"
        STABLE[Stable ReplicaSet<br/>Current version]
        CANARY[Canary ReplicaSet<br/>New version]
    end
end

APPREPO --> CI
CI -->|Update image tag| GITREPO
GITREPO -->|Watched by| ARGOCD
ARGOCD -->|Applies Rollout manifest| CTRL
CTRL -->|Creates / manages| STABLE
CTRL -->|Creates / manages| CANARY
CTRL -->|Adjusts weights| ISTIO
CTRL -->|Queries metrics| ANALYSIS

style CTRL fill:#9cf,stroke:#333
style ARGOCD fill:#9f9,stroke:#333
style ANALYSIS fill:#f99,stroke:#333" %}
```

**Responsibilities:**

* **ArgoCD** — Git sync. Applies your `Rollout` manifests to the cluster.
* **Argo Rollouts Controller** — Manages canary/blue-green logic, traffic shifting.
* **Prometheus / Datadog** — Supplies metrics for automated promotion/rollback decisions.
* **Istio / NGINX** — Performs the actual traffic split at the network layer.

ArgoCD and Argo Rollouts are complementary, not competing. ArgoCD handles reconciliation; Argo Rollouts handles the delivery strategy.

## Installing Argo Rollouts

### Install the Controller

```bash
kubectl create namespace argo-rollouts

kubectl apply -n argo-rollouts \
  -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml
```

Verify the controller is running:

```bash
kubectl get pods -n argo-rollouts
# NAME                             READY   STATUS    RESTARTS
# argo-rollouts-5d9c6d9f8b-xp2qk   1/1     Running   0
```

### Install the kubectl Plugin

The `kubectl argo rollouts` plugin gives you a live-updating dashboard in the terminal:

```bash
# macOS
brew install argoproj/tap/kubectl-argo-rollouts

# Linux
curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
chmod +x kubectl-argo-rollouts-linux-amd64
sudo mv kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts
```

Test it:

```bash
kubectl argo rollouts version
# argo-rollouts: v1.7.2
```

### Install the ArgoCD Plugin (Optional but Recommended)

This enables the Argo Rollouts UI panel inside ArgoCD's web interface:

```bash
# Patch ArgoCD configmap to enable the rollouts plugin
kubectl patch configmap argocd-cm -n argocd --patch '
data:
  resource.customizations: |
    argoproj.io/Rollout:
      health.lua: |
        hs = {}
        if obj.status ~= nil then
          if obj.status.phase == "Degraded" then
            hs.status = "Degraded"
            hs.message = obj.status.message
            return hs
          end
          if obj.status.phase == "Paused" then
            hs.status = "Suspended"
            hs.message = obj.status.message
            return hs
          end
          if obj.status.currentPodHash == obj.status.stableRS then
            if obj.spec.replicas == obj.status.readyReplicas then
              hs.status = "Healthy"
              return hs
            end
          end
        end
        hs.status = "Progressing"
        return hs
'
```

## Strategy 1: Canary Deployments

A canary deployment sends a small percentage of traffic to the new version, gradually increases it, and validates at each step.

The name comes from "canary in a coal mine" — if the canary dies, you know there's danger before it reaches everyone.

### The Rollout Resource

`Rollout` is an Argo Rollouts CRD that replaces the standard `Deployment`. The spec is nearly identical — you switch `kind: Deployment` to `kind: Rollout` and add a `strategy` section.

```yaml
# rollout-api.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: api-service
  namespace: production
spec:
  replicas: 10
  selector:
    matchLabels:
      app: api-service
  template:
    metadata:
      labels:
        app: api-service
    spec:
      containers:
      - name: api
        image: myregistry/api-service:v2.1.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
  strategy:
    canary:
      steps:
      - setWeight: 5         # Send 5% traffic to canary
      - pause: {duration: 2m}  # Wait 2 minutes, check metrics
      - setWeight: 20
      - pause: {duration: 5m}
      - setWeight: 50
      - pause: {duration: 5m}
      - setWeight: 100       # Full promotion
```

When you update the image tag, Argo Rollouts:

1. Creates a new `ReplicaSet` (canary)
2. Shifts 5% of traffic to it
3. Pauses and waits
4. Continues (or rolls back on failure)

### Traffic Shifting with Istio

Pod-count-based traffic splitting is approximate. For precise percentages, wire Argo Rollouts to Istio:

```yaml
# virtual-service.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: api-service-vsvc
  namespace: production
spec:
  hosts:
  - api-service
  http:
  - name: primary
    route:
    - destination:
        host: api-service
        subset: stable
      weight: 100
    - destination:
        host: api-service
        subset: canary
      weight: 0
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: api-service-destrule
  namespace: production
spec:
  host: api-service
  subsets:
  - name: stable
    labels:
      app: api-service
  - name: canary
    labels:
      app: api-service
```

Reference the VirtualService in your Rollout:

```yaml
strategy:
  canary:
    trafficRouting:
      istio:
        virtualService:
          name: api-service-vsvc
          routes:
          - primary
    steps:
    - setWeight: 5
    - pause: {duration: 2m}
    - setWeight: 20
    - pause: {duration: 5m}
    - setWeight: 50
    - pause: {duration: 5m}
```

Now when the controller sets `weight: 5`, Istio routes exactly 5% — regardless of pod count.

### Watching a Canary Rollout Live

```bash
kubectl argo rollouts get rollout api-service -n production --watch
```

Output:

```
Name:            api-service
Namespace:       production
Status:          ॥ Paused
Message:         CanaryPauseStep
Strategy:        Canary
  Step:          2/6
  SetWeight:     5
  ActualWeight:  5

REVISION  STATUS      PODS  STABLE  CANARY
2         ॥ Paused    1/1   ✔       ✔
1         Healthy     9/9   ✔
```

### Promoting Manually

If you want to skip a pause and promote immediately:

```bash
# Promote one step
kubectl argo rollouts promote api-service -n production

# Full promotion (skip all remaining steps)
kubectl argo rollouts promote api-service -n production --full
```

### Aborting (Rolling Back)

```bash
kubectl argo rollouts abort api-service -n production
```

The controller routes 100% traffic back to the stable ReplicaSet and marks the rollout as `Degraded`. You can then retry after fixing the issue.

## Strategy 2: Blue-Green Deployments

Blue-green keeps two full environments alive: **blue** (current stable) and **green** (new version). You switch all traffic at once, but green is pre-warmed and validated before the switch.

```
Before cutover:              After cutover:
┌─────────────┐              ┌─────────────┐
│   blue      │ ← 100%       │   blue      │ ← 0%
│   (stable)  │              │   (stable)  │ (kept for fast rollback)
└─────────────┘              └─────────────┘
┌─────────────┐              ┌─────────────┐
│   green     │ ← 0%         │   green     │ ← 100%
│   (new)     │              │   (new)     │
└─────────────┘              └─────────────┘
```

The advantage over canary: no partial state. Users are never split between two versions. Critical for database schema changes where you can't have two different app versions talking to the same DB simultaneously.

### Blue-Green Rollout

```yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: payment-service
  namespace: production
spec:
  replicas: 5
  selector:
    matchLabels:
      app: payment-service
  template:
    metadata:
      labels:
        app: payment-service
    spec:
      containers:
      - name: payment
        image: myregistry/payment-service:v3.0.0
        ports:
        - containerPort: 8080
  strategy:
    blueGreen:
      activeService: payment-service-active      # Production traffic
      previewService: payment-service-preview    # Pre-prod traffic (testing)
      autoPromotionEnabled: false                # Manual promotion required
      scaleDownDelaySeconds: 300                 # Keep blue alive 5 min after cutover
```

Two Services are required:

```yaml
# Active service (production traffic)
apiVersion: v1
kind: Service
metadata:
  name: payment-service-active
  namespace: production
spec:
  selector:
    app: payment-service
  ports:
  - port: 80
    targetPort: 8080
---
# Preview service (for smoke testing the green environment)
apiVersion: v1
kind: Service
metadata:
  name: payment-service-preview
  namespace: production
spec:
  selector:
    app: payment-service
  ports:
  - port: 80
    targetPort: 8080
```

### Blue-Green Flow

{% @mermaid/diagram content="sequenceDiagram
participant Dev as Developer
participant CD as ArgoCD
participant RO as Argo Rollouts
participant Blue as Blue (Stable)
participant Green as Green (New)
participant Traffic as Production Traffic

```
Dev->>CD: Update image tag in Git
CD->>RO: Sync Rollout manifest
RO->>Green: Create green ReplicaSet
RO->>Green: Route preview Service to green
Note over Green: Green warming up
Dev->>Green: Smoke tests via preview URL
Dev->>RO: kubectl argo rollouts promote payment-service
RO->>Traffic: Switch active Service to green
Note over Blue: Blue kept alive (scaleDownDelay)
RO->>Blue: Scale down blue after 5 min" %}
```

The key step: you run your smoke tests against `payment-service-preview` before promoting. If anything fails, you just don't promote — blue is still serving 100% of traffic.

### Promoting the Blue-Green Rollout

```bash
# Check status - green is ready but not active
kubectl argo rollouts get rollout payment-service -n production

# Promote - switches traffic from blue to green
kubectl argo rollouts promote payment-service -n production
```

## Automated Promotion and Rollback with AnalysisRun

Manual promotion works, but the real power is automated analysis: let Prometheus metrics decide whether to proceed or roll back.

### AnalysisTemplate

An `AnalysisTemplate` defines what metrics to query and what constitutes a passing or failing result:

```yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
  namespace: production
spec:
  args:
  - name: service-name
  metrics:
  - name: success-rate
    interval: 1m         # Query every minute
    count: 5             # Run 5 times total
    successCondition: result[0] >= 0.95   # 95% success rate required
    failureLimit: 1      # 1 failure allowed before aborting
    provider:
      prometheus:
        address: http://prometheus.monitoring.svc.cluster.local:9090
        query: |
          sum(
            rate(http_requests_total{
              service="{{args.service-name}}",
              status!~"5.."
            }[2m])
          ) /
          sum(
            rate(http_requests_total{
              service="{{args.service-name}}"
            }[2m])
          )
```

### Wiring Analysis into a Canary Rollout

```yaml
strategy:
  canary:
    analysis:
      templates:
      - templateName: success-rate
      startingStep: 2      # Start analysis at step 2 (after 5% traffic for 2 min)
      args:
      - name: service-name
        value: api-service
    steps:
    - setWeight: 5
    - pause: {duration: 2m}
    - setWeight: 20
    - pause: {duration: 5m}  # Analysis runs here in parallel
    - setWeight: 50
    - pause: {duration: 5m}
```

Now if the success rate drops below 95% during the pause, Argo Rollouts **automatically aborts** and routes traffic back to stable — without anyone having to notice or intervene.

### Background Analysis

You can also run analysis continuously throughout a canary, not just at pause steps:

```yaml
strategy:
  canary:
    analysis:
      templates:
      - templateName: success-rate
      args:
      - name: service-name
        value: api-service
    steps:
    - setWeight: 5
    - pause: {duration: 2m}
    - setWeight: 20
    - pause: {duration: 5m}
    - setWeight: 50
    - pause: {duration: 5m}
```

Background analysis runs from the first step and monitors continuously. A failure at any point triggers automatic rollback.

### Datadog as an Analysis Provider

If you use Datadog instead of Prometheus:

```yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: error-rate-datadog
  namespace: production
spec:
  args:
  - name: service-name
  metrics:
  - name: error-rate
    interval: 1m
    count: 5
    successCondition: result <= 0.01   # Less than 1% error rate
    failureLimit: 2
    provider:
      datadog:
        apiVersion: "v2"
        query: |
          sum:trace.web.request.errors{service:{{args.service-name}}}
          /
          sum:trace.web.request.hits{service:{{args.service-name}}}
```

Argo Rollouts supports Prometheus, Datadog, New Relic, CloudWatch, Graphite, and custom web hooks.

## End-to-End GitOps Flow with Progressive Delivery

Putting it all together with ArgoCD managing the lifecycle:

### Repository Structure

```
config-repo/
├── apps/
│   ├── api-service/
│   │   ├── rollout.yaml          # Rollout CRD (replaces Deployment)
│   │   ├── service.yaml
│   │   ├── analysis-template.yaml
│   │   └── virtual-service.yaml  # If using Istio
├── argocd/
│   └── api-service-app.yaml      # ArgoCD Application
```

### ArgoCD Application Pointing at the Rollout

```yaml
# argocd/api-service-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: api-service
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/mycompany/config-repo.git
    targetRevision: HEAD
    path: apps/api-service
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true
```

### The Complete Deployment Flow

{% @mermaid/diagram content="sequenceDiagram
participant Dev as Developer
participant AppRepo as App Repo
participant CI as GitHub Actions
participant Registry as Container Registry
participant ConfigRepo as Config Repo
participant ArgoCD as ArgoCD
participant Rollouts as Argo Rollouts
participant Prometheus as Prometheus
participant K8s as Kubernetes

```
Dev->>AppRepo: git push (new feature)
AppRepo->>CI: Trigger workflow
CI->>CI: Build & unit test
CI->>Registry: Push image (api-service:sha-abc123)
CI->>ConfigRepo: Update rollout.yaml image tag to sha-abc123
ConfigRepo->>ArgoCD: Webhook / poll detects change
ArgoCD->>K8s: Apply Rollout manifest
K8s->>Rollouts: Rollout controller detects new spec
Rollouts->>K8s: Create canary ReplicaSet (1 pod = 5%)
Note over K8s: 5% traffic → canary
Rollouts->>Prometheus: Query success-rate metric
Prometheus-->>Rollouts: result: 0.98 (pass)
Rollouts->>K8s: Scale canary to 20%
Note over K8s: 20% traffic → canary
Rollouts->>Prometheus: Query again
Prometheus-->>Rollouts: result: 0.96 (pass)
Rollouts->>K8s: Scale to 50%, then 100%
Rollouts->>K8s: Mark new RS as stable, scale down old RS
Note over K8s: Deployment complete ✓" %}
```

### The CI Piece: Updating the Image Tag

The CI pipeline needs to push the new image tag into the config repo. Here's a GitHub Actions job that does it:

```yaml
# .github/workflows/deploy.yml
name: Build and Deploy

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      image_tag: ${{ steps.meta.outputs.tags }}
    steps:
    - uses: actions/checkout@v4

    - name: Generate image tag
      id: meta
      run: echo "tags=myregistry/api-service:sha-$(git rev-parse --short HEAD)" >> $GITHUB_OUTPUT

    - name: Build and push
      run: |
        docker build -t ${{ steps.meta.outputs.tags }} .
        docker push ${{ steps.meta.outputs.tags }}

  update-config:
    needs: build
    runs-on: ubuntu-latest
    steps:
    - name: Checkout config repo
      uses: actions/checkout@v4
      with:
        repository: mycompany/config-repo
        token: ${{ secrets.CONFIG_REPO_TOKEN }}

    - name: Update image tag in rollout.yaml
      run: |
        sed -i "s|image: myregistry/api-service:.*|image: ${{ needs.build.outputs.image_tag }}|" \
          apps/api-service/rollout.yaml
        git config user.name "github-actions[bot]"
        git config user.email "github-actions[bot]@users.noreply.github.com"
        git add apps/api-service/rollout.yaml
        git commit -m "deploy: api-service ${{ needs.build.outputs.image_tag }}"
        git push
```

ArgoCD picks up the config change, applies the updated `Rollout`, and Argo Rollouts takes control from there.

## ArgoCD UI Integration

Once you install the ArgoCD UI plugin for Argo Rollouts, you see the rollout status directly in ArgoCD's application view:

* Canary weight percentage
* Current step
* AnalysisRun status (running / passed / failed)
* ReplicaSet breakdown (stable vs canary)
* Pause/promote/abort buttons

You can promote or abort directly from the UI without touching the terminal — useful for engineers who aren't deep in kubectl.

## Checking Status and Debugging

### Getting Rollout Status

```bash
# Get all rollouts in namespace
kubectl argo rollouts list rollouts -n production

# Live-watch a specific rollout
kubectl argo rollouts get rollout api-service -n production --watch

# Get analysis run details
kubectl get analysisruns -n production
kubectl describe analysisrun api-service-abc123-5 -n production
```

### When a Rollout Gets Stuck

The most common issue: an analysis failure that isn't obvious from the rollout status.

```bash
# Check analysis run
kubectl get analysisrun -n production
# NAME                         STATUS    AGE
# api-service-abc123-5         Failed    3m

# Describe to see why
kubectl describe analysisrun api-service-abc123-5 -n production
# ...
# Message: Metric "success-rate" assessed Failed due to failed (1) > failureLimit (0)
# Last value: 0.89

# Check Prometheus is reachable from within the cluster
kubectl run debug --image=curlimages/curl -it --rm -n production -- \
  curl http://prometheus.monitoring.svc.cluster.local:9090/api/v1/query \
  --data-urlencode 'query=up'
```

Common causes of stuck rollouts:

* Prometheus query returns `NaN` when a service has zero requests (divide-by-zero). Fix: add `or vector(1)` fallback.
* Analysis template references wrong service label.
* `failureLimit: 0` with any transient network error causing analysis to fail. Set `failureLimit: 1` as a minimum.

### Aborting and Retrying

```bash
# Rollout is Degraded after failed analysis - fix the issue first
# Then retry
kubectl argo rollouts retry rollout api-service -n production
```

## Converting Existing Deployments

If you have existing `Deployment` resources, the migration is straightforward:

```yaml
# Before: standard Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-service
  template:
    ...
  strategy:
    type: RollingUpdate

# After: Argo Rollout
apiVersion: argoproj.io/v1alpha1  # Changed
kind: Rollout                      # Changed
metadata:
  name: api-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-service
  template:
    ...                            # Identical
  strategy:
    canary:                        # New section
      steps:
      - setWeight: 20
      - pause: {duration: 5m}
```

Delete the old `Deployment` and apply the `Rollout`. The pods are recreated but the migration is non-disruptive if you do it during low traffic.

## What Canary Can't Protect You From

Progressive delivery isn't a silver bullet. Things it doesn't catch:

* **Data migration failures** — If your migration breaks halfway through, canary routing doesn't help. Handle with blue-green + pre-migration snapshots.
* **External dependency issues** — If a third-party API your service depends on goes down after promotion, that's not something canary traffic analysis will predict.
* **Metrics lag** — Some errors only surface under load that canary percentages don't generate. Consider a longer pause period or a dedicated load test stage before canary.
* **One-time initialization failures** — Some bugs only hit on the first request to a fresh pod. Adjust your `readinessProbe` so unhealthy pods never enter the canary pool.

## What I Learned Running This in Production

**Start simple, then add analysis.** My first Argo Rollouts setup was just weighted steps with manual pauses — no automated analysis. That alone was a huge improvement over raw `kubectl rollout`. I added Prometheus analysis only after I had a reliable metrics setup.

**The readiness probe is your first gate.** A pod that can't pass its readiness probe never joins the canary pool. Put real business-logic checks in your `/health` endpoint — not just "port is open."

**Keep `scaleDownDelaySeconds` generous on blue-green.** The default is 30 seconds. I bumped it to 300 (5 minutes). If you promote and immediately notice something wrong, you can abort within the window and blue is still alive and serving traffic.

**Use `autoPromotionEnabled: false` when starting out.** Manual promotion on blue-green gives you a forcing function to run smoke tests against the preview environment. Once your automated tests cover enough surface area, switch to auto.

**The analysis query must handle zero-traffic cases.** When a service has just started, the request rate is near zero. A Prometheus rate query over 2 minutes might return `NaN`. Add a default:

```yaml
query: |
  (
    sum(rate(http_requests_total{service="api",status!~"5.."}[2m])) /
    sum(rate(http_requests_total{service="api"}[2m]))
  ) or vector(1)
```

`or vector(1)` returns 1.0 (100% success) when no data is available, which lets the rollout proceed without false failures during warmup.

## Summary

Progressive delivery addresses the fundamental problem with traditional deployments: the blast radius is always 100%.

With Argo Rollouts and ArgoCD:

* **Canary** — limit exposure to a percentage of real traffic, analyze, proceed or roll back automatically.
* **Blue-green** — run two full environments, test before切ting traffic, instant cutover.
* **AnalysisRun** — tie Prometheus/Datadog metrics to automatic promotion/rollback decisions.
* **ArgoCD** — manage all of it declaratively from Git, with full visibility in the UI.

The deployment that crashed my API in 30 seconds? With a 5% canary and a 2-minute analysis window, it would have been caught at 5 pods instead of all 10 — with an automatic rollback before I even got the alert.

That's the goal: **deploy with confidence, not prayers.**

***

## Further Reading

* [Argo Rollouts Documentation](https://argo-rollouts.readthedocs.io)
* [GitOps CI/CD Pipeline Integration](https://blog.htunnthuthu.com/cloud-engineering-and-infrastructure/gitops-101/08-gitops-cicd-pipeline-integration) — How to wire CI to update image tags in Git
* [Advanced ArgoCD Features](https://blog.htunnthuthu.com/cloud-engineering-and-infrastructure/gitops-101/07-advanced-argocd-features) — Sync waves, ApplicationSets, notifications