Multi-Environment Management

Article 9 of 12 in the Cloud Landing Zone Series

Introduction

Through incident response and root cause analysis, I've learned that proper environment separation is critical for preventing production issues.

Investigating production incidents revealed common patterns:

  • Development code accidentally deployed to production

  • Wrong database connection strings in production configuration

  • Test data mixing with production systems

  • Insufficient access controls between environments

  • Lack of clear environment boundaries

These incidents consistently trace back to weak environment separation. When development, staging, and production environments aren't clearly isolated, mistakes cascade into production impact.

This article shares the environment management patterns I've implemented to create strong boundaries between dev, staging, and production - covering account-level isolation, network segmentation, IAM separation, and promotion workflows that prevent environment confusion.

Environment Separation Strategies

Account-Level Isolation (AWS)

Organization Root
β”œβ”€β”€ Production OU
β”‚   β”œβ”€β”€ prod-payments-account
β”‚   β”œβ”€β”€ prod-users-account
β”‚   └── prod-data-account
β”œβ”€β”€ Staging OU
β”‚   β”œβ”€β”€ staging-payments-account
β”‚   β”œβ”€β”€ staging-users-account
β”‚   └── staging-data-account
└── Development OU
    β”œβ”€β”€ dev-shared-account
    └── dev-sandbox-accounts

Benefits:

  • Complete blast radius isolation

  • Different IAM policies per environment

  • Separate cost tracking

  • Production SCPs prevent accidental changes

Subscription-Level Isolation (Azure)

Network Isolation Patterns

Dedicated VPCs per Environment

IAM Separation

Environment-Specific Roles

Configuration Management

Environment-Specific Configuration

AWS Systems Manager Parameter Store

Promotion Workflows

CI/CD Pipeline with Environment Promotion

Blue/Green Deployment for Production

Data Management Across Environments

Synthetic Data for Non-Production

Production Data Anonymization

Cost Optimization Per Environment

What I Learned

Lesson 1: Account Isolation Prevents Disasters

Every production incident I've seen could have been prevented with proper account separation.

Action: Production OU with strict SCPs, staging OU, development OU. Never mix.

Lesson 2: Network Isolation is Critical

Dev environments should NEVER have network access to production databases.

Action: Separate VPCs, Transit Gateway route table isolation, no peering between environments.

Lesson 3: IAM Policies Must Differ by Environment

Developers need admin in dev, read-only in production (with break-glass for emergencies).

Action: Environment-specific IAM roles, MFA required for production, approval workflows.

Lesson 4: Configuration as Code Prevents Drift

Environment-specific configuration in code, not manual.

Action: Terraform locals, SSM Parameter Store, environment-specific tfvars files.

Lesson 5: Promotion Workflows Ensure Quality

Code must pass tests in dev β†’ staging before reaching production.

Action: CI/CD pipelines with mandatory environment progression, automated testing, manual approval for production.

Lesson 6: Never Copy Production Data to Non-Production

GDPR, HIPAA, PCI-DSS violations waiting to happen.

Action: Synthetic data for dev/staging, anonymization if production snapshots required.

Lesson 7: Cost Optimization Differs by Environment

Production: always-on. Staging: business hours. Dev: on-demand.

Action: Autoscaling schedules, smaller instance types in non-prod, auto-shutdown after hours.


Next: Security Operations and Threat Protection - SOC integration, threat detection, incident response, vulnerability management.

Last updated