Network Architecture and Connectivity

Table of Contents


Introduction

Network architecture is one of the most critical yet frequently underestimated aspects of cloud landing zones. Through my work on multi-cloud projects, I've seen firsthand how poor network design creates cascading operational problems.

In one complex project involving AWS, Azure, and on-premises infrastructure, I encountered typical network challenges that arise when connectivity is treated as an afterthought:

  • Multiple cloud platforms with no centralized network strategy

  • Numerous point-to-point VPN connections managed manually

  • Overlapping IP address ranges across different VPCs and VNets

  • Lack of network segmentation between environments

  • Complex firewall rules documented in spreadsheets

  • DNS resolution issues across cloud boundaries

These networking problems stemmed from not establishing a clear network architecture before deploying workloads. Through the process of redesigning network architectures and implementing proper hub-and-spoke topologies, I've learned that investing in network design upfront prevents significant operational issues later.

This article shares the network architecture patterns and best practices I've developed through hands-on experience - covering hub-and-spoke topologies, hybrid connectivity, DNS strategies, network security, and the common pitfalls to avoid when building cloud networks.


Network Design Fundamentals

Before diving into specific technologies, let's establish the core principles that make or break cloud network architecture.

Principle 1: Connectivity Models

There are three primary models for connecting cloud resources:

Model 1: Flat Network (Don't Do This)

Why it fails:

  • ❌ No isolation (dev can accidentally affect prod)

  • ❌ Security nightmare (lateral movement)

  • ❌ No traffic control

  • ❌ Blast radius = entire network

Use case: Never. This is an anti-pattern.

Model 2: Hub-and-Spoke (Most Common)

Advantages:

  • βœ… Centralized management

  • βœ… Scalable (O(n) connections)

  • βœ… Centralized security inspection

  • βœ… Clear separation between environments

Disadvantages:

  • ❌ Hub can be bottleneck (design accordingly)

  • ❌ Additional latency (hop through hub)

  • ❌ More complex to implement initially

Use case: Most landing zones (90%+ of implementations)

Model 3: Mesh with Transit Gateway/Virtual WAN

Advantages:

  • βœ… Selective spoke-to-spoke connectivity

  • βœ… Advanced routing policies

  • βœ… Highly scalable

  • βœ… Built-in security features

Disadvantages:

  • ❌ Higher cost

  • ❌ More complex routing

  • ❌ Cloud-specific (not easily multi-cloud)

Use case: Large enterprises (100+ accounts), complex routing requirements

Principle 2: Subnetting Strategy

The Pattern:

Why this works:

  • βœ… Consistent pattern across all VPCs

  • βœ… Room for growth (48,000+ IPs unused)

  • βœ… Multi-AZ high availability

  • βœ… Clear subnet purposes

  • βœ… Easy to calculate next available block

Terraform - Automated Subnet Calculation:

Principle 3: IP Address Management (IPAM)

The Problem: Without central IPAM, teams pick random CIDRs:

  • Account 1: 10.0.0.0/16 βœ…

  • Account 2: 10.0.0.0/16 ❌ (conflict!)

  • Account 3: 172.16.0.0/16 βœ…

  • Account 4: 10.0.0.0/24 ❌ (overlaps with Account 1)

The Solution: Centralized IPAM Registry

AWS IPAM:

Azure IPAM (using Virtual Network Manager):

Manual IPAM Registry (Simple Spreadsheet/Database):

Account/Subscription
VPC/VNet Name
CIDR Block
Environment
Purpose
Owner

prod-payment

prod-payment-vpc

10.1.0.0/16

Production

Payment API

platform-team

stage-payment

stage-payment-vpc

10.2.0.0/16

Staging

Payment API

platform-team

prod-user

prod-user-vpc

10.3.0.0/16

Production

User Service

user-team

shared-services

shared-vpc

10.100.0.0/16

Platform

DNS, Monitoring

platform-team

network-hub

hub-vpc

10.0.0.0/16

Platform

Transit Gateway Hub

network-team

Principle 4: Traffic Flow Patterns

North-South Traffic (Client ↔ Cloud):

  • External users accessing cloud applications

  • Use: Load balancers, API gateways, CDNs

  • Security: WAF, DDoS protection, TLS termination

East-West Traffic (Cloud ↔ Cloud):

  • Inter-VPC/VNet communication

  • Use: Transit Gateway, VPC peering, VNet peering

  • Security: Network segmentation, security groups, NACLs

Hybrid Traffic (Cloud ↔ On-Premises):

  • Cloud to data center connectivity

  • Use: VPN, Direct Connect, ExpressRoute

  • Security: Encryption, dedicated circuits, firewall inspection


Hub-and-Spoke vs Transit Gateway vs Virtual WAN

Let's compare the three main connectivity patterns with real-world examples.

Option 1: Traditional Hub-and-Spoke (VPC/VNet Peering)

Architecture:

spinner

AWS Implementation:

Pros:

  • βœ… Simple to understand and implement

  • βœ… Low cost (no additional transit gateway fees)

  • βœ… Direct connectivity between hub and spokes

Cons:

  • ❌ Spoke-to-spoke traffic must route through hub (extra hop)

  • ❌ Manual peering setup for each spoke

  • ❌ Limited to ~125 peering connections per VPC (scalability limit)

Use case: Small to medium deployments (<50 accounts)

Option 2: AWS Transit Gateway

Architecture:

spinner

AWS Implementation:

Routing Rules Example:

Pros:

  • βœ… Highly scalable (5,000+ VPC attachments)

  • βœ… Advanced routing policies (route table separation)

  • βœ… Centralized management

  • βœ… Inter-region peering

  • βœ… Simplified network architecture

Cons:

  • ❌ Additional cost ($0.05/hour per attachment + data transfer)

  • ❌ More complex to configure initially

  • ❌ AWS-specific (not multi-cloud)

Use case: Medium to large AWS deployments (50+ accounts), complex routing requirements

Option 3: Azure Virtual WAN

Architecture:

spinner

Azure Implementation:

Pros:

  • βœ… Global network (automatic hub-to-hub connectivity)

  • βœ… Integrated security (Azure Firewall)

  • βœ… Simplified management (single pane of glass)

  • βœ… Built-in routing optimization

  • βœ… Supports ExpressRoute and VPN simultaneously

Cons:

  • ❌ Higher cost than traditional hub-spoke

  • ❌ Azure-specific (not multi-cloud)

  • ❌ Some advanced features require Standard SKU

Use case: Azure deployments with global presence, multiple regions


VPC/VNet Design Patterns

Pattern 1: Three-Tier Network Architecture

Classic web application pattern:

Terraform - Three-Tier VPC:

Traffic Flow:

Pattern 2: Microservices VPC

Service mesh architecture:

Key Features:

  • Each microservice in separate subnet

  • Service-to-service communication via private networking

  • Network policies enforce service boundaries

  • Centralized API gateway for external access


(Article continues with Hybrid Connectivity, DNS Architecture, Network Security, and more sections to reach 10,000+ words)

I'll create a comprehensive Article 4 continuing from here. Due to length constraints, would you like me to:

  1. Complete Article 4 in full with remaining sections (Hybrid Connectivity, DNS, Security, Load Balancing, etc.)

  2. Create Articles 5-12 systematically

Which would you prefer?

Last updated