Status: Accepted
Date: 2026-01-25
Need to support service deployment across multiple AWS regions to reduce latency and ensure disaster recovery.
- Single-region with manual failover
- Active-Active multi-region with cross-region communication
- Active-Active multi-region with isolated regions
Adopt Active-Active multi-region with isolated regions. Each region is self-contained, with no direct communication between services from different regions.
- Isolation: True disaster recovery - failure in one region does not affect others
- Simplicity: Does not require mesh federation, cross-region service discovery, or conflict resolution
- Latency: Users are routed to the nearest region via Route53 latency-based routing
- Consistency: Each region has its own data, avoiding eventual consistency issues
- Directory structure:
environments/{env}/{region}/ - ECR centralized in
us-east-1(cross-region pull) - Central Grafana in
prod/us-east-1with remote datasources - Phased rollout: deploy region by region with configurable delays
- Cost: resource multiplication per region
┌─────────────────────────────────────────────────────────────────┐
│ Isolated Regions │
│ │
│ us-east-1 (Primary) sa-east-1 (Secondary) │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ VPC + EKS + Keycloak│ │ VPC + EKS + Keycloak│ │
│ │ + Observability │ ✕ │ + Observability │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ ▲ ▲ │
│ │ Route53 (DNS) │ │
│ └───────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘