Status: Accepted
Date: 2026-01-26
HelpDev's current architecture has global resources (Route53, CloudFront, ECR, Central Grafana) hosted in the helpdev-prod account. This approach creates coupling between shared infrastructure and production workloads, generating:
- Amplified blast radius: Incidents in the prod account affect global services and vice versa
- Billing opacity: Difficult to separate infrastructure costs from workload costs
- Access control: Teams that need ECR access must have access to the prod account
- Limited scalability: Adding new Business Units means sharing prod account resources
- Observability coupling: Central Grafana in the same account as the workloads it observes
AWS best practices (Organizing Your AWS Environment Using Multiple Accounts) recommend a multi-account structure with clear separation between:
- Management Account: Governance and billing
- Shared Services Account: Services shared across accounts
- Workload Accounts: Application environments (HML, Prod, BUs)
Create a dedicated helpdev-org-main account (Shared Services Account) to host all global and shared resources of the organization.
| Resource | Origin | Rationale |
|---|---|---|
| Route53 | helpdev-prod | DNS is truly global, serves all accounts |
| CloudFront | helpdev-prod | CDN serves all environments, centralized billing |
| ECR | helpdev-prod (us-east-1) | All accounts push/pull images |
| Central Grafana | helpdev-prod | Observes all accounts, should not be in workload account |
| ACM Certificates | per account | Centralize wildcard certificate management |
| Transit Gateway | N/A (optional) | Only if cross-account VPC communication is needed |
AWS Organization
├── helpdev-org-main # Shared Services Account (NOVA)
│ ├── Route53 (DNS zones, health checks, latency-based routing)
│ ├── CloudFront (distribuições CDN)
│ ├── ECR (container registry - us-east-1)
│ ├── Grafana Central (observabilidade unificada)
│ ├── ACM (certificados wildcard compartilhados)
│ └── Transit Gateway (opcional - conectividade cross-account)
│
├── helpdev-hml # Homologação (sem alterações)
│ └── us-east-1: EKS, Keycloak, Observability (Mimir/Loki/Tempo)
│
├── helpdev-prod # Produção (simplificada)
│ ├── us-east-1: EKS, Keycloak, Observability (Mimir/Loki/Tempo)
│ └── sa-east-1: (futuro)
│
└── helpdev-prod-bu-* # Business Units (futuro)
└── Cada BU consome ECR compartilhado, usa Route53 compartilhado
┌─────────────────────────────────────────────────────────────────────────────┐
│ Acesso Cross-Account │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ helpdev-org-main │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Route53 │ │CloudFront│ │ ECR │ │ Grafana │ │ ACM │ │ │
│ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ │
│ └───────┼────────────┼───────────┼────────────┼────────────┼───────────┘ │
│ │ │ │ │ │ │
│ │ Resource Policies │ IAM Roles Cross-Account │
│ │ + AWS RAM │ + ECR Repository Policy │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────────────────────┐ │
│ │ helpdev-hml │ │ helpdev-prod │ │ helpdev-prod-bu-* │ │
│ │ │ │ │ │ (futuro) │ │
│ │ • Pull ECR │ │ • Pull ECR │ │ • Pull ECR │ │
│ │ • DNS lookup │ │ • DNS lookup │ │ • DNS lookup │ │
│ │ • CDN origin │ │ • CDN origin │ │ • CDN origin │ │
│ └───────────────┘ └───────────────┘ └───────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
| Benefit | Impact |
|---|---|
| Isolation | Global services protected from workload incidents |
| Security | Least-privilege: devs access ECR without prod access |
| Billing | Clear separation: infrastructure costs vs workloads |
| Scalability | New BUs/accounts easily consume shared services |
| Governance | Platform team manages shared account; product teams manage their accounts |
| Disaster Recovery | Shared services independent of regional failures |
This decision follows the recommendations of the AWS Well-Architected Framework and the whitepaper Organizing Your AWS Environment Using Multiple Accounts:
"Use a shared services account to host common services that are used by multiple workload accounts, such as directory services, CI/CD pipelines, and container registries."
- ✅ Better isolation and security boundaries between infrastructure and workloads
- ✅ Clearer cost attribution for finance and chargeback
- ✅ Scalable architecture for future Business Units
- ✅ Platform team with autonomy over shared resources
- ✅ Smaller blast radius in case of incidents
⚠️ Additional cross-account IAM complexity⚠️ Migration effort for existing resources⚠️ Need to update CI/CD pipelines for new ECR⚠️ Resource policies and AWS RAM configuration
- 🔄 ECR cross-region pull already works in current architecture
- 🔄 Route53 and CloudFront are inherently global
- 🔄 Grafana already does cross-account queries via datasources
- Create
helpdev-org-mainaccount in AWS Organization - Configure Terraform backend and directory structure
- Define cross-account IAM roles and policies
- Create ECR in new account
- Configure repository policies for cross-account pull
- Update CI/CD pipelines to push to new account
- Migrate existing images
- Update references in Kubernetes manifests
- Create hosted zones in new account
- Gradually migrate DNS records
- Reconfigure CloudFront distributions
- Update ACM certificates
- Provision Grafana in new account
- Configure cross-account datasources
- Migrate dashboards and alerts
- Update DNS for new endpoint
- Rejected: Violates isolation principle and hinders expansion to BUs
- Rejected for v1: Adds unnecessary complexity for current size. May be considered in the future if Transit Gateway and Direct Connect become complex.
- Rejected: Increases storage costs, complicates image management, hinders promotion between environments