Skip to content

Commit d85e75d

Browse files
authored
Merge pull request #46 from kunduso-org/fix-checkov-scan-findings
Fix Checkov Security Findings and Enhance Documentation
2 parents 0d06453 + 8d00828 commit d85e75d

12 files changed

Lines changed: 567 additions & 103 deletions

File tree

README.md

Lines changed: 205 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,207 @@
11
[![License: Unlicense](https://img.shields.io/badge/license-Unlicense-white.svg)](https://choosealicense.com/licenses/unlicense/) [![GitHub pull-requests closed](https://img.shields.io/github/issues-pr-closed/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform)](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/pulls?q=is%3Apr+is%3Aclosed) [![GitHub pull-requests](https://img.shields.io/github/issues-pr/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform)](https://GitHub.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/pull/)
22
[![GitHub issues-closed](https://img.shields.io/github/issues-closed/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform)](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/issues?q=is%3Aissue+is%3Aclosed) [![GitHub issues](https://img.shields.io/github/issues/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform)](https://GitHub.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/issues/)
3-
[![terraform-infra-provisioning](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/terraform.yml/badge.svg?branch=main)](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/terraform.yml) [![checkov-scan](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/code-scan.yml/badge.svg?branch=main)](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/code-scan.yml)
3+
[![terraform-infra-provisioning](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/terraform.yml/badge.svg?branch=main)](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/terraform.yml) [![checkov-scan](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/code-scan.yml/badge.svg?branch=main)](https://github.com/kunduso-org/github-self-hosted-runner-amazon-ec2-terraform/actions/workflows/code-scan.yml)
4+
5+
# GitHub Self-Hosted Runner on Amazon EC2 with Terraform
6+
7+
This repository contains Terraform infrastructure code to deploy scalable, self-hosted GitHub Actions runners on Amazon EC2 instances. The solution provides automated runner provisioning, lifecycle management, and secure deregistration using AWS Auto Scaling Groups, Lambda functions, and CloudWatch logging.
8+
9+
## Features
10+
11+
- **High Availability**: Maintains consistent runner capacity using AWS Auto Scaling Groups with automatic instance replacement across multiple Availability Zones
12+
- **Secure Authentication**: Uses GitHub App authentication for secure API access
13+
- **Automated Lifecycle Management**: Automatic runner registration and deregistration with dual mechanisms (Lambda + systemd service)
14+
- **Automated Deregistration**: Prevents orphaned runners in GitHub organization using lifecycle hooks and Lambda functions
15+
- **Unified Logging**: Centralized CloudWatch logging for complete runner lifecycle tracking
16+
- **Network Security**: Runs in private subnets with NAT Gateway for outbound internet access
17+
- **Encryption**: KMS encryption for secrets, CloudWatch logs, EFS storage, SNS topics, and Lambda functions
18+
- **Performance Optimization**: EFS with tuned NFS parameters and Lambda layer for reduced cold start times
19+
- **Cost Optimization**: EFS storage for shared runner workspace and dependency caching to reduce startup time
20+
21+
## Architecture
22+
23+
The solution deploys:
24+
- **VPC with public/private subnets** across multiple Availability Zones
25+
- **Auto Scaling Group** with EC2 instances running GitHub Actions runners
26+
- **Auto Scaling Lifecycle Hooks** for graceful runner deregistration on instance termination
27+
- **SNS Topic** for lifecycle event notifications with KMS encryption
28+
- **Lambda function** for automated runner deregistration via GitHub API
29+
- **Lambda Layer** with PyJWT and cryptography dependencies for optimized performance
30+
- **Dead Letter Queue** for Lambda error handling and retry mechanisms
31+
- **EFS file system** for shared runner workspace storage with optimized NFS parameters
32+
- **CloudWatch log groups** for unified lifecycle logging with structured format
33+
- **Secrets Manager** for secure GitHub App credentials storage
34+
- **SSM Parameter Store** for runner configuration scripts and deregistration service
35+
- **Systemd Service** for backup deregistration mechanism
36+
37+
## Prerequisites
38+
39+
Before deploying this infrastructure, please ensure the following prerequisites are met:
40+
41+
### AWS Setup
42+
- An AWS account with appropriate permissions to create and manage the resources included in this repository
43+
- An OpenID Connect identity provider created in AWS IAM with a trust relationship to this GitHub repository ([detailed setup guide](https://skundunotes.com/2023/02/28/securely-integrate-aws-credentials-with-github-actions-using-openid-connect/))
44+
- The ARN of the IAM Role stored as a GitHub secret for use in the `terraform.yml` workflow and referred via `${{ secrets.IAM_ROLE }}`.
45+
46+
### GitHub Setup
47+
- A GitHub organization where the self-hosted runners will be registered
48+
- A GitHub App created in the organization with the following permissions:
49+
- Repository permissions: `Actions (Read)`, `Administration (Read)`, `Metadata (Read)`
50+
- Organization permissions: `Self-hosted runners (Write)`
51+
- GitHub App credentials (App ID, Installation ID, and Private Key) stored in AWS Secrets Manager
52+
53+
### Infracost Integration (Optional)
54+
- An `INFRACOST_API_KEY` stored as a GitHub Actions secret for cost estimation
55+
- A GitHub Actions variable `INFRACOST_SCAN_TYPE` set to either `hcl_code` or `tf_plan` depending on the desired scan type
56+
57+
## Usage
58+
59+
This infrastructure is deployed automatically using the GitHub Actions workflow defined in `.github/workflows/terraform.yml`. The workflow provides complete CI/CD automation with security scanning, cost estimation, and infrastructure deployment.
60+
61+
### Automated Deployment Pipeline
62+
63+
The `terraform.yml` workflow includes the following automated stages:
64+
65+
#### 1. **Terraform Validation and Planning**
66+
- **Terraform Format Check**: Ensures code follows canonical formatting
67+
- **Terraform Validation**: Validates configuration syntax and logic
68+
- **Terraform Plan**: Generates execution plan showing proposed changes
69+
- **Plan Output**: Posts detailed plan as PR comment for review
70+
71+
#### 2. **Security and Cost Analysis**
72+
- **Checkov Security Scan**: Identifies security misconfigurations and compliance issues
73+
- **Infracost Analysis**: Provides cost estimates for infrastructure changes
74+
- **Cost Comparison**: Shows cost diff between current and proposed infrastructure
75+
76+
#### 3. **Automated Deployment**
77+
- **Trigger**: Automatically deploys on pushes to `main` branch
78+
- **Authentication**: Uses OIDC for secure, temporary AWS credentials
79+
- **Terraform Apply**: Provisions infrastructure with GitHub App credentials
80+
- **State Management**: Maintains Terraform state in remote backend
81+
82+
### Configuration Steps
83+
84+
#### 1. Configure GitHub Secrets
85+
Set up the following secrets in your GitHub repository:
86+
- `IAM_ROLE`: ARN of the OIDC-assumable IAM role
87+
- `THIS_GITHUB_APP_ID`: GitHub App ID for runner authentication
88+
- `THIS_GITHUB_INSTALLATION_ID`: GitHub App Installation ID
89+
- `THIS_GITHUB_PRIVATE_KEY`: GitHub App private key
90+
- `INFRACOST_API_KEY`: API key for cost estimation (optional)
91+
92+
#### 2. Store GitHub App Credentials in AWS
93+
Create a secret in AWS Secrets Manager with GitHub App credentials:
94+
```json
95+
{
96+
"app_id": "123456",
97+
"installation_id": "12345678",
98+
"private_key": "the-private-key"
99+
}
100+
```
101+
102+
### Deployment Process
103+
104+
#### Pull Request Workflow
105+
1. **Create Feature Branch**: Make changes in a feature branch
106+
2. **Open Pull Request**: Triggers validation, security scan, and cost analysis
107+
3. **Review Automation**:
108+
- Terraform plan posted as PR comment
109+
- Checkov findings displayed in PR
110+
- Infracost analysis shows cost impact
111+
4. **Merge to Main**: Triggers automatic deployment
112+
113+
#### Production Deployment
114+
1. **Automatic Trigger**: Merge to `main` branch starts deployment
115+
2. **Secure Authentication**: OIDC provides temporary AWS credentials
116+
3. **Infrastructure Provisioning**: Terraform applies changes to AWS
117+
4. **Validation**: Deployment success confirmed through workflow logs
118+
119+
### Monitoring and Validation
120+
121+
#### Deployment Status
122+
- **Workflow Badge**: Click the terraform-infra-provisioning badge above for real-time status
123+
- **GitHub Actions Logs**: Detailed logs available in the Actions tab
124+
- **Terraform State**: Remote state tracks all deployed resources
125+
126+
#### Runner Validation
127+
- **GitHub Organization**: Verify runners appear in Actions settings
128+
- **CloudWatch Logs**: Monitor registration process in `/{name}/lifecycle` log group
129+
- **Auto Scaling Group**: Check EC2 instances are launching successfully
130+
- **EFS Mount**: Verify shared workspace storage is accessible
131+
132+
## Configuration
133+
134+
### Key Variables
135+
The infrastructure can be customized by modifying the default values in `variables.tf`:
136+
137+
- `region`: AWS region for deployment (default: "us-west-2")
138+
- `name`: Prefix for all resource names (default: "github-self-hosted-runner")
139+
- `github_organization`: GitHub organization name (must be updated)
140+
- `runner_instance_type`: EC2 instance type for runners (default: "t3.medium")
141+
- `runner_min_size`: Minimum number of runners (default: 1)
142+
- `runner_max_size`: Maximum number of runners
143+
- `runner_desired_capacity`: Desired number of runners
144+
145+
### Logging Structure
146+
The solution provides unified logging with the following structure:
147+
```
148+
/{name}/lifecycle/
149+
├── {instance-id}/registration
150+
├── {instance-id}/execution
151+
└── {instance-id}/deregistration
152+
```
153+
154+
## Security Considerations
155+
156+
- All runners operate in private subnets with no direct internet access
157+
- GitHub App authentication provides scoped, time-limited access tokens
158+
- All secrets are encrypted using customer-managed KMS keys
159+
- CloudWatch logs are encrypted at rest with KMS
160+
- EFS file system uses encryption in transit and at rest
161+
- SNS topics and Lambda functions encrypted with customer-managed KMS keys
162+
- Lambda functions run in VPC with private subnets for enhanced security
163+
- Dead Letter Queue encrypted for secure error message handling
164+
- Security groups restrict network access to necessary ports only
165+
- IAM roles follow least privilege principle with minimal required permissions
166+
167+
## Troubleshooting
168+
169+
### Common Issues
170+
1. **Runner registration failures**: Check GitHub App permissions and credentials in Secrets Manager
171+
2. **Instance launch failures**: Verify VPC configuration and security group rules
172+
3. **Deregistration issues**: Check Lambda function logs in CloudWatch and dead letter queue messages
173+
4. **Network connectivity**: Ensure NAT Gateway is properly configured for private subnet internet access
174+
5. **Lambda deregistration failures**: Check Lambda function logs, VPC configuration, and GitHub API connectivity
175+
6. **EFS mount issues**: Verify NFS security group rules and mount target availability in all AZs
176+
7. **Lifecycle hook timeouts**: Check 5-minute timeout configuration and Lambda function performance metrics
177+
8. **SNS delivery failures**: Verify SNS topic permissions and Lambda subscription configuration
178+
179+
### Monitoring
180+
- CloudWatch logs provide detailed lifecycle tracking with structured format
181+
- Auto Scaling Group metrics show scaling activities and lifecycle hook status
182+
- Lambda function metrics indicate deregistration success rates and error patterns
183+
- Dead Letter Queue metrics show failed Lambda executions requiring investigation
184+
- EFS performance metrics monitor storage throughput and connection counts
185+
- SNS topic metrics track message delivery and failure rates
186+
187+
## Contributing
188+
189+
Contributions are welcome! Please follow these guidelines:
190+
191+
1. Fork the repository
192+
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
193+
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
194+
4. Push to the branch (`git push origin feature/amazing-feature`)
195+
5. Open a Pull Request
196+
197+
Please ensure that:
198+
- Code follows Terraform best practices
199+
- All resources include appropriate tags
200+
- Security considerations are addressed
201+
- Documentation is updated for any new features
202+
203+
If you find any issues or have suggestions for improvement, please feel free to open an issue.
204+
205+
## License
206+
207+
This code is released under the Unlicense License. See [LICENSE](LICENSE) for details.

asg.tf

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,15 @@ resource "aws_iam_policy" "github_runner" {
7070
Action = [
7171
"kms:Decrypt"
7272
]
73-
Resource = aws_kms_key.ssm_parameters.arn
73+
Resource = aws_kms_key.encrypt_ssm.arn
74+
},
75+
{
76+
Effect = "Allow"
77+
Action = [
78+
"kms:Decrypt",
79+
"kms:DescribeKey"
80+
]
81+
Resource = aws_kms_key.encrypt_efs.arn
7482
}
7583
]
7684
})
@@ -110,6 +118,10 @@ resource "aws_security_group_rule" "github_runner_egress" {
110118
cidr_blocks = ["0.0.0.0/0"]
111119
description = "Allow all outbound traffic"
112120
security_group_id = aws_security_group.github_runner.id
121+
#checkov:skip=CKV_AWS_382: Ensure no security groups allow egress from 0.0.0.0:0 to port -1
122+
#Reason: The Amazon EC2 instances require this to download packages
123+
#Reason: The instances are sufficiently protected since they're in private subnet
124+
113125
}
114126

115127
resource "aws_launch_template" "github_runner" {

cloudwatch.tf

Lines changed: 54 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,58 @@
1+
#https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kms_key
2+
resource "aws_kms_key" "cloudwatch_kms_key" {
3+
description = "KMS key for github runner"
4+
deletion_window_in_days = 7
5+
enable_key_rotation = true
6+
}
7+
#https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kms_alias
8+
resource "aws_kms_alias" "key" {
9+
name = "alias/${var.name}"
10+
target_key_id = aws_kms_key.cloudwatch_kms_key.id
11+
}
12+
#https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/kms_key_policy
13+
resource "aws_kms_key_policy" "encrypt_cloudwatch" {
14+
key_id = aws_kms_key.cloudwatch_kms_key.id
15+
policy = jsonencode({
16+
Id = "encryption-rest"
17+
Statement = [
18+
{
19+
Action = "kms:*"
20+
Effect = "Allow"
21+
Principal = {
22+
AWS = "${local.principal_root_arn}"
23+
}
24+
Resource = "*"
25+
Sid = "Enable IAM User Permissions"
26+
},
27+
{
28+
Effect : "Allow",
29+
Principal : {
30+
Service : "${local.principal_logs_arn}"
31+
},
32+
Action : [
33+
"kms:Encrypt*",
34+
"kms:Decrypt*",
35+
"kms:ReEncrypt*",
36+
"kms:GenerateDataKey*",
37+
"kms:Describe*"
38+
],
39+
Resource : "*",
40+
Condition : {
41+
ArnEquals : {
42+
"kms:EncryptionContext:aws:logs:arn" : [
43+
local.gh_runner_lifecycle_log_group_arn
44+
]
45+
}
46+
}
47+
}
48+
]
49+
Version = "2012-10-17"
50+
})
51+
}
52+
153
resource "aws_cloudwatch_log_group" "github_runner_lifecycle" {
2-
name = "/github-runner/${var.name}/lifecycle"
3-
retention_in_days = 14
54+
name = "/${var.name}/lifecycle"
55+
retention_in_days = 365
456
kms_key_id = aws_kms_key.cloudwatch_kms_key.arn
557
tags = {
658
Name = "${var.name}-lifecycle-logs"

efs.tf

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,56 @@
1+
# KMS key for EFS encryption
2+
resource "aws_kms_key" "encrypt_efs" {
3+
enable_key_rotation = true
4+
description = "Key to encrypt EFS file system in ${var.name}."
5+
deletion_window_in_days = 7
6+
}
7+
8+
data "aws_iam_policy_document" "encrypt_efs" {
9+
statement {
10+
sid = "Enable full access for root account"
11+
effect = "Allow"
12+
principals {
13+
type = "AWS"
14+
identifiers = ["${local.principal_root_arn}"]
15+
}
16+
actions = ["kms:*"]
17+
resources = [aws_kms_key.encrypt_efs.arn]
18+
}
19+
20+
statement {
21+
sid = "Allow EFS service"
22+
effect = "Allow"
23+
principals {
24+
type = "Service"
25+
identifiers = [
26+
"elasticfilesystem.amazonaws.com"
27+
]
28+
}
29+
actions = [
30+
"kms:Encrypt",
31+
"kms:Decrypt",
32+
"kms:ReEncrypt*",
33+
"kms:GenerateDataKey*",
34+
"kms:DescribeKey"
35+
]
36+
resources = [aws_kms_key.encrypt_efs.arn]
37+
}
38+
}
39+
40+
resource "aws_kms_key_policy" "encrypt_efs" {
41+
key_id = aws_kms_key.encrypt_efs.id
42+
policy = data.aws_iam_policy_document.encrypt_efs.json
43+
}
44+
45+
resource "aws_kms_alias" "encrypt_efs" {
46+
name = "alias/${var.name}-encrypt-efs"
47+
target_key_id = aws_kms_key.encrypt_efs.key_id
48+
}
49+
150
resource "aws_efs_file_system" "github_runner_work" {
251
creation_token = "${var.name}-work-dir"
352
encrypted = true
4-
53+
kms_key_id = aws_kms_key.encrypt_efs.arn
554
tags = {
655
Name = "${var.name}-work-dir"
756
}

github-actions-role.tf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ resource "aws_iam_role" "github_actions_runner" {
1212
AWS = aws_iam_role.github_runner.arn
1313
}
1414
Action = "sts:AssumeRole"
15-
1615
}
1716
]
1817
})
@@ -50,4 +49,5 @@ resource "aws_iam_role_policy_attachment" "github_actions_state" {
5049
resource "aws_iam_role_policy_attachment" "github_actions_admin" {
5150
role = aws_iam_role.github_actions_runner.name
5251
policy_arn = "arn:aws:iam::aws:policy/AdministratorAccess"
52+
#checkov:skip=CKV_AWS_274:AdministratorAccess required for GitHub Actions to manage all infrastructure resources
5353
}

0 commit comments

Comments
 (0)