Py-Chaos-Agent

A Python-based chaos engineering sidecar tool for testing the resilience of containerized applications. Py-Chaos-Agent runs alongside your application containers to inject controlled failures and validate system behavior under stress.

Features

Multiple Failure Modes: CPU stress, memory pressure, process termination, network latency
Flexible Configuration: YAML-based configuration with probability controls
Kubernetes Native: Designed as a sidecar container with proper security contexts
Observable: Prometheus metrics for monitoring chaos experiments
Safe by Default: Self-protection mechanisms and dry-run mode
Infrastructure as Code: Terraform modules for AWS EKS deployment

Quick Start

Local Development with Docker Compose

# Clone the repository
git clone https://github.com/othaime-en/py-chaos-agent.git
cd py-chaos-agent

# Start the target application and chaos agent
docker-compose up --build

# View logs
docker-compose logs -f chaos-agent

# Access metrics
curl http://localhost:8000/metrics

# Access target application
curl http://localhost:8080

Kubernetes Deployment

# Build and load images (for local testing with kind/minikube)
docker build -t py-chaos-agent:latest -f docker/Dockerfile .
docker build -t target-app:latest -f docker/Dockerfile.target .

# Deploy to Kubernetes
kubectl apply -f k8s/chaos-demo.yaml

# View chaos agent logs
kubectl logs -n chaos-demo -l app=resilient-app -c chaos-agent -f

# View metrics
kubectl port-forward -n chaos-demo svc/resilient-app 8000:8000
curl http://localhost:8000/metrics

Configuration

Configure chaos experiments via config.yaml:

agent:
  interval_seconds: 10 # How often to potentially inject failures
  dry_run: false # Set to true to test without actual injection

failures:
  cpu:
    enabled: true
    duration_seconds: 5
    probability: 0.3 # 30% chance per interval
    cores: 1

  memory:
    enabled: true
    duration_seconds: 8
    probability: 0.2
    mb: 200

  process:
    enabled: true
    target_name: "target-app"
    probability: 0.4

  network:
    enabled: true
    interface: "eth0"
    delay_ms: 300
    duration_seconds: 10
    probability: 0.25

See Configuration Guide for detailed options.

Architecture

Py-Chaos-Agent runs as a sidecar container in Kubernetes, sharing the process and network namespaces with your target application. This allows it to inject failures while maintaining isolation from other pods.

┌─────────────────────────────────────┐
│           Kubernetes Pod            │
├─────────────────┬───────────────────┤
│  Target App     │  Chaos Agent      │
│  (port 8080)    │  (port 8000)      │
│                 │                   │
│  Shares: Process Namespace          │
│          Network Namespace          │
└─────────────────────────────────────┘

See Architecture Documentation for detailed design.

Safety and Ethics

WARNING: This tool is designed for testing environments only.

Only use on systems you own or have explicit permission to test
Never run in production without proper safeguards and approval
Start with dry-run mode to verify behavior
Monitor systems closely during chaos experiments
Have rollback procedures ready

The agent includes self-protection mechanisms to avoid terminating itself, but always exercise caution when running chaos experiments.

Development

Prerequisites

Python 3.10+
Docker and Docker Compose
kubectl (for Kubernetes testing)
Terraform (for AWS deployment)

Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt -r requirements-dev.txt

# Run tests
pytest

# Run tests with coverage
pytest --cov=src --cov-report=html

# Lint and format
black src tests
flake8 src tests
mypy src

Running Tests

# Run all tests
pytest

# Run with verbose output
pytest -v

# Run specific test file
pytest tests/test_failures.py

# Run with coverage
pytest --cov=src --cov-report=term-missing

See Development Guide for contribution guidelines.

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please read the Development Guide before submitting pull requests.

Acknowledgments

Inspired by chaos engineering principles from Netflix's Chaos Monkey and the broader chaos engineering community.

Contact

For questions or feedback, please open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.github/workflows		.github/workflows
docker		docker
docs		docs
k8s		k8s
src		src
terraform		terraform
tests		tests
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
mypy.ini		mypy.ini
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements-docs.txt		requirements-docs.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Py-Chaos-Agent

Features

Quick Start

Local Development with Docker Compose

Kubernetes Deployment

Configuration

Architecture

Safety and Ethics

Development

Prerequisites

Setup

Running Tests

License

Contributing

Acknowledgments

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Py-Chaos-Agent

Features

Quick Start

Local Development with Docker Compose

Kubernetes Deployment

Configuration

Architecture

Safety and Ethics

Development

Prerequisites

Setup

Running Tests

License

Contributing

Acknowledgments

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages