Skip to content

Latest commit

 

History

History
97 lines (76 loc) · 4.93 KB

File metadata and controls

97 lines (76 loc) · 4.93 KB

Observability Module

For Turkish documentation: README.tr.md

Overview

This module provides a complete observability solution for Kafka infrastructure. It automatically installs, configures, and deploys Prometheus, Alertmanager, and Grafana components on a central server. All host and service information is dynamically fetched from infrastructure managed by Terraform via S3, and inventories are generated automatically.


Features

  • Fully Automated Installation: All observability components (Prometheus, Alertmanager, Grafana) are installed with a single command.
  • Dynamic Inventory and Configuration: Server and service information is automatically fetched from the Terraform state file in S3, and Prometheus and Ansible inventories are generated dynamically.
  • Prometheus Monitoring: Metrics are automatically collected for Kafka Broker, Controller, Connect nodes, and infrastructure servers.
  • Alertmanager for Alert Management: Customizable Prometheus alerts and Alertmanager integration for critical situations.
  • Grafana Visualization: A strong admin password is set automatically at first setup, and two ready dashboards (Kafka Broker & Kafka Connect) are loaded automatically.
  • Node Exporter Automation: node_exporter is automatically installed on all Kafka, Controller, and Connect nodes, and infrastructure metrics are monitored by Prometheus.
  • CI/CD with GitHub Actions: All installation and updates can be automated via GitHub Actions.

Fully Automated Installation with Ansible

In this module, all observability components (Prometheus, Alertmanager, Grafana, Node Exporter) are fully installed and configured automatically using Ansible playbooks and scripts. No manual installation steps are required from the user; the entire infrastructure is ready with a single command.


Directory Structure

  • observability.yml — Ansible playbook: Installs Prometheus, Alertmanager, Grafana
  • deploy.sh — Main script to start the whole process
  • generate-inventory.sh — Script to fetch host info from S3 and generate inventory
  • generate-prometheus-config.sh — Script to generate dynamic configuration for Prometheus
  • prometheus/ — Main Prometheus configuration and alert definitions
  • grafana/ — Dashboard templates and auto-loading
  • alertmanager/ — Alertmanager configuration
  • node-exporter.yml — Ansible playbook to install exporter on all nodes

Automated Workflow

  1. Inventory Generation: Fetches up-to-date host info from S3 and generates Ansible inventory with generate-inventory.sh.
  2. Prometheus Configuration: Automatically determines Prometheus targets with generate-prometheus-config.sh.
  3. Node Exporter Installation: Automatically installs exporter on all Kafka, Controller, and Connect nodes.
  4. Stack Deploy: Deploys Prometheus, Alertmanager, and Grafana with a single command using deploy.sh.
  5. Grafana Automation: Sets admin password at first launch and automatically loads four dashboards (Kafka Broker, Kafka Connect, Node Exporter).
  6. Alertmanager: Prometheus alert definitions and Alertmanager integration are done automatically.
  7. CI/CD: The entire process can be automated with GitHub Actions.

Components

Prometheus

  • Collects metrics from Kafka Broker, Controller, Connect, and infrastructure servers.
  • Detects critical situations with alert definitions.
  • Configuration file and targets are updated automatically. prometheus

Alertmanager

  • Centrally manages Prometheus alerts.
  • Email, Slack, etc. notification integrations can be easily added.
  • Alert definitions are managed in the prometheus/alerts/ directory. alertmanager

Grafana

  • Automatically installed with a strong admin password.
  • Two dashboards are loaded automatically at first setup:
    • Kafka Broker Dashboard
    • Kafka Connect Dashboard
  • All metrics and alerts can be visually monitored. grafana

Security and Best Practices

  • All passwords and admin info are managed via environment variables or Ansible vault.
  • Certificates and sensitive files are not added to version control.
  • All scripts and playbooks are idempotent and safe to re-run.

CI/CD with GitHub Actions

  • All installation and updates can be automated with .github/workflows/ansible_observability.yaml.
  • Automatically triggered by S3 and Terraform state changes.
  • Test, installation, and dashboard loading steps are included in the pipeline. actions

Quick Start

  1. Set up your AWS and S3 access credentials.
  2. Run the deploy.sh script:
    bash deploy.sh
  3. Once setup is complete, use the server IP and admin password to access Grafana.

Dashboard and Alert Examples

  • Ready JSON dashboard templates under grafana/dashboards/
  • Example Prometheus alert definitions under prometheus/alerts/