Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.fr.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ Pas encore de compte cloud ? `cleancloud demo` affiche un exemple de sortie sans
- **Détection du gaspillage IA/ML sur les 3 clouds :** endpoints, notebooks, Studio apps et training jobs SageMaker ; clusters AML Compute et instances ML ; endpoints en ligne Azure ML et services Azure AI Search ; endpoints, instances Workbench et training jobs Vertex AI. Les ressources GPU sont mises en avant comme candidats de revue à risque plus élevé. Les outils natifs n'indiquent pas toujours quoi examiner — CleanCloud le fait. Opt-in via `--category ai`
- **Gouvernance policy-as-code :** `cleancloud.yaml` pour la configuration par règle, les exceptions avec dates d'expiration, les seuils de coût et de confiance, les exclusions par tag — versionné aux côtés de votre infrastructure. Chaque exception est une approbation auditée dans git.
- **Application de politique (opt-in) :** `--fail-on-confidence HIGH` ou `--fail-on-cost 500` — appliquer des seuils de gaspillage en CI/CD sur un planning, géré par les équipes platform ou FinOps
- **45 règles de détection sélectives et haut signal :** volumes orphelins, bases de données inactives, instances arrêtées, registres inutilisés, et plus — conçues pour éviter les faux positifs en environnements IaC, chacune avec une estimation de coût déterministe
- **46 règles de détection sélectives et haut signal :** volumes orphelins, bases de données inactives, instances arrêtées, registres inutilisés, et plus — conçues pour éviter les faux positifs en environnements IaC, chacune avec une estimation de coût déterministe
- **Scan multi-comptes (AWS) :** scannez des AWS Organizations entières en une exécution — fichier de config, IDs inline, ou auto-découverte via `--org`
- **Scan multi-abonnements (Azure) :** scannez tous les abonnements Azure en parallèle — auto-découverte via Management Group, détail des coûts par abonnement inclus
- **Scan multi-projets (GCP) :** scannez tous les projets GCP accessibles en parallèle — auto-découverte via Application Default Credentials, détail des coûts par projet inclus
Expand Down Expand Up @@ -278,7 +278,7 @@ L'infrastructure IA/ML inactive est la source de gaspillage cloud invisible à l
| Cluster AML Compute Azure (GPU) | 600 – 15 000 $ / mois |
| Instance de calcul Azure ML (GPU) | 600 – 15 000+ $ / mois |
| Endpoint en ligne Azure ML (GPU) | 200 – 2 600+ $ / mois |
| Azure AI Search (Standard+) | 261 – 4 028+ $ / mois |
| Azure AI Search (Basic+) | 261 – 4 028+ $ / mois |
| Déploiement Azure OpenAI Provisionné (PTU) | 1 460+ $ / PTU / mois |
| Endpoint Vertex AI Online Prediction (GPU) | 449 – 23 000+ $ / mois |
| Instance Vertex AI Workbench (GPU) | 449 – 8 000+ $ / mois |
Expand Down Expand Up @@ -528,7 +528,7 @@ Oui. CleanCloud n'a besoin d'accès réseau qu'aux endpoints API de votre cloud

## Ce que CleanCloud détecte

45 règles pour AWS, Azure et GCP — conservatrices, haut signal, conçues pour éviter les faux positifs en environnements IaC.
46 règles pour AWS, Azure et GCP — conservatrices, haut signal, conçues pour éviter les faux positifs en environnements IaC.

**AWS :**
- Compute : instances arrêtées 30+ jours (charges EBS continuent)
Expand All @@ -545,7 +545,7 @@ Oui. CleanCloud n'a besoin d'accès réseau qu'aux endpoints API de votre cloud
- Réseau : adresses IP publiques inutilisées, Load Balancers vides (HIGH), App Gateways vides (HIGH), VNet Gateways inactives
- Plateforme : App Service Plans vides (HIGH), bases de données SQL inactives (HIGH), App Services inactifs, Container Registries inutilisés
- Gouvernance : ressources sans tags
- IA/ML *(opt-in : `--category ai`)* : clusters de calcul AML avec capacité baseline non nulle et aucune activité depuis 14+ jours — clusters GPU flaggés risque HIGH ($600–$15K/mois) ; instances de calcul Azure ML Running sans activité depuis 14+ jours — instances GPU flaggées risque CRITICAL ($600–$15K+/mois) ; endpoints en ligne ML managés sans requête de scoring depuis 7+ jours — endpoints GPU flaggés HIGH/CRITICAL (200–2 600+$/mois) ; services AI Search (Standard+) sans requête depuis 30+ jours — facturés par SKU × réplicas × partitions (261–4 028+$/mois) ; déploiements Azure OpenAI provisionnés (PTUs) sans requête API depuis 7+ jours — facturés ~1 460 $/PTU/mois en on-demand quel que soit le trafic
- IA/ML *(opt-in : `--category ai`)* : clusters de calcul AML avec capacité baseline non nulle et aucune activité depuis 14+ jours — clusters GPU flaggés risque HIGH ($600–$15K/mois) ; instances de calcul Azure ML Running sans activité depuis 14+ jours — instances GPU flaggées risque CRITICAL ($600–$15K+/mois) ; endpoints en ligne ML managés sans requête de scoring depuis 7+ jours — endpoints GPU flaggés HIGH/CRITICAL (200–2 600+$/mois) ; services AI Search (Basic+) sans requête depuis 90+ jours — facturés par SKU × réplicas × partitions (261–4 028+$/mois) ; déploiements Azure OpenAI provisionnés (PTUs) sans requête API depuis 7+ jours — facturés ~1 460 $/PTU/mois en on-demand quel que soit le trafic

**GCP :**
- Compute : instances VM arrêtées 30+ jours (charges disque continuent) (HIGH)
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ No cloud account yet? `cleancloud demo` shows sample output without any credenti
- **AI/ML waste detection across all 3 clouds:** idle SageMaker endpoints, notebook instances, Studio apps, and long-running training jobs; AML compute clusters and instances; Azure ML online endpoints and AI Search services; Vertex AI endpoints, Workbench instances, and training jobs. GPU-backed resources are highlighted as higher-risk review candidates. Native cost tools don't surface these — CleanCloud does. Opt-in via `--category ai`
- **Policy-as-code governance:** `cleancloud.yaml` for per-rule config, exceptions with expiry dates, cost and confidence thresholds, tag-based exclusions — version-controlled alongside your infrastructure. Every exception is a git-reviewable approval.
- **Governance enforcement (opt-in):** `--fail-on-confidence HIGH` or `--fail-on-cost 500` — enforce waste thresholds in CI/CD on a schedule, owned by platform or FinOps teams
- **45 curated, high-signal detection rules:** orphaned volumes, idle databases, stopped instances, unused registries, and more — designed to avoid false positives in IaC environments, each with a deterministic cost estimate
- **46 curated, high-signal detection rules:** orphaned volumes, idle databases, stopped instances, unused registries, and more — designed to avoid false positives in IaC environments, each with a deterministic cost estimate
- **Multi-account scanning (AWS):** scan entire AWS Organizations in one run — config file, inline IDs, or auto-discovery via `--org`
- **Multi-subscription scanning (Azure):** scan all Azure subscriptions in parallel — auto-discovery via Management Group, per-subscription cost breakdown included
- **Multi-project scanning (GCP):** scan all accessible GCP projects in parallel — auto-discovery via Application Default Credentials, per-project cost breakdown included
Expand Down Expand Up @@ -278,7 +278,7 @@ Idle AI/ML infrastructure is the fastest-growing source of invisible cloud spend
| Azure AML compute cluster (GPU) | $600 – $15,000 / month |
| Azure ML Compute Instance (GPU) | $600 – $15,000+ / month |
| Azure ML Online Endpoint (GPU-backed) | $200 – $2,600+ / month |
| Azure AI Search (Standard+) | $261 – $4,028+ / month |
| Azure AI Search (Basic+) | $261 – $4,028+ / month |
| Azure OpenAI Provisioned Deployment (PTU) | $1,460+ / PTU / month |
| Vertex AI Online Prediction endpoint (GPU) | $449 – $23,000+ / month |
| Vertex AI Workbench instance (GPU) | $449 – $8,000+ / month |
Expand Down Expand Up @@ -528,7 +528,7 @@ Yes. CleanCloud only needs network access to your cloud provider's API endpoints

## What CleanCloud Detects

45 rules across AWS, Azure, and GCP — conservative, high-signal, designed to avoid false positives in IaC environments.
46 rules across AWS, Azure, and GCP — conservative, high-signal, designed to avoid false positives in IaC environments.

**AWS:**
- Compute: stopped instances 30+ days (EBS charges continue)
Expand All @@ -545,7 +545,7 @@ Yes. CleanCloud only needs network access to your cloud provider's API endpoints
- Network: unused public IPs, empty load balancers (HIGH), empty App Gateways (HIGH), idle VNet Gateways
- Platform: empty App Service Plans (HIGH), idle SQL databases (HIGH), idle App Services, unused Container Registries
- Governance: untagged resources
- AI/ML *(opt-in: `--category ai`)*: idle AML compute clusters with non-zero baseline capacity and no workload activity 14+ days — GPU clusters flagged HIGH risk ($600–$15K/month); idle Compute Instances with no control-plane activity 14+ days — GPU instances CRITICAL risk ($600–$15K+/month); idle ML managed online endpoints with zero scoring requests 7+ days — GPU-backed endpoints flagged HIGH/CRITICAL ($200–$2,600+/month); idle AI Search services (Standard+) with zero queries 30+ days — billed per SKU × replicas × partitions ($261–$4,028+/month); idle Azure OpenAI provisioned deployments (PTUs) with zero API requests 7+ days — bills ~$1,460/PTU/month on-demand regardless of traffic
- AI/ML *(opt-in: `--category ai`)*: idle AML compute clusters with non-zero baseline capacity and no workload activity 14+ days — GPU clusters flagged HIGH risk ($600–$15K/month); idle Compute Instances with no control-plane activity 14+ days — GPU instances CRITICAL risk ($600–$15K+/month); idle ML managed online endpoints with zero scoring requests 7+ days — GPU-backed endpoints flagged HIGH/CRITICAL ($200–$2,600+/month); idle AI Search services (Basic+) with zero queries 90+ days — billed per SKU × replicas × partitions ($261–$4,028+/month); idle Azure OpenAI provisioned deployments (PTUs) with zero API requests 7+ days — bills ~$1,460/PTU/month on-demand regardless of traffic

**GCP:**
- Compute: stopped instances 30+ days (disk charges continue) (HIGH)
Expand Down
55 changes: 55 additions & 0 deletions cleancloud/doctor/aws.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,7 @@ def run_aws_doctor(profile: Optional[str], region: Optional[str] = None) -> None
info("Permissions required (attach to your IAM role or user):")
info(" ec2:DescribeVolumes")
info(" ec2:DescribeSnapshots")
info(" ec2:DescribeSnapshotAttribute")
info(" ec2:DescribeRegions")
info(" ec2:DescribeAddresses")
info(" ec2:DescribeNetworkInterfaces")
Expand All @@ -239,6 +240,8 @@ def run_aws_doctor(profile: Optional[str], region: Optional[str] = None) -> None
info(" ec2:DescribeSecurityGroups")
info(" rds:DescribeDBInstances")
info(" rds:DescribeDBSnapshots")
info(" rds:DescribeDBSnapshotAttributes")
info(" cloudtrail:LookupEvents")
info(" elasticloadbalancing:DescribeLoadBalancers")
info(" elasticloadbalancing:DescribeTargetGroups")
info(" logs:DescribeLogGroups")
Expand Down Expand Up @@ -409,6 +412,22 @@ def run_aws_doctor(profile: Optional[str], region: Optional[str] = None) -> None
permissions_failed.append(("ec2:DescribeSnapshots", str(e)))
warn(f"ec2:DescribeSnapshots - {e}")

try:
_snaps = ec2.describe_snapshots(OwnerIds=["self"], MaxResults=5).get("Snapshots", [])
if _snaps:
ec2.describe_snapshot_attribute(
SnapshotId=_snaps[0]["SnapshotId"], Attribute="createVolumePermission"
)
permissions_tested.append("ec2:DescribeSnapshotAttribute")
success("ec2:DescribeSnapshotAttribute")
except Exception as e:
if "AccessDenied" in str(e) or "not authorized" in str(e).lower():
permissions_failed.append(("ec2:DescribeSnapshotAttribute", str(e)))
warn(f"ec2:DescribeSnapshotAttribute - {e}")
else:
permissions_tested.append("ec2:DescribeSnapshotAttribute")
success("ec2:DescribeSnapshotAttribute")

try:
ec2.describe_regions()
permissions_tested.append("ec2:DescribeRegions")
Expand Down Expand Up @@ -483,6 +502,24 @@ def run_aws_doctor(profile: Optional[str], region: Optional[str] = None) -> None
permissions_failed.append(("rds:DescribeDBSnapshots", str(e)))
warn(f"rds:DescribeDBSnapshots - {e}")

try:
_rds_snaps = rds.describe_db_snapshots(MaxRecords=20, SnapshotType="manual").get(
"DBSnapshots", []
)
if _rds_snaps:
rds.describe_db_snapshot_attributes(
DBSnapshotIdentifier=_rds_snaps[0]["DBSnapshotIdentifier"]
)
permissions_tested.append("rds:DescribeDBSnapshotAttributes")
success("rds:DescribeDBSnapshotAttributes")
except Exception as e:
if "AccessDenied" in str(e) or "not authorized" in str(e).lower():
permissions_failed.append(("rds:DescribeDBSnapshotAttributes", str(e)))
warn(f"rds:DescribeDBSnapshotAttributes - {e}")
else:
permissions_tested.append("rds:DescribeDBSnapshotAttributes")
success("rds:DescribeDBSnapshotAttributes")

# Test ELB permissions
try:
elbv2 = session.client("elbv2", region_name=region)
Expand Down Expand Up @@ -563,6 +600,24 @@ def run_aws_doctor(profile: Optional[str], region: Optional[str] = None) -> None
permissions_failed.append(("s3:GetBucketTagging", str(e)))
warn(f"s3:GetBucketTagging - {e}")

# Test CloudTrail permissions (aws.ec2.instance.stopped — stopped-duration probe)
try:
from datetime import datetime, timedelta
from datetime import timezone as _tz

cloudtrail = session.client("cloudtrail", region_name=region)
_now = datetime.now(_tz.utc)
cloudtrail.lookup_events(
StartTime=_now - timedelta(hours=1),
EndTime=_now,
MaxResults=1,
)
permissions_tested.append("cloudtrail:LookupEvents")
success("cloudtrail:LookupEvents")
except Exception as e:
permissions_failed.append(("cloudtrail:LookupEvents", str(e)))
warn(f"cloudtrail:LookupEvents - {e}")

except Exception:
fail("CleanCloud cannot run safely with missing read-only permissions")

Expand Down
Loading
Loading