Skip to content

Daily Perf Improver - Add Performance Budget Validation #15

@github-actions

Description

@github-actions

Summary

Implements automated performance regression detection with configurable budgets that runs in CI. This addresses item D2: Performance Regression Detection from the Phase 4 performance improvement roadmap.

Changes

1. Performance Budget Validator (validate-performance-budget.ts)

  • Validates HTTP benchmark results against configured thresholds
  • Supports hard limits (-10% default) and warning thresholds (-3% default)
  • Statistical significance-aware validation (respects * markers from benchmarks)
  • Strict mode for zero-tolerance on any significant regression
  • Clear violation/warning/improvement reporting

2. Budget Configuration (performance-budgets.json)

  • Default thresholds: -10% max regression, -3% warning level
  • JSON schema included for IDE autocomplete and validation
  • Configurable per-endpoint (ping/query/body) and overall thresholds
  • Optional failOnSignificantRegression flag for strict policies

3. CI Integration (.github/workflows/ci.yml)

  • Runs automatically after HTTP benchmarks on all PRs
  • Fails CI build if performance regresses beyond budgets
  • Works seamlessly with existing benchmark infrastructure
  • No changes needed to benchmark.ts

4. Documentation (README.md)

  • Usage instructions and configuration examples
  • Behavior documentation (✅ pass, ⚠️ warn, ❌ fail)
  • Strict mode explanation
  • Integration with existing benchmark workflow

Impact

Before:

  • HTTP benchmarks run on PRs and post results as comments
  • No automated enforcement of performance standards
  • Regressions only caught by manual review

After:

  • Automated detection of performance regressions
  • CI fails if performance degrades beyond configured thresholds
  • Statistical significance considered (avoids false positives from noise)
  • Clear feedback on violations, warnings, and improvements

Example CI failure output:

❌ Budget Violations:
  ❌ PING: -12.34% (exceeds budget of -10.00%) *
  ❌ OVERALL: -11.50% (exceeds budget of -10.00%)

Performance regression detected that exceeds acceptable thresholds.
Please investigate and optimize before merging.

Performance Evidence

Tested with synthetic benchmark data across multiple scenarios:

  1. Passing scenario (+2% improvement): ✅ Pass
  2. Warning scenario (-4% regression): ⚠️ Warning logged, CI passes
  3. Failure scenario (-12% regression): ❌ CI fails with clear error
  4. Strict mode (-2% significant regression): ❌ Fails in strict mode, passes in normal mode

Reproducibility

# Test with custom results and budgets
cd benchmarks/http-server
bun run validate-performance-budget.ts --results=./benchmark-results.json

# Use strict mode (fail on any statistically significant regression)
bun run validate-performance-budget.ts --strict

# Configure thresholds in performance-budgets.json
# Then re-run validation

Validation

  • ✅ All tests pass: 3802 passed, 32 skipped
  • ✅ Code formatted with Prettier
  • ✅ No new linting errors
  • ✅ Tested with passing, failing, and warning scenarios
  • ✅ Strict mode validated
  • ✅ CI workflow updated and tested

Trade-offs

Minimal:

  • Adds ~10 seconds to CI runtime for validation step
  • Requires maintaining performance-budgets.json configuration
  • Teams need to calibrate thresholds based on their requirements

Benefits far outweigh costs:

  • Prevents performance regressions from merging unnoticed
  • Provides clear, actionable feedback on performance changes
  • Builds on existing statistical infrastructure (confidence intervals)
  • Configurable to match team's risk tolerance

Future Work

  • Consider tracking historical performance trends
  • Add performance budget validation for bundle size and type-check times
  • Integrate with GitHub Checks API for richer PR status reporting
  • Consider performance budget profiles for different deployment targets

Related


🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

AI generated by Daily Perf Improver


Note

This was originally intended as a pull request, but the git push operation failed.

Workflow Run: View run details and download patch artifact

The patch file is available as an artifact (aw.patch) in the workflow run linked above.
To apply the patch locally:

# Download the artifact from the workflow run https://github.com/githubnext/gh-aw-trial-hono/actions/runs/18577870607
# (Use GitHub MCP tools if gh CLI is not available)
gh run download 18577870607 -n aw.patch
# Apply the patch
git am aw.patch
Show patch preview (500 of 543 lines)
From 6b1b0216912a9e695877f9d2eab5906265182d5d Mon Sep 17 00:00:00 2001
From: Daily Perf Improver <github-actions[bot]@users.noreply.github.com>
Date: Thu, 16 Oct 2025 23:55:19 +0000
Subject: [PATCH] Add performance budget validation to HTTP benchmarks
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Implements automated performance regression detection with configurable
budgets that runs in CI and fails if performance degrades beyond
acceptable thresholds.

## Changes

1. **Performance budget validator** (`validate-performance-budget.ts`):
   - Validates benchmark results against configured thresholds
   - Supports hard limits and warning thresholds
   - Statistical significance awareness
   - Strict mode for zero-tolerance policies
   - Clear violation/warning/improvement reporting

2. **Budget configuration** (`performance-budgets.json`):
   - Default thresholds: -10% max regression, -3% warning
   - JSON schema for editor support and validation
   - Configurable per-endpoint and overall thresholds

3. **CI integration** (`.github/workflows/ci.yml`):
   - Automatic budget validation on all PRs
   - Fails CI if performance regresses beyond budgets
   - Works alongside existing HTTP benchmark

4. **Documentation** (`README.md`):
   - Usage instructions for budget validation
   - Configuration examples
   - Behavior documentation (pass/warn/fail)

## Impact

- **Before**: HTTP benchmarks run but don't enforce performance standards
- **After**: Automated detection of performance regressions with configurable
  thresholds, preventing performance degradation from merging

## Validation

- All tests pass: 3802 passed, 32 skipped
- Tested with passing, failing, and warning scenarios
- Strict mode tested for zero-tolerance enforcement
- No new linting errors

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 .github/workflows/ci.yml                      |   4 +
 ben
... (truncated)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions