Skip to content

TritonParse v0.4.4 Release πŸŽ‰

Latest

Choose a tag to compare

@FindHao FindHao released this 23 Apr 03:32
· 3 commits to main since this release

TritonParse Release Notes v0.4.4 (23 commits)

  • Date range: 2026-04-09 β€” 2026-04-22
  • Scope: Feature release β€” new compat_builder module for automated Triton/LLVM compatibility mapping, PyTorch bisection support, AI-powered diff root cause analysis, CLP archive viewer support, and reproducer correctness fixes.

Highlights

  • πŸ—οΈ New compat_builder Module: Brand-new package (~2,085 lines across 8 modules) that automates generating commits.csv files for LLVM bumps in Triton. Uses a state-machine-driven workflow (CompatBuilder) with 7 phases, git bisect–based compatibility probing (build β†’ import β†’ smoke test), AI-powered fix generation via ClaudeCodeClient, CSV management with metadata headers, and a full CLI with --resume, --verify, and --status modes. Includes 200+ tests covering all pure-logic paths. Integrated into the main tritonparse CLI as the compat-build subcommand.

  • πŸ” PyTorch Bisection Support (#377): Extends the bisect module (~1,030 new lines) to bisect PyTorch commits in addition to Triton/LLVM. New TorchBisector class drives git bisect over a PyTorch repo using user-provided test scripts. Includes build infrastructure scripts for CUDA, cuSparseLt, and Magma installation, plus a prepare_build_pytorch.sh that sets up the PyTorch build environment. Accessible via tritonparse bisect --target torch.

  • πŸ€– AI-Powered Diff Root Cause Analysis: Adds Phase 2 AI analysis to tritonparse diff --ai. Deterministic diff results from Phase 1 (metadata, IR stats, source mappings, tensor values) are formatted as structured markdown and sent to an LLM, which returns root cause explanations as DiffNote objects. Architecture includes a Triton-expert system prompt, priority-ordered context builder, and three response parsing strategies (JSON, structured markdown, raw text fallback). Supports both single-kernel and trace-level analysis with significance thresholds.

  • πŸ“¦ CLP Archive Support in Log Viewer (#382): The web viewer can now load and parse CLP (Compressed Log Processor) archives directly, completing the pipeline started in #326 where structured logging gained CLP output support. Updates DataSourceSelector, WelcomeScreen, and dataLoader.ts to handle CLP file selection and decompression via clp-ffi-js.

  • πŸ”§ OVERRIDE_TTIR Constexpr Interleaving Fix (#384): Fixes a TypeError that broke all triton-mpp analyze subcommands (ncu, barrier-analysis, plot-sm-occupancy) when kernel signatures interleave constexpr and non-constexpr parameters. The OVERRIDE_TTIR reproducer branch was removing constexpr args from positional lists, shifting remaining args into wrong positions. Fix passes all non-constexpr args as keyword args, eliminating position-dependent binding entirely.

  • πŸ“ Documentation Overhaul: Moves all GitHub Wiki pages into a version-controlled docs/ directory (~5,000 lines) with automatic wiki sync via GitHub Actions. Updates API signatures, adds documentation for diff, bisect, and compat-build subcommands, fixes outdated environment variable references, and corrects test commands.

Changes by Area

πŸ—οΈ New compat_builder Module

  • State Machine (state.py): CompatBuildPhase 7-phase enum (INITIALIZING β†’ COMPLETED/FAILED), CompatBuildState dataclass with JSON serialization, CompatStateManager for persistence. 218 lines + 251 lines of tests.
  • Core Builder (builder.py, PR2-01): CompatBuilder orchestrator driving the initialize β†’ find_next_incompatible β†’ record_pair β†’ fix_incompatibility loop. 773 lines + 634 lines of tests.
  • CSV Manager (csv_manager.py, PR2-02): CSVManager and BumpBlock for reading, validating, and writing single-bump CSV files with metadata headers. 261 lines + 413 lines of tests.
  • AI Fixer (ai_fixer.py, PR3-01): AI-powered compatibility fixing following a two-phase (deterministic context + AI) pattern. System prompt encoding LLVM API change patterns, structured context builder, AICompatFixer orchestrator. 442 lines + 346 lines of tests.
  • CLI (cli.py, PR3-02): Four modes β€” default build, --resume, --verify, --status. AI control flags (--ai/--no-ai, --ai-model) and worktree management. 364 lines + 225 lines of tests.
  • CLI Integration (PR3-03): compat-build subparser wired into the main tritonparse CLI.

πŸ” Bisect Enhancements

  • PyTorch Bisection (#377): New TorchBisector class (142 lines), shell scripts for CUDA/cuSparseLt/Magma installation and PyTorch builds (~644 lines), CLI extension with --target torch. 130 lines of tests.
  • Torch Bisect Script Fixes (#383): Setup CUDA_HOME, install cuSparseLt libraries, install CI requirements across all bisect build scripts.
  • LLVM Path Comment Fix: Corrected misleading comments in bisect scripts about .llvm-project/ vs llvm-project/ directory layout.

πŸ€– AI & Diff

  • AI Root Cause Analysis for Diff: diff/fb/ai/ module with system prompt, context builder, and AIDiffAnalyzer orchestrator. --ai flag for both single-kernel and trace-level diff modes. 390 lines + tests (moved to tests/fb/diff/).
  • AI Diff Test Relocation: Moved fb-only AI diff tests from tests/cpu/diff/ to tests/fb/diff/ to prevent ModuleNotFoundError on GitHub CI.

πŸ”§ Reproducer Fixes

  • OVERRIDE_TTIR Constexpr Fix (#384): Pass non-constexpr args as keyword args in the override branch, preventing TypeError when constexprs are interleaved with positional args. 123 lines of new tests.
  • num_warps_base Extraction: Extract original num_warps from TTGIR ttg.num-warps module attribute during parse phase, storing it as metadata["num_warps_base"]. Fixes warp-specialized kernels reporting inflated warp counts to the reproducer and viewer.
  • Per-Hash Tensor Blob Saving (#380): Tensor blob saving counter changed from global to per-compilation-hash. Each autotuned config saves exactly one set of blobs instead of only the first winner. Benchmark (autotune timing) launches are now always skipped.

🌐 Website & Viewer

  • CLP Archive Loading (#382): clp-ffi-js integration for decompressing and parsing CLP archives in the browser-based log viewer.
  • ESLint 10 Upgrade (#378): ESLint v9 β†’ v10, react-hooks canary channel, React 19.2.5, Vite 8.0.7, TypeScript-ESLint 8.58.1.
  • ESLint 10 Lint Fixes (#379): Comprehensive fixes across 10 files for new lint rules β€” lazy state initialization, useCallback wrapping, extracted utility modules, error cause chaining.
  • Vite Security Bump (#381): Vite 8.0.7 β†’ 8.0.8 (dependabot).

βš™οΈ Internal Improvements

  • TRITONPARSE_FB_MODE Env Var: Override is_fbcode() detection with TRITONPARSE_FB_MODE=0 (OSS) or =1 (fbcode). Fixes ImportError when running inside fbsource without Meta-internal dependencies.
  • Torch as Hard Dependency: Removed TORCH_INSTALLED conditional flag and 12 guard branches in structured_logging.py. Torch was already a de facto hard dependency.
  • FileCheck Binary Detection: Check package root, AMD backend, and NVIDIA backend paths (not just AMD), matching Triton's own _filecheck.py convention.
  • importlib.resources for Procedure Checks: Fix default_procedure_checks.json loading in PAR archives by switching from Path(__file__).parent to importlib.resources.files().

πŸ“ Documentation & CI

  • Wiki β†’ docs/ Migration: 10 wiki pages (5,000+ lines) moved into version-controlled docs/ directory with automatic sync via GitHub Actions.
  • Wiki Sync Regex Fix (#390): Escape literal ) in sed extended regex to fix sync-wiki.yml workflow.

Compatibility Notes

  • torch is now a hard dependency: The TORCH_INSTALLED guard has been removed. Environments without PyTorch installed will fail at import time rather than silently degrading.
  • TRITONPARSE_FB_MODE env var: New escape hatch for users running inside fbsource without full Meta-internal dependencies β€” set TRITONPARSE_FB_MODE=0 to force OSS mode.
  • No other breaking changes to the public API.