[Bench][Manifest] Mark norm MoE and argreduce benchmarks driven by superAngGao · Pull Request #1589 · tile-ai/TileOPs

superAngGao · 2026-06-16T07:33:54Z

Summary

mark normalization benchmark entries as manifest-driven
mark the remaining MoE/grouped-GEMM benchmark entries as manifest-driven
mark argmax/argmin benchmark entries as manifest-driven
route the 3WG fused MoE experts benchmark roofline through op.eval_roofline() instead of duplicated formulas

This moves implemented benchmark manifest coverage from 108/126 to 124/126. The two remaining implemented gaps are Conv1dFwdOp and Conv1dBiasFwdOp, which are outside this PR scope.

Closes #1561
Closes #1562
Closes #1563

Validation

python -m ruff check benchmarks/ops/bench_fused_moe_experts.py
python scripts/validate_manifest.py --levels schema,shape,dtype,bench
PYTHONPATH=$PWD python scripts/manifest_stats.py --format text
python -m pytest --collect-only -q benchmarks/ops/bench_ada_layer_norm.py benchmarks/ops/bench_batch_norm.py benchmarks/ops/bench_fused_add_layer_norm.py benchmarks/ops/bench_fused_add_rms_norm.py benchmarks/ops/bench_group_norm.py benchmarks/ops/bench_instance_norm.py benchmarks/ops/bench_layer_norm.py benchmarks/ops/bench_rms_norm.py benchmarks/ops/bench_fused_moe_experts.py benchmarks/ops/bench_moe_grouped_gemm_nopad.py benchmarks/ops/bench_argreduce.py

gemini-code-assist

Code Review

This pull request refactors MoEExpertsBenchmark to accept an operator parameter and cache roofline evaluation results for FLOPs and memory calculations. Additionally, it updates various operator manifests in moe.yaml, normalization.yaml, and reduction.yaml to enable bench_manifest_driven: true. The feedback suggests initializing the _roofline_cache attribute inside the __init__ constructor rather than as a class attribute to align with idiomatic Python practices and prevent shared state issues.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Gabbering

goose skimmed cccada0 — nothing to honk about.

RMLYC

I reviewed this against the stated benchmark-manifest migration plan. The normalization flips look consistent with the existing manifest-driven benchmark files, and I do not see obvious over-engineering in the small custom MoE benchmark adapter.

I found two issues that should be addressed before this is merged:

The argreduce benchmark still skips known large-N manifest workloads at runtime, so marking Argmax/Argmin as bench_manifest_driven: true is premature under the plan gate.
The fused MoE experts benchmark derives pytest ids from manifest dtype entries, but the test still hardcodes bf16. That is fine for current bf16-only workloads, but after this flag is true it will silently mis-benchmark any future fp16 workload.

Validation I checked locally on the PR worktree:

python scripts/validate_manifest.py --levels schema,shape,dtype,bench passes.
PYTHONPATH=$PWD python scripts/manifest_stats.py --format text reports bench_manifest_driven 124/144.
git diff --check 283f41476bb16aa40c07f1fe813f80e2bbcdd09e cccada04a0e0250b7f3058e4ed841b27462c5bb2 passes.

I could not rerun the PR ruff or pytest collect-only commands in this local environment because the active Python env is missing ruff and pytest.

Gabbering

goose skimmed 7cbc3ac — nothing to honk about.

Ibuki-wind

Overall

One manifest-driven flip is premature because a manifest workload set still has expected-failure benchmark cases.

superAngGao · 2026-06-17T11:54:32Z

Thanks for the review. Addressed in e12a854.

Changes made:

Removed bench_manifest_driven: true from FusedAddRMSNormFwdOp while bench_fused_add_rms_norm.py still xfails the llama-3.1-405b-* manifest workloads.
Moved FusedAddRMSNormBenchmark._roofline_cache into instance initialization for consistency with the MoE benchmark adapter cleanup.

Validated in the nightly docker environment (tileops-runner:nightly-tl019-fullstack-no-tileops-ldfix):

python -m ruff check benchmarks/ops/bench_fused_add_rms_norm.py benchmarks/ops/bench_fused_moe_experts.py
python scripts/validate_manifest.py --levels schema,shape,dtype,bench
PYTHONPATH=$PWD python scripts/manifest_stats.py --format text
python -m pytest --collect-only -q benchmarks/ops/bench_fused_add_rms_norm.py benchmarks/ops/bench_fused_moe_experts.py benchmarks/ops/bench_argreduce.py

Results:

ruff passed
manifest validation passed in advisory mode
manifest stats now reports bench_manifest_driven 121/144
collect-only: 21 tests collected

Gabbering

goose skimmed e12a854 — nothing to honk about.

Ibuki-wind

Overall

Approval is blocked on PR metadata, not the code diff: the body still says argmax/argmin are marked manifest-driven and reports 108/126 -> 124/126, but the current head leaves those entries and FusedAddRMSNormFwdOp unflipped and manifest_stats.py reports bench_manifest_driven 121/144; update the PR body to describe the final merged state and edit the latest author reply to outcome-only form such as Done in e12a854..

[Bench][Manifest] Mark norm MoE and argreduce benchmarks driven

cccada0

superAngGao requested a review from a team June 16, 2026 07:33

github-actions Bot added the bench Benchmark updates label Jun 16, 2026

gemini-code-assist Bot reviewed Jun 16, 2026

View reviewed changes

Comment thread benchmarks/ops/bench_fused_moe_experts.py Outdated

Gabbering reviewed Jun 16, 2026

View reviewed changes

superAngGao requested a review from RMLYC June 16, 2026 09:33

RMLYC requested changes Jun 16, 2026

View reviewed changes

Comment thread tileops/manifest/reduction.yaml Outdated

Comment thread tileops/manifest/reduction.yaml Outdated

Comment thread benchmarks/ops/bench_fused_moe_experts.py Outdated

Address manifest benchmark review comments

7cbc3ac

Gabbering reviewed Jun 16, 2026

View reviewed changes

superAngGao requested a review from RMLYC June 16, 2026 11:25

Ibuki-wind requested changes Jun 16, 2026

View reviewed changes

Comment thread tileops/manifest/normalization.yaml Outdated

Address manifest benchmark review feedback

e12a854

Gabbering reviewed Jun 17, 2026

View reviewed changes

Ibuki-wind requested changes Jun 17, 2026

View reviewed changes

Ibuki-wind approved these changes Jun 17, 2026

View reviewed changes

RMLYC approved these changes Jun 18, 2026

View reviewed changes

lcy-seso merged commit cdb6b0b into tile-ai:main Jun 18, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bench][Manifest] Mark norm MoE and argreduce benchmarks driven#1589

[Bench][Manifest] Mark norm MoE and argreduce benchmarks driven#1589
lcy-seso merged 3 commits into
tile-ai:mainfrom
superAngGao:bench/issue-1561-1563/manifest-coverage

superAngGao commented Jun 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Gabbering left a comment

Uh oh!

RMLYC left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Gabbering left a comment

Uh oh!

Ibuki-wind left a comment

Uh oh!

Uh oh!

superAngGao commented Jun 17, 2026

Uh oh!

Gabbering left a comment

Uh oh!

Ibuki-wind left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

superAngGao commented Jun 16, 2026

Summary

Validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Gabbering left a comment

Choose a reason for hiding this comment

Uh oh!

RMLYC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Gabbering left a comment

Choose a reason for hiding this comment

Uh oh!

Ibuki-wind left a comment

Choose a reason for hiding this comment

Overall

Uh oh!

Uh oh!

superAngGao commented Jun 17, 2026

Uh oh!

Gabbering left a comment

Choose a reason for hiding this comment

Uh oh!

Ibuki-wind left a comment

Choose a reason for hiding this comment

Overall

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants