Commit 1b390fb

authored

Added benchmarking for new torchao low precision attention api (#3865)

## Summary - Added new benchmark for new low precision attention API - Can set baseline and test models between different backends: (fa2, fa3, fa3_fp8, fa4, fa4_fp8) - uses flux.1-schnell model, 4 inference steps, DrawBench prompts - has options to control number of prompts, torch.compile usage, warmup_iters, using debug prompts, number of inference steps, rope fusion - Following the guidelines of #3502 ## Example Run python benchmarks/prototype/attention/eval_flux_model.py --baseline fa3 --test fa3_fp8 --compile

1 parent 2ec82b3 commit 1b390fbCopy full SHA for 1b390fb

1 file changed

benchmarks/prototype/attention
- eval_flux_model.py

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 1b390fb

File tree

0 commit comments