Commit 1b390fb
authored
Added benchmarking for new torchao low precision attention api (#3865)
## Summary
- Added new benchmark for new low precision attention API
- Can set baseline and test models between different backends: (fa2, fa3, fa3_fp8, fa4, fa4_fp8)
- uses flux.1-schnell model, 4 inference steps, DrawBench prompts
- has options to control number of prompts, torch.compile usage, warmup_iters, using debug prompts, number of inference steps, rope fusion
- Following the guidelines of #3502
## Example Run
python benchmarks/prototype/attention/eval_flux_model.py --baseline fa3 --test fa3_fp8 --compile1 parent 2ec82b3 commit 1b390fb
1 file changed
Lines changed: 460 additions & 0 deletions
0 commit comments