Commit a1bb31f
committed
Update base for Update on "Add support for flashinfer quantize kernel option for nvfp4"
Summary:
Added the flashinfer option for better performance on some of the workflow
we are interested in, also added numerical equivalence test between different
nvfp4_quantize_kernel_choice options
Test Plan:
pytest test/prototype/mx_formats/test_nvfp4_tensor.py -k test_kernel_preference_numerical_equivalence
We'll test speedup a bit later
Reviewers:
Subscribers:
Tasks:
Tags:
[ghstack-poisoned]2 files changed
Lines changed: 10 additions & 1 deletion
File tree
- test
- prototype/mx_formats
- quantization/pt2e
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
191 | | - | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
192 | 195 | | |
193 | 196 | | |
194 | 197 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3093 | 3093 | | |
3094 | 3094 | | |
3095 | 3095 | | |
| 3096 | + | |
| 3097 | + | |
| 3098 | + | |
3096 | 3099 | | |
3097 | 3100 | | |
3098 | 3101 | | |
3099 | 3102 | | |
3100 | 3103 | | |
3101 | 3104 | | |
3102 | 3105 | | |
| 3106 | + | |
| 3107 | + | |
| 3108 | + | |
3103 | 3109 | | |
3104 | 3110 | | |
3105 | 3111 | | |
| |||
0 commit comments