Skip to content

Commit 2880f5c

Browse files
committed
Update on "Add support for flashinfer quantize kernel option for nvfp4"
Summary: Added the flashinfer option for better performance on some of the workflow we are interested in, also added numerical equivalence test between different nvfp4_quantize_kernel_choice options Test Plan: pytest test/prototype/mx_formats/test_nvfp4_tensor.py -k test_kernel_preference_numerical_equivalence We'll test speedup a bit later Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
2 parents 5cf6e51 + a4e7c33 commit 2880f5c

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

torchao/prototype/mx_formats/config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ class QuantizeToNVFP4KernelChoice(str, Enum):
4949

5050
torch.serialization.add_safe_globals([QuantizeToNVFP4KernelChoice])
5151

52+
5253
# register as pytree constant so we can use dynamo nonstrict trace in torchao.prototype.moe_training.ep
5354
@register_as_pytree_constant
5455
class ScaleCalculationMode(Enum):

0 commit comments

Comments
 (0)