Skip to content

Commit d48f4a0

Browse files
committed
improve mslk docs
Summary: 1. clearly call mslk out in main readme 2. clearly call mslk out in `NVFP4DynamicActivationNVFP4WeightConfig` Test Plan: CI ghstack-source-id: 416ce8c ghstack-comment-id: 4057318351 Pull-Request: #4077
1 parent 77be1e0 commit d48f4a0

3 files changed

Lines changed: 26 additions & 1 deletion

File tree

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,17 @@ pip install torchao
110110

111111
Please see the [torchao compability table](https://github.com/pytorch/ao/issues/2919) for version requirements for dependencies.
112112

113+
### Optional Dependencies
114+
115+
[MSLK](https://github.com/pytorch/MSLK) is an optional runtime dependency that provides accelerated kernels for some of the workflows in torchao. Stable MSLK should be used with stable torchao, and nightly MSLK with nightly torchao.
116+
```bash
117+
# Stable
118+
pip install mslk-cuda==1.0.0
119+
120+
# Nightly
121+
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
122+
```
123+
113124
## 🔎 Inference
114125

115126
TorchAO delivers substantial performance gains with minimal code changes:

docs/source/index.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,19 @@ Other installation options:
6767
6868
Please see the `torchao compatibility table <https://github.com/pytorch/ao/issues/2919>`__ for version requirements for dependencies.
6969

70+
Optional Dependencies
71+
^^^^^^^^^^^^^^^^^^^^^
72+
73+
`MSLK <https://github.com/pytorch/MSLK>`__ is an optional runtime dependency that provides accelerated kernels for some of the workflows in torchao. Stable MSLK should be used with stable torchao, and nightly MSLK with nightly torchao.
74+
75+
.. code:: bash
76+
77+
# Stable
78+
pip install mslk-cuda==1.0.0
79+
80+
# Nightly
81+
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
82+
7083
.. toctree::
7184
:glob:
7285
:maxdepth: 1

torchao/prototype/mx_formats/inference_workflow.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,8 @@ class NVFP4DynamicActivationNVFP4WeightConfig(AOBaseConfig):
204204
set to False.
205205
206206
Configuration parameters:
207-
- use_triton_kernel: bool, whether to use fused triton kernel for activation scaling (default: True)
207+
- use_triton_kernel: bool, whether to use fused triton kernel for activation scaling (default: True).
208+
Requires `MSLK <https://github.com/pytorch/MSLK>`__ to be installed.
208209
- use_dynamic_per_tensor_scale: bool, whether to dynamically compute per tensor scale (default: True)
209210
- step: Optional[QuantizationStep], the quantization step for observer-based flow
210211
- Data: float4_e2m1fn_x2

0 commit comments

Comments
 (0)