Skip to content

Commit fee43b5

Browse files
authored
Remove tutorials/calibration_flow/ and update stale references (#4257)
Summary: Remove the calibration_flow tutorial folder (static_quant.py, gptq_like.py, awq_like.py) which depended on LinearActivationQuantizedTensor and other deprecated AQT APIs. Update documentation references that pointed to these deleted files. Test Plan: - Verified no remaining references to calibration_flow/, static_quant.py, gptq_like.py, or awq_like.py in the codebase - Doc links updated to point to remaining valid resources [ghstack-poisoned]
1 parent d8ad36d commit fee43b5

File tree

6 files changed

+4
-835
lines changed

6 files changed

+4
-835
lines changed

docs/source/contributing/quantization_overview.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ We'll skip the instruction for now since we haven't seen many use cases for stat
141141
Other Quantization Flows
142142
########################
143143

144-
For other quantization flow/algorithms that does not fit into any of the above, we also intend to provide examples for common patterns. For example, `GPTQ like quantization flow <https://github.com/pytorch/ao/blob/e283743b3cc4612bb641b88dca3670231724d396/tutorials/calibration_flow/gptq_like.py>`__ that is adopted by `Autoround <https://github.com/pytorch/ao/blob/e283743b3cc4612bb641b88dca3670231724d396/torchao/prototype/autoround/README.md>`__, it uses `MultiTensor <https://gist.github.com/HDCharles/a1b575bbf8875f994af8a01b225e1227>`__ and module hooks to optimize the module.
144+
For other quantization flow/algorithms that does not fit into any of the above, we also intend to provide examples for common patterns. For example, `Autoround <https://github.com/pytorch/ao/blob/main/torchao/prototype/autoround/README.md>`__ uses `MultiTensor <https://gist.github.com/HDCharles/a1b575bbf8875f994af8a01b225e1227>`__ and module hooks to optimize the module.
145145

146146
If you are working on a new quantization algorithm/flow and not sure how to implement it in a PyTorch native way, please feel free to open an issue to describe how your algorithm works and we can help advise on the implementation details.
147147

docs/source/eager_tutorials/static_quantization.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Static quantization refers to using a fixed quantization range for all inputs du
55

66
In static quantization, this fixed quantization range is typically calibrated on similar inputs before quantizing the model. During the calibration phase, we first insert observers into the model to "observe" the distribution of the inputs to be quantized, and use this distribution to decide what scales and zero points to ultimately use when quantizing the model.
77

8-
In this tutorial, we walk through an example of how to achieve this in torchao. All code can be found in this `example script <https://github.com/pytorch/ao/tree/main/tutorials/calibration_flow/static_quant.py>`__. Let's start with our toy linear model:
8+
In this tutorial, we walk through an example of how to achieve this in torchao. Let's start with our toy linear model:
99

1010
.. code:: py
1111
@@ -236,4 +236,4 @@ Now, we will see that the linear layers in our model are swapped to our `Quantiz
236236
>>> m.linear1.qweight # quantized weight tensor with scale and zero_point
237237
IntxUnpackedToInt8Tensor(...) # actual repr depends on quantization config
238238
239-
In this tutorial, we walked through a basic example of how to perform integer static quantization in torchao. We also have an example of how to perform the same static quantization in float8. Please see the full `example script <https://github.com/pytorch/ao/tree/main/tutorials/calibration_flow/static_quant.py>`__ for more detail!
239+
In this tutorial, we walked through a basic example of how to perform integer static quantization in torchao.

torchao/prototype/autoround/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ This script allows you to apply `Auto-Round` on a given model directly, more con
3838
`Auto-Round` is a calibration-based quantization algorithm. The flow involves three main steps: 1) insert hooks to the modules you want to quantize, 2) Wrap the calibration data with `MultiTensor` and run the model, 3) Replace the optimized weight with quantized tensors to select the appropriate low-bit kernel.
3939

4040
> [!NOTE]
41-
> To learn more about the flow and `MultiTensor`, please refer to [this example](https://github.com/pytorch/ao/blob/main/tutorials/calibration_flow/gptq_like.py).
41+
> To learn more about the flow and `MultiTensor`, please refer to [MultiTensor](https://gist.github.com/HDCharles/a1b575bbf8875f994af8a01b225e1227).
4242
4343
#### Step 1: Prepare the Model
4444
```python

tutorials/calibration_flow/awq_like.py

Lines changed: 0 additions & 223 deletions
This file was deleted.

0 commit comments

Comments
 (0)