pytorch
diff --git a/‎docs/source/contributing/quantization_overview.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/contributing/quantization_overview.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/eager_tutorials/static_quantization.rst‎
Lines changed: 2 additions & 2 deletions b/‎docs/source/eager_tutorials/static_quantization.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎torchao/prototype/autoround/README.md‎
Lines changed: 1 addition & 1 deletion b/‎torchao/prototype/autoround/README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎tutorials/calibration_flow/awq_like.py‎
Lines changed: 0 additions & 223 deletions b/‎tutorials/calibration_flow/awq_like.py‎
Lines changed: 0 additions & 223 deletions
@@ -141,7 +141,7 @@ We'll skip the instruction for now since we haven't seen many use cases for stat
 Other Quantization Flows
 ########################
 
-For other quantization flow/algorithms that does not fit into any of the above, we also intend to provide examples for common patterns. For example, `GPTQ like quantization flow <https://github.com/pytorch/ao/blob/e283743b3cc4612bb641b88dca3670231724d396/tutorials/calibration_flow/gptq_like.py>`__ that is adopted by `Autoround <https://github.com/pytorch/ao/blob/e283743b3cc4612bb641b88dca3670231724d396/torchao/prototype/autoround/README.md>`__, it uses `MultiTensor <https://gist.github.com/HDCharles/a1b575bbf8875f994af8a01b225e1227>`__ and module hooks to optimize the module.
+For other quantization flow/algorithms that does not fit into any of the above, we also intend to provide examples for common patterns. For example, `Autoround <https://github.com/pytorch/ao/blob/main/torchao/prototype/autoround/README.md>`__ uses `MultiTensor <https://gist.github.com/HDCharles/a1b575bbf8875f994af8a01b225e1227>`__ and module hooks to optimize the module.
 
 If you are working on a new quantization algorithm/flow and not sure how to implement it in a PyTorch native way, please feel free to open an issue to describe how your algorithm works and we can help advise on the implementation details.
 
 
@@ -5,7 +5,7 @@ Static quantization refers to using a fixed quantization range for all inputs du
 
 In static quantization, this fixed quantization range is typically calibrated on similar inputs before quantizing the model. During the calibration phase, we first insert observers into the model to "observe" the distribution of the inputs to be quantized, and use this distribution to decide what scales and zero points to ultimately use when quantizing the model.
 
-In this tutorial, we walk through an example of how to achieve this in torchao. All code can be found in this `example script <https://github.com/pytorch/ao/tree/main/tutorials/calibration_flow/static_quant.py>`__. Let's start with our toy linear model:
+In this tutorial, we walk through an example of how to achieve this in torchao. Let's start with our toy linear model:
 
 .. code:: py
 
@@ -236,4 +236,4 @@ Now, we will see that the linear layers in our model are swapped to our `Quantiz
    >>> m.linear1.qweight  # quantized weight tensor with scale and zero_point
    IntxUnpackedToInt8Tensor(...)  # actual repr depends on quantization config
 
-In this tutorial, we walked through a basic example of how to perform integer static quantization in torchao. We also have an example of how to perform the same static quantization in float8. Please see the full `example script <https://github.com/pytorch/ao/tree/main/tutorials/calibration_flow/static_quant.py>`__ for more detail!
+In this tutorial, we walked through a basic example of how to perform integer static quantization in torchao.
@@ -38,7 +38,7 @@ This script allows you to apply `Auto-Round` on a given model directly, more con
 `Auto-Round` is a calibration-based quantization algorithm. The flow involves three main steps: 1) insert hooks to the modules you want to quantize, 2) Wrap the calibration data with `MultiTensor` and run the model, 3) Replace the optimized weight with quantized tensors to select the appropriate low-bit kernel.
 
 > [!NOTE]
-> To learn more about the flow and `MultiTensor`, please refer to [this example](https://github.com/pytorch/ao/blob/main/tutorials/calibration_flow/gptq_like.py).
+> To learn more about the flow and `MultiTensor`, please refer to [MultiTensor](https://gist.github.com/HDCharles/a1b575bbf8875f994af8a01b225e1227).
 
 #### Step 1: Prepare the Model
 ```python