Add mixed granularity tests for smooth quant#3847
Add mixed granularity tests for smooth quant#3847cyxlily wants to merge 9 commits intopytorch:mainfrom
Conversation
Signed-off-by: Cui, Lily <lily.cui@intel.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3847
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Signed-off-by: Cui, Lily <lily.cui@intel.com>
There was a problem hiding this comment.
Pull request overview
Adds additional SmoothQuant test coverage to validate behavior when activation and weight quantization granularities are configured independently (mixed granularity), split out from #3524.
Changes:
- Expand
test_smoothquant_accuracyparameterization to include mixed activation/weight granularities for int8 dynamic and int8 static configs. - Add explicit per-row/per-tensor variants for int8 dynamic configs alongside mixed-granularity cases.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Int8StaticActivationInt8WeightConfig(granularity=PerRow()), | ||
| Int8DynamicActivationInt8WeightConfig(version=2, granularity=PerTensor()), | ||
| Int8DynamicActivationInt8WeightConfig( | ||
| version=2, granularity=[PerRow(), PerTensor()] |
There was a problem hiding this comment.
actually why is this a list?
I saw type is changed to
ao/torchao/quantization/quant_api.py
Line 856 in 9058b58
I feel we should not have both tuple and list, can you have a followup PR to change this to Union[Granularity, Tuple[Granularity, Granularity]] to reduce the complexity here?
There was a problem hiding this comment.
Thanks for comments, updated the UT.
For the followup PR, the JSON encoder explicitly raises NotImplementedError for tuples: https://github.com/pytorch/ao/blob/main/torchao/core/config.py#L146
So even though users can pass a Tuple, internally it still needs to be stored as a list for serialization.
Does it still make sense to change the API from list to tuple?
There was a problem hiding this comment.
oh I see, in that case we should remove tuple and just use List, I've seen the json error before as well
Signed-off-by: Cui, Lily <lily.cui@intel.com>
| Int8StaticActivationInt8WeightConfig(granularity=PerRow()), | ||
| Int8DynamicActivationInt8WeightConfig(), | ||
| Int8DynamicActivationInt8WeightConfig(granularity=PerTensor()), | ||
| Int8DynamicActivationInt8WeightConfig(granularity=(PerRow(), PerTensor())), |
There was a problem hiding this comment.
let's just use list and drop tuple in the next PR
There was a problem hiding this comment.
OK, created the PR to remove tuple: #4347
Split from #3524
Test different activation and weight granularities for smooth quantization.