-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Experimental][TorchFX] quantize_pt2e + X86Quantizer introduction #3121
base: develop
Are you sure you want to change the base?
[Experimental][TorchFX] quantize_pt2e + X86Quantizer introduction #3121
Conversation
efd3367
to
d1941f3
Compare
aea0bdf
to
52e80c8
Compare
nncf/experimental/common/quantization/algorithms/post_training/algorithm.py
Outdated
Show resolved
Hide resolved
9178921
to
43bc251
Compare
nncf/experimental/common/quantization/algorithms/quantizer/base_quantizer.py
Outdated
Show resolved
Hide resolved
nncf/experimental/common/quantization/algorithms/quantizer/fx_quantizer.py
Outdated
Show resolved
Hide resolved
nncf/experimental/common/quantization/algorithms/post_training/algorithm.py
Outdated
Show resolved
Hide resolved
nncf/experimental/common/quantization/algorithms/range_estimator/range_estimator.py
Outdated
Show resolved
Hide resolved
7ede33d
to
20147ab
Compare
f62e0b8
to
38c82a4
Compare
38c82a4
to
206f606
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daniil-lyakhov, I would suggest to create an example how to use nncf.torch.quantize_pt2e for torch.ao.Quantizer. Please open a ticket for this task if it was not open yet.
self._quantizer = quantizer | ||
|
||
def get_quantization_setup(self, model: torch.fx.GraphModule, nncf_graph: NNCFGraph) -> SingleConfigQuantizerSetup: | ||
anotated_model = deepcopy(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you briefly explain why you are doing deep copying of the model here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because the .anotate
and .validate
methods alter the model meta. I believe get_quantization_setup
metho d should not change the input model in any way
@@ -0,0 +1,10 @@ | |||
# Copyright (c) 2024 Intel Corporation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would sugget to move nncf/experimental/quantization/algorithms/quantizer
-> nncf/experimental/quantization/quantizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Additionally, I renamed base_quantizer.py
-> quantizer.py
per_channel = False | ||
else: | ||
raise nncf.InternalError(f"Unknown qscheme: {qspec.qscheme}") | ||
signed = qspec.dtype is torch.uint8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, double check this condition. I believe signed=True if torch.int8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is incorrect, you are right! But this parameter does not affect the quantization parameters as signdness_to_force
is ignored in PTQ min max here https://github.com/openvinotoolkit/nncf/blob/develop/nncf/quantization/algorithms/min_max/torch_fx_backend.py#L230
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
qconfig = QuantizerConfig(mode=mode, signedness_to_force=signed, per_channel=per_channel) | ||
qps = [] | ||
# If input node is a constant and placed not at activations port (0) | ||
if from_n.op == "get_attr" and to_n.args.index(from_n) != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, WeightQuantizationInsertionPoint
differs from ActivationQuantizationInsertionPoint
in the method of collecting statistics, am I right? Then why did you add the activation port checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored it, now it should works in common way. The problem which is fixed there is that for some models (swin_v2_s for example) some activations could be a constants as well
q_setup.add_independent_quantization_point(qp) | ||
|
||
elif isinstance(qspec, SharedQuantizationSpec): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the support of the SharedQuantizationSpec
be added in the follow-up PR? If so, please add a todo here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SharedQuantizationSpec
is not produced by the X86InductorQuantizer
, but could be produced by a different one. Warning and a todo string was added
TModel = TypeVar("TModel") | ||
|
||
|
||
class Quantizer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add a doctring for the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
EdgeOrNode = Union[Tuple[torch.fx.Node, torch.fx.Node]] | ||
|
||
|
||
class TorchAOQuantizerAdapter(NNCFQuantizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add a docsting for the class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
weights_range_estimator_params: Optional[RangeEstimatorParameters] = None, | ||
): | ||
""" | ||
:param subset_size: Size of a subset to calculate activations statistics used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring for the quantizer
parameter is missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gap is filled, thanks
206f606
to
983f5fa
Compare
983f5fa
to
e9bc7a8
Compare
ticket: #3185 |
Changes
Introduction of
quantize_pt2e
methodReason for changes
Related tickets
#2766
Tests
graph tests:
tests/torch/fx/test_quantizer.py