Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experimental][TorchFX] quantize_pt2e + X86Quantizer introduction #3121

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

daniil-lyakhov
Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov commented Nov 28, 2024

Changes

Introduction of quantize_pt2e method

Reason for changes

Related tickets

#2766

Tests

graph tests: tests/torch/fx/test_quantizer.py

@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch experimental NNCF PTQ Pull requests that updates NNCF PTQ labels Nov 28, 2024
@daniil-lyakhov daniil-lyakhov changed the title Dl/fx/experimental quantization [Experimental][TorchFX] quantize_pt2e + X86Quantizer introduction Nov 28, 2024
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch 2 times, most recently from efd3367 to d1941f3 Compare November 28, 2024 17:32
@github-actions github-actions bot added the NNCF Common Pull request that updates NNCF Common label Dec 2, 2024
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch 2 times, most recently from aea0bdf to 52e80c8 Compare December 4, 2024 09:59
@daniil-lyakhov daniil-lyakhov marked this pull request as ready for review December 4, 2024 12:17
@daniil-lyakhov daniil-lyakhov requested a review from a team as a code owner December 4, 2024 12:17
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch 2 times, most recently from 9178921 to 43bc251 Compare December 5, 2024 12:32
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch 3 times, most recently from 7ede33d to 20147ab Compare December 23, 2024 22:15
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch from f62e0b8 to 38c82a4 Compare December 29, 2024 18:27
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch from 38c82a4 to 206f606 Compare January 7, 2025 16:53
Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniil-lyakhov, I would suggest to create an example how to use nncf.torch.quantize_pt2e for torch.ao.Quantizer. Please open a ticket for this task if it was not open yet.

self._quantizer = quantizer

def get_quantization_setup(self, model: torch.fx.GraphModule, nncf_graph: NNCFGraph) -> SingleConfigQuantizerSetup:
anotated_model = deepcopy(model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you briefly explain why you are doing deep copying of the model here?

Copy link
Collaborator Author

@daniil-lyakhov daniil-lyakhov Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because the .anotate and .validate methods alter the model meta. I believe get_quantization_setup metho d should not change the input model in any way

@@ -0,0 +1,10 @@
# Copyright (c) 2024 Intel Corporation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would sugget to move nncf/experimental/quantization/algorithms/quantizer -> nncf/experimental/quantization/quantizer

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Additionally, I renamed base_quantizer.py -> quantizer.py

per_channel = False
else:
raise nncf.InternalError(f"Unknown qscheme: {qspec.qscheme}")
signed = qspec.dtype is torch.uint8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, double check this condition. I believe signed=True if torch.int8.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is incorrect, you are right! But this parameter does not affect the quantization parameters as signdness_to_force is ignored in PTQ min max here https://github.com/openvinotoolkit/nncf/blob/develop/nncf/quantization/algorithms/min_max/torch_fx_backend.py#L230

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

qconfig = QuantizerConfig(mode=mode, signedness_to_force=signed, per_channel=per_channel)
qps = []
# If input node is a constant and placed not at activations port (0)
if from_n.op == "get_attr" and to_n.args.index(from_n) != 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, WeightQuantizationInsertionPoint differs from ActivationQuantizationInsertionPoint in the method of collecting statistics, am I right? Then why did you add the activation port checking?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have refactored it, now it should works in common way. The problem which is fixed there is that for some models (swin_v2_s for example) some activations could be a constants as well

q_setup.add_independent_quantization_point(qp)

elif isinstance(qspec, SharedQuantizationSpec):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the support of the SharedQuantizationSpec be added in the follow-up PR? If so, please add a todo here.

Copy link
Collaborator Author

@daniil-lyakhov daniil-lyakhov Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SharedQuantizationSpec is not produced by the X86InductorQuantizer, but could be produced by a different one. Warning and a todo string was added

TModel = TypeVar("TModel")


class Quantizer:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add a doctring for the class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

EdgeOrNode = Union[Tuple[torch.fx.Node, torch.fx.Node]]


class TorchAOQuantizerAdapter(NNCFQuantizer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add a docsting for the class

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

weights_range_estimator_params: Optional[RangeEstimatorParameters] = None,
):
"""
:param subset_size: Size of a subset to calculate activations statistics used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for the quantizer parameter is missed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gap is filled, thanks

@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch from 206f606 to 983f5fa Compare January 8, 2025 14:19
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/experimental_quantization branch from 983f5fa to e9bc7a8 Compare January 8, 2025 14:23
@daniil-lyakhov
Copy link
Collaborator Author

@daniil-lyakhov, I would suggest to create an example how to use nncf.torch.quantize_pt2e for torch.ao.Quantizer. Please open a ticket for this task if it was not open yet.

ticket: #3185

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental NNCF Common Pull request that updates NNCF Common NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants