You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when do "Post-Training Quantization using MCT" , there are many parameters can adjust:
I want to know how I can find the most suitable value of them?
Expected behaviour
No response
Code to reproduce the issue
classQuantizationErrorMethod(Enum):
""" Method for quantization threshold selection: NOCLIPPING - Use min/max values as thresholds. MSE - Use mean square error for minimizing quantization noise. MAE - Use mean absolute error for minimizing quantization noise. KL - Use KL-divergence to make signals distributions to be similar as possible. Lp - Use Lp-norm to minimizing quantization noise. HMSE - Use Hessian-based mean squared error for minimizing quantization noise. This method is using Hessian scores to factorize more valuable parameters when computing the error induced by quantization. """NOCLIPPING=0MSE=1MAE=2KL=4LP=5HMSE=6@dataclassclassQuantizationConfig:
""" A class that encapsulates all the different parameters used by the library to quantize a model. Examples: You can create a quantization configuration to apply to a model. For example, to quantize a model's weights and activations using thresholds, with weight threshold selection based on MSE and activation threshold selection using NOCLIPPING (min/max), while enabling relu_bound_to_power_of_2 and weights_bias_correction, you can instantiate a quantization configuration like this: >>> import model_compression_toolkit as mct >>> qc = mct.core.QuantizationConfig(activation_error_method=mct.core.QuantizationErrorMethod.NOCLIPPING, weights_error_method=mct.core.QuantizationErrorMethod.MSE, relu_bound_to_power_of_2=True, weights_bias_correction=True) The QuantizationConfig instance can then be used in the quantization workflow, such as with Keras in the function: :func:~model_compression_toolkit.ptq.keras_post_training_quantization`. """activation_error_method: QuantizationErrorMethod=QuantizationErrorMethod.MSEweights_error_method: QuantizationErrorMethod=QuantizationErrorMethod.MSErelu_bound_to_power_of_2: bool=Falseweights_bias_correction: bool=Trueweights_second_moment_correction: bool=Falseinput_scaling: bool=Falsesoftmax_shift: bool=Falseshift_negative_activation_correction: bool=Trueactivation_channel_equalization: bool=Falsez_threshold: float=math.infmin_threshold: float=MIN_THRESHOLDl_p_value: int=2linear_collapsing: bool=Trueresidual_collapsing: bool=Trueshift_negative_ratio: float=0.05shift_negative_threshold_recalculation: bool=Falseshift_negative_params_search: bool=Falseconcat_threshold_update: bool=Falseactivation_bias_correction: bool=Falseactivation_bias_correction_threshold: float=0.0# Default quantization configuration the library use.DEFAULTCONFIG=QuantizationConfig(QuantizationErrorMethod.MSE, QuantizationErrorMethod.MSE,
relu_bound_to_power_of_2=False, weights_bias_correction=True,
weights_second_moment_correction=False, input_scaling=False, softmax_shift=False)
Log output
No response
The text was updated successfully, but these errors were encountered:
Thank you for the question and for using MCT.
Defining the QuantizationConfig depends on what you're trying to achieve and the model that you are trying to compress.
The simplest approach is to use the default config, which is automatically set when running PTQ without providing a specific config. This should give you a descent result.
In addition, if you want to enable/disable some of the available features that might be valuable for your own model compression, you can set and provide your own configuration.
You can find more information about the different features that are available via the quantization config in our documentation.
Feel free to ask about any of the options for more details!
You can also find examples and deep dive on some of the features in our tutorials (see 1, 2). Note that these are written in Keras, but the demonstrated features are available for Pytorch as well.
Issue Type
Performance
Source
source
MCT Version
2.2.2
OS Platform and Distribution
No response
Python version
No response
Describe the issue
when do "Post-Training Quantization using MCT" , there are many parameters can adjust:
I want to know how I can find the most suitable value of them?
Expected behaviour
No response
Code to reproduce the issue
Log output
No response
The text was updated successfully, but these errors were encountered: