Skip to content

Dynamic Quantization for GPT2 model from huggingface. #1401

Open
@mriganktiwari

Description

@mriganktiwari

Hi,

Reproducibility required: PyTorch version 1.4.0

I am trying to use the torch.quantization.quantize_dynamic function to quantize the pre_trained DistilGPT2 model from Hugging-face.

As most transformer blocks in this model are made up of the nn.Conv1d modules, there occurs a problem while performing the quantization.

I understand, because the function torch.quantization.quantize_dynamic does not define a way for quantizing the nn.Conv1d layer (see the snippet below), they all just go UN-Quantized

    if qconfig_spec is None:
        if dtype == torch.qint8:
            qconfig_spec = {
                nn.Linear : default_dynamic_qconfig,
                nn.LSTM : default_dynamic_qconfig
            }

Please suggest a solution.

cc @jerryzh168 @jianyuh

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions