-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pytorch bias quantisation on aarch64 - how best to proceed? #2015
Comments
May we know why you need this implementation, any background? To use your-defined bias data type in oneDNN. // auto conv_bias_md = memory::desc({conv_bias_tz}, dt::f32, tag::any); DNNL_VERBOSE output should be: |
@christinaburge Though your observation is correct and one can add such support the way you suggested, it is not really helping because of two reasons:
To keep the solution clean, bias should be passed non-quantized. We haven't restricted that for some reason, but we might consider doing that in the next major version to avoid such situations. Hope it helps. |
You can find detailed explanation of current quantization scheme in oneDNN in "Quantization and scaling" RFC. |
Hi! Following on from the issue #1864, I have been doing some further investigation and it looks like it's relatively straightforward to add quantized bias by making these super-hacky changes to
pytorch/third_party/ideep/mkl-dnn/src/cpu/gemm_x8s8s32x_convolution_utils.cpp
Does that sound sensible? Or is there a better way to proceed?
The text was updated successfully, but these errors were encountered: