Closed
Description
Hi!
Thanks for this great project! I am trying to use it in my project where I have the model weights compressed to 2 bits, similar to AutoGPTQ approach.
However, when I was trying to create a uint2 matmul kernel, as below:
import bitblas
import torch
import os
os.environ['NUMEXPR_MAX_THREADS'] = "32"
M = 1
N = 1024
K = 1024
GROUP_SIZE = 128
matmul_config = bitblas.MatmulConfig(
M=M, # M dimension
N=N, # N dimension
K=K, # K dimension
A_dtype="float16", # activation A dtype
W_dtype="uint2", # weight W dtype
accum_dtype="float16", # accumulation dtype
out_dtype="float16", # output dtype
layout="nt", # matrix layout, "nt" indicates the layout of A is non-transpose and the layout of W is transpose
with_bias=False, # bias
# configs for weight only quantization
group_size=128, # setting for grouped quantization
with_scaling=True, # setting for scaling factor
with_zeros=True, # setting for zeros
zeros_mode="quantized", # setting for how to calculating zeros
)
matmul = bitblas.Matmul(config=matmul_config)
I got the following error:
Traceback (most recent call last):
File "/pub/scratch/xiayao/projects/fmsys/triteia/tests/bitblas_example.py", line 28, in <module>
matmul = bitblas.Matmul(config=matmul_config)
File "/pub/scratch/xiayao/mamba/envs/triteia/lib/python3.10/site-packages/bitblas/ops/general_matmul.py", line 291, in __init__
self.hardware_aware_finetune()
File "/pub/scratch/xiayao/mamba/envs/triteia/lib/python3.10/site-packages/bitblas/ops/operator.py", line 201, in hardware_aware_finetune
self.optimized_func = self.apply_fast_tuning(
File "/pub/scratch/xiayao/mamba/envs/triteia/lib/python3.10/site-packages/bitblas/ops/operator.py", line 173, in apply_fast_tuning
self.pass_context = best.config.pass_context
AttributeError: 'NoneType' object has no attribute 'config'
Is it a known issue, or just unsupported yet? Any pointers on how to solve this issue would be appreciated!
Thanks again
### Tasks
Metadata
Metadata
Assignees
Labels
No labels