Closed
Description
When running benchmark_inference_latency for bitnet I got this exception:
self.bitblas_matmul = self._get_or_create_bitblas_operator(matmul_config, ENABLE_TUNING)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ostix/Bitnet/utils_quant.py", line 81, in _get_or_create_bitblas_operator
global_operator_cache.save_into_database(BITBLAS_DATABASE_PATH, BITBLAS_TARGET)
File "/home/ostix/.virtualenvs/optimized-LLM/lib/python3.11/site-packages/bitblas/cache/operator.py", line 55, in save_into_database
self._save_operator_config_and_artifact(config, op_inst, config_path)
File "/home/ostix/.virtualenvs/optimized-LLM/lib/python3.11/site-packages/bitblas/cache/operator.py", line 103, in _save_operator_config_and_artifact
source_file.write(op_inst.get_source())
TypeError: write() argument must be str, not None
When restarting python throw the same error but for a different layer and the log indicate that the previous matmult has been loaded from cache (but hasn't been saved correctly the previous run).
After debugging a bit, it seems that the Operator has optimized_func=None that cause rt_mod=None and make the get_source return None
Python: 3.11.5
CUDA version: V12.3.107
Bitblas version: 0.0.1.dev3
OS: Linux WSL
What does mean this error and is it bad?
Test code executed to debug:
my_linear = BitLinear(
2048, 6000, bias=False,
weight_bits=1, input_bits=8,
)
Thanks
Metadata
Metadata
Assignees
Labels
No labels