Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Dev] Refactor the range of INT Format to (-max_int_value - 1, max_in…
…t_value) (#15) * rename transparency * dependabot fix * update transparency. * update plugin * remove redundant transparency * dsl benchmark scirpts * update submodule. * remove redundant code. * remove transparency * fix propagate map issue * implement in register dequantize config * optimize target * fix tag. * fix some issues on ampere game device * finetune with data distribution. * fill matmul benchmarking scripts * refactor use_async_copy to bool value * support af format * format fix * support propagate input transform for dequantization. * update requirements * update requirements.txt * update af4 related tests. * clean test * naive support for dynamic zeros * move to bitdistiller * implement lop3 with zeros cpp test * implement fast decoding with zeros * update zero generation support. * Bump transformers from 4.29.2 to 4.36.0 Bumps [transformers](https://github.com/huggingface/transformers) from 4.29.2 to 4.36.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.29.2...v4.36.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Bump pillow from 9.4.0 to 10.2.0 Bumps [pillow](https://github.com/python-pillow/Pillow) from 9.4.0 to 10.2.0. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](python-pillow/Pillow@9.4.0...10.2.0) --- updated-dependencies: - dependency-name: pillow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Bump tornado from 6.2 to 6.3.3 Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.2 to 6.3.3. - [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst) - [Commits](tornadoweb/tornado@v6.2.0...v6.3.3) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Bump scipy from 1.5.3 to 1.11.1 Bumps [scipy](https://github.com/scipy/scipy) from 1.5.3 to 1.11.1. - [Release notes](https://github.com/scipy/scipy/releases) - [Commits](scipy/scipy@v1.5.3...v1.11.1) --- updated-dependencies: - dependency-name: scipy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Bump jinja2 from 3.1.2 to 3.1.3 Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](pallets/jinja@3.1.2...3.1.3) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Bump pygments from 2.2.0 to 2.15.0 Bumps [pygments](https://github.com/pygments/pygments) from 2.2.0 to 2.15.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](pygments/pygments@2.2.0...2.15.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Bump pygments from 2.13.0 to 2.15.0 Bumps [pygments](https://github.com/pygments/pygments) from 2.13.0 to 2.15.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](pygments/pygments@2.13.0...2.15.0) --- updated-dependencies: - dependency-name: pygments dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * update requirements and matmul. * support fast decode for int8 related items * improve pass context * update benchmark related figures. * update benchmark readme * reorganize readme * refactor readme * update benchmark readme * refactor quant linear for bisect * update tvm submodule * fix blockIdx related * update bitditiller related. * update zero type related test * implement zero types support * implement zero types support * fix lop3 permuteta issue. * fix weight executor bug. * improve typing * resolve performance related items * add implementation for dequantization with dynamic symbolic * fix ladder transform related issues. * improve ladder permutation for dequantization * enhance dynamic symbolic for matmul_impl * improve support for dynamic symbolic * update tvm dependency * implement operator cache. * refactor print to logging * append setup.py and remove tvm pythonpath dependency. * update ignore * improve installation scripts * update scaling benchmark of 1bit * int8xint1 lop3 support. * replace with to_torch_func * license related fix * update contributing.md * autogptq support. * refactor docs * refactor * refactor docs * typo fix * implement disk cache * refactor codegen to get_source * support get weight shape. * Update dependabot.yml * Update dependabot.yml * Update dependabot.yml * Update dependabot.yml * Update dependabot.yml * Update requirements.txt * Update requirements.txt * Update requirements.txt * refactor propagate into transform kind * Update dependabot.yml * implement scale and zero layout propagation * typo fix * refactor codes * fix performance issue of dequantize propagate * refactor print * fix gemv scale bugs * refactor ops configs * improve tensor_adapter * implement trick wrapper for integration * code refactor * SUPPORT.md commit * spell check * improve for linting * overal lint improvements * Add copyright and license information * improve contributing * Fix PYTHONPATH export in installation script and update BitBLAS package * Update benchmark section in README.md * Update performance benchmarks and integration details * Fix typo in README.md * Refactor index map logging in matmul_analysis.py * Add .ruff_cache to .gitignore * Add _tir_u32_to_f4_to_f16 function to quantization module * Update performance benchmark images * Update benchmark configurations * Update benchmark information in README.md * Refactor code for improved performance and readability * convolution impl support * Refactor convolution2d_impl.py and test_auto_normalized_tensorcore.py * Fix code formatting and remove unnecessary code * Update TensorCore GEMM Performance Comparison * Update TensorCore GEMM performance comparison on A100 and RTX4090 * Refactor propagate_inputs method in TensorCorePolicy * Fix BitBLAS import and remove debug print statements * Add end-to-end integration with Quantize Inference Kernel for AutoGPTQ and vLLM * Fix import order and handle exception in benchmark scripts * Update TVM subproject commit * Update TileDevice class names in bitblas package * Update imports in roller module * Update images * Update images * Update end2end_llama_13b_vllm.png * Update trademark and acknowledgement section * Update benchmark images for consistent GEMM operations * Add test case for decoding UInt4 to Float16 with scaling and zeros quantized * Remove benchmarking code for int4 on a specific target * Update image files and add new functions for quantization and rasterization * fix rescale and original lop3. * Add integration example of FasterTransformers with BitBLAS * Update integration example of FasterTransformer with BitBLAS * Update requirements-dev.txt and requirements.txt * Add LLVM download and extraction functionality * Update FasterTransformer.gif * Update BitBLAS version and requirements * Update BitBLAS import paths and add support for installing and developing TVM * Add GPU intrinsics module for BitBLAS * Update requirements-dev.txt and requirements.txt * Refactor import paths in BitBLAS GPU modules * Update installation guide in Installation.md * Refactor MatmulConfig class in matmul.py for improved readability and maintainability * Refactor MatmulConfig class in matmul.py for improved readability and maintainability * Refactor MatmulConfig class in matmul.py for improved readability and maintainability * Update installation guide and QuickStart link in README.md * Update installation guide and QuickStart link in README.md * Append Default Schedule Fallback * Refactor requirements-dev.txt and fix newline issue in arch_base.py * Fix typo in check_mit_license.sh * imrpove the target detection. * Improve target detection and fix typos in code * Fix auto-inline spacing issue in MatmulTensorizationMMAWithDequantizeInfo class * Improve target detection and fix typos in code * transform to submit * Add support for weight_dtype transformation in MatmulWeightOnlyDequantizeConfig * Update zeros_type to zeros_mode in code blocks * update README * update README * Fix import errors and update paths in code * Update variable names in test_bitblas_linear.py and __init__.py * Update imports and add new function in quantization and cache modules * Update README with support matrix table * Update support matrix table and benchmark configurations * Update support matrix table and benchmark configurations * Update support matrix table and benchmark configurations * Update support matrix table and benchmark configurations * Update support matrix table and benchmark configurations * Update import statements and add new functions in quantization and cache modules * Fix default dynamic range for M in MatmulConfig * Update support matrix table with new tested platforms and Out_dtype column * Refactor code for mixed-precision matrix multiplication and update support matrix table * Refactor code for mixed-precision matrix multiplication and update support matrix table * Update MatmulConfig initialization in QuickStart.md * Update support matrix table with new tested platforms and INT32/FP16/INT8 support * Refactor code for mixed-precision matrix multiplication and update support matrix table * Update link to code implementation in QuickStart.md * Disable tuning for initial bitblas operator creation * Update linear transformation description in PythonAPI.md * Update MatmulConfig in PythonAPI.md * convert af format to nf * Enable hardware-aware tuning for bitblas operators * Refactor code for mixed-precision matrix multiplication and update support matrix table * Update support matrix table with new tested platforms and INT32/FP16/INT8 support * Update OperatorConfig.md with matrix multiplication configuration details * code refactor * Fix capitalization in QuickStart.md * update ReadME * Refactor setup.py to remove unnecessary code and improve readability * refactor infeatures to infeatures * update README.md * Fix incorrect data type mapping in general_matmul.py * update doc * Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py * uncomments some case * Add BITBLAS_DATABASE_PATH constant to OperatorCache and update load_global_ops_cache function * Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py * Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py * Update dependencies in requirements-dev.txt and requirements.txt * Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py * Fix BITBLAS_DATABASE_PATH constant assignment in OperatorCache * Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py * Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py * update install * Refactor variable names in setup.py and build_tvm function * append linear benchmark scripts * simple bug fix * Update BitBLAS installation instructions for Ubuntu 20.04 * Refactor variable names and add output print statements for debugging * Refactor variable names and update dependencies * Update BitBLAS installation instructions for Ubuntu 20.04 and add note about Linux support * Refactor logging handler and set log level in BitBLAS module * Bump version to 0.0.1 * Implement BitNET LOP3 Test * Refactor variable names and update dependencies * Refactor variable names and update dependencies in quantization module --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lingxiao Ma <[email protected]>
- Loading branch information