Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DEV] Transform Codebase from Azure to GitHub #14

Merged
merged 316 commits into from
Apr 15, 2024
Merged

[DEV] Transform Codebase from Azure to GitHub #14

merged 316 commits into from
Apr 15, 2024

Conversation

LeiWang1999
Copy link
Contributor

@LeiWang1999 LeiWang1999 commented Apr 15, 2024

Refactor the codes and implement BitBLAS.Linear/Matmul.

Provide some docs.

xysmlx and others added 27 commits April 10, 2024 06:14
@LeiWang1999 LeiWang1999 merged commit eed0ea2 into main Apr 15, 2024
3 checks passed
@LeiWang1999 LeiWang1999 deleted the azure/dev branch April 15, 2024 11:54
LeiWang1999 added a commit that referenced this pull request Apr 24, 2024
* update codeql

* fix uint32 zero issue

* initial transparency.

* enhance transparency.

* rename transparency

* dependabot fix

* update transparency.

* update plugin

* remove redundant transparency

* dsl benchmark scirpts

* update submodule.

* remove redundant code.

* remove transparency

* fix propagate map issue

* implement in register dequantize config

* optimize target

* fix tag.

* fix some issues on ampere game device

* finetune with data distribution.

* fill matmul benchmarking scripts

* refactor use_async_copy to bool value

* support af format

* format fix

* support propagate input transform for dequantization.

* update requirements

* update requirements.txt

* update af4 related tests.

* clean test

* naive support for dynamic zeros

* move to bitdistiller

* implement lop3 with zeros cpp test

* implement fast decoding with zeros

* update zero generation support.

* Bump transformers from 4.29.2 to 4.36.0

Bumps [transformers](https://github.com/huggingface/transformers) from 4.29.2 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.29.2...v4.36.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump pillow from 9.4.0 to 10.2.0

Bumps [pillow](https://github.com/python-pillow/Pillow) from 9.4.0 to 10.2.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](python-pillow/Pillow@9.4.0...10.2.0)

---
updated-dependencies:
- dependency-name: pillow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump tornado from 6.2 to 6.3.3

Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.2 to 6.3.3.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](tornadoweb/tornado@v6.2.0...v6.3.3)

---
updated-dependencies:
- dependency-name: tornado
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump scipy from 1.5.3 to 1.11.1

Bumps [scipy](https://github.com/scipy/scipy) from 1.5.3 to 1.11.1.
- [Release notes](https://github.com/scipy/scipy/releases)
- [Commits](scipy/scipy@v1.5.3...v1.11.1)

---
updated-dependencies:
- dependency-name: scipy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump jinja2 from 3.1.2 to 3.1.3

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.2...3.1.3)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump pygments from 2.2.0 to 2.15.0

Bumps [pygments](https://github.com/pygments/pygments) from 2.2.0 to 2.15.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](pygments/pygments@2.2.0...2.15.0)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump pygments from 2.13.0 to 2.15.0

Bumps [pygments](https://github.com/pygments/pygments) from 2.13.0 to 2.15.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](pygments/pygments@2.13.0...2.15.0)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* update requirements and matmul.

* support fast decode for int8 related items

* improve pass context

* update benchmark related figures.

* update benchmark readme

* reorganize readme

* refactor readme

* update benchmark readme

* refactor quant linear for bisect

* update tvm submodule

* fix blockIdx related

* update bitditiller related.

* update zero type related test

* implement zero types support

* implement zero types support

* fix lop3 permuteta issue.

* fix weight executor bug.

* improve typing

* resolve performance related items

* add implementation for dequantization with dynamic symbolic

* fix ladder transform related issues.

* improve ladder permutation for dequantization

* enhance dynamic symbolic for matmul_impl

* improve support for dynamic symbolic

* update tvm dependency

* implement operator cache.

* refactor print to logging

* append setup.py and remove tvm pythonpath dependency.

* update ignore

* improve installation scripts

* update scaling benchmark of 1bit

* int8xint1 lop3 support.

* replace with to_torch_func

* license related fix

* update contributing.md

* autogptq support.

* refactor docs

* refactor

* refactor docs

* typo fix

* implement disk cache

* refactor codegen to get_source

* support get weight shape.

* Update dependabot.yml

* Update dependabot.yml

* Update dependabot.yml

* Update dependabot.yml

* Update dependabot.yml

* Update requirements.txt

* Update requirements.txt

* Update requirements.txt

* refactor propagate into transform kind

* Update dependabot.yml

* implement scale and zero layout propagation

* typo fix

* refactor codes

* fix performance issue of dequantize propagate

* refactor print

* fix gemv scale bugs

* refactor ops configs

* improve tensor_adapter

* implement trick wrapper for integration

* code refactor

* SUPPORT.md commit

* spell check

* improve for linting

* overal lint improvements

* Add copyright and license information

* improve contributing

* Fix PYTHONPATH export in installation script and update BitBLAS package

* Update benchmark section in README.md

* Update performance benchmarks and integration details

* Fix typo in README.md

* Refactor index map logging in matmul_analysis.py

* Add .ruff_cache to .gitignore

* Add _tir_u32_to_f4_to_f16 function to quantization module

* Update performance benchmark images

* Update benchmark configurations

* Update benchmark information in README.md

* Refactor code for improved performance and readability

* convolution impl support

* Refactor convolution2d_impl.py and test_auto_normalized_tensorcore.py

* Fix code formatting and remove unnecessary code

* Update TensorCore GEMM Performance Comparison

* Update TensorCore GEMM performance comparison on A100 and RTX4090

* Refactor propagate_inputs method in TensorCorePolicy

* Fix BitBLAS import and remove debug print statements

* Add end-to-end integration with Quantize Inference Kernel for AutoGPTQ and vLLM

* Fix import order and handle exception in benchmark scripts

* Update TVM subproject commit

* Update TileDevice class names in bitblas package

* Update imports in roller module

* Update images

* Update images

* Update end2end_llama_13b_vllm.png

* Update trademark and acknowledgement section

* Update benchmark images for consistent GEMM operations

* Add test case for decoding UInt4 to Float16 with scaling and zeros quantized

* Remove benchmarking code for int4 on a specific target

* Update image files and add new functions for quantization and rasterization

* fix rescale and original lop3.

* Add integration example of FasterTransformers with BitBLAS

* Update integration example of FasterTransformer with BitBLAS

* Update requirements-dev.txt and requirements.txt

* Add LLVM download and extraction functionality

* Update FasterTransformer.gif

* Update BitBLAS version and requirements

* Update BitBLAS import paths and add support for installing and developing TVM

* Add GPU intrinsics module for BitBLAS

* Update requirements-dev.txt and requirements.txt

* Refactor import paths in BitBLAS GPU modules

* Update installation guide in Installation.md

* Refactor MatmulConfig class in matmul.py for improved readability and maintainability

* Refactor MatmulConfig class in matmul.py for improved readability and maintainability

* Refactor MatmulConfig class in matmul.py for improved readability and maintainability

* Update installation guide and QuickStart link in README.md

* Update installation guide and QuickStart link in README.md

* Append Default Schedule Fallback

* Refactor requirements-dev.txt and fix newline issue in arch_base.py

* Fix typo in check_mit_license.sh

* imrpove the target detection.

* Improve target detection and fix typos in code

* Fix auto-inline spacing issue in MatmulTensorizationMMAWithDequantizeInfo class

* Improve target detection and fix typos in code

* transform to submit

* Add support for weight_dtype transformation in MatmulWeightOnlyDequantizeConfig

* Update zeros_type to zeros_mode in code blocks

* update README

* update README

* Fix import errors and update paths in code

* Update variable names in test_bitblas_linear.py and __init__.py

* Update imports and add new function in quantization and cache modules

* Update README with support matrix table

* Update support matrix table and benchmark configurations

* Update support matrix table and benchmark configurations

* Update support matrix table and benchmark configurations

* Update support matrix table and benchmark configurations

* Update support matrix table and benchmark configurations

* Update import statements and add new functions in quantization and cache modules

* Fix default dynamic range for M in MatmulConfig

* Update support matrix table with new tested platforms and Out_dtype column

* Refactor code for mixed-precision matrix multiplication and update support matrix table

* Refactor code for mixed-precision matrix multiplication and update support matrix table

* Update MatmulConfig initialization in QuickStart.md

* Update support matrix table with new tested platforms and INT32/FP16/INT8 support

* Refactor code for mixed-precision matrix multiplication and update support matrix table

* Update link to code implementation in QuickStart.md

* Disable tuning for initial bitblas operator creation

* Update linear transformation description in PythonAPI.md

* Update MatmulConfig in PythonAPI.md

* convert af format to nf

* Enable hardware-aware tuning for bitblas operators

* Refactor code for mixed-precision matrix multiplication and update support matrix table

* Update support matrix table with new tested platforms and INT32/FP16/INT8 support

* Update OperatorConfig.md with matrix multiplication configuration details

* code refactor

* Fix capitalization in QuickStart.md

* update ReadME

* Refactor setup.py to remove unnecessary code and improve readability

* refactor infeatures to infeatures

* update README.md

* Fix incorrect data type mapping in general_matmul.py

* update doc

* Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py

* uncomments some case

* Add BITBLAS_DATABASE_PATH constant to OperatorCache and update load_global_ops_cache function

* Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py

* Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py

* Update dependencies in requirements-dev.txt and requirements.txt

* Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py

* Fix BITBLAS_DATABASE_PATH constant assignment in OperatorCache

* Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py

* Refactor variable names in bitblas_linear.py and bitblas_quant_linear.py

* update install

* Refactor variable names in setup.py and build_tvm function

* append linear benchmark scripts

* simple bug fix

* Update BitBLAS installation instructions for Ubuntu 20.04

* Refactor variable names and add output print statements for debugging

* Refactor variable names and update dependencies

* Update BitBLAS installation instructions for Ubuntu 20.04 and add note about Linux support

* Refactor logging handler and set log level in BitBLAS module

* Bump version to 0.0.1

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lingxiao Ma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants