[Dev] Bring Block Reduction into our seach space and policy #132

LeiWang1999 · 2024-08-05T13:47:08Z

This pull request includes several changes to the bitblas module, focusing on improving functionality, optimizing performance, and refactoring code. The most important changes include adding a deprecation decorator, modifying tensorcore policies, enhancing the general_matmul module, and updating the ladder_permutate and lop3_permutate modules.

Code Enhancements and New Features:

bitblas/__init__.py: Added a deprecation decorator to mark functions as deprecated and emit warnings when used.
bitblas/module/__init__.py: Introduced the unpack_qweight function for unpacking quantized weights and updated the load_and_transform_weight method to use this function. [1] [2]

Tensorcore Policy Optimizations:

bitblas/base/roller/policy/tensorcore.py: Refactored _check_small_tile logic, removed commented-out block reduction constraints, and added _expand_with_tags to handle block reduction depth. [1] [2]

General Matmul Module Enhancements:

bitblas/ops/general_matmul/__init__.py: Added weight_compress and updated the _assign_weight_compress method to handle weight compression. Refactored transform_weight to improve weight processing. [1] [2] [3]

Ladder Permutate and LOP3 Permutate Module Updates:

bitblas/ops/ladder_permutate/__init__.py: Added forward and retrieve_output_shape methods to improve functionality.
bitblas/ops/lop3_permutate/__init__.py: Updated forward method to handle input and output tensors more efficiently and added retrieve_2d_weight_shape method.

…ability and maintainability

…ainability

…tainability

…ility

…d maintainability

…te_transform

…d maintainability

LeiWang1999 added 30 commits July 5, 2024 08:54

Refactor BatchMatMulEmitter and BatchMatMulSelector for improved read…

d8884e6

…ability and maintainability

Refactor import statements for improved readability and maintainability

fc84173

Refactor import statements for improved readability and maintainability

02f64de

disable failure email for ci

397eee6

remove email notifications.

20f6ad1

move relax pass from testing to mlc_llm

b93c394

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

ba6a6df

Refactor scripts with se check_eual_ref_scripts_with_emitter function

257693a

Lint Fix

9bb7f49

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

39e7614

Refactor scripts with se check_eual_ref_scripts_with_emitter function

93eb5a5

bug fix in test

aa66a90

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

ae14a53

lint fix.

79b08e4

test cuda i4 kernel

86fd036

Refactor copyright notice in i4matmul.hpp

6b73a21

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

0ba90c1

Refactor BitBLASLinear test module for improved readability and maint…

086d208

…ainability

refactor test as version below python 3.9 cannot handle int32 overflow.

47a3abd

format lint for test

024b247

Refactor test_int4b_fp16_convert.py for improved readability and main…

bfedeaa

…tainability

remove unused design file

e672a23

move tile device from package to base

21e5430

dummy impl for codegen

fd11940

Refactor file structure for ladder_permutate module

9ccfa85

Refactor backend class and fix typos in comments

7c7d73e

Deep refactor Lib related code.

47d5fc5

remove ci pull.

53dd0dd

LintFix

d58ac43

refactor builder for whl build

37cb07c

LeiWang1999 added 29 commits August 2, 2024 09:15

fix codeql

e5a4485

chore: Update submodule reference to latest commit

a04282b

chore: Disable common subexpression elimination in TIR passes

314d3e9

Lint Fix

f7d33bb

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

db633ed

4bit related lop3 updates.

201155a

lint fix

2b73662

gptq test fix

1a6a0fd

Fix for test

e84e3ef

lint fix

f0fbb55

lint fix

bf30688

typofix

9a360ba

QuantCompress Test

ee94536

chore: Refactor quant_compress_impl.py for readability and maintainab…

930cd76

…ility

Enhance docs to update latest works.

8c24776

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

c018e3c

Refactor weight executors in Matmul class for improved readability an…

38f1713

…d maintainability

Refactor weight executors in Matmul class for improved readability an…

4a578ce

…d maintainability

Refactor weight executors in Matmul class for improved readability an…

4e7126b

…d maintainability

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into upda…

de9fd2e

…te_transform

removed legacy operator

e405aa2

Refactor weight executors in Matmul class for improved readability an…

5709db1

…d maintainability

LintFix

2d90e7b

Fix GPTQ Repack with the latest weight transform

c2d2cfa

lint fix

ed6a0a1

bug fix for rescale dequantize

d23ab47

test fix

af16059

typo fix

ac316fd

lint fix

71c1d6e

LeiWang1999 merged commit 2e60d2b into microsoft:main Aug 5, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dev] Bring Block Reduction into our seach space and policy #132

[Dev] Bring Block Reduction into our seach space and policy #132

LeiWang1999 commented Aug 5, 2024

[Dev] Bring Block Reduction into our seach space and policy #132

[Dev] Bring Block Reduction into our seach space and policy #132

Conversation

LeiWang1999 commented Aug 5, 2024

Code Enhancements and New Features:

Tensorcore Policy Optimizations:

General Matmul Module Enhancements:

Ladder Permutate and LOP3 Permutate Module Updates: