[Dev] Append Efficient CUDA test for low precision batch decoding #80

LeiWang1999 · 2024-07-07T16:50:52Z

This pull request includes changes to improve the codebase and add new functionalities. The most significant changes include adding a new notice for IST-DASLab/marlin in the THIRDPARTYNOTICES.txt file, refactoring the MatMulNTDequantizeEmitter class in the matmul_dequantize_impl.py file, and adding new test directories in the CMakeLists.txt file under the testing/cpp directory.

Addition of third party notices:

THIRDPARTYNOTICES.txt: Added a new notice for IST-DASLab/marlin.

Refactoring of existing code:

bitblas/ops/impl/matmul_dequantize_impl.py: Refactored the MatMulNTDequantizeEmitter class by removing unnecessary methods and simplifying the code. Also, modified the decode_func method in multiple places to simplify the addition of bias. [1] [2] [3] [4]

Addition of new test directories:

testing/cpp/CMakeLists.txt: Added a new subdirectory efficient_i4_cuda_impl for testing.
testing/cpp/efficient_i4_cuda_impl/CMakeLists.txt: Created a new CMakeLists.txt file under the efficient_i4_cuda_impl directory and defined functions to add CUDA and CPP test executables.

…ability and maintainability

…ainability

…tainability

LeiWang1999 added 18 commits July 5, 2024 08:54

Refactor BatchMatMulEmitter and BatchMatMulSelector for improved read…

d8884e6

…ability and maintainability

Refactor import statements for improved readability and maintainability

fc84173

Refactor import statements for improved readability and maintainability

02f64de

disable failure email for ci

397eee6

remove email notifications.

20f6ad1

move relax pass from testing to mlc_llm

b93c394

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

ba6a6df

Refactor scripts with se check_eual_ref_scripts_with_emitter function

257693a

Lint Fix

9bb7f49

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into main

39e7614

Refactor scripts with se check_eual_ref_scripts_with_emitter function

93eb5a5

bug fix in test

aa66a90

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

ae14a53

lint fix.

79b08e4

test cuda i4 kernel

86fd036

Refactor copyright notice in i4matmul.hpp

6b73a21

Merge branch 'main' of https://github.com/Microsoft/BitBLAS into dev

0ba90c1

Refactor BitBLASLinear test module for improved readability and maint…

086d208

…ainability

LeiWang1999 changed the title ~~[Dev] Append Efficient CUDA implementation for low precision batch decoding~~ [Dev] Append Efficient CUDA test for low precision batch decoding Jul 7, 2024

LeiWang1999 mentioned this pull request Jul 7, 2024

[Discussion] Refactor the dispatcher design for incoming different backends #81

Open

LeiWang1999 added 3 commits July 8, 2024 04:47

refactor test as version below python 3.9 cannot handle int32 overflow.

47a3abd

format lint for test

024b247

Refactor test_int4b_fp16_convert.py for improved readability and main…

bfedeaa

…tainability

LeiWang1999 merged commit c6f3ca5 into microsoft:main Jul 8, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dev] Append Efficient CUDA test for low precision batch decoding #80

[Dev] Append Efficient CUDA test for low precision batch decoding #80

LeiWang1999 commented Jul 7, 2024

[Dev] Append Efficient CUDA test for low precision batch decoding #80

[Dev] Append Efficient CUDA test for low precision batch decoding #80

Conversation

LeiWang1999 commented Jul 7, 2024