Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dev] Append Efficient CUDA test for low precision batch decoding #80

Merged
merged 21 commits into from
Jul 8, 2024

Conversation

LeiWang1999
Copy link
Contributor

This pull request includes changes to improve the codebase and add new functionalities. The most significant changes include adding a new notice for IST-DASLab/marlin in the THIRDPARTYNOTICES.txt file, refactoring the MatMulNTDequantizeEmitter class in the matmul_dequantize_impl.py file, and adding new test directories in the CMakeLists.txt file under the testing/cpp directory.

Addition of third party notices:

Refactoring of existing code:

Addition of new test directories:

@LeiWang1999 LeiWang1999 changed the title [Dev] Append Efficient CUDA implementation for low precision batch decoding [Dev] Append Efficient CUDA test for low precision batch decoding Jul 7, 2024
@LeiWang1999 LeiWang1999 merged commit c6f3ca5 into microsoft:main Jul 8, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant