Skip to content

This repo demonstrates how to use pybind11 to provide python interface for highly optimized CUDA code. It only contains basic functionality.

License

Notifications You must be signed in to change notification settings

DongDongBan/gemm-pybind-learning

Repository files navigation

gemm-pybind-learning

This repo demonstrates how to use pybind11 to provide python interface for highly optimized CUDA code. It only contains basic functionality. Present work uses modern CMake/Cuda and Yujia Zhai's GEMV implementention approach. CmakeLists comes from pkestene

Build and Install

This project requires CMake>=3.18, it can be built with code below:

git clone --recurse-submodules https://github.com/dongdongban/gemm-pybind-learning
cmake -S . -B build -DCMAKE_CUDA_ARCHITECTURES="75" && cd build
cmake --build .

The device architecture "sm_75" should be replaced by your native GPU capability.

Verification

if Nothing went wrong, check your module with these codes:

cd Optimizing-SGEMV-on-NVIDIA-GPUs
python -c 'import mygemm; mygemm.host(4096, 4096, 1); mygemm.host(4096, 4096, 2)'

About

This repo demonstrates how to use pybind11 to provide python interface for highly optimized CUDA code. It only contains basic functionality.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages