gemm-pybind-learning

简体中文README

This repo demonstrates how to use pybind11 to provide python interface for highly optimized CUDA code. It only contains basic functionality. Present work uses modern CMake/Cuda and Yujia Zhai's GEMV implementention approach. CmakeLists comes from pkestene

Build and Install

This project requires CMake>=3.18, it can be built with code below:

git clone --recurse-submodules https://github.com/dongdongban/gemm-pybind-learning
cmake -S . -B build -DCMAKE_CUDA_ARCHITECTURES="75" && cd build
cmake --build .

The device architecture "sm_75" should be replaced by your native GPU capability.

Verification

if Nothing went wrong, check your module with these codes:

cd Optimizing-SGEMV-on-NVIDIA-GPUs
python -c 'import mygemm; mygemm.host(4096, 4096, 1); mygemm.host(4096, 4096, 2)'

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Optimizing-SGEMV-on-NVIDIA-GPUs @ edfd4d2		Optimizing-SGEMV-on-NVIDIA-GPUs @ edfd4d2
cmake		cmake
pybind11 @ da8c730		pybind11 @ da8c730
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gemm-pybind-learning

简体中文README

Build and Install

Verification

About

Releases

Packages

Languages

License

DongDongBan/gemm-pybind-learning

Folders and files

Latest commit

History

Repository files navigation

gemm-pybind-learning

简体中文README

Build and Install

Verification

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages