About

The code and data in this repository were used for the paper "Input-Dependent Power Usage in GPUs" published in SC'24 Sustainable Supercomputing workshop.

Setup

Install Miniconda and make a python environment (e.g., conda create -n gemm python).
Activate the conda environment (e.g., conda activate gemm) and install necessary packages (e.g., pip install cmake and pip install numpy).
Install Cuda Toolkit 12.2+ and set environment variable CUDACXX to point to the cuda-x.y/bin/nvcc directory (or update path).
Install Nvidia DCGM and enable the DCGM systemd service.
These experiments stem from the Cutlass examples. Clone the Cutlass repo.
Move or copy matrices.h to the cutlass/include/cutlass/gemm/ directory.
Move or copy basic_gemm_expt.cu and other_expt.cu to the cutlass/examples/00_basic_gemm/ directory.

Add the following to the cutlass/examples/00_basic_gemm/CMakeLists.txt file:

cutlass_example_add_executable(
    00_basic_gemm_expt
    basic_gemm_expt.cu
)

cutlass_example_add_executable(
    other_expt
    other_expt.cu
)

You might need to comment out the swap method in cutlass/include/cutlass/fast_math.h due to overrides.

Build

Add a build folder in the main cutlass/ directory.

mkdir build & cd build

Standard compile for A100. Change the arch code to compile for other platforms, or see Cutlass documentation for more options. The arch code for A100 is 80, H100 is 90a, and RTX 6000 is 75 (this is a useful website on NVIDIA arch codes).

cmake .. -DCUTLASS_NVCC_ARCHS=80

If you want to use tensor cores, compile with:

cmake .. -DCUTLASS_NVCC_ARCHS=80 -DCUTLASS_F16C_ENABLED=ON -DCUTLASS_LIBRARY_KERNELS=tensorop

Navigate to the build and make.

cd examples/00_basic_gemm && make

Run

To change the datatype used in the experiments, update the DTYPE field at the top of the matrices.h file. Note that all RVs are initially generated as floats and then converted to DTYPE with the Cutlass NumericConverter. The default float rounding style is round to nearest.

The experiment script is 00_basic_gemm_expt, located in cutlass/build/examples/00_basic_gemm. It has the following usage:

./00_basic_gemm_expt <M> <N> <K> <alpha> <beta> <iterations> <seed> <pattern> <output path> <pattern parameters>

Example parameters for each pattern can be seen in paper_expts.py. The paper experiments are also in paper_expts.py.

To run experiment 13 (no transpose on B), set the transpose parameter of the B matrix to be false in the TestCutlassGemm function of basic_gemm_expt.cu.

To run experiments with tensor cores enabled, uncomment the using MMAOp and using SmArch statements in the CutlassSgemmNN function of basic_gemm_expt.cu. Add the following parameters to the subsequent using CutlassGemm statement: DTYPE, MMAOp, SmArch.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
a100_expt_results		a100_expt_results
h100_expt_results		h100_expt_results
rtx6000_expt_results		rtx6000_expt_results
v100_expt_results		v100_expt_results
.gitignore		.gitignore
README.md		README.md
basic_gemm_expt.cu		basic_gemm_expt.cu
matrices.h		matrices.h
other_expt.cu		other_expt.cu
paper_expts.py		paper_expts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Setup

Build

Run

About

Releases

Packages

Languages

theo-gregersen/input-dependent-power-sc24

Folders and files

Latest commit

History

Repository files navigation

About

Setup

Build

Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages