CXLBench is a comprehensive benchmarking suite designed to evaluate and analyze the performance of Compute Express Link (CXL) technology. This repository provides a collection of tools, benchmarks, and utilities to assess various aspects of CXL implementations, including latency, bandwidth, and overall system performance.
This repository has several standard benchmarks that have been made to work with DRAM and CXL. Each benchmark stands alone. Read the 'README.md' for each benchmark for instructions.
The primary goals of CXLBench are:
- To provide a standardized set of benchmarks for CXL technology
- To enable researchers, developers, and industry professionals to evaluate CXL performance across different hardware configurations
- To facilitate the comparison of various CXL implementations and their impact on system performance
- To support the ongoing development and optimization of CXL technology
cxlbench/
├── benchmarks/ // All benchmarks
├── lib/ // Helper libraries for the benchmark scripts
├── tools/ // Helper tools for the benchmark suite scripts
├── CONTRIBUTING.md // How to contribute to this project
├── LICENSE // License file
└── README.md // This file
This table shows the list of benchmarks included in this suite:
Benchmark | Description |
---|---|
cloudsuite3/graph-analytics | The Graph Analytics benchmark relies on the Spark framework to perform graph analytics on large-scale datasets |
cloudsuite3/inmem-analytics | This benchmark uses Apache Spark and runs a collaborative filtering algorithm (alternating least squares, ALS) provided by Spark MLlib in memory on a dataset of user-movie ratings. The metric of interest is the time in seconds for computing movie recommendations. |
GPU/NVidia/nvbandwidth | EA tool for bandwidth measurements on NVIDIA GPUs |
GPU/NVidia/cuda_examples | Evaluates the data transfer rates for NVidia GPUs |
IntelMLC | Runs the Intel Memory Latency Checker (MLC) |
memcached | Memcached is a general-purpose distributed memory-caching system |
Qdrant-Synth | Creates synthetic vectors and benchmarks a Qdrant Vector Database running in a Docker Container |
redis | Redis is a source-available, in-memory storage, used as a distributed, in-memory key–value database, cache and message broker |
stream | The STREAM benchmark is a simple synthetic benchmark program that measures sustainable memory bandwidth (in MB/s) and the corresponding computation rate for simple vector kernels. |
tpcc | TPC-C (Transaction Processing Performance Council Benchmark C, is a benchmark used to compare the performance of online transaction processing systems. |
To clone the CXLBench repository along with its submodules, use the following command:
git clone --recursive https://github.com/cxlbench/cxlbench.git
This will clone the main repository and initialize all submodules, including the CUDA examples and NVIDIA Bandwidth Test tool.
To update CXLBench, use the following command:
git pull
To update the submodules to their latest commits, run:
git submodule update --remote
- Clone the repository as described above
- Choose a benchmark to run from the
benchmarks/
directory - Follow the specific instructions for each benchmark in its respective directory
We welcome contributions to CXLBench! Please read our CONTRIBUTING.md file for guidelines on how to submit issues, feature requests, and pull requests.
CXLBench is released under the GPL-3.0 License. This covers our contributions to the project - scripts, etc. If a benchmark uses a 3rd party utility, most do, then that utility is covered by the authors license.
We want to thank the contributors of the following projects, which are included as submodules in this repository:
- CUDA Examples by drkennetz
- NVIDIA Bandwidth Test by NVIDIA
For questions, suggestions, or support, please open an issue in this repository.