kube-perftest
is a framework for building and running performance benchmarks for
Kubernetes clusters.
Requires a volcano.sh installation on the cluster, can be installed using
kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/master/installer/volcano-development.yaml
The kube-perftest
operator can be installed using Helm:
helm repo add kube-perftest https://stackhpc.github.io/kube-perftest
# Use the most recent published chart for the main branch
helm upgrade \
kube-perftest-operator \
kube-perftest/kube-perftest-operator \
-i \
--version ">=0.1.0-dev.0.main.0,<0.1.0-dev.0.main.99999999999"
For most use cases, no customisations to the Helm values will be necessary.
All the benchmarks are capable of running using the Kubernetes pod network or the host network
(using hostNetwork: true
on the benchmark pods).
Benchmarks are also able to run on accelerated networks where available, using Multus for multiple CNIs and device plugins to request network resources.
This allows benchmarks to levarage technologies such as SR-IOV (via the SR-IOV network device plugin), macvlan (via the macvlan CNI plugin) and RDMA (e.g. via the RDMA shared device plugin).
The networking is configured using the following properties of the benchmark spec
:
spec:
# Indicates whether to use host networking or not
# If true, networkName is not used
hostNetwork: false
# The name of a Multus network to use
# Only used if hostNetwork is false
# If not given, the Kubernetes pod network is used
networkName: namespace/netname
# The resources for benchmark pods
resources:
limits:
# E.g. requesting a share of an RDMA device
rdma/hca_shared_devices_a: 1
# The MTU to set on the interface *inside* the container
# If not given, the default MTU is used
mtu: 9000
The kube-perftest
operator provides a BenchmarkSet
resource that can be used to run
the same benchmark over a sweep of parameters:
apiVersion: perftest.stackhpc.com/v1alpha1
kind: BenchmarkSet
metadata:
name: iperf
spec:
# The template for the fixed parts of the benchmark
template:
apiVersion: perftest.stackhpc.com/v1alpha1
kind: IPerf
spec:
duration: 30
# The number of repetitions to run for each permutation
# Defaults to 1 if not given
repetitions: 1
# Defines the permutations for the set
# Each permutation is merged into the spec of the template
# If not given, a single permutation consisting of the given template is used
permutations:
# Permutations are calculated as a cross-product of the specified names and values
product:
hostNetwork: [true, false]
streams: [1, 2, 4, 8, 16, 32, 64]
# A list of explicit permutations to include
explicit:
- hostNetwork: true
streams: 128
Currently, the following benchmarks are supported:
Runs the iperf network performance tool to measure bandwidth for a transfer between two pods.
apiVersion: perftest.stackhpc.com/v1alpha1
kind: IPerf
metadata:
name: iperf
spec:
# The number of parallel streams to use
streams: 8
# The duration of the test
duration: 30
Runs the Intel MPI Benchmarks (IMB) MPI1 PingPong benchmark to measure the average round-trip time and bandwidth for MPI messages of different sizes between two pods.
Uses Open MPI initialised over SSH. The data plane can use TCP or, hardware and network permitting, RDMA via UCX.
apiVersion: perftest.stackhpc.com/v1alpha1
kind: MPIPingPong
metadata:
name: mpi-pingpong
spec:
# The MPI transport to use - one of TCP, RDMA
transport: TCP
# Controls the maximum message length
# Selected lengths, in bytes, will be 0, 1, 2, 4, 8, 16, ..., 2^maxlog
# Defaults to 22 if not given, meaning the maximum message size will be 4MB
maxlog: 22
OpenFOAM is a toolbox for solving problems in computational fluid dynamics (CFD). It is included here as an example of a "real world" workload.
This benchmark runs the 3-D Lid Driven cavity flow benchmark from the OpenFOAM benchmark suite.
Uses Open MPI initialised over SSH. The data plane can use TCP or, hardware and network permitting, RDMA via UCX.
apiVersion: perftest.stackhpc.com/v1alpha1
kind: OpenFOAM
metadata:
name: openfoam
spec:
# The MPI transport to use - one of TCP, RDMA
transport: TCP
# The problem size to use - one of S, M, XL, XXL
problemSize: S
# The number of MPI processes to use
numProcs: 16
# The number of MPI pods to launch
numNodes: 8
Runs the RDMA bandwidth benchmarks (i.e. ib_{read,write}_bw
) from the
perftest collection.
This benchmark requires an RDMA-capable network to be specified.
apiVersion: perftest.stackhpc.com/v1alpha1
kind: RDMABandwidth
metadata:
name: rdma-bandwidth
spec:
# The mode for the test - read or write
mode: read
# The number of iterations to do at each message size
# Defaults to 1000 if not given
iterations: 1000
# The number of queue pairs to use
# Defaults to 1 if not given
# A higher number of queue pairs can help to spread traffic,
# e.g. over NICs in a bond when using RoCE-LAG
qps: 1
# Extra arguments to be added to the command
extraArgs:
- --tclass=96
Runs the RDMA latency benchmarks (i.e. ib_{read,write}_lat
) from the
perftest collection.
This benchmark requires an RDMA-capable network to be specified.
apiVersion: perftest.stackhpc.com/v1alpha1
kind: RDMALatency
metadata:
name: rdma-latency
spec:
# The mode for the test - read or write
mode: read
# The number of iterations to do at each message size
# Defaults to 1000 if not given
iterations: 1000
# Extra arguments to be added to the command
extraArgs:
- --tclass=96
Runs filesystem performance benchmarking using fio to
determine filesystem performance characteristics. All available spec
options are
given below. Fio configration options match broadly with those defined in the fio
documentation.
Setting .spec.volumeClaimTemplate
allows the provision of stable storage using
PersistentVolumes
provisioned by a PersistentVolume
Provisioner.
When spec.volumeClaimTemplate.accessModes
contains ReadWriteMany
, this benchmark
will create a single PersistentVolume
per BenchmarkSet
iteration, and attach all
worker pods participting in the set (equal to spec.numWorkers
) to the same volume.
Otherwise, a PersistentVolume
per-pod is created and attached to each worker pod
participating in the benchmark.
apiVersion: perftest.stackhpc.com/v1alpha1
kind: Fio
metadata:
name: fio-filesystem
spec:
# fio benchmark configuration options
direct: 1
iodepth: 8
ioengine: libaio
nrfiles: 1
numJobs: 1
bs: 1M
rw: read
percentageRandom: 100
runtime: 10s
rwmixread: 50
size: 1G
# kube-perftest benchmark configuration
# options
numWorkers: 1
thread: false
hostNetwork: false
# PersistentVolume configuration options
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
storageClassName: csi-cinder
resources:
requests:
storage: 5Gi
Runs machine learning model training and inference micro-benchmarks from the official PyTorch benchmarks repo to compare performance of CPU and GPU devices on synthetic input data. Running benchmarks on CUDA-capable devices requires the Nvidia GPU Operator to be pre-installed on the target Kubernetes cluster.
The pre-built container image currently includes the alexnet
, resnet50
and
llama
(inference only) models - additional models from the
upstream repo list
may be added as needed in the future. (Adding a new model simply requires adding it to the list
in images/pytorch-benchmark/Dockerfile
and updating the PyTorchModel
enum in pytorch.py
.)
apiVersion: perftest.stackhpc.com/v1alpha1
kind: PyTorch
metadata:
name: pytorch-test-gpu
spec:
# The device to run the benchmark on ('cpu' or 'cuda')
device: cuda
# Name of model to benchmark
model: alexnet
# Either 'train' or 'eval'
# (not all models support both)
benchmarkType: eval
# Batch size for generated input data
inputBatchSize: 32
resources:
limits:
nvidia.com/gpu: 2
# Install dependencies in a virtual environment
python3 -m venv venv
source venv/bin/activate
pip install -U pip
pip install -r python/requirements.txt
pip install -e python
# Set the default image tag
export KUBE_PERFTEST__DEFAULT_IMAGE_TAG=<dev branch name>
# Set the default image pull policy
export KUBE_PERFTEST__DEFAULT_IMAGE_PULL_POLICY=Always
# Run the operator
kopf run -m perftest.operator -A