diff --git a/README.md b/README.md index e280e93..a740f2e 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,34 @@ -# Kperf +# kperf - a kube-apiserver benchmark tool -Kperf is a benchmark tool for Kubernetes API server. +kperf is a benchmarking tool for the Kubernetes API server that allows users to +conduct high-load testing on simulated clusters. Its primary purpose is to emulate +clusters larger than the actual environment, helping to uncover potential control +plane issues based on the user's workload scale. This tool provides an efficient, +cost-effective way for users to validate the performance and stability of their +Kubernetes API server. -It's like [wrk](https://github.com/wg/wrk), but it's designed to generate load and measure latency for Kubernetes API server. +# Why kperf? -## Quick Start +kperf offers unique advantages over tools like kubemark by simulating a broader +range of traffic patterns found in real Kubernetes workloads. While kubemark +primarily emulates kubelet traffic, kperf can replicate complex interactions +typically associated with controllers, operators, and daemonsets. This includes +scenarios like stale list requests from the API server cache, quorum-based list +operations that directly impact etcd, and informer cache lists and watch behaviors. +By covering these additional traffic types, kperf provides a more comprehensive +view of control plane performance and stability, making it an essential tool for +understanding how a cluster will handle high-load scenarios across diverse workload patterns. -To quickly get started with Kperf, follow these steps: +## Getting Started -1. Run the command `make` to build the necessary dependencies. +See documentation on [Getting-Started](/docs/getting-started.md) -2. Once the build is complete, execute the following command to start the benchmark: +## Running in Cluster -```bash -bin/kperf -v 3 runner run --config examples/node10_job1_pod100.yaml -``` +The `kperf` commands offer low-level functions to measure that target kube-apiserver. +You may need example to combine these functions to run example benchmark test. -3. The benchmark will generate load and measure the performance of the Kubernetes API server. You will see the results displayed in the terminal, including the total number of requests, duration, error statistics, received bytes, and percentile latencies. - -Feel free to adjust the configuration file (`examples/node10_job1_pod100.yaml`) according to your requirements. +See documentation on [runkperf](/docs/runkperf.md) for more detail. ## Contributing diff --git a/cmd/kperf/commands/runner/README.md b/cmd/kperf/commands/runner/README.md deleted file mode 100644 index 7709be7..0000000 --- a/cmd/kperf/commands/runner/README.md +++ /dev/null @@ -1,89 +0,0 @@ -## runner subcommand - -This subcommand can be used to run benchmark. - -Before we run benchmark, we need to define load profile. - -```YAML -version: 1 -description: example profile -spec: - # rate defines the maximum requests per second (zero is no limit). - rate: 100 - # total defines the total number of requests. - total: 10 - # conns defines total number of individual transports used for traffic. - conns: 100 - # client defines total number of HTTP clients. - client: 1000 - # contentType defines response's content type. (json or protobuf) - contentType: json - # disableHTTP2 means client will use HTTP/1.1 protocol if it's true. - disableHTTP2: false - # pick up requests randomly based on defined weight. - requests: - # staleList means this list request with zero resource version. - - staleList: - version: v1 - resource: pods - limit: 1000 - shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) - # quorumList means this list request without kube-apiserver cache. - - quorumList: - version: v1 - resource: pods - limit: 1000 - shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) -``` - -Let's say the local file `/tmp/example-loadprofile.yaml`. - -We can run benchmark by the following command: - -```bash -$ # cd kperf repo -$ # please build binary by make build -$ -$ bin/kperf -v 3 runner run --config /tmp/example-loadprofile.yaml -I0131 09:50:45.471008 2312418 schedule.go:96] "Setting" clients=1000 connections=100 rate=100 total=10 http2=true content-type="json" -{ - "total": 10, - "duration": "1.021348144s", - "errorStats": { - "unknownErrors": [], - "responseCodes": {}, - "http2Errors": {} - }, - "totalReceivedBytes": 18802170, - "percentileLatencies": [ - [ - 0, - 0.82955958 - ], - [ - 0.5, - 0.846259049 - ], - [ - 0.9, - 1.000932855 - ], - [ - 0.95, - 1.006544717 - ], - [ - 0.99, - 1.006544717 - ], - [ - 1, - 1.006544717 - ] - ] -} -``` - -Please checkout `kperf runner run -h` to see more options. - -If you want to run benchmark in kubernetes cluster, please use `kperf rg`. diff --git a/cmd/kperf/commands/runnergroup/README.md b/cmd/kperf/commands/runnergroup/README.md deleted file mode 100644 index 7875c3c..0000000 --- a/cmd/kperf/commands/runnergroup/README.md +++ /dev/null @@ -1,160 +0,0 @@ -## runnergroup subcommand - -The subcommand is used to manage a set of runners in target kubernetes. -Before using command, please build container image for kperf first. - -```bash -$ # cd kperf repo -$ # change repo name to your -$ export IMAGE_REPO=example.azurecr.io/public -$ export IMAGE_TAG=v0.0.2 -$ make image-push -``` - -After that, build kperf binary. - -```bash -$ # cd kperf repo -$ make build -``` - -> NOTE: `make help` can show more recipes. - -### run - deploy a set of runners into kubernetes - -Before run `run` command, we should define runner group first. -Here is an example: there are 10 runners in one group. - -```YAML -# count defines how many runners in the group. -count: 10 # 10 runners in this group -# loadProfile defines what the load traffic looks like. -# All the runners in this group will use the same load profile. -loadProfile: - version: 1 - description: example profile - spec: - # rate defines the maximum requests per second (zero is no limit). - rate: 100 - # total defines the total number of requests. - total: 10 - # conns defines total number of individual transports used for traffic. - conns: 100 - # client defines total number of HTTP clients. - client: 1000 - # contentType defines response's content type. (json or protobuf) - contentType: json - # disableHTTP2 means client will use HTTP/1.1 protocol if it's true. - disableHTTP2: false - # pick up requests randomly based on defined weight. - requests: - # staleList means this list request with zero resource version. - - staleList: - version: v1 - resource: pods - limit: 1000 - shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) - # quorumList means this list request without kube-apiserver cache. - - quorumList: - version: v1 - resource: pods - limit: 1000 - shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) - -# nodeAffinity defines how to deploy runners into dedicated nodes which have specific labels. -nodeAffinity: - node.kubernetes.io/instance-type: - - Standard_D8s_v3 -``` - -Let's say the local file `/tmp/example-runnergroup-spec.yaml`. - -We can run this runner group by the following command: - -```bash -$ # cd kperf repo -$ # change repo name to your -$ export IMAGE_REPO=example.azurecr.io/public -$ export IMAGE_TAG=v0.0.2 -$ export IMAGE_NAME=$IMAGE_REPO/kperf:$IMAGE_TAG -$ -$ bin/kperf rg run \ - --runner-image=$IMAGE_NAME \ - --runnergroup="file:///tmp/example-runnergroup-spec.yaml" -``` - -We use URI scheme to load runner group's spec. -For example, `file://absolute-path`. We also support read spec from configmap, `configmap://name?namespace=ns&specName=dataNameInCM`. -Please checkout `kperf rg run -h` to see more options. - -> NOTE: Currently, we use helm release to deploy a long running sever as controller to -deploy runners. The namespace is `runnergroups-kperf-io` and we don't allow run -multiple long running servers right now. - -### status - check runnergroup's status - -After deploy runner groups successfully, we can use `status` to check. - -```bash -$ # cd kperf repo -$ -$ bin/kperf rg status -NAME COUNT SUCCEEDED FAILED STATE START -runnergroup-server-0 10 10 0 finished 2024-01-30T10:18:36Z -``` - -### result - wait for test report - -We use `result` to wait for report. - -```bash -$ # cd kperf repo -$ -$ bin/kperf rg result --wait --timeout=1h -{ - "total": 100, - "duration": "318.47949ms", - "errorStats": { - "unknownErrors": [], - "responseCodes": {}, - "http2Errors": {} - }, - "totalReceivedBytes": 89149672, - "percentileLatencies": [ - [ - 0, - 0.039138658 - ], - [ - 0.5, - 0.072110663 - ], - [ - 0.9, - 0.158119337 - ], - [ - 0.95, - 0.179047998 - ], - [ - 0.99, - 0.236420101 - ], - [ - 1, - 0.267788626 - ] - ] -} -``` - -`--wait` is used to block until all the runners finished. - -### delete - delete runner groups - -```bash -$ # cd kperf repo -$ -$ bin/kperf rg delete -``` diff --git a/cmd/kperf/commands/virtualcluster/README.md b/cmd/kperf/commands/virtualcluster/README.md deleted file mode 100644 index ad4c432..0000000 --- a/cmd/kperf/commands/virtualcluster/README.md +++ /dev/null @@ -1,93 +0,0 @@ -# virtualcluster subcommand - -## nodepool subcommand - -The `nodepool` subcmd is using [kwok](https://github.com/kubernetes-sigs/kwok) to -deploy virtual nodepool. The user can use few physical resources to simulate -more than 1,000 nodes scenario. - -The kperf uses `virtualnodes-kperf-io` namespace to host resources related to -nodepool. - -If the user wants to schedule pods to virtual nodes, the user needs to change -node affinity and tolerations for pods. - -```YAML -affinity: - nodeAffinity: - requiredDuringSchedulingIgnoredDuringExecution: - nodeSelectorTerms: - - matchExpressions: - - key: type - operator: In - values: - - kperf-virtualnodes - -tolerations: -- key: "kperf.io/nodepool" - operator: "Exists" - effect: "NoSchedule" -``` - -Be default, the pod created by job controller will be completed after 5 seconds. -Other pods will be long running until receiving delete event. - -### add - add a set of nodes with the same setting - -We can use the following command to add nodepool named by `example` with 100 nodes. - -```bash -$ # cd kperf repo -$ # please build binary by make build -$ -$ bin/kperf vc nodepool add example \ - --nodes=100 --cpu=32 --memory=96 --max-pods=50 \ - --affinity="node.kubernetes.io/instance-type=Standard_D16s_v3" -``` - -The `--affinity` is used to deploy node controller (kwok) to nodes with the -specific labels. - -The user can use `kubectl get nodes` to check. - -```bash -$ kubectl get nodes -o wide | grep example | head -n 10 -example-0 Ready agent 75s fake 10.244.11.150 kwok-v0.4.0 kwok -example-1 Ready agent 75s fake 10.244.9.71 kwok-v0.4.0 kwok -example-10 Ready agent 75s fake 10.244.10.178 kwok-v0.4.0 kwok -example-11 Ready agent 75s fake 10.244.9.74 kwok-v0.4.0 kwok -example-12 Ready agent 75s fake 10.244.9.75 kwok-v0.4.0 kwok -example-13 Ready agent 75s fake 10.244.11.143 kwok-v0.4.0 kwok -example-14 Ready agent 75s fake 10.244.11.153 kwok-v0.4.0 kwok -example-15 Ready agent 75s fake 10.244.10.180 kwok-v0.4.0 kwok -example-16 Ready agent 75s fake 10.244.9.81 kwok-v0.4.0 kwok -example-17 Ready agent 75s fake 10.244.11.147 kwok-v0.4.0 kwok -``` - -### list - list all the existing nodepools created by kperf - -```bash -$ # cd kperf repo -$ # please build binary by make build -$ -$ bin/kperf vc nodepool list -NAME NODES CPU MEMORY (GiB) MAX PODS STATUS -example ? / 100 32 96 50 deployed -example-v2 ? / 10 8 16 130 deployed -``` - -> NOTE: There is TODO item to show the number of ready nodes. Before that, we -use `?` as read nodes. - -### delete - delete the target nodepool - -```bash -$ # cd kperf repo -$ # please build binary by make build -$ -$ bin/kperf vc nodepool delete example -$ -$ bin/kperf vc nodepool list -NAME NODES CPU MEMORY (GiB) MAX PODS STATUS -example-v2 ? / 10 8 16 130 deployed -``` diff --git a/docs/getting-started.md b/docs/getting-started.md new file mode 100644 index 0000000..7e34e8d --- /dev/null +++ b/docs/getting-started.md @@ -0,0 +1,488 @@ +# Getting started with kperf + +## Installing kperf + +Currently, kperf hasn't released official binary yet. To install kperf, we need +to build kperf from source. + +### Build requirements + +The following build system dependencies are required: + +* Go 1.21.X or above +* Linux platform +* GNU Make +* Git + +> NOTE: The contrib/cmd/runkperf binary is using [mount_namespaces(7)](https://man7.org/linux/man-pages/man7/mount_namespaces.7.html) +to fetch metrics from each instance of kube-apiserver. It requires Linux platform. + +### Build kperf + +You need git to checkout the source code: + +```bash +git clone https://github.com/Azure/kperf.git +``` + +`kperf` uses `make` to create a repeatable build flow. +That means you can run: + +```bash +cd kperf +make +``` + +This is going to build binaries in the `./bin` directory. + +You can move them in your `PATH`. You can run: + +```bash +sudo make install +``` + +By default, the binaries will be in `/usr/local/bin`. The install prefix can be +changed by passing the `PREFIX` variable (default: `/usr/local`). + +## Using kperf + +### kperf-runner run + +The `kperf runner run` command generates requests from the endpoint where the command is executed. +This command provides flexiable way to configure how to generate requests to Kubernetes API server. +All the requests are generated based on load profile, for example. + +```yaml +version: 1 +description: example profile +spec: + # rate defines the maximum requests per second (zero is no limit). + rate: 100 + + # total defines the total number of requests. + total: 10 + + # conns defines total number of individual transports used for traffic. + conns: 100 + + # client defines total number of HTTP clients. These clients shares connection + # pool represented by `conn:` field. + client: 1000 + + # contentType defines response's content type. (json or protobuf) + contentType: json + + # disableHTTP2 means client will use HTTP/1.1 protocol if it's true. + disableHTTP2: false + + # pick up requests randomly based on defined weight. + requests: + # staleList means this list request with zero resource version. + - staleList: + version: v1 + resource: pods + shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) + # quorumList means this list request without kube-apiserver cache. + - quorumList: + version: v1 + resource: pods + limit: 1000 + shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) +``` + +Let's see what that profile means here. + +There are two kinds of requests and all the responses are in JSON format. + +* stale list: `/api/v1/pods` +* quorum list: `/api/v1/pods?limit=1000` + +That command will send out `10` requests with `100` QPS as maximum rate. +You can adjust the `total` and `rate` fields to control the test duration. + +Before generating requests, that comamnd will generate `100` individual connections and share them in `1000` clients. + +When the number of clients exceeds the available connections, each client selects a specific connection based on its index. +The goal is for each client to select a connection based on its index to ensure every client is assigned a connection in a round-robin fashion. + +```plain +Client 0 is assigned to Connection 0 +Client 1 is assigned to Connection 1 +Client 2 is assigned to Connection 2 +Client 3 is assigned to Connection 0 +Client 4 is assigned to Connection 1 +``` + +The above profile is located at `/tmp/example-loadprofile.yaml`. You can run + +```bash +$ kperf -v 3 runner run --config /tmp/example-loadprofile.yaml +I1028 23:08:18.948632 294624 schedule.go:96] "Setting" clients=1000 connections=100 rate=100 total=10 http2=true content-type="json" +{ + "total": 10, + "duration": "367.139837ms", + "errorStats": { + "unknownErrors": [], + "netErrors": {}, + "responseCodes": {}, + "http2Errors": {} + }, + "totalReceivedBytes": 2856450, + "percentileLatencies": [ + [ + 0, + 0.235770565 + ], + [ + 0.5, + 0.247910802 + ], + [ + 0.9, + 0.266660525 + ], + [ + 0.95, + 0.286721785 + ], + [ + 0.99, + 0.286721785 + ], + [ + 1, + 0.286721785 + ] + ], + "percentileLatenciesByURL": { + "https://xyz:443/api/v1/pods?limit=1000\u0026timeout=1m0s": [ + [ + 0, + 0.235770565 + ], + [ + 0.5, + 0.245662504 + ], + [ + 0.9, + 0.266660525 + ], + [ + 0.95, + 0.266660525 + ], + [ + 0.99, + 0.266660525 + ], + [ + 1, + 0.266660525 + ] + ], + "https://xyz:443/api/v1/pods?resourceVersion=0\u0026timeout=1m0s": [ + [ + 0, + 0.23650554 + ], + [ + 0.5, + 0.247910802 + ], + [ + 0.9, + 0.286721785 + ], + [ + 0.95, + 0.286721785 + ], + [ + 0.99, + 0.286721785 + ], + [ + 1, + 0.286721785 + ] + ] + } +} +``` + +The result shows the percentile latencies and also provides latency details based on each kind of request. + +> NOTE: Please checkout `kperf runner run -h` to see more options. + +If you want to run benchmark in Kubernetes cluster, please use `kperf runnergroup`. + +### kperf-runnergroup + +The `kperf runnergroup` command manages a group of runners within a target Kubernetes cluster. +A runner group consists of multiple runners, with each runner deployed as an individual Pod for the `kperf runner` process. +These runners not only generate requests within the cluster but can also issue requests from multiple endpoints, +mitigating limitations such as network bandwidth constraints. + +#### run - deploy a set of runners into kubernetes + +Each runner in a group shares the same load profile. For example, it's defination about runner group. +There are 10 runners in one group and they will be scheduled to `Standard_DS2_v2` type nodes. + +```yaml +# count defines how many runners in the group. +count: 10 + +# loadProfile defines what the load traffic looks like. +# All the runners in this group will use the same load profile. +loadProfile: + version: 1 + description: example profile + spec: + # rate defines the maximum requests per second (zero is no limit). + rate: 100 + + # total defines the total number of requests. + total: 10 + + # conns defines total number of individual transports used for traffic. + conns: 100 + + # client defines total number of HTTP clients. + client: 1000 + + # contentType defines response's content type. (json or protobuf) + contentType: json + + # disableHTTP2 means client will use HTTP/1.1 protocol if it's true. + disableHTTP2: false + + # pick up requests randomly based on defined weight. + requests: + # staleList means this list request with zero resource version. + - staleList: + version: v1 + resource: pods + shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) + # quorumList means this list request without kube-apiserver cache. + - quorumList: + version: v1 + resource: pods + limit: 1000 + shares: 1000 # Has 50% chance = 1000 / (1000 + 1000) + +# nodeAffinity defines how to deploy runners into dedicated nodes which have specific labels. +nodeAffinity: + node.kubernetes.io/instance-type: + - Standard_DS2_v2 +``` + +Let's say the local file `/tmp/example-runnergroup-spec.yaml`. You can run: + +```bash +$ kperf rg run \ + --runner-image=telescope.azurecr.io/oss/kperf:v0.1.5 \ + --runnergroup="file:///tmp/example-runnergroup-spec.yaml" +``` + +We use URI scheme to load runner group's spec. +For example, `file://absolute-path`. We also support read spec from configmap, `configmap://name?namespace=ns&specName=dataNameInCM`. +Please checkout `kperf rg run -h` to see more options. + +> NOTE: Currently, we use helm release to deploy a long running sever as controller to +deploy runners. The namespace is `runnergroups-kperf-io` and we don't allow run +multiple long running servers right now. + +#### status - check runner group's status + +After deploy runner groups successfully, you can use `status` to check. + +```bash +$ kperf rg status +NAME COUNT SUCCEEDED FAILED STATE START +runnergroup-server-0 10 10 0 finished 2024-10-29T00:30:03Z +``` + +#### result - wait for test report + +We use `result` to wait for report. + +```bash +$ kperf rg result --wait +{ + "total": 100, + "duration": "283.369368ms", + "errorStats": { + "unknownErrors": [], + "netErrors": {}, + "responseCodes": {}, + "http2Errors": {} + }, + "totalReceivedBytes": 36087700, + "percentileLatencies": [ + [ + 0, + 0.031640566 + ], + [ + 0.5, + 0.084185705 + ], + [ + 0.9, + 0.152182422 + ], + [ + 0.95, + 0.172522186 + ], + [ + 0.99, + 0.186271132 + ], + [ + 1, + 0.205396874 + ] + ], + "percentileLatenciesByURL": { + "https://10.0.0.1:443/api/v1/pods?limit=1000\u0026timeout=1m0s": [ + [ + 0, + 0.044782901 + ], + [ + 0.5, + 0.093048564 + ], + [ + 0.9, + 0.152182422 + ], + [ + 0.95, + 0.174676524 + ], + [ + 0.99, + 0.205396874 + ], + [ + 1, + 0.205396874 + ] + ], + "https://10.0.0.1:443/api/v1/pods?resourceVersion=0\u0026timeout=1m0s": [ + [ + 0, + 0.031640566 + ], + [ + 0.5, + 0.076792273 + ], + [ + 0.9, + 0.158094428 + ], + [ + 0.95, + 0.172522186 + ], + [ + 0.99, + 0.176899664 + ], + [ + 1, + 0.176899664 + ] + ] + } +} +``` + +> NOTE: `--wait` is used to block until all the runners finished. + +#### delete - delete runner groups + +```bash +$ kperf rg delete +``` + +### kperf-virtualcluster nodepool + +The `nodepool` subcmd is using [kwok](https://github.com/kubernetes-sigs/kwok) to +deploy virtual nodepool. You can use few physical resources to simulate more than 1,000 nodes scenario. + +> NOTE: The `kperf` uses `virtualnodes-kperf-io` namespace to host resources related to nodepool. + +If the user wants to schedule pods to virtual nodes, the user needs to change node affinity and tolerations for pods. + +```YAML +affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: type + operator: In + values: + - kperf-virtualnodes + +tolerations: +- key: "kperf.io/nodepool" + operator: "Exists" + effect: "NoSchedule" +``` + +Be default, the pod created by job controller will be completed after 5 seconds. +Other pods will be long running until receiving delete event. + +#### add - add a set of nodes with the same setting + +You can use the following command to add nodepool named by `example` with 10 nodes. + +```bash +$ kperf vc nodepool add example \ + --nodes=10 --cpu=32 --memory=96 --max-pods=50 \ + --affinity="node.kubernetes.io/instance-type=Standard_DS2_v2" +``` + +> NOTE: The `--affinity` is used to deploy node controller (kwok) to nodes with the specific labels. + +You can use `kubectl get nodes` to check. + +```bash +$ kubectl get nodes -o wide | grep example +example-0 Ready agent 21s fake 10.244.15.21 kwok-v0.5.1 kwok +example-1 Ready agent 21s fake 10.244.13.18 kwok-v0.5.1 kwok +example-2 Ready agent 21s fake 10.244.14.18 kwok-v0.5.1 kwok +example-3 Ready agent 21s fake 10.244.15.22 kwok-v0.5.1 kwok +example-4 Ready agent 21s fake 10.244.13.19 kwok-v0.5.1 kwok +example-5 Ready agent 21s fake 10.244.14.21 kwok-v0.5.1 kwok +example-6 Ready agent 21s fake 10.244.14.20 kwok-v0.5.1 kwok +example-7 Ready agent 21s fake 10.244.14.19 kwok-v0.5.1 kwok +example-8 Ready agent 21s fake 10.244.13.20 kwok-v0.5.1 kwok +example-9 Ready agent 21s fake 10.244.15.23 kwok-v0.5.1 kwok +``` + +#### list - list all the existing nodepools + +```bash +$ kperf vc nodepool list +NAME NODES CPU MEMORY (GiB) MAX PODS STATUS +example ? / 10 32 96 50 deployed +``` + +> NOTE: There is TODO item to show the number of ready nodes. Before that, we +use `?` as read nodes. + +#### delete - delete the target nodepool + +```bash +$ kperf vc nodepool delete example +$ +$ kperf vc nodepool list +NAME NODES CPU MEMORY (GiB) MAX PODS STATUS +``` diff --git a/docs/runkperf.md b/docs/runkperf.md new file mode 100644 index 0000000..eaea485 --- /dev/null +++ b/docs/runkperf.md @@ -0,0 +1,265 @@ +# runkperf + +runkperf is a command-line tool that runs kperf within a Kubernetes cluster to +simulate large workloads and measure the performance and stability of the target kube-apiserver. + +## Installing runkperf + +See documentation [Getting-Started#Installing-Kperf](/docs/getting-started.md#installing-kperf). + +## How to run benchmark test? + +runkperf includes three benchmark scenarios, one of which focuses on measuring +performance and stability with 3,000 short-lifecycle pods distributed across 100 nodes. + +```bash +$ runkperf bench --runner-image telescope.azurect.io/oss/kperf:v0.1.5 node100_job1_pod3k --help + +NAME: + runkperf bench node100_job1_pod3k - + +The test suite is to setup 100 virtual nodes and deploy one job with 3k pods on +that nodes. It repeats to create and delete job. The load profile is fixed. + + +USAGE: + runkperf bench node100_job1_pod3k [command options] [arguments...] + +OPTIONS: + --total value Total requests per runner (There are 10 runners totally and runner's rate is 10) (default: 36000) + --cpu value the allocatable cpu resource per node (default: 32) + --memory value The allocatable Memory resource per node (GiB) (default: 96) + --max-pods value The maximum Pods per node (default: 110) + --content-type value Content type (json or protobuf) (default: "json") +``` + +This test eliminates the need to set up 100 physical nodes, as kperf leverages +[kwok](https://github.com/kubernetes-sigs/kwok) to simulate both nodes and pod +lifecycles. Only a few physical nodes are required to host **5** kperf runners +and **100** kwok controllers. + +We **recommend** using two separate node pools in the target Kubernetes cluster +to host the kperf runners and Kwok controllers independently. By default, runkperf +schedules: + +* Runners on nodes with instance type: **Standard_D16s_v3** on Azure or **m4.4xlarge** on AWS +* kwok controllers on nodes with instance type: **Standard_D8s_v3** on Azure or **m4.2xlarge** on AWS + +You can modify the scheduling affinity for runners and controllers using the +`--rg-affinity` and `--vc-affinity` options. Please check `runkperf bench --help` for more details. + +When that target cluster is ready, you can run + +```bash +$ sudo runkperf -v 3 bench \ + --kubeconfig $HOME/.kube/config \ + --runner-image telescope.azurecr.io/oss/kperf:v0.1.5 \ + node100_job1_pod3k --total 1000 +``` + +> NOTE: The `sudo` allows that command to create [mount_namespaces(7)](https://man7.org/linux/man-pages/man7/mount_namespaces.7.html) +to fetch kube-apiserver metrics, for example, `GOMAXPROCS`. However, it's not required. + +This command has four steps: + +* Setup 100 virtual nodes +* Repeat to create and delete one Job to simulate 3,000 short-lifecycle pods +* Deploy runner group and start measurement +* Retrieve measurement report + +You will see that summary when runners finish, like + +```bash +{ + "description": "\nEnvironment: 100 virtual nodes managed by kwok-controller,\nWorkload: Deploy 1 job with 3,000 pods repeatedly. The parallelism is 100. The interval is 5s", + "loadSpec": { + "count": 10, + "loadProfile": { + "version": 1, + "description": "node100-job1-pod3k", + "spec": { + "rate": 10, + "total": 1000, + "conns": 10, + "client": 100, + "contentType": "json", + "disableHTTP2": false, + "maxRetries": 0, + "Requests": [ + { + "shares": 1000, + "staleList": { + "group": "", + "version": "v1", + "resource": "pods", + "namespace": "", + "limit": 0, + "seletor": "", + "fieldSelector": "" + } + }, + { + "shares": 100, + "quorumList": { + "group": "", + "version": "v1", + "resource": "pods", + "namespace": "", + "limit": 1000, + "seletor": "", + "fieldSelector": "" + } + }, + { + "shares": 100, + "quorumList": { + "group": "", + "version": "v1", + "resource": "events", + "namespace": "", + "limit": 1000, + "seletor": "", + "fieldSelector": "" + } + } + ] + } + }, + "nodeAffinity": { + "node.kubernetes.io/instance-type": [ + "Standard_D16s_v3", + "m4.4xlarge" + ] + } + }, + "result": { + "total": 10000, + "duration": "1m40.072897445s", + "errorStats": { + "unknownErrors": [], + "netErrors": {}, + "responseCodes": {}, + "http2Errors": {} + }, + "totalReceivedBytes": 38501695787, + "percentileLatencies": [ + [ + 0, + 0.024862332 + ], + [ + 0.5, + 0.076491594 + ], + [ + 0.9, + 0.135807192 + ], + [ + 0.95, + 0.157084984 + ], + [ + 0.99, + 0.200460794 + ], + [ + 1, + 0.323297381 + ] + ], + "percentileLatenciesByURL": { + "https://10.0.0.1:443/api/v1/events?limit=1000\u0026timeout=1m0s": [ + [ + 0, + 0.025955119 + ], + [ + 0.5, + 0.040329283 + ], + [ + 0.9, + 0.05549999 + ], + [ + 0.95, + 0.061468019 + ], + [ + 0.99, + 0.079093604 + ], + [ + 1, + 0.158946761 + ] + ], + "https://10.0.0.1:443/api/v1/pods?limit=1000\u0026timeout=1m0s": [ + [ + 0, + 0.041545073 + ], + [ + 0.5, + 0.12342483 + ], + [ + 0.9, + 0.186716374 + ], + [ + 0.95, + 0.208233619 + ], + [ + 0.99, + 0.253509952 + ], + [ + 1, + 0.323297381 + ] + ], + "https://10.0.0.1:443/api/v1/pods?resourceVersion=0\u0026timeout=1m0s": [ + [ + 0, + 0.024862332 + ], + [ + 0.5, + 0.077794907 + ], + [ + 0.9, + 0.131738916 + ], + [ + 0.95, + 0.146966904 + ], + [ + 0.99, + 0.189498717 + ], + [ + 1, + 0.302434749 + ] + ] + } + }, + "info": { + "apiserver": { + "cores": { + "after": { + "52.167.25.119": 10 + }, + "before": { + "52.167.25.119": 10 + } + } + } + } +} +``` diff --git a/examples/node10_job1_pod100.yaml b/examples/node10_job1_pod100.yaml deleted file mode 100644 index 4755091..0000000 --- a/examples/node10_job1_pod100.yaml +++ /dev/null @@ -1,25 +0,0 @@ - version: 1 - description: "node10-job1-pod100" - spec: - rate: 10 - total: 1000 - conns: 10 - client: 10 - contentType: json - disableHTTP2: false - maxRetries: 0 - requests: - - staleList: - version: v1 - resource: pods - shares: 1000 # chance 1000 / (1000 + 100 + 100) - - quorumList: - version: v1 - resource: pods - limit: 1000 - shares: 100 # chance 100 / (1000 + 100 + 100) - - quorumList: - version: v1 - resource: events - limit: 1000 - shares: 100 # chance 100 / (1000 + 100 + 100) \ No newline at end of file