-
Notifications
You must be signed in to change notification settings - Fork 69
RELEASE 2022 0130 Perf tests Result
- build branch: scale-out-poc-2021-0430 - https://github.com/CentaurusInfra/arktos/tree/scale-out-poc-2021-0430
- 1 Apiserver
- 1 ETCD instance
- WCM disabled
- Perf-test tool: https://github.com/CentaurusInfra/arktos/tree/master/perf-tests/clusterloader2
- Leader-election disabled
- Insecure-port enabled
- apiserver pprof debug enabled
- Prometheus debug enabled
- Env:
export MASTER_DISK_SIZE=200GB MASTER_ROOT_DISK_SIZE=200GB KUBE_GCE_ZONE=us-west2-b MASTER_SIZE=n1-standard-32 NODE_SIZE=n1-standard-16 NUM_NODES=6 NODE_DISK_SIZE=200GB GOPATH=$HOME/go KUBE_GCE_ENABLE_IP_ALIASES=true KUBE_GCE_PRIVATE_CLUSTER=true CREATE_CUSTOM_NETWORK=true ETCD_QUOTA_BACKEND_BYTES=8589934592 TEST_CLUSTER_LOG_LEVEL=--v=2 ENABLE_KCM_LEADER_ELECT=false ENABLE_SCHEDULER_LEADER_ELECT=false SHARE_PARTITIONSERVER=false LOGROTATE_FILES_MAX_COUNT=50 LOGROTATE_MAX_SIZE=200M KUBE_ENABLE_PROMETHEUS_DEBUG=true KUBE_ENABLE_PPROF_DEBUG=true KUBE_ENABLE_APISERVER_INSECURE_PORT=true KUBEMARK_NUM_NODES=500 KUBE_GCE_INSTANCE_PREFIX=release43021-500-scaleup KUBE_GCE_NETWORK=release43021-500-scaleup
- Cmd:
GOPATH=$HOME/go nohup ./perf-tests/clusterloader2/run-e2e.sh --nodes=500 --provider=kubemark --kubeconfig=/home/sonyali/go/src/k8s.io/arktos/test/kubemark/resources/kubeconfig.kubemark --report-dir=/home/sonyali/logs/perf-test/gce-500/arktos/release43021-500-scaleup --testconfig=testing/density/config.yaml --testconfig=testing/load/config.yaml --testoverrides=./testing/experiments/disable_pvs.yaml
- build branch: scale-out-poc-2021-0430 - https://github.com/CentaurusInfra/arktos/tree/scale-out-poc-2021-0430
- 1 proxy
- multi RP masters
- multi TP masters
- WCM disabled
- Perf-test tool: https://github.com/CentaurusInfra/arktos/tree/master/perf-tests/clusterloader2
- Leader-election disabled
- Insecure-port enabled
- apiserver pprof debug enabled
- Prometheus debug enabled
- Env example:
export KUBEMARK_NUM_NODES=15000 NUM_NODES=310 SCALEOUT_TP_COUNT=2 SCALEOUT_RP_COUNT=2 RUN_PREFIX=poc430-041621-2x2x15k
export MASTER_DISK_SIZE=1000GB MASTER_ROOT_DISK_SIZE=1000GB KUBE_GCE_ZONE=us-central1-b MASTER_SIZE=n1-highmem-96 NODE_SIZE=n1-highmem-16 NODE_DISK_SIZE=1000GB GOPATH=$HOME/go KUBE_GCE_ENABLE_IP_ALIASES=true KUBE_GCE_PRIVATE_CLUSTER=true CREATE_CUSTOM_NETWORK=true KUBE_GCE_INSTANCE_PREFIX=${RUN_PREFIX} KUBE_GCE_NETWORK=${RUN_PREFIX} ENABLE_KCM_LEADER_ELECT=false ENABLE_SCHEDULER_LEADER_ELECT=false ETCD_QUOTA_BACKEND_BYTES=8589934592 SHARE_PARTITIONSERVER=false LOGROTATE_FILES_MAX_COUNT=200 LOGROTATE_MAX_SIZE=200M KUBE_ENABLE_APISERVER_INSECURE_PORT=true KUBE_ENABLE_PROMETHEUS_DEBUG=true KUBE_ENABLE_PPROF_DEBUG=true TEST_CLUSTER_LOG_LEVEL=--v=2 HOLLOW_KUBELET_TEST_LOG_LEVEL=--v=2 SCALEOUT_CLUSTER=true
- Perf Cmd:
SCALEOUT_TEST_TENANT=arktos RUN_PREFIX=poc430-041621-2x2x15k PERF_LOG_DIR=/home/sonyali/logs/perf-test/gce-15000/arktos/${RUN_PREFIX}/${SCALEOUT_TEST_TENANT} nohup perf-tests/clusterloader2/run-e2e.sh --nodes=15000 --provider=kubemark --kubeconfig=/home/sonyali/go/src/k8s.io/arktos/test/kubemark/resources/kubeconfig.kubemark-proxy --report-dir=${PERF_LOG_DIR} --testconfig=testing/density/config.yaml --testoverrides=./testing/experiments/disable_pvs.yaml > ${PERF_LOG_DIR}/perf-run.log 2>&1 &
SCALEOUT_TEST_TENANT=zeta RUN_PREFIX=poc430-041621-2x2x15k PERF_LOG_DIR=/home/sonyali/logs/perf-test/gce-15000/arktos/${RUN_PREFIX}/${SCALEOUT_TEST_TENANT} nohup perf-tests/clusterloader2/run-e2e.sh --nodes=15000 --provider=kubemark --kubeconfig=/home/sonyali/go/src/k8s.io/arktos/test/kubemark/resources/kubeconfig.kubemark-proxy --report-dir=${PERF_LOG_DIR} --testconfig=testing/density/config.yaml --testoverrides=./testing/experiments/disable_pvs.yaml > ${PERF_LOG_DIR}/perf-run.log 2>&1 &
- Build and commit information: https://github.com/CentaurusInfra/arktos
11756a4be7f (HEAD, upstream/master) Support multiple RPs in Mizar node controller (#1225)
ccd60276544 add csr to rp controller (#1228)
e9d658bd1c3 Daemonset controller supports multi resource partitions (#1224)
bb529334f2a kunsupported cgroup setup causes kubelet to emit a warning rather than exiting (#1220)
c4697f43324 Rename tech doc name - CI bot complains (#1218)
2f1e4277b38 Add a brief introduction to Google Anthos Overall Architecture
6336ea98e7f Move proxy setup logic from dev machines to proxy VM (#1212)
c74b94cc998 (master) fix flannel to v0.14.0 (#1214)
8f427844acf concurrency related code adjustment (#1209)
306c4472071 (tag: v0.9) Bump Arktos to v0.9.0 (#1204)
- additional env vars
KUBE_CONTROLLER_EXTRA_ARGS="--kube-api-qps=100 --kube-api-burst=150"
KUBE_SCHEDULER_EXTRA_ARGS="--kube-api-qps=300 --kube-api-burst=450"
KUBE_FEATURE_GATES=ExperimentalCriticalPodAnnotation=true,QPSDoubleGCController=true
- Additional perf config: change pod latency threshold to 6s; skip deleting saturation pods; skip deleting latency pods
--testoverrides=./testing/density/25k_nodes/override.yaml
--testoverrides=./testing/experiments/deleting_saturation_pods.yaml
--testoverrides=./testing/experiments/deleting_latency_pods.yaml
-
logs can be found under GCP project workload-controller-manager on sonyadev4: /home/sonyali/logs/perf-test/gce-25k/arktos/rel130-112921-3x3x25k
-
[tenant: arktos] Test Result-density: Test finished with Status: Fail
E1129 21:29:22.815530 20985 clusterloader.go:219] Test Finished
E1129 21:29:22.815534 20985 clusterloader.go:220] Test: testing/density/config.yaml
E1129 21:29:22.815539 20985 clusterloader.go:221] Status: Fail
E1129 21:29:22.815543 20985 clusterloader.go:223] Errors: [measurement call PodStartupLatency - PodStartupLatency error: pod startup: too high latency 99th percentile: got 9.770964654s expected: 6s]
PodStartupLatency:
"data": {
"Perc50": 1846.034238,
"Perc90": 2868.577272,
"Perc99": 9770.964654
},
"unit": "ms",
SaturationPodStartupLatency:
"data": {
"Perc50": 8247.532652,
"Perc90": 23718.363873,
"Perc99": 32777.744936
},
"unit": "ms",
SchedulingThroughput:
{
"perc50": 114.4,
"perc90": 130.4,
"perc99": 220.6,
"max": 433.4
}
- [tenant: monkey] Test Result-density: Test finished with Status: Fail
E1129 21:28:38.918370 18897 clusterloader.go:220] Test: testing/density/config.yaml
E1129 21:28:38.918375 18897 clusterloader.go:221] Status: Fail
E1129 21:28:38.918379 18897 clusterloader.go:223] Errors: [namespace oyhkg4-testns object latency-deployment-58 creation error: the server is currently unable to handle the request
namespace oyhkg4-testns object latency-deployment-59 creation error: the server is currently unable to handle the request
namespace oyhkg4-testns object latency-deployment-60 creation error: the server is currently unable to handle the request
namespace oyhkg4-testns object latency-deployment-61 creation error: the server is currently unable to handle the request
PodStartupLatency:
"data": {
"Perc50": 1838.01862,
"Perc90": 2887.80774,
"Perc99": 11781.367567
},
"unit": "ms",
SaturationPodStartupLatency:
"data": {
"Perc50": 7441.018805,
"Perc90": 15135.40746,
"Perc99": 23663.440455
},
"unit": "ms",
SchedulingThroughput:
{
"perc50": 116.2,
"perc90": 128.8,
"perc99": 222.8,
"max": 389.4
}
- [tenant: zeta] Test Result-density: Test finished with Status: Fail
E1129 21:28:33.288976 29181 clusterloader.go:219] Test Finished
E1129 21:28:33.288981 29181 clusterloader.go:220] Test: testing/density/config.yaml
E1129 21:28:33.288985 29181 clusterloader.go:221] Status: Fail
E1129 21:28:33.288990 29181 clusterloader.go:223] Errors: [measurement call PodStartupLatency - PodStartupLatency error: pod startup: too high latency 99th percentile: got 9.695355474s expected: 6s]
PodStartupLatency:
"data": {
"Perc50": 1846.659701,
"Perc90": 2882.973899,
"Perc99": 9695.355474
},
"unit": "ms",
SaturationPodStartupLatency:
"data": {
"Perc50": 7939.144063,
"Perc90": 22108.767104,
"Perc99": 39638.286394
},
"unit": "ms",
SchedulingThroughput:
{
"perc50": 112,
"perc90": 125.4,
"perc99": 198.2,
"max": 499.4
}
- Build and commit information: https://github.com/futurewei-cloud/arktos-perftest/tree/poc20220130-perf-1202
f4f9af2a616 (HEAD, arktos-perf/poc20220130-perf-1202) add 20k_nodes override.yaml for density perf test
11756a4be7f (upstream/master) Support multiple RPs in Mizar node controller (#1225)
ccd60276544 add csr to rp controller (#1228)
e9d658bd1c3 Daemonset controller supports multi resource partitions (#1224)
bb529334f2a kunsupported cgroup setup causes kubelet to emit a warning rather than exiting (#1220)
c4697f43324 Rename tech doc name - CI bot complains (#1218)
2f1e4277b38 (master) Add a brief introduction to Google Anthos Overall Architecture
6336ea98e7f Move proxy setup logic from dev machines to proxy VM (#1212)
c74b94cc998 (sindica/master) fix flannel to v0.14.0 (#1214)
8f427844acf concurrency related code adjustment (#1209)
306c4472071 (tag: v0.9, arktos-perf/master) Bump Arktos to v0.9.0 (#1204)
- additional env vars
KUBE_CONTROLLER_EXTRA_ARGS="--kube-api-qps=80 --kube-api-burst=120"
KUBE_SCHEDULER_EXTRA_ARGS="--kube-api-qps=300 --kube-api-burst=450"
KUBE_FEATURE_GATES=ExperimentalCriticalPodAnnotation=true,QPSDoubleGCController=true
- Additional perf config: change pod latency threshold to 6s; skip deleting saturation pods; skip deleting latency pods
--testoverrides=./testing/density/20k_nodes/override.yaml
--testoverrides=./testing/experiments/deleting_saturation_pods.yaml
--testoverrides=./testing/experiments/deleting_latency_pods.yaml
-
logs can be found under GCP project workload-controller-manager on sonyadev4: /home/sonyali/logs/perf-test/gce-20k/arktos/rel130-120221-3x3x20k
-
[tenant: arktos] Test Result-density: Test finished with Status: Fail
E1202 22:03:20.015506 2712 clusterloader.go:220] Test: testing/density/config.yaml
E1202 22:03:20.015510 2712 clusterloader.go:221] Status: Fail
E1202 22:03:20.015514 2712 clusterloader.go:223] Errors: [measurement call PodStartupLatency - PodStartupLatency error: pod startup: too high latency 99th percentile: got 6.75555489s expected: 6s]
PodStartupLatency:
"data": {
"Perc50": 1808.89176,
"Perc90": 2716.544209,
"Perc99": 6755.55489
},
SaturationPodStartupLatency:
"data": {
"Perc50": 2000.715777,
"Perc90": 5337.728881,
"Perc99": 8900.365149
},
SchedulingThroughput:
{
"perc50": 84.6,
"perc90": 119.4,
"perc99": 141.2,
"max": 169.6
- [tenant: monkey] Test Result-density: Test finished with Status: Fail
E1202 22:03:05.119264 24248 clusterloader.go:219] Test Finished
E1202 22:03:05.119268 24248 clusterloader.go:220] Test: testing/density/config.yaml
E1202 22:03:05.119273 24248 clusterloader.go:221] Status: Fail
E1202 22:03:05.119286 24248 clusterloader.go:223] Errors: [measurement call PodStartupLatency - PodStartupLatency error: pod startup: too high latency 99th percentile: got 6.367232281s expected: 6s]
PodStartupLatency:
"data": {
"Perc50": 1812.364377,
"Perc90": 2720.207293,
"Perc99": 6367.232281
},
SaturationPodStartupLatency:
"data": {
"Perc50": 1953.705991,
"Perc90": 4494.297597,
"Perc99": 7879.681238
},
SchedulingThroughput:
{
"perc50": 83.8,
"perc90": 107.2,
"perc99": 141.8,
"max": 175.6
}
- [tenant: zeta] Test Result-density: Test finished with Status: Fail
E1202 22:03:00.116689 26378 clusterloader.go:219] Test Finished
E1202 22:03:00.116694 26378 clusterloader.go:220] Test: testing/density/config.yaml
E1202 22:03:00.116698 26378 clusterloader.go:221] Status: Fail
E1202 22:03:00.116702 26378 clusterloader.go:223] Errors: [measurement call PodStartupLatency - PodStartupLatency error: pod startup: too high latency 99th percentile: got 6.644877432s expected: 6s]
PodStartupLatency:
"data": {
"Perc50": 1804.040981,
"Perc90": 2702.087421,
"Perc99": 6644.877432
},
SaturationPodStartupLatency:
"data": {
"Perc50": 1980.46348,
"Perc90": 4878.498122,
"Perc99": 8302.981652
},
SchedulingThroughput:
{
"perc50": 84,
"perc90": 113.8,
"perc99": 139,
"max": 153.4
}
- Build and commit information: https://github.com/CentaurusInfra/arktos
d7323cd9376 (HEAD, origin/master-scaleout-serviceiprange) Different service-cluster-ip-range for different TP
b97a1ff2b0e (origin/master-kubeupsupportvpcrange, master-kubeupsupportvpcrange) kube-up support vpc range (#1397)
1e34a15b2ee Distinct VPC range, passing VPC start/end from cmd arg for scale out (#1398)
c6b37c3a605 [Arktos] The scripts for scale-up + workers environment on AWS Ubuntu1804&Ubuntu2004 and scale-out 2x2 + workers environment on AWS Ubuntu 2004 (#1382)
b509faba333 static pods on different nodes are assigned unique uid (#1393)
95c0f4e9a8c Design doc for Mizar-Arktos Integration (#1347)
5d8567ddaa7 Kubeup scaleout mizar support (#1385)
8a545a48b57 scale-up mizar support (#1377)
c3e1ece1df9 Mizar VPC support for service, add TP master to mizar droplets (#1371)
29a4be4e249 update golang version in setup-dev-env.md (#1376)
...
-
logs can be found under GCP project workload-controller-manager on sonyadev4: /home/sonyali/logs/perf-test/gce-500/arktos/rel130-031122-2x2x500
-
[tenant: arktos] Test Result-density: Test finished with Status: Success
PodStartupLatency:
"data": {
"Perc50": 1705.550843,
"Perc90": 2445.832458,
"Perc99": 2916.089723
},
"unit": "ms",
SaturationPodStartupLatency:
"data": {
"Perc50": 1768.615967,
"Perc90": 2466.832409,
"Perc99": 2923.741039
},
"unit": "ms",
SchedulingThroughput:
{
"perc50": 20,
"perc90": 20,
"perc99": 20.2,
"max": 20.2
}
- [tenant: zeta] Test Result-density: Test finished with Status: Success
PodStartupLatency:
"data": {
"Perc50": 1785.975631,
"Perc90": 2514.09808,
"Perc99": 2872.509759
},
"unit": "ms",
SaturationPodStartupLatency:
"data": {
"Perc50": 1814.88997,
"Perc90": 2515.723203,
"Perc99": 2985.930222
},
"unit": "ms",
SchedulingThroughput:
{
"perc50": 20,
"perc90": 20,
"perc99": 20.2,
"max": 20.2
}
- Build and commit information: https://github.com/CentaurusInfra/arktos
d7323cd9376 (HEAD, origin/master-scaleout-serviceiprange) Different service-cluster-ip-range for different TP
b97a1ff2b0e (origin/master-kubeupsupportvpcrange, master-kubeupsupportvpcrange) kube-up support vpc range (#1397)
1e34a15b2ee Distinct VPC range, passing VPC start/end from cmd arg for scale out (#1398)
c6b37c3a605 [Arktos] The scripts for scale-up + workers environment on AWS Ubuntu1804&Ubuntu2004 and scale-out 2x2 + workers environment on AWS Ubuntu 2004 (#1382)
b509faba333 static pods on different nodes are assigned unique uid (#1393)
95c0f4e9a8c Design doc for Mizar-Arktos Integration (#1347)
5d8567ddaa7 Kubeup scaleout mizar support (#1385)
8a545a48b57 scale-up mizar support (#1377)
c3e1ece1df9 Mizar VPC support for service, add TP master to mizar droplets (#1371)
29a4be4e249 update golang version in setup-dev-env.md (#1376)
...
-
logs can be found under GCP project workload-controller-manager on sonyadev4: /home/sonyali/logs/perf-test/gce-500/arktos/rel130-031121-up
-
Test Result-density: Test finished with Status: Success
PodStartupLatency:
"data": {
"Perc50": 1746.612489,
"Perc90": 2453.854045,
"Perc99": 2887.546415
},
"unit": "ms",
SaturationPodStartupLatency:
"data": {
"Perc50": 1787.943107,
"Perc90": 2493.532866,
"Perc99": 2960.929454
},
"unit": "ms",
SchedulingThroughput:
{
"perc50": 20,
"perc90": 20,
"perc99": 20.2,
"max": 20.2
}