Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update performance benchmarks #3818

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a link to the Actor performance test that we can include, so that people can try this themselves?

Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ For applications using actors in Dapr there are two aspects to be considered. Fi
* Sidecar Injector (control plane)
* Sentry (optional, control plane)

## Performance summary for Dapr v1.0
## Performance summary for Dapr v1.12

The actors API in Dapr sidecar will identify which hosts are registered for a given actor type and route the request to the appropriate host for a given actor ID. The host runs an instance of the application and uses the Dapr SDK (.Net, Java, Python or PHP) to handle actors requests via HTTP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The actors API in Dapr sidecar will identify which hosts are registered for a given actor type and route the request to the appropriate host for a given actor ID. The host runs an instance of the application and uses the Dapr SDK (.Net, Java, Python or PHP) to handle actors requests via HTTP.
The actors API in Dapr sidecar identifies which hosts are registered for a given actor type and routes the request to the appropriate host for a given actor ID. The host runs an instance of the application and uses the Dapr SDK (.Net, Java, Python, Go) to handle actors requests via HTTP.


Expand All @@ -40,17 +40,14 @@ Test parameters:
* Sidecar limited to 0.5 vCPU
* mTLS enabled
* Sidecar telemetry enabled (tracing with a sampling rate of 0.1)
* Payload of an empty JSON object: `{}`

### Results

* The actual throughput was ~500 qps.
* The tp90 latency was ~3ms.
* The tp99 latency was ~6.2ms.
* Dapr app consumed ~523m CPU and ~304.7Mb of Memory
* Dapr sidecar consumed 2m CPU and ~18.2Mb of Memory
* The requested throughput was 500 qps.
* The actual throughput was 500 qps.
* The tp90 latency was ~3.2ms.
* The tp99 latency was ~7ms.
* Dapr app consumed ~339m CPU and ~336Mb of Memory
* Dapr sidecar consumed 93m CPU and ~60Mb of Memory
* No app restarts
* No sidecar restarts

## Related links
* For more information see [overview of Dapr on Kubernetes]({{< ref kubernetes-overview.md >}})
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
type: docs
title: "Pub/sub performance"
linkTitle: "Pub/sub performance"
weight: 20000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weight: 20000
weight: 30000

description: ""
---
This article provides pub/sub API performance benchmarks and resource utilization in Dapr on Kubernetes.

## System overview

Dapr consists of a data plane, the sidecar that runs next to your app, and a control plane that configures the sidecars and provides capabilities such as cert and identity management.

### Kubernetes components

* Sidecar (data plane)
* Placement (required for actors, control plane mapping actor types to hosts)
* Operator (control plane)
* Sidecar Injector (control plane)
* Sentry (optional, control plane)
* Kafka cluster with 3 replicas

## Performance summary for Dapr v1.12

The Pub/Sub API is used to publish messages to a message broker. Dapr accepts requests from the app via HTTP or gRPC, wraps them in a cloud event if needed, and sends the request to the message broker.

Performance varies based on the underlying message broker. The Pub/Sub performance test measures the added latency when publishing a message with Dapr compared with the baseline latency when publishing directly to the message broker.

### Kubernetes performance test setup

The test was conducted on a 3 node Kubernetes cluster, using commodity hardware running 4 cores and 8GB of RAM, without any network acceleration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a link to the performance test that we can include, so that people can try this themselves?


Test parameters:

* 1000 requests per second
* 1 replica
* 1 minute duration
* Sidecar limited to 0.5 vCPU
* Sidecar telemetry enabled (tracing with a sampling rate of 0.1)
* Payload of a 1kb size

### Results

* The requested throughput was 1000 qps
* The actual throughput was 1000 qps
* Added latency for 90th percentile was 0.64ms for gRPC and 0.49ms for HTTP
* Added latency for 99th percentile was 1.91ms for gRPC and 1.21ms for HTTP
* Dapr app consumed ~0.2 vCPU and ~30Mb of Memory for both gRPC and HTTP
* No app restarts
* No sidecar restarts
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ For more information see [overview of Dapr in self-hosted mode]({{< ref self-hos

For more information see [overview of Dapr on Kubernetes]({{< ref kubernetes-overview.md >}}).

## Performance summary for Dapr v1.0
## Performance summary for Dapr v1.12

The service invocation API is a reverse proxy with built-in service discovery to connect to other services. This includes tracing, metrics, mTLS for in-transit encryption of traffic, together with resiliency in the form of retries for network partitions and connection errors.

Expand Down Expand Up @@ -59,10 +59,10 @@ When running in a highly available production setup, the Dapr control plane cons

| Component | vCPU | Memory
| ------------- | ------------- | -------------
| Operator | 0.001 | 12.5 Mb
| Sentry | 0.005 | 13.6 Mb
| Sidecar Injector | 0.002 | 14.6 Mb
| Placement | 0.001 | 20.9 Mb
| Operator | 0.003 | 18 Mb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only place that we capture the Control Plane numbers and are these consistently the same for all the performance test? If so we should separate these out. If not, then should each perf test have different control plane numbers like this one?

| Sentry | 0.01 | 33 Mb
| Sidecar Injector | 0.008 | 17 Mb
| Placement | 0.005 | 25 Mb

There are a number of variants that affect the CPU and memory consumption for each of the system components. These variants are shown in the table below.

Expand All @@ -75,18 +75,10 @@ There are a number of variants that affect the CPU and memory consumption for ea

### Data plane performance

The Dapr sidecar uses 0.48 vCPU and 23Mb per 1000 requests per second.
End-to-end, the Dapr sidecars (client and server) add ~1.40 ms to the 90th percentile latency, and ~2.10 ms to the 99th percentile latency. End-to-end here is a call from one app to another app receiving a response. This is shown by steps 1-7 in [this diagram]({{< ref service-invocation-overview.md >}}).

This performance is on par or better than commonly used service meshes.

### Latency

In the test setup, requests went through the Dapr sidecar both on the client side (serving requests from the load tester tool) and the server side (the target app).
mTLS and telemetry (tracing with a sampling rate of 0.1) and metrics were enabled on the Dapr test, and disabled for the baseline test.

<img src="/images/perf_invocation_p90.png" alt="Latency for 90th percentile">

<br>
The Dapr sidecar uses 0.45 vCPU and 38Mb per 1000 requests per second.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Dapr sidecar uses 0.45 vCPU and 38Mb per 1000 requests per second.
The Dapr sidecar uses 0.45 vCPU and 38Mb per 1000 requests per second.

End-to-end, the Dapr sidecars (client and server) add ~1.20 ms to the 90th percentile latency, and ~2.50 ms to the 99th percentile latency. End-to-end here is a call from one app to another app receiving a response. This is shown by steps 1-7 in [this diagram]({{< ref service-invocation-overview.md >}}).

<img src="/images/perf_invocation_p99.png" alt="Latency for 99th percentile">
This performance is on par or better than commonly used service meshes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's quantify this with a link to service meshes perf results, otherwise this is hearsay and we should not include this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In removing the graphs, there is not mTLS measurement comparison. What impact does mTLS have on the performance tests, it would be good to at least clarify this with a % increase. The majority of people will use mTLS and so this number is important.

Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
type: docs
title: "State performance"
linkTitle: "State performance"
weight: 20000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weight: 20000
weight: 40000

description: ""
---
This article provides state API performance benchmarks and resource utilization in Dapr on Kubernetes.

## System overview

Dapr consists of a data plane, the sidecar that runs next to your app, and a control plane that configures the sidecars and provides capabilities such as cert and identity management.

### Kubernetes components

* Sidecar (data plane)
* Placement (required for actors, control plane mapping actor types to hosts)
* Operator (control plane)
* Sidecar Injector (control plane)
* Sentry (optional, control plane)
* PosgreSQL database (single node)

## Performance summary for Dapr v1.12

The state API is used to persist state to a database, commonly called state store in Dapr.

Performance varies based on the underlying state store. The state API performance test measures the added latency when using Dapr to get state compared with the baseline latency when getting state directly from the state store.

### Kubernetes performance test setup

The test was conducted on a 3 node Kubernetes cluster, using commodity hardware running 4 cores and 8GB of RAM, without any network acceleration.

Test parameters:

* 1000 requests per second
* 1 replica
* 1 minute duration
* Sidecar limited to 0.5 vCPU
* Sidecar telemetry enabled (tracing with a sampling rate of 0.1)
* Payload of a 1kb size

### Results

* The requested throughput was 1000 qps
* The actual throughput was 1000 qps
* Added latency for 90th percentile was 0.75ms for gRPC
* Added latency for 99th percentile was 1.52ms for gRPC
* Dapr app consumed ~0.3 vCPU and ~48 of Memory for gRPC
* No app restarts
* No sidecar restarts
Loading