Skip to content

Commit f88c083

Browse files
docs: Update performance guide (#11969)
* first draft * address initial feedback * address more feedback
1 parent a15a69b commit f88c083

File tree

1 file changed

+64
-22
lines changed

1 file changed

+64
-22
lines changed

docs/processing-performance.asciidoc

+64-22
Original file line numberDiff line numberDiff line change
@@ -5,40 +5,82 @@ APM Server performance depends on a number of factors: memory and CPU available,
55
network latency, transaction sizes, workload patterns,
66
agent and server settings, versions, and protocol.
77

8-
Let's look at a simple example that makes the following assumptions:
8+
We tested several scenarios to help you understand how to size the APM Server so that it can keep up with the load that your Elastic APM agents are sending:
99

10-
* The load is generated in the same region as where APM Server and {es} are deployed.
11-
* We're using the default settings in cloud.
12-
* A small number of agents are reporting.
13-
14-
This leaves us with relevant variables like payload and instance sizes.
15-
See the table below for approximations.
16-
As a reminder, events are
10+
* Using the default hardware template on AWS, GCP and Azure on {ecloud}.
11+
* For each hardware template, testing with several sizes: 1 GB, 4 GB, 8 GB, and 32 GB.
12+
* For each size, using a fixed number of APM agents: 10 agents for 1 GB, 30 agents for 4 GB, 60 agents for 8 GB, and 240 agents for 32 GB.
13+
* In all scenarios, using medium sized events. Events include
1714
<<data-model-transactions,transactions>> and
1815
<<data-model-spans,spans>>.
1916

17+
NOTE: You will also need to scale up {es} accordingly, potentially with an increased number of shards configured.
18+
For more details on scaling {es}, refer to the {ref}/scalability.html[{es} documentation].
19+
20+
The results below include numbers for a synthetic workload. You can use the results of our tests to guide
21+
your sizing decisions, however, *performance will vary based on factors unique to your use case* like your
22+
specific setup, the size of APM event data, and the exact number of agents.
23+
24+
:hardbreaks-option:
25+
2026
[options="header"]
21-
|=======================================================================
22-
|Transaction/Instance |512 MB Instance |2 GB Instance |8 GB Instance
23-
|Small transactions
27+
|====
28+
| Profile / Cloud | AWS | Azure | GCP
2429

25-
_5 spans with 5 stack frames each_ |600 events/second |1200 events/second |4800 events/second
26-
|Medium transactions
30+
| *1 GB*
31+
(10 agents)
32+
| 9,000
33+
events/second
34+
| 6,000
35+
events/second
36+
| 9,000
37+
events/second
2738

28-
_15 spans with 15 stack frames each_ |300 events/second |600 events/second |2400 events/second
29-
|Large transactions
39+
| *4 GB*
40+
(30 agents)
41+
| 25,000
42+
events/second
43+
| 18,000
44+
events/second
45+
| 17,000
46+
events/second
3047

31-
_30 spans with 30 stack frames each_ |150 events/second |300 events/second |1400 events/second
32-
|=======================================================================
48+
| *8 GB*
49+
(60 agents)
50+
| 40,000
51+
events/second
52+
| 26,000
53+
events/second
54+
| 25,000
55+
events/second
3356

34-
In other words, a 512 MB instance can process \~3 MB per second,
35-
while an 8 GB instance can process ~20 MB per second.
57+
| *16 GB*
58+
(120 agents)
59+
| 72,000
60+
events/second
61+
| 51,000
62+
events/second
63+
| 45,000
64+
events/second
3665

37-
APM Server is CPU bound, so it scales better from 2 GB to 8 GB than it does from 512 MB to 2 GB.
38-
This is because larger instance types in {ecloud} come with much more computing power.
66+
| *32 GB*
67+
(240 agents)
68+
| 135,000
69+
events/second
70+
| 95,000
71+
events/second
72+
| 95,000
73+
events/second
74+
75+
|====
76+
77+
:!hardbreaks-option:
3978

4079
Don't forget that the APM Server is stateless.
4180
Several instances running do not need to know about each other.
4281
This means that with a properly sized {es} instance, APM Server scales out linearly.
4382

44-
NOTE: RUM deserves special consideration. The RUM agent runs in browsers, and there can be many thousands reporting to an APM Server with very variable network latency.
83+
NOTE: RUM deserves special consideration. The RUM agent runs in browsers, and there can be many thousands reporting to an APM Server with very variable network latency.
84+
85+
Alternatively or in addition to scaling the APM Server, consider
86+
decreasing the ingestion volume. Read more in <<reduce-apm-storage>>.

0 commit comments

Comments
 (0)