Skip to content

Commit

Permalink
Merge pull request #12974 from RasaHQ/ATO-1628
Browse files Browse the repository at this point in the history
Additional load testing recommendations
  • Loading branch information
sanchariGr authored Jan 5, 2024
2 parents b1199ca + 64546fc commit fc72dd8
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 0 deletions.
1 change: 1 addition & 0 deletions .github/workflows/continous-integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,7 @@ jobs:
- name: Prevent race condition in poetry build
# More context about race condition during poetry build can be found here:
# https://github.com/python-poetry/poetry/issues/7611#issuecomment-1747836233
if: needs.changes.outputs.backend == 'true'
run: |
poetry config installer.max-workers 1
Expand Down
14 changes: 14 additions & 0 deletions docs/docs/monitoring/load-testing-guidelines.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,26 @@ In order to gather metrics on our system's ability to handle increased loads and
In each test case we spawned the following number of concurrent users at peak concurrency using a [spawn rate](https://docs.locust.io/en/1.5.0/configuration.html#all-available-configuration-options) of 1000 users per second.
In our tests we used the Rasa [HTTP-API](https://rasa.com/docs/rasa/pages/http-api) and the [Locust](https://locust.io/) open source load testing tool.


| Users | CPU | Memory |
|--------------------------|----------------------------------------------|---------------|
| Up to 50,000 | 6vCPU | 16 GB |
| Up to 80,000 | 6vCPU, with almost 90% CPU usage | 16 GB |


### Some recommendations to improve latency
- Sanic Workers must be mapped 1:1 to CPU for both Rasa Pro and Rasa Action Server
- Create `async` actions to avoid any blocking I/O
- `enable_selective_domain: true` : Domain is only sent for actions that needs it. This massively trims the payload between the two pods.
- Consider using compute efficient machines on cloud which are optimized for high performance computing such as the C5 instances on AWS.
However, as they are low on memory, models need to be trained lightweight.


| Machine | RasaPro | Rasa Action Server |
|--------------------------------|------------------------------------------------|--------------------------------------------------|
| AWS C5 or Azure F or Gcloud C2 | 3-7vCPU, 10-16Gb Memory, 3-7 Sanic Threads | 3-7vCPU, 2-12Gb Memory, 3-7 Sanic Threads |


### Debugging bot related issues while scaling up

To test the Rasa [HTTP-API](https://rasa.com/docs/rasa/pages/http-api) ability to handle a large number of concurrent user activity we used the Rasa Pro [tracing](./tracing.mdx) capability
Expand Down

0 comments on commit fc72dd8

Please sign in to comment.