Skip to content

Commit

Permalink
Update rate-limits.mdx
Browse files Browse the repository at this point in the history
Update embeddings related quotas. Add detail about Free Tier quotas limitations.
  • Loading branch information
fpagny authored Jan 16, 2025
1 parent 940d3e3 commit 2b7886f
Showing 1 changed file with 9 additions and 5 deletions.
14 changes: 9 additions & 5 deletions ai-data/generative-apis/reference-content/rate-limits.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,13 @@ Any model served through Scaleway Generative APIs gets limited by:
- Tokens per minute
- Queries per minute

<Message type="tip">
These limits only apply if you created a Scaleway Account and registered a valid payment method. Otherwise, stricter limits apply to ensure usage stays within Free Tier only.
</Message>

### Chat models

| Model string | Requests per minute | Tokens per minute |
| Model string | Requests per minute | Total Tokens per minute |
|-----------------|-----------------|-----------------|
| `llama-3.1-8b-instruct` | 300 | 100K |
| `llama-3.1-70b-instruct` | 300 | 100K |
Expand All @@ -29,10 +33,10 @@ Any model served through Scaleway Generative APIs gets limited by:

### Embedding models

| Model string | Requests per minute | Tokens per minute |
| Model string | Requests per minute | Input Tokens per minute |
|-----------------|-----------------|-----------------|
| `sentence-t5-xxl` | 600 | 1M |
| `bge-multilingual-gemma2` | 600 | 1M |
| `sentence-t5-xxl` | 100 | 200K |
| `bge-multilingual-gemma2` | 100 | 200K |

## Why do we set rate limits?

Expand All @@ -41,4 +45,4 @@ These limits safeguard against abuse or misuse of Scaleway Generative APIs, help
## How can I increase the rate limits?

We actively monitor usage and will improve rates based on feedback.
If you need to increase your rate limits, contact us via the support team, providing details on the model used and specific use case.
If you need to increase your rate limits, contact us via the support team, providing details on the model used and specific use case.

0 comments on commit 2b7886f

Please sign in to comment.