Skip to content

Commit

Permalink
Update fixing-common-issues.mdx
Browse files Browse the repository at this point in the history
Add details to on 429: Too many request error
  • Loading branch information
fpagny authored Jan 16, 2025
1 parent f94215e commit 9aae0b9
Showing 1 changed file with 16 additions and 3 deletions.
19 changes: 16 additions & 3 deletions ai-data/generative-apis/troubleshooting/fixing-common-issues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,24 @@ dates:

Below are common issues that you may encounter when using Generative APIs, their causes, and recommended solutions.

## 504: Timeout
## 429: Too Many Request - You exceeded your current quota of requests/tokens per minute

### Cause
- The query is too long.
- The model goes into an infinite loop while processing the input.
- You performed too many API requests over a given minute
- You consumed too much tokens (input and output) with your API requests over a given minute

### Solution
- [Ask our support](https://console.scaleway.com/support/tickets/create) to raise your quota
- Smooth out your API requests rate by limiting the number of API requests you perform in parallel
- Reduce the size of the input or output tokens processed by your API requests
- Use [Managed Inference](/ai-data/managed-inference/), where these quota do not apply (your throughput will be only limited by the amount of Inference Deployment your provision)


## 504: Gateway Timeout

### Cause
- The query is too long to process (even if context-length stays [between supported context window and maximum tokens](https://www.scaleway.com/en/docs/ai-data/generative-apis/reference-content/supported-models/))
- The model goes into an infinite loop while processing the input (which is a known structural issue with several AI models)

### Solution
- Set a stricter **maximum token limit** to prevent overly long responses.
Expand Down

0 comments on commit 9aae0b9

Please sign in to comment.