Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify + update rate limit docs #175

Merged
merged 3 commits into from
Oct 16, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 15 additions & 39 deletions fern/pages/going-to-production/rate-limits.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,52 +4,28 @@ slug: "docs/rate-limits"

hidden: false

description: "This page describes the limitations around Cohere's API."
description: "This page describes Cohere API rate limits for production and evaluation keys."
image: "../../assets/images/f1cc130-cohere_meta_image.jpg"
keywords: "Cohere, large language model API"

createdAt: "Thu Feb 29 2024 18:20:04 GMT+0000 (Coordinated Universal Time)"
updatedAt: "Wed Jun 05 2024 20:23:51 GMT+0000 (Coordinated Universal Time)"
---
Cohere offers two kinds of API keys: trial keys (with a variety of attendant limitations), and production keys (which have no such limitations).
Cohere offers two kinds of API keys: evaluation keys (free but limited in usage), and production keys (paid and not limited in usage). You can create an evaluation or production key on [the API keys page](https://dashboard.cohere.com/api-keys). For more details on pricing please see our [pricing docs](https://docs.cohere.com/v2/docs/how-does-cohere-pricing-work).
billytrend-cohere marked this conversation as resolved.
Show resolved Hide resolved

In this document, we'll discuss some of the limitations associated with a trial key before turning to the parameters of a production key.
| Endpoint | Evaluation rate limit | Production rate limit |
| ------------------------------------------ | --------------------- | --------------------- |
| [Chat](/reference/chat) | 20/min | no limit |
| [Embed](/reference/embed) | 100,000/min | no limit |
billytrend-cohere marked this conversation as resolved.
Show resolved Hide resolved
| [EmbedJob](/reference/embed-jobs) | 5/min | no limit |
| [Rerank](/reference/rerank) | 10/min | no limit |
| [Generate (legacy)](/reference/generate) | 5/min | no limit |
| [Summarize (legacy)](/reference/summarize) | 5/min | no limit |

## Trial Key Limitations

Trial keys are rate-limited depending on the endpoint you want to use:
All endpoints are limited to 1,000 calls per month with a evaluation key.

- [Chat](/reference/chat): 20/min
- Cluster: 5 calls/min
- [Embed](/reference/embed): 5 calls/min
- EmbedJob: 5 calls/min
- [Rerank](/reference/rerank-1): 10 calls/min
- Generate (legacy): 5 calls/min
- Summarize (legacy): 5 calls/min

[Chat](/reference/chat) and the [Coral user interface](https://coral.cohere.ai/) are limited to a total of 1,000 calls a month with a trial key. All remaining endpoints are limited to a total of 1,000 calls per month with a trial key.

If you’d like to use Cohere endpoints in a production application or require higher throughput from our endpoints for your usage, you can upgrade to a production key.

With a trial key:

- Organizations can still have unlimited trial keys in the free tier.
- There is a defined usage limit on all the development API keys per minute (all keys add up to that rate limit).
- When a developer/org reaches a rate limit, they will receive an error that they have exceeded the limit/minute.
- Playground usage counts toward your trial key rate limit.
- If calls exceed the throttling we throw an error that says “Trial keys are throttled." Please upgrade your API key or contact us directly on <a href="https://discord.com/invite/co-mmunity" target="_blank">Discord</a>.
- Trial keys are free to use even after you upgrade to a Production key.

## Production Key Specifications

Production keys for all endpoints are rate-limited at 1,000 calls per minute and are intended for serving Cohere in a public-facing application and testing purposes. Usage of production keys is metered at price points which can be found on our [pricing page](/docs/how-does-cohere-pricing-work).

To get a production key, start by navigating to the [API Keys](https://dashboard.cohere.com/api-keys) page in your Cohere dashboard. You'll either need to be the admin of your organization, or ask your organization Admin to complete these steps.

![](../../assets/images/1d24fd7-Screenshot_2024-07-01_at_10.33.04_AM.png)

From there, click on _Create Production key_ to finish the process.

![](../../assets/images/27062e8-Screenshot_2024-07-01_at_10.33.54_AM.png)

The whole process should complete in less than three minutes, and enables you to generate a production key that you can use to serve Cohere APIs in production. If you deploy without completing the go to production workflow, your API key may be temporarily or permanently revoked.
Organizational evaluation key limitations:
- Organizations can have unlimited evaluation keys.
- Organization evaluation keys share the rate limits enumerated above.
- Playground and Chat UI usage counts toward the evaluation key rate limit.
Loading