Skip to content

Added instructions for using throttled-py to smooth API calls #1886

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ZhuoZhuoCrayon
Copy link

@ZhuoZhuoCrayon ZhuoZhuoCrayon commented Jun 7, 2025

Summary

Added instructions for using throttled-py to smooth API calls,

Motivation

Simply backing off and retrying will waste part of the request budget on unnecessary retries.

By using the GCRA rate limiting strategy provided by throttled-py in Wait & Retry mode, the API call process can smoothly adhere to OpenAI API rate limiting rules.

For example, per_sec(2, burst=2) means allows 2 requests per second, and allows 2 burst requests(🪣 Bucket's capacity). In other words, this limiter will consume the burst after 2 requests. If timeout>=0.5 is set, the above example will complete all requests in 1.5 seconds (the burst is consumed immediately, and the 3 requests will be filled in the subsequent 1.5s):

import time
from throttled import RateLimiterType, Throttled, rate_limiter, store

@Throttled(
    key="chat.completions", 
    using=RateLimiterType.GCRA.value, 
    quota=rate_limiter.per_sec(2, burst=2),
    timeout=0.5   # ⏳ Set timeout=0.5 to enable wait-and-retry (max wait 0.5 second).
)
def call_chat_completions(**kwargs):
    pass

def test_throttled():
    # Make 5 sequential requests
    start_time = time.time()
    for i in range(5):
        call_chat_completions()
        print(f"Request {i+1} completed at {time.time() - start_time:.2f}s")

    total_time = time.time() - start_time
    print(f"\nTotal time for 5 requests at 2/sec: {total_time:.2f}s")

if __name__ == "__main__":
    test_throttled()
Testing Throttled rate limiter...
------------- Burst----------------------------
Request 1 completed at 0.00s
Request 2 completed at 0.00s
-----------------------------------------------
------------ Refill: 0.5 tokens per second ------
Request 3 completed at 0.50s
Request 4 completed at 1.00s
Request 5 completed at 1.50s
-----------------------------------------------

Total time for 5 requests at 2/sec: 1.50s
Expected time: ~1.5s

For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • I have conducted a self-review of my content based on the contribution guidelines:
    • Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • Correctness: The information I include is correct and all of my code executes successfully.
    • Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

@ZhuoZhuoCrayon ZhuoZhuoCrayon force-pushed the dev branch 2 times, most recently from 7780f43 to eb79f25 Compare June 7, 2025 05:09
@ZhuoZhuoCrayon
Copy link
Author

ZhuoZhuoCrayon commented Jun 7, 2025

@shyamal-anadkat @shikhar-cyber @josiah-openai Looking forward to some reviews

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant