What's the best way to impose global rate limit on all interactions with LLMs? #1030

hslee16 · 2024-12-18T18:14:03Z

Is your feature request related to a problem? Please describe.
Seems like a lot of the time, the agents timeout/exit due to exhausting the rate limits imposed by various LLMs supported by GPT-R.

Describe the solution you'd like
Perhaps we can use a global rate limiter by default or by configuration.

Describe alternatives you've considered
Using timeouts or having try/catch above the code that calls GPT-R. Very clunky as the entire process to generate the research and report is started again.

Additional context
Browsing through github issues, it seems like several others are having problems with rate limits imposed by LLM providers.

ElishaKay · 2024-12-20T07:38:46Z

Sup @hslee16

For the regular reports, maybe another method on the GPTResearcher class?

For the global option, perhaps a timeout handler within the create_chat_completion helper function? That seems to be the helper function leveraged whenever an llm is called, and I assume it's awaited wherever it's leveraged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the best way to impose global rate limit on all interactions with LLMs? #1030

What's the best way to impose global rate limit on all interactions with LLMs? #1030

hslee16 commented Dec 18, 2024

ElishaKay commented Dec 20, 2024 •

edited

Loading

What's the best way to impose global rate limit on all interactions with LLMs? #1030

What's the best way to impose global rate limit on all interactions with LLMs? #1030

Comments

hslee16 commented Dec 18, 2024

ElishaKay commented Dec 20, 2024 • edited Loading

ElishaKay commented Dec 20, 2024 •

edited

Loading