Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the best way to impose global rate limit on all interactions with LLMs? #1030

Open
hslee16 opened this issue Dec 18, 2024 · 1 comment

Comments

@hslee16
Copy link
Contributor

hslee16 commented Dec 18, 2024

Is your feature request related to a problem? Please describe.
Seems like a lot of the time, the agents timeout/exit due to exhausting the rate limits imposed by various LLMs supported by GPT-R.

Describe the solution you'd like
Perhaps we can use a global rate limiter by default or by configuration.

Describe alternatives you've considered
Using timeouts or having try/catch above the code that calls GPT-R. Very clunky as the entire process to generate the research and report is started again.

Additional context
Browsing through github issues, it seems like several others are having problems with rate limits imposed by LLM providers.

@ElishaKay
Copy link
Collaborator

ElishaKay commented Dec 20, 2024

Sup @hslee16

For the regular reports, maybe another method on the GPTResearcher class?

For the global option, perhaps a timeout handler within the create_chat_completion helper function? That seems to be the helper function leveraged whenever an llm is called, and I assume it's awaited wherever it's leveraged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants