Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to customise model request retry behaviour #677

Open
Finndersen opened this issue Jan 13, 2025 · 2 comments
Open

Add ability to customise model request retry behaviour #677

Finndersen opened this issue Jan 13, 2025 · 2 comments
Labels
Feature request New feature request

Comments

@Finndersen
Copy link

Finndersen commented Jan 13, 2025

Currently there appears to be no easy way to customise the Model request retry behaviour, in cases of request error or unexpected response.

In one of my projects I've had to put exception handling around agent.run() to catch request failures (e.g. due to "Model is overloaded" errors from Gemini), however this is inefficient because it re-tries the entire Agent workflow (which may involve multiple model requests).

It would be good to be able to customise the retry logic for each individual model request (Model.request() call), for both:

  • Request errors (due to network issue, rate limiting, authorisation..)
  • Invalid/unexpected model response content/format (to be able to prompt model to correct itself part way through a multi-step interaction)
@samuelcolvin
Copy link
Member

agreed 👍 .

I will say, you could already do this by implementing your own model which wraps the Gemini model and handles retrying.

You might also be interested in #516.

@Finndersen
Copy link
Author

I'm not too familiar with all the possible failure modes of the model provider SDKs or APIs that could potentially be resolved automatically by some kind of handler.

There's obviously network or throttling related issues which could be resolved by retrying the entire request.

Are there any other error scenarios due to invalid input provided to the model/API that a handler could potentially resolve by modifying the request data or something?

I was also thinking about some way to check the model response before returning to the user, and automatically re-prompt the LLM to do something differently if the response isn't as expected. But this could be done outside the agent.run() interaction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request New feature request
Projects
None yet
Development

No branches or pull requests

3 participants