Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summary / auxiliary LLM #5464

Open
enyst opened this issue Dec 8, 2024 · 3 comments
Open

Summary / auxiliary LLM #5464

enyst opened this issue Dec 8, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@enyst
Copy link
Collaborator

enyst commented Dec 8, 2024

Summary

Add support for users to configure a secondary, lightweight LLM alongside their main LLM. This auxiliary LLM would handle quick, repetitive tasks where using their primary (potentially expensive/slower) LLM would be overkill.

If no auxiliary LLM is configured, we'll fall back to some hardcoded solutions or main LLM, as we already have.

Motivation

Users configure their own LLMs in our project, which we recommend to be some powerful models. Using these for simple tasks like generating branch names or checking for agent loops is unnecessarily expensive and slow. We could:

  • give users the option to use a smaller, faster LLM for simple tasks (not agent tasks)
  • save cost / reduce latency/rate limits on the main LLM

Use Cases

The resolver has currently a couple of cases which are an example of things we may use such LLM for (guess success, review)

In addition:

Alternatives to Consider

  • multiple LLMs, hopefully optimized per task
  • use main LLM with specialized prompts
@xingyaoww
Copy link
Collaborator

I think this may be somewhat related to the not diamond issue #4184, but maybe slight different, bc you were thinking about using it for simple agent tasks.

I think this could be achievable by maintaining an LLM instance inside agent_controller? But allowing user to manually set TWO LLM config could be too tricky - I guess we can just use 4o-mini for openai/, haiku for anthropic/ LLM key, and fall back to existing implementation for other providers?

@mamoodi mamoodi added the enhancement New feature or request label Dec 9, 2024

This comment has been minimized.

@github-actions github-actions bot added the Stale Inactive for 30 days label Jan 9, 2025
@enyst enyst removed the Stale Inactive for 30 days label Jan 9, 2025
@BradKML
Copy link

BradKML commented Feb 6, 2025

This is needed even in the era of 200K Context, since terminal output + complex code calling logic, are also things that work against context length. Also as a practical concern, token = cost. Anything that compress context is good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants