Summary / auxiliary LLM #5464

enyst · 2024-12-08T04:33:42Z

Summary

Add support for users to configure a secondary, lightweight LLM alongside their main LLM. This auxiliary LLM would handle quick, repetitive tasks where using their primary (potentially expensive/slower) LLM would be overkill.

If no auxiliary LLM is configured, we'll fall back to some hardcoded solutions or main LLM, as we already have.

Motivation

Users configure their own LLMs in our project, which we recommend to be some powerful models. Using these for simple tasks like generating branch names or checking for agent loops is unnecessarily expensive and slow. We could:

give users the option to use a smaller, faster LLM for simple tasks (not agent tasks)
save cost / reduce latency/rate limits on the main LLM

Use Cases

The resolver has currently a couple of cases which are an example of things we may use such LLM for (guess success, review)

In addition:

Workspace Summaries
- generate quick summaries for workspace.zip downloads
- example: "workspace_add-user-auth.zip" instead of just "workspace.zip"
- fallback: Use workspace.zip, or branch name if a repo exists, or timestamp
- tracked as [Feature]: Give descriptive name to downloaded zip file #4706
- but it might be obsolete, see Implement file-by-file download with progress #5008
Stuck-in-a-loop
- when our stuck detector thinks the agent is stuck in a loop, ask the auxiliary LLM for a second opinion
- otherwise go ahead and stop the loop
- reference: [Bug]: Loop detection kills agents that are waiting on long-running processes #5355
Branch Name Creation
- generate branch names from issues
- fallback: issue number as we currently do
- reference : slack suggestion

Alternatives to Consider

multiple LLMs, hopefully optimized per task
use main LLM with specialized prompts

xingyaoww · 2024-12-08T06:02:12Z

I think this may be somewhat related to the not diamond issue #4184, but maybe slight different, bc you were thinking about using it for simple agent tasks.

I think this could be achievable by maintaining an LLM instance inside agent_controller? But allowing user to manually set TWO LLM config could be too tricky - I guess we can just use 4o-mini for openai/, haiku for anthropic/ LLM key, and fall back to existing implementation for other providers?

BradKML · 2025-02-06T09:02:29Z

This is needed even in the era of 200K Context, since terminal output + complex code calling logic, are also things that work against context length. Also as a practical concern, token = cost. Anything that compress context is good.

mamoodi added the enhancement New feature or request label Dec 9, 2024

This comment has been minimized.

Sign in to view

github-actions bot added the Stale Inactive for 30 days label Jan 9, 2025

enyst removed the Stale Inactive for 30 days label Jan 9, 2025

enyst mentioned this issue Feb 23, 2025

Generate and display natural-language descriptions of conversations #6894

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Summary / auxiliary LLM #5464

Summary / auxiliary LLM #5464

enyst commented Dec 8, 2024 •

edited

Loading

xingyaoww commented Dec 8, 2024

This comment has been minimized.

BradKML commented Feb 6, 2025

Summary / auxiliary LLM #5464

Summary / auxiliary LLM #5464

Comments

enyst commented Dec 8, 2024 • edited Loading

xingyaoww commented Dec 8, 2024

This comment has been minimized.

BradKML commented Feb 6, 2025

enyst commented Dec 8, 2024 •

edited

Loading