-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix llm_config fallback #4415
Fix llm_config fallback #4415
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Is it possible to add some test cases for this to make sure we don't accidentally ruin it again?
The LLM section itself is for a specific model. These fallback scenarios may save typing, but end up potentially mixing values from different sections that imo shouldn't. |
@tobitege You make good points, and I think there is a solution for the first: the case happens because you need A model is optional in the The issue with how the UI works with toml is tracked here. Just to note here: Custom LLM configs were developed for CLI, not UI. I'm pretty sure they did not interfere with the UI at all until - as you pointed out recently in slack - the unexpected happened: setting one as "agent" LLM made it actively used with the UI too. That was, as far as I know, an unintended use case, and frankly I didn't even think it could happen until you mentioned it. I didn't even get to test it yet. This issue is not about that. I think sorting it all out will not be trivial, but if I were to guess, probably the solutions for the interplay between UI and toml will be based on:
IMHO, the way it works now is mostly workable, except for that case - agent llm. I'm not sure how we should address that, without very significant changes. |
On further thought, I see your point better. Maybe the solution here doesn't require a lot, just a way to define a "default llm". That means to separate these two views on the [llm] section:
|
Actually I don't fully see how you encounter the first case you mention. I mean, if you are running
Can you please tell how exactly you were encountering this case? e.g. how were you running. |
@enyst is this something that will still be pursued? |
I've come around to see the benefit to have the |
Will do, Tobi, it's on my todo list, I'll get to it tomorrow! |
@openhands-agent Read this PR diff carefully and understand what it's doing. Read the *_config.py files to understand the config implementation we have. Read also config.template.toml, which is very important because it does already attempt to document how the configuration works. Then, check out docs/modules/usage for Configuration Options, to see what we have documented so far. Add another .md file to document the behavior of this particular feature: named (custom) LLM configurations. Make sure it does include accurate description of the feature and how to use it. Note: you should include a link to this new page, in the Configuration Options page under LLM Config, and specify that it is ONLY used with the development setup (development.md, you can find that too). |
Overview of Changes:
Status: Complete - No remaining issues identified. The implementation addresses all requested documentation needs while maintaining proper structure and transparency about feature limitations. |
e4d6192
to
3661e78
Compare
- Adds proper fallback mechanism from generic LLM config to custom configs - Adds special handling for draft_editor field: - Falls back to generic config value if not specified - Can be set to None using 'null' in TOML - Can be overridden with custom value
3661e78
to
bf8feff
Compare
This is ready for review again, on similar principle as before, consistent fallbacks. FWIW I'm changing my mind about a few of them, the auth-relevant ones. I doubt excluding some should be implemented on the current code design though, it needs a refactoring first IMO. |
@enyst is it worth it to ping people for review here now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one quesiton
Co-authored-by: openhands <[email protected]>
End-user friendly description of the problem this fixes or functionality that this introduces
[cli, headless] Restore fallback to defaults for custom LLM configurations.
Give a summary of what the PR does, explaining any non-trivial design decisions
This PR proposes to fix a regression: when we defined custom llm configurations (e.g. sections with user-defined names like
[llm.claude]
,[llm.o1-mini
]), we normally didn't have to specify over and over again the same attributes that were in the generic[llm]
section: if an attribute is not defined, it falls back to the[llm]
section; and if it's not there either, it will fallback to the default values defined in the dataclass (llm_config.py).This fallback behavior got lost at some point, so now we have to copy/paste all the values we want applied, in each custom section. IMHO, this is a bug, and has pretty bad usability cost. This PR proposes to restore the fallback behavior.
Example use cases where this is useful: well, almost every attribute, particularly embedding model related settings, max_message_chars, sometimes retries values (which we may or may not want to override per each llm), draft llm, saving completions in State for evals debugging, etc.
PR behavior:
[llm]
sectionCurrent main:
On this PR:
Link of any specific issues this addresses
Addresses this - IMHO it should not be mandatory to define the draft_llm in every custom llm config, in order to see it applied (or get a value at all!) when we use custom llm configs. I think @xingyaoww has addressed this making it fallback to the regular llm, that's cool of course, I just think the original behavior on the draft PR was due to this bug between generic and custom configs.
To run this PR locally, use the following command: