Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify draft llm #6281

Merged
merged 14 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions docs/modules/usage/llms/custom-llm-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,35 @@ In this example:
- Code generation uses GPT-4 with a higher token limit for generating larger code blocks
- The default configuration remains available for other tasks

# Custom Configurations with Reserved Names

OpenHands can use predefined custom named LLM configurations for specific use cases. If you specify the model and other settings under the predefined names, then OpenHands will load and them for a specific purpose. As of now, one such configuration is implemented: draft editor.

## Draft Editor Configuration

The `draft_editor` configuration is a group of settings you can provide, to specify the model to use for preliminary drafting of code edits, for any tasks that involve editing and refining code. You need to provide it under the section `[llm.draft_editor]`.

```toml
[llm.draft_editor]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "predefined" in the sense that the user need not specify anything to use this config? I read this as just an example draft editor configuration.

Copy link
Collaborator Author

@enyst enyst Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually just "predefined name". 🤔 That is, openhands knows what to do with a configuration with this name, if the user fills it in.
Maybe it's not the best word.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Might help to clarify that the given configuration is an example that would automatically be used as the draft editor LLM if provided.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like the user can define any other custom configuration like [llm.eval-gpt4o] etc, and they can use them or not, use them in an agent, or in general, there are some ... maybe we should call them reserved names? [llm.draft_editor] will be used for the draft llm editor feature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Reserved names" makes sense, I think that fits the concept. Implies there's some hard-coded logic around the name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

model = "gpt-4"
temperature = 0.2
top_p = 0.95
presence_penalty = 0.0
frequency_penalty = 0.0
```

This configuration:
- Uses GPT-4 for high-quality edits and suggestions
- Sets a low temperature (0.2) to maintain consistency while allowing some flexibility
- Uses a high top_p value (0.95) to consider a wide range of token options
- Disables presence and frequency penalties to maintain focus on the specific edits needed

Use this configuration when you want to let an LLM draft edits before making them. In general, it may be useful to:
- Review and suggest code improvements
- Edit documentation or text content
- Refine existing content while maintaining its core meaning
- Make precise, focused changes to code or text

:::note
Custom LLM configurations are only available when using OpenHands in development mode, via `main.py` or `cli.py`. When running via `docker run`, please use the standard configuration options.
:::
15 changes: 1 addition & 14 deletions openhands/core/config/llm_config.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import os
from dataclasses import dataclass, fields
from typing import Optional

from openhands.core.config.config_utils import get_field_info
from openhands.core.logger import LOG_DIR
Expand Down Expand Up @@ -44,7 +43,6 @@ class LLMConfig:
caching_prompt: Use the prompt caching feature if provided by the LLM and supported by the provider.
log_completions: Whether to log LLM completions to the state.
log_completions_folder: The folder to log LLM completions to. Required if log_completions is True.
draft_editor: A more efficient LLM to use for file editing. Introduced in [PR 3985](https://github.com/All-Hands-AI/OpenHands/pull/3985).
custom_tokenizer: A custom tokenizer to use for token counting.
native_tool_calling: Whether to use native tool calling if supported by the model. Can be True, False, or not set.
"""
Expand Down Expand Up @@ -84,7 +82,6 @@ class LLMConfig:
caching_prompt: bool = True
log_completions: bool = False
log_completions_folder: str = os.path.join(LOG_DIR, 'completions')
draft_editor: Optional['LLMConfig'] = None
custom_tokenizer: str | None = None
native_tool_calling: bool | None = None

Expand Down Expand Up @@ -137,22 +134,12 @@ def to_safe_dict(self):
def from_dict(cls, llm_config_dict: dict) -> 'LLMConfig':
"""Create an LLMConfig object from a dictionary.

This function is used to create an LLMConfig object from a dictionary,
with the exception of the 'draft_editor' key, which is a nested LLMConfig object.
This function is used to create an LLMConfig object from a dictionary.
"""
# Keep None values to preserve defaults, filter out other dicts
args = {
k: v
for k, v in llm_config_dict.items()
if not isinstance(v, dict) or v is None
}
if (
'draft_editor' in llm_config_dict
and llm_config_dict['draft_editor'] is not None
):
if isinstance(llm_config_dict['draft_editor'], LLMConfig):
args['draft_editor'] = llm_config_dict['draft_editor']
else:
draft_editor_config = LLMConfig(**llm_config_dict['draft_editor'])
args['draft_editor'] = draft_editor_config
return cls(**args)
21 changes: 5 additions & 16 deletions openhands/core/config/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,11 +144,11 @@ def load_from_toml(cfg: AppConfig, toml_file: str = 'config.toml'):
logger.openhands_logger.debug(
'Attempt to load default LLM config from config toml'
)
# TODO clean up draft_editor
# Extract generic LLM fields, keeping draft_editor

# Extract generic LLM fields, which are not nested LLM configs
generic_llm_fields = {}
for k, v in value.items():
if not isinstance(v, dict) or k == 'draft_editor':
if not isinstance(v, dict):
generic_llm_fields[k] = v
generic_llm_config = LLMConfig.from_dict(generic_llm_fields)
cfg.set_llm_config(generic_llm_config, 'llm')
Expand All @@ -168,22 +168,11 @@ def load_from_toml(cfg: AppConfig, toml_file: str = 'config.toml'):
# results in num_retries APPLIED to claude-3-5-sonnet
custom_fields = {}
for k, v in nested_value.items():
if not isinstance(v, dict) or k == 'draft_editor':
if not isinstance(v, dict):
custom_fields[k] = v
merged_llm_dict = generic_llm_config.__dict__.copy()
merged_llm_dict.update(custom_fields)
# TODO clean up draft_editor
# Handle draft_editor with fallback values:
# - If draft_editor is "null", use None
# - If draft_editor is in custom fields, use that value
# - If draft_editor is not specified, fall back to generic config value
if 'draft_editor' in custom_fields:
if custom_fields['draft_editor'] == 'null':
merged_llm_dict['draft_editor'] = None
else:
merged_llm_dict['draft_editor'] = (
generic_llm_config.draft_editor
)

custom_llm_config = LLMConfig.from_dict(merged_llm_dict)
cfg.set_llm_config(custom_llm_config, nested_key)
elif key is not None and key.lower() == 'security':
Expand Down
4 changes: 3 additions & 1 deletion openhands/runtime/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,9 @@ def __init__(
)

# Load mixins
FileEditRuntimeMixin.__init__(self)
FileEditRuntimeMixin.__init__(
self, enable_llm_editor=config.get_agent_config().codeact_enable_llm_editor
)

def setup_initial_env(self) -> None:
if self.attach_to_existing:
Expand Down
20 changes: 9 additions & 11 deletions openhands/runtime/utils/edit.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import copy
import os
import re
import tempfile
Expand Down Expand Up @@ -104,26 +103,25 @@ class FileEditRuntimeMixin(FileEditRuntimeInterface):
# This restricts the number of lines we can edit to avoid exceeding the token limit.
MAX_LINES_TO_EDIT = 300

def __init__(self, *args, **kwargs):
def __init__(self, enable_llm_editor: bool, *args, **kwargs):
super().__init__(*args, **kwargs)
self.enable_llm_editor = enable_llm_editor

llm_config = self.config.get_llm_config()
if not self.enable_llm_editor:
return

if llm_config.draft_editor is None:
llm_config.draft_editor = copy.deepcopy(llm_config)
draft_editor_config = self.config.get_llm_config('draft_editor')

# manually set the model name for the draft editor LLM to distinguish token costs
llm_metrics = Metrics(
model_name='draft_editor:' + llm_config.draft_editor.model
)
if llm_config.draft_editor.caching_prompt:
llm_metrics = Metrics(model_name='draft_editor:' + draft_editor_config.model)
if draft_editor_config.caching_prompt:
logger.debug(
'It is not recommended to cache draft editor LLM prompts as it may incur high costs for the same prompt. '
'Automatically setting caching_prompt=false.'
)
llm_config.draft_editor.caching_prompt = False
draft_editor_config.caching_prompt = False

self.draft_editor_llm = LLM(llm_config.draft_editor, metrics=llm_metrics)
self.draft_editor_llm = LLM(draft_editor_config, metrics=llm_metrics)
logger.debug(
f'[Draft edit functionality] enabled with LLM: {self.draft_editor_llm}'
)
Expand Down
6 changes: 1 addition & 5 deletions openhands/server/session/agent_session.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,11 +263,7 @@ def _create_controller(
f'LLM: {agent.llm.config.model}\n'
f'Base URL: {agent.llm.config.base_url}\n'
)
if agent.llm.config.draft_editor:
msg += (
f'Draft editor LLM (for file editing): {agent.llm.config.draft_editor.model}\n'
f'Draft editor LLM (for file editing) Base URL: {agent.llm.config.draft_editor.base_url}\n'
)

msg += (
f'Agent: {agent.name}\n'
f'Runtime: {self.runtime.__class__.__name__}\n'
Expand Down
1 change: 0 additions & 1 deletion tests/unit/test_agent_session.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ def mock_agent():
# Configure the LLM config
llm_config.model = 'test-model'
llm_config.base_url = 'http://test'
llm_config.draft_editor = None
llm_config.max_message_chars = 1000

# Set up the chain of mocks
Expand Down
127 changes: 71 additions & 56 deletions tests/unit/test_llm_draft_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,86 +7,101 @@


@pytest.fixture
def draft_llm_toml(tmp_path: pathlib.Path) -> str:
def config_toml_without_draft_editor(tmp_path: pathlib.Path) -> str:
"""
This fixture provides a TOML config that DOES NOT contain [llm.draft_editor].
We'll use it to verify that the draft_editor LLM is not present in the config.
"""
toml_content = """
[core]
workspace_base = "./workspace"

[llm]
model = "base-model"
api_key = "base-api-key"
draft_editor = { model = "draft-model", api_key = "draft-api-key" }

[llm.custom1]
model = "custom-model-1"
api_key = "custom-api-key-1"
# Should use draft_editor from [llm] as fallback
"""
toml_file = tmp_path / 'no_draft_editor.toml'
toml_file.write_text(toml_content)
return str(toml_file)


@pytest.fixture
def config_toml_with_draft_editor(tmp_path: pathlib.Path) -> str:
"""
This fixture provides a TOML config that DOES contain [llm.draft_editor].
We'll use it to verify that the draft_editor LLM is loaded as any other custom LLM.
"""
toml_content = """
[core]
workspace_base = "./workspace"

[llm]
model = "base-model"
api_key = "base-api-key"
num_retries = 7

[llm.draft_editor]
model = "draft-model"
api_key = "draft-api-key"

[llm.custom2]
model = "custom-model-2"
api_key = "custom-api-key-2"
draft_editor = { model = "custom-draft", api_key = "custom-draft-key" }

[llm.custom3]
model = "custom-model-3"
api_key = "custom-api-key-3"
draft_editor = "null" # Explicitly set to null in TOML
"""
toml_file = tmp_path / 'llm_config.toml'
toml_file = tmp_path / 'yes_draft_editor.toml'
toml_file.write_text(toml_content)
return str(toml_file)


def test_draft_editor_fallback(draft_llm_toml):
"""Test that draft_editor is correctly handled in different scenarios:
- Falls back to generic [llm] section value
- Uses custom value when specified
- Can be explicitly set to null
def test_no_draft_editor_in_config(config_toml_without_draft_editor):
"""
Test that draft_editor is simply not present if not declared in the TOML.
Previously, we tested fallback behavior. Now, it's simplified to not exist at all.
This docstring remains to illustrate that the old fallback logic is removed.
"""
config = AppConfig()

# Verify default draft_editor is None
default_llm = config.get_llm_config('llm')
assert default_llm.draft_editor is None

# Load config from TOML
load_from_toml(config, draft_llm_toml)

# Verify generic LLM draft_editor
generic_llm = config.get_llm_config('llm')
assert generic_llm.draft_editor is not None
assert generic_llm.draft_editor.model == 'draft-model'
assert generic_llm.draft_editor.api_key == 'draft-api-key'

# Verify custom1 uses draft_editor from generic as fallback
custom1 = config.get_llm_config('custom1')
assert custom1.model == 'custom-model-1'
assert custom1.draft_editor is not None
assert custom1.draft_editor.model == 'draft-model'
assert custom1.draft_editor.api_key == 'draft-api-key'

# Verify custom2 overrides draft_editor
custom2 = config.get_llm_config('custom2')
assert custom2.model == 'custom-model-2'
assert custom2.draft_editor is not None
assert custom2.draft_editor.model == 'custom-draft'
assert custom2.draft_editor.api_key == 'custom-draft-key'

# Verify custom3 has draft_editor explicitly set to None
custom3 = config.get_llm_config('custom3')
assert custom3.model == 'custom-model-3'
assert custom3.draft_editor is None


def test_draft_editor_defaults(draft_llm_toml):
"""Test that draft_editor uses default values from LLMConfig when not specified"""
load_from_toml(config, config_toml_without_draft_editor)

# draft_editor should not appear in config.llms
assert 'draft_editor' not in config.llms


def test_draft_editor_as_named_llm(config_toml_with_draft_editor):
"""
Test that draft_editor is loaded if declared in the TOML under [llm.draft_editor].
This docstring references the simpler approach: if it exists, it's just another named LLM.
"""
config = AppConfig()
load_from_toml(config, draft_llm_toml)
load_from_toml(config, config_toml_with_draft_editor)

# draft_editor should appear as a normal named LLM
assert 'draft_editor' in config.llms

draft_llm = config.get_llm_config('draft_editor')
assert draft_llm is not None
assert draft_llm.model == 'draft-model'
assert draft_llm.api_key == 'draft-api-key'

generic_llm = config.get_llm_config('llm')
assert generic_llm.draft_editor.num_retries == 8 # Default from LLMConfig
assert generic_llm.draft_editor.embedding_model == 'local' # Default from LLMConfig

custom2 = config.get_llm_config('custom2')
assert custom2.draft_editor.num_retries == 8 # Default from LLMConfig
assert custom2.draft_editor.embedding_model == 'local' # Default from LLMConfig
def test_draft_editor_fallback(config_toml_with_draft_editor):
"""
Test that the draft_editor config does pick up fallbacks
normally set in LLMConfig class and from generic LLM.

We expect the 'draft_editor' LLM to behave just like any custom LLM would.
"""
config = AppConfig()
load_from_toml(config, config_toml_with_draft_editor)

# Check that the normal default fields come from LLMConfig where not overridden
draft_editor_config = config.get_llm_config('draft_editor')
# num_retries is an example default from llm section
assert draft_editor_config.num_retries == 7
# embedding_model is defaulted in the LLMConfig class
assert draft_editor_config.embedding_model == 'local'
Loading