-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: Condenser Interface and Defaults #5306
Merged
xingyaoww
merged 109 commits into
All-Hands-AI:main
from
csmith49:feature/condenser-refactor
Jan 7, 2025
Merged
Changes from 62 commits
Commits
Show all changes
109 commits
Select commit
Hold shift + click to select a range
d35a06e
refactor: Implement condenser configuration system
openhands-agent c6bc242
Fix condenser config and event creation issues
openhands-agent d0bc171
refactor: Move condenser config to core/config and improve env var ha…
openhands-agent 8b9b52d
fixing up unit tests and type hints
10d5fc6
restoring utils file
6489787
splitting failing config unit tests
3f4af37
removing TOML-setting of condensers
39e6c3d
refactor: Implement condenser configuration system
openhands-agent 1088537
Fix condenser config and event creation issues
openhands-agent 4e63c7f
refactor: Move condenser config to core/config and improve env var ha…
openhands-agent b53f4b6
fixing up unit tests and type hints
190d6ae
restoring utils file
a07a615
splitting failing config unit tests
a550833
removing TOML-setting of condensers
14c8f1d
condenser config basemodel -> dataclass
5e05564
extending condenser output with support for metadata
1a3482c
condensation metadata utility functions
b8544ef
llm condenser metrics
3dc9245
merging
2a059c8
configuration and metrics for all condensers
fe6ed05
adding keep_first flag to reduce condenser
4343b8d
recent events condenser can keep arbitrary initial values
c0b1230
adding some evaluation scripts
4ce8594
uploading some exp data
1d97831
Add cactus plots for token usage and history length analysis
openhands-agent 52193c4
feat(evaluation): Add graph generation script
openhands-agent 2bff3bc
refactor(evaluation): Improve plot_graphs.py
openhands-agent 83070df
feat(evaluation): Add browser renderer option
openhands-agent 3b779a2
Merge branch 'main' into feature/condenser-refactor
402906f
Refactor Condenser to work directly with State objects
openhands-agent 9ecf5aa
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
e49e84d
Add AmortizedForgettingCondenser framework
openhands-agent b6dff1f
Implement AmortizedForgettingCondenser
openhands-agent 523dee2
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
8c87d70
Add keep_first parameter to AmortizedForgettingCondenser
openhands-agent 4381307
Add event tracking to AmortizedForgettingCondenser
openhands-agent f2b9d07
feat: Add LLMAttentionCondenser with NotImplementedError placeholder
openhands-agent d64edf8
Implement LLMAttentionCondenser with state maintenance and LLM config…
openhands-agent e169d8a
Merge branch 'main' into feature/condenser-refactor
14873ad
fixing forgetting condensers and tests
18b8b17
fixing condenser tests
85885b9
minor
d74b5f6
cleaning up accidental experimental files
36dff26
Merge branch 'main' into feature/condenser-refactor
csmith49 d189f73
fixes to llm attention condenser
0dee6eb
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
b28d319
failing unit tests added
b6f6190
unit tests passing with event id added
2558fad
unit tests passing with event id added
bfa80cb
minor condenser entrypoint refactor
940d85e
cleaning up condenser metadata management
756bedb
doc string pass
6e263b8
observation-masking condenser
1daedff
fixing llm attention
d6f9906
fixing observation masking tests
909e847
fixing eval scripts
716f665
Merge branch 'main' into feature/condenser-refactor
5e29ebc
fixing lock
c20e811
fixing prompt caching tests
a96551c
Merge branch 'main' into feature/condenser-refactor
csmith49 68d1bff
Merge branch 'main' into feature/condenser-refactor
csmith49 219994d
safe dumping of llm-based condenser config
b69cd15
asdf
5c0704f
failing test for mismatched tool calls
006a5fd
mismatched tool call tests passing
32eccc9
refactoring test
8825d03
Merge branch 'main' into feature/condenser-refactor
csmith49 8dc834f
llm attention validation and test
5ec51e1
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
7aa7a24
Merge branch 'main' into feature/condenser-refactor
csmith49 e3bfd63
moving condenser observations
b587499
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
453ce82
condensation observation -> agent condensation observation
35bb405
Merge branch 'main' into feature/condenser-refactor
csmith49 de10aeb
fix broken imports
d3c9794
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
1e98db6
Merge branch 'main' into feature/condenser-refactor
csmith49 b4d4680
fixing response format detection
66ae8a3
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
0e6cea7
Resolve merge conflicts with main branch
openhands-agent c1c7881
Fix pyproject.toml formatting
openhands-agent 06ef4cc
Update poetry.lock
openhands-agent 1bd5310
Merge branch 'main' into feature/condenser-refactor
csmith49 c26f9dc
updating docs and field constraints
185421c
fixing bad llm config in condenser tests
124c327
Merge branch 'main' into feature/condenser-refactor
csmith49 8a65f96
get metrics functionality in eval utils
13feeb1
Merge branch 'feature/condenser-refactor' of github.com:csmith49/Open…
4565b54
Merge branch 'main' into feature/condenser-refactor
csmith49 ef03f98
Merge branch 'main' into feature/condenser-refactor
csmith49 c21f254
Merge branch 'main' into feature/condenser-refactor
csmith49 0a7646d
Merge branch 'main' into feature/condenser-refactor
csmith49 2a0f843
add condensation obs to be deserialized when restoring session
enyst f2bdd07
Merge branch 'main' of github.com:All-Hands-AI/OpenHands into feature…
enyst ce5dfc1
Merge branch 'main' into feature/condenser-refactor
enyst d6acd03
poetry lock
enyst c73e422
Merge branch 'main' into feature/condenser-refactor
enyst 7171a31
Merge branch 'main' into feature/condenser-refactor
enyst f9ff522
Merge branch 'main' into feature/condenser-refactor
csmith49 cddd0e7
Merge branch 'main' into feature/condenser-refactor
csmith49 d80a68c
Merge branch 'main' into feature/condenser-refactor
csmith49 5beba03
Merge branch 'main' into feature/condenser-refactor
csmith49 c6acf01
Merge branch 'main' into feature/condenser-refactor
csmith49 1f88c06
Merge branch 'main' into feature/condenser-refactor
csmith49 cdb2c4b
Merge branch 'main' into feature/condenser-refactor
csmith49 956251f
Merge branch 'main' into feature/condenser-refactor
csmith49 47fb5e7
Merge branch 'main' into feature/condenser-refactor
csmith49 3261db3
updating lock file
01caf68
Merge branch 'main' into feature/condenser-refactor
csmith49 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,10 @@ | |
|
||
from openhands.controller.state.state import State | ||
from openhands.core.config import LLMConfig | ||
from openhands.core.config.condenser_config import ( | ||
CondenserConfig, | ||
NoOpCondenserConfig, | ||
) | ||
from openhands.core.exceptions import ( | ||
AgentRuntimeBuildError, | ||
AgentRuntimeDisconnectedError, | ||
|
@@ -45,18 +49,29 @@ class EvalMetadata(BaseModel): | |
dataset: str | None = None | ||
data_split: str | None = None | ||
details: dict[str, Any] | None = None | ||
condenser_config: CondenserConfig | None = None | ||
|
||
def model_dump(self, *args, **kwargs): | ||
dumped_dict = super().model_dump(*args, **kwargs) | ||
# avoid leaking sensitive information | ||
dumped_dict['llm_config'] = self.llm_config.to_safe_dict() | ||
if hasattr(self.condenser_config, 'llm_config'): | ||
dumped_dict['condenser_config']['llm_config'] = ( | ||
self.condenser_config.llm_config.to_safe_dict() | ||
) | ||
|
||
return dumped_dict | ||
|
||
def model_dump_json(self, *args, **kwargs): | ||
dumped = super().model_dump_json(*args, **kwargs) | ||
dumped_dict = json.loads(dumped) | ||
# avoid leaking sensitive information | ||
dumped_dict['llm_config'] = self.llm_config.to_safe_dict() | ||
if hasattr(self.condenser_config, 'llm_config'): | ||
dumped_dict['condenser_config']['llm_config'] = ( | ||
self.condenser_config.llm_config.to_safe_dict() | ||
) | ||
|
||
logger.debug(f'Dumped metadata: {dumped_dict}') | ||
return json.dumps(dumped_dict) | ||
|
||
|
@@ -192,6 +207,7 @@ def make_metadata( | |
eval_output_dir: str, | ||
data_split: str | None = None, | ||
details: dict[str, Any] | None = None, | ||
condenser_config: CondenserConfig | None = None, | ||
) -> EvalMetadata: | ||
model_name = llm_config.model.split('/')[-1] | ||
model_path = model_name.replace(':', '_').replace('@', '-') | ||
|
@@ -222,6 +238,9 @@ def make_metadata( | |
dataset=dataset_name, | ||
data_split=data_split, | ||
details=details, | ||
condenser_config=condenser_config | ||
if condenser_config | ||
else NoOpCondenserConfig(), | ||
Comment on lines
+242
to
+244
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that this probably won't pass linting, so you should make sure that you have pre-commit config installed to auto-lint before committing. |
||
) | ||
metadata_json = metadata.model_dump_json() | ||
logger.info(f'Metadata: {metadata_json}') | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
from typing import Literal | ||
|
||
from pydantic import BaseModel, Field | ||
|
||
from openhands.core.config.llm_config import LLMConfig | ||
|
||
|
||
class NoOpCondenserConfig(BaseModel): | ||
"""Configuration for NoOpCondenser.""" | ||
|
||
type: Literal['noop'] = Field('noop') | ||
|
||
|
||
class ObservationMaskingCondenserConfig(BaseModel): | ||
"""Configuration for ObservationMaskingCondenser.""" | ||
|
||
type: Literal['observation_masking'] = Field('observation_masking') | ||
attention_window: int = Field( | ||
default=10, | ||
description='The number of most-recent events where observations will not be masked.', | ||
) | ||
|
||
|
||
class RecentEventsCondenserConfig(BaseModel): | ||
"""Configuration for RecentEventsCondenser.""" | ||
|
||
type: Literal['recent'] = Field('recent') | ||
keep_first: int = Field( | ||
default=0, | ||
description='The number of initial events to condense.', | ||
) | ||
max_events: int = Field( | ||
default=10, description='Maximum number of events to keep.', ge=1 | ||
) | ||
|
||
|
||
class LLMSummarizingCondenserConfig(BaseModel): | ||
"""Configuration for LLMCondenser.""" | ||
|
||
type: Literal['llm'] = Field('llm') | ||
llm_config: LLMConfig = Field( | ||
..., description='Configuration for the LLM to use for condensing.' | ||
) | ||
|
||
|
||
class AmortizedForgettingCondenserConfig(BaseModel): | ||
"""Configuration for AmortizedForgettingCondenser.""" | ||
|
||
type: Literal['amortized'] = Field('amortized') | ||
max_size: int = Field( | ||
default=100, | ||
description='Maximum size of the condensed history before triggering forgetting.', | ||
ge=2, | ||
) | ||
keep_first: int = Field( | ||
default=0, | ||
description='Number of initial events to always keep in history.', | ||
ge=0, | ||
) | ||
|
||
|
||
class LLMAttentionCondenserConfig(BaseModel): | ||
"""Configuration for LLMAttentionCondenser.""" | ||
|
||
type: Literal['llm_attention'] = Field('llm_attention') | ||
llm_config: LLMConfig = Field( | ||
..., description='Configuration for the LLM to use for attention.' | ||
) | ||
max_size: int = Field( | ||
default=100, | ||
description='Maximum size of the condensed history before triggering forgetting.', | ||
ge=2, | ||
) | ||
keep_first: int = Field( | ||
default=0, | ||
description='Number of initial events to always keep in history.', | ||
ge=0, | ||
) | ||
|
||
|
||
CondenserConfig = ( | ||
NoOpCondenserConfig | ||
| ObservationMaskingCondenserConfig | ||
| RecentEventsCondenserConfig | ||
| LLMSummarizingCondenserConfig | ||
| AmortizedForgettingCondenserConfig | ||
| LLMAttentionCondenserConfig | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
from openhands.memory.condenser import MemoryCondenser | ||
from openhands.memory.condenser import Condenser | ||
from openhands.memory.memory import LongTermMemory | ||
|
||
__all__ = ['LongTermMemory', 'MemoryCondenser'] | ||
__all__ = ['LongTermMemory', 'Condenser'] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this line be moved inside
state.metrics.get()
? Then we wouldn't have to copy-paste it into every evaluation benchmark separately.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite so directly: the condenser metrics are stored on the state, but the
.get()
here is on the metrics object held by the state. We can clean it up with astate.get_metrics()
function though.EDIT: Never mind, that introduces a circular import between the condensers and the state that isn't easy to break. Maybe the
get_metrics()
should just be a function inevaluation/utils/shared.py
.