Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow to continue when the agent is stuck in interactive mode #5597

Merged
merged 16 commits into from
Dec 14, 2024

Conversation

enyst
Copy link
Collaborator

@enyst enyst commented Dec 14, 2024

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below
    Let user continue when the agent gets stuck in a loop (UI-only).

Changes:

  1. Clean up:

    • Remove unused almost_stuck field and related code
    • Simplify stuck detection logic
  2. Improve stuck detection:

    • Use headless_mode to determine behavior
    • In interactive mode: only consider history after last user message
    • In headless mode: keep existing behavior (full history)
  3. Optimizations:

    • Use reversed() to find last user message
    • Elegant filtering that works in both modes:
      • In headless: actively filters user messages
      • In interactive: no-op (already sliced after last user message)
  4. Tests:

    • Add tests for both modes
    • Verify behavior before/after user messages
    • Maintain backward compatibility

Fix: #5480


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:63cb544-nikolaik   --name openhands-app-63cb544   docker.all-hands.dev/all-hands-ai/openhands:63cb544

- Remove almost_stuck field from State class
- Remove almost_stuck counter from StuckDetector
- Simplify stuck detection logic to focus on actual loop detection
- Update tests to remove almost_stuck assertions
- Add UI mode awareness to stuck detection
- Only consider history after last user message in UI mode
- Keep existing behavior in headless mode
- Add comprehensive tests for both modes

Fix: #5480
- Use headless_mode flag to determine stuck detection behavior
- In interactive mode (not headless), only consider history after last user message
- Keep existing behavior in headless mode
- Add comprehensive tests for both modes

Fix: #5480
- Use not_headless parameter to match AgentController's headless_mode
- Remove unnecessary interactive_mode concept
- Update tests to use consistent terminology
- Keep behavior the same, just clearer naming
@enyst enyst force-pushed the fix-stuck-loop-recovery-simple branch from 56167cc to 8589b0a Compare December 14, 2024 17:11
enyst and others added 2 commits December 14, 2024 18:15
- Use headless_mode parameter to match AgentController
- Remove confusing double negative (not_headless)
- Keep behavior the same, just clearer naming
enyst and others added 3 commits December 14, 2024 18:35
- Use reversed() to find last user message
- Stop searching once found (break)
- Same behavior, just more efficient
The same filter works perfectly in both modes:
- In headless: actively filters user messages
- In non-headless: no-op (already sliced after last user message)
@enyst enyst added the lint-fix Attempts to fix lint issues on the PR label Dec 14, 2024
@enyst enyst changed the title fix: improve stuck detection in interactive mode fix: allow to continue when the agent is stuck in interactive mode Dec 14, 2024
@enyst enyst requested review from xingyaoww and rbren December 14, 2024 18:03
@enyst enyst requested a review from neubig December 14, 2024 18:03
Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the fix!

@enyst enyst added the integration-test Runs integration tests on the PR label Dec 14, 2024
Copy link
Contributor

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

Copy link
Contributor

Trigger by: Pull Request (integration-test label on PR #5597)
Commit: 1e3eaa8
Integration Tests Report (Haiku)
Haiku LLM Test Results:
Success rate: 100.00% (6/6)

Total cost: USD 0.00

instance_id success reason cost
t04_git_staging True 0
t05_simple_browsing True 0
t01_fix_simple_typo True 0
t03_jupyter_write_file True 0
t06_github_pr_browsing True 0
t02_add_bash_hello True 0

Integration Tests Report (DeepSeek)
DeepSeek LLM Test Results:
Success rate: 83.33% (5/6)

Total cost: USD 0.00

instance_id success reason cost
t04_git_staging True 0
t05_simple_browsing False The answer is not found in any message. Total messages: 2. 0
t02_add_bash_hello True 0
t01_fix_simple_typo True 0
t03_jupyter_write_file True 0
t06_github_pr_browsing True 0

Download testing outputs (includes both Haiku and DeepSeek results): Download

@enyst enyst merged commit f0257c7 into main Dec 14, 2024
19 checks passed
@enyst enyst deleted the fix-stuck-loop-recovery-simple branch December 14, 2024 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration-test Runs integration tests on the PR lint-fix Attempts to fix lint issues on the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Cannot recover from "Agent stuck in loop"
3 participants