Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(eval): misc SWE-Bench improvement - use different resources for different instances #6313

Merged
merged 4 commits into from
Jan 16, 2025

Conversation

xingyaoww
Copy link
Collaborator

@xingyaoww xingyaoww commented Jan 16, 2025

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions

  • Improve update output script so it can work on larger output file
  • Change timeout for scripts to under 600 secs to fits in recent runtime API constraints
  • move eval_infer create runtime into try-except-finally so we cleanup runtime resource properly when failed
  • add a resource mapping for different evaluation instances that requires more resource to run (to save some OOM restart)

Link of any specific issues this addresses


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:9c912b6-nikolaik   --name openhands-app-9c912b6   docker.all-hands.dev/all-hands-ai/openhands:9c912b6

@xingyaoww xingyaoww marked this pull request as ready for review January 16, 2025 17:45
@xingyaoww
Copy link
Collaborator Author

Thanks Calvin!

OpenHands can you help me address the comments?

@xingyaoww xingyaoww added the fix-me Attempt to fix this issue with OpenHands label Jan 16, 2025
@openhands-agent
Copy link
Contributor

OpenHands started fixing the pr! You can monitor the progress here.

@openhands-agent
Copy link
Contributor

Here's a concise overview of the changes and their status:

✅ Successfully Resolved:

  1. File handling improvement:

    • Implemented proper context manager (with statement) for file operations
    • Fixed potential file handle leak issues
    • Ensures proper resource cleanup
  2. Type hint modernization:

    • Removed deprecated typing.Dict
    • Replaced with built-in dict type hints
    • Updated all relevant type annotations

🔍 Current Status:

  • All identified issues appear to have been successfully addressed
  • Changes maintain existing functionality while improving code quality
  • No remaining issues are evident from the provided information

The changes effectively modernized the code and improved resource management without altering the core functionality.

Copy link
Collaborator

@csmith49 csmith49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, looking forward to not having so many restarts!

@xingyaoww xingyaoww merged commit 72af7bb into main Jan 16, 2025
24 checks passed
@xingyaoww xingyaoww deleted the xw/resource-fix branch January 16, 2025 18:48
csmith49 pushed a commit to csmith49/OpenHands that referenced this pull request Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix-me Attempt to fix this issue with OpenHands
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants