Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat socket io #5056

Merged
merged 127 commits into from
Nov 26, 2024
Merged

Feat socket io #5056

merged 127 commits into from
Nov 26, 2024

Conversation

tofarr
Copy link
Collaborator

@tofarr tofarr commented Nov 15, 2024

Using SocketIO as the Underlying Communication Protocol

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Migration / Refactor to use SocketIO as the communication protocol instead of raw Websockets

  • Better Error Tolerance
  • Better Clusterability (Using Redis)

Testing Scenarios:

Starting a new session should open a single websocket
image
image

Refreshing the page should allow the session to continue uninterrupted (I refreshed the page then asked a follow on question):
image

Interrupting the socket without refreshing the page should allow the session to continue uninterrupted (You can see in the screenshot that I used the developer console to close the socket, and a new one instantly opened to replace it. I asked a follow up question with no issue):
image

Many of the Use cases for this PR (Particularly related to Redis are mostly applicable in the SAAS environment, where Pods may cease to exist mid session and we want to minimize user disruption. The logic when dealing with multiple pods is more complex than with a single pod!


Fixes #5151


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:56ca2ed-nikolaik   --name openhands-app-56ca2ed   docker.all-hands.dev/all-hands-ai/openhands:56ca2ed

@diwu-sf
Copy link
Contributor

diwu-sf commented Nov 15, 2024

Will this make #5019 easier to implement by decoupling the Session from the websocket?

Comment on lines +66 to +73
except:
try:
asyncio.get_running_loop()
logger.warning(
'error_reading_from_redis', exc_info=True, stack_info=True
)
except RuntimeError:
return # Loop has been shut down
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we os.Exit here or anything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - if there are any other graceful shutdown tasks that might interfere with those - ending the loop should be sufficient.

Copy link
Collaborator

@rbren rbren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only bit that'd be nice to tackle is that between SIGTERM and SIGKILL the agent appears as STOPPED, even though it's still working during the grace period. We can tackle that in a follow-on though, this is working well as-is

@tofarr tofarr enabled auto-merge (squash) November 26, 2024 00:12
@tofarr tofarr merged commit c7d8971 into main Nov 26, 2024
13 checks passed
@tofarr tofarr deleted the feat-socket-io branch November 26, 2024 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: WebSocket Crashes with ValueError on latest_event_id
5 participants