Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨Collaboration long polling fallback #517

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

AntoLC
Copy link
Collaborator

@AntoLC AntoLC commented Dec 18, 2024

Purpose

Some users have their websockets blocked, so they cannot collaborate.
If they are connected with other collaborators at the same time, it will create constant conflict in the document.

Proposal

We have succeeded to propose an experience almost as good as with websocket.

  • We will use a http fallback when the websocket is not able to connect.
  • We are still using the Hocus Pocus mechanism, so push and pull are trigger by the Hocus Pocus provider and server.
  • By using the Hocus Pocus mechanism, we are still using y-protocols/sync making our request very light (a few bytes).
  • We are using the SSE (https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events)
    to pull data, to minimize the requests amount and keep as much as possible our documents sync between each others.

Cases we solved:

  • connect even without websockets users altogether
  • keep rights (can edit / can view) by using the same mechanism as with the WS
  • keep the awareness (cursor), sync and doc update
  • keep our requests light
  • add a nginx auth cache system - query the backend 1 time every 30 seconds
  • test what I could

Architecture

flowchart TD
    title1[WebSocket Success]-->Client1(Client)<--->|WebSocket Success|WS1(Websocket) --> Nginx1(Ngnix) <--> Auth1("Auth Sub Request (Django)") --->|With the good right|YServer1("Hocus Pocus Server")
  YServer1 --> WS1
  YServer1 <--> clients(Dispatch to clients)
  title2[WebSocket Fails - Push data]-->Client2(Client)---|WebSocket fails|HTTP2(HTTP) --> Nginx2(Ngnix) <--> Auth2("Auth Sub Request (Django)")--->|With the good right|Express2(Express) --> YServer2("Hocus Pocus Server") --> clients(Dispatch to clients)
  title3[WebSocket Fails - Pull data]-->Client3(Client)<--->|WebSocket fails|SSE(SSE) --> Nginx3(Ngnix) <--> Auth3("Auth Sub Request (Django)") --->|With the good right|Express3(Express) --> YServer3("Listen Hocus Pocus Server")
  YServer3("Listen Hocus Pocus Server") --> SSE
  YServer3("Listen Hocus Pocus Server") <--> clients(Data from clients)
Loading

@AntoLC AntoLC self-assigned this Dec 18, 2024
@AntoLC AntoLC changed the title ✨Collab long polling ✨Collaboration long polling fallback Dec 18, 2024
@AntoLC AntoLC force-pushed the feature/collab-long-polling branch 3 times, most recently from 1360973 to 55238a7 Compare December 23, 2024 16:18
@AntoLC AntoLC changed the base branch from main to refacto/collaboration-process December 23, 2024 16:19
@AntoLC AntoLC mentioned this pull request Dec 23, 2024
4 tasks
Base automatically changed from refacto/collaboration-process to main December 24, 2024 11:29
@AntoLC AntoLC force-pushed the feature/collab-long-polling branch 4 times, most recently from 137c5b1 to ff343ca Compare December 24, 2024 15:21
@AntoLC AntoLC marked this pull request as ready for review December 24, 2024 15:21
@AntoLC AntoLC requested a review from YousefED December 24, 2024 15:25
@AntoLC AntoLC force-pushed the feature/collab-long-polling branch from ff343ca to d95e892 Compare December 24, 2024 15:32
@virgile-dev
Copy link
Collaborator

Great job @AntoLC !

Copy link
Collaborator

@YousefED YousefED left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AntoLC Nice you got this working. If I see it correctly, you created a new endpoint over which you're always syncing the entire Y.Doc to and from the server.

If I'm not mistaken, normally the Y.js sync protocol is more efficient than this and syncs the exact updates required. What's the reason you went for this approach (new endpoint, syncing entire doc) instead of the proxy approach? I think the proxy approach has some potential advantages:

  • We can keep the same sync protocol, but just switch to a different transport (more efficient and awareness would still work)
  • The HocusPocus side can stay the same, our "fix" would be isolated to a separate layer) (less code complexity and smaller chance of bugs or security issues)

I might be missing some advantages of your current approach, but my concern is mainly that it adds more "custom code" that's another surface we need to test, maintain and secure. The proxy approach would isolate / limit this more I think

@AntoLC AntoLC force-pushed the feature/collab-long-polling branch 4 times, most recently from b24b01c to 3eb9f69 Compare January 21, 2025 14:37
@AntoLC AntoLC force-pushed the feature/collab-long-polling branch 4 times, most recently from b8ff4ad to c64f1f2 Compare February 14, 2025 15:58
@AntoLC AntoLC force-pushed the feature/collab-long-polling branch from c64f1f2 to d26da26 Compare February 14, 2025 16:08
@AntoLC
Copy link
Collaborator Author

AntoLC commented Feb 14, 2025

You can test this PR before it is merged on https://docs-ia.beta.numerique.gouv.fr/.
To deactivate the websocket add the query param withoutWS=true

Example public doc: https://docs-ia.beta.numerique.gouv.fr/docs/481a9933-3514-4aeb-9877-c21be1388877/?withoutWS=true

@AntoLC AntoLC force-pushed the feature/collab-long-polling branch from d26da26 to 6355348 Compare February 14, 2025 18:51
The environment was missing in the Sentry
configuration.
This commit adds the environment to the
Sentry configuration.
We can now interact with the collaboration server
using http requests.
It will be used as a fallback when the websocket
is not working.
2 kind of requests:
 - to send messages to the server we use POST requests
 - to get messages from the server we use a GET
 request using SSE (Server Sent Events)
We will need toBase64 in different features,
better to move it to "doc-management".
@AntoLC AntoLC force-pushed the feature/collab-long-polling branch 2 times, most recently from 6f8744e to c096f35 Compare February 14, 2025 19:35
Create the CollaborationProvider class.
This class is inherited from HocuspocusProvider class.
This class integrate a fallback mechanism to handle the
case where the user cannot connect with websockets.
It will use post request to send the data to the
collaboration server.
It will use an EventSource to receive the data from the
collaboration server.
We adapt the nginx configuration to works
with http requests and on the collaboration routes.
Requests are light but quite network intensive,
so we add a cache system above "collaboration-auth".
It means the backend will be called only once
every 30 seconds after a 200 response.
We adapt the nginx configuration to works
with http requests and on the collaboration routes.
Requests are light but quite network intensive,
so we add a cache system above "collaboration-auth".
It means the backend will be called only once
every 30 seconds after a 200 response.
Firefox with websocket
Other without
@AntoLC AntoLC force-pushed the feature/collab-long-polling branch from c096f35 to 56f9a00 Compare February 14, 2025 19:45
@AntoLC AntoLC requested review from lunika and YousefED February 14, 2025 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants