-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better reconnection for nonvnc on challenge restart #478
Comments
This should have already fixed DOS: dojo/workspace/services/desktop.nix Lines 7 to 18 in 89dbada
I think what you are proposing does not make sense as a "noVNC patch". Instead what we should do is put javascript on the top level page that decides where the iframe will point at. Somehow, we need the ability for that javascript to be informed that it should change the iframe location. For example:
We should do the same thing for all top-level /workspace/* routes. The tricky thing is receiving that event. We could for instance poll every 30s to see if I have a new workspace (e.g. container id changed and is in an initialized state). But that's gonna be a decent amount of increases requests and also be slow to do what we want: we want the new desktop to switch over as fast as possible. What we probably need to figure out is websockets or SSE. The critical thing is making sure this plays nice by not holding an available flask worker hostage and not holding db thread connections hostage. I believe CTFd actually has some support for this already that we've disabled in the docker-compose: |
I would lean towards SSE as it has way less overhead: it's a normal HTTP request that stays open and streams a text-based protocol. Websocket requires an upgrade handshake between the client and server and every message has to be XORed. I took a look at the CTFd notification logic. A couple important files are the route the clients connect to and the events_manager classes. Here's my understanding of how it works. Redis has a built in pub/sub mechanism. The event manager's job is to spawn a gevent task (basically a thread) to subscribe to redis. When an HTTP request comes through to the SSE endpoint, it returns a generator which is how flask streams stuff. The generator hooks into the event manager, encodes the events in the SSE format, and forwards them to the client. It seems like this would lock up a flask worker, but I'd have to test to be sure. Maybe the only solution is to add more flask workers. |
I think I also lean towards SSE, especially since we don't have a need for bidirectional communication, and probably even more importantly, CTFd already has logic for this! I see a My surface level understanding is that CTFd is using gevent by default, and we continue to use this in the DOJO, a "lightweight thread", as you mention. We specify 8 workers, but each worker actually also spins up real (linux) threads:
I think that 9th process is a a supervisor process that manages the 8 workers, maybe? |
The question is where flask iterates the generator from. |
I think I see what you're saying. I think it should be iterating here in the Which pulls from: I think gevent should be taking care of making sure the worker process is context switching when it gets blocked on that I can see that gevent is willing to monkey patch SimpleQueues (https://www.gevent.org/api/gevent.monkey.html#gevent.monkey.patch_queue), but this is a regular Queue. Not sure how it allows that to function, but I can't imagine something so core isn't supported, it must be context switching somehow (maybe simply with the assistance of regular threads). |
Just throwing it out there, but I think this idea will also have nice implications for #426, we can get some sweet interfacing if we can cleanly stream solves/awards/etc. |
we could also use jquery: we could check every 2 seconds on the client if the status is "noVNC_status_error noVNC_open" and if so trigger a reload of the page or get the new iframe src from a specific endpoint. |
We generate a new VNC password when we restart a challenge, and as a result, novnc can't automatically reconnect across challenge starts. This annoyance will get more noticeable as we streamline the UX (e.g., #473). We should probably patch novnc to refresh the page when a reconnection fails (which will also fix the DOS issue).
The text was updated successfully, but these errors were encountered: