Replies: 2 comments 5 replies
-
Hey there, code 1006 is reserved for unexpected or abrupt closures - it can mean anything: client's browser connection abruptly failed, or a gateway terminated the connection, or a server had issues with the TCP connection, or internet wen down, many things... Was it happening with |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm unsure how to classify this issue as I don't really know if it's a client or server issue. Anyhow, firstly I'd like to thank for the great support I've seen happen here by the maintainer @enisdenjo!
Our issue has to do with growing number of users connecting to our websocket servers (running on several Docker continainers on AWS ECS) causing stability issues which we decided to try mitigate by switching sub protocol from Apollo's graphql-ws to this new one graphql-transport-ws.
Our initial concerns were related to memory leaks and lack of maintenance for the the older protocol, though we were unable to confirm our problems were related to websockets as the Graphql service we have been running also serves regular queries. Currently, we are at at migration period to separate the servers and adjusting clients to connect to a new endpoint serving only websocket traffic and only the new sub protocol.
Since the new server has very few connected clients I wanted to make sure we understand everything that is happening under the hood. One of the things we thought might be adding to CPU usage was constant reconnects occurring according to logs (though we cant confirm whether really is an issue). We thought the problem would mostly affect unmaintained clients connecting with old libraries/old platforms/bad network. However, after selecting some newer clients for testing with the new server we noticed same frequent reconnect behaviour.
I'll share a log to shed light on it. Typically, the server would be sending datagrams every 2-10 seconds depending on the hardware behind.
Here you can see disconnects with code 1006 occur quite randomly without reason that would indicate why it happened. Was it the client that chose to close the connection?
The code to setup server follows the use server recipe at graphql-ws/lib/use/ws
Client sets up connection like this (https://github.com/tibber/com.tibber.athom/blob/master/lib/tibber.ts#L360):
Would you happen to have any ideas how to troubleshoot the constant reconnects? Any help much appreciated!
Beta Was this translation helpful? Give feedback.
All reactions