You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Average time taken to start: ~8 seconds
Average time taken to stop the server when there is at least 1 dispatch done: ~30 seconds
The start time mostly taken up by the verification of whether the server is ready to accept dispatches, that's why it is a kind of acceptable. But the stop time taken is actually a lot and we should try to reduce it. The majority of the time when stopping the server is actually taken up by the _terminate_child_processes function (can be found here.
Currently we are sending the SIGINT signal to the leader process and then shutting down its children and we know that this is working fine albeit slow. But as soon as I tried to use other methods of trying to terminate the process, such as the terminate and kill commands made available by psutils, none of them worked and the command got stuck in waiting forever.
We need to look further into fixing it by using the Dask APIs to possibly stop the cluster of workers instead of shutting down their processes directly.
It would also be better if we have tell the user what stage exactly is being loaded when starting/stopping the server and be more verbose.
The text was updated successfully, but these errors were encountered:
Average time taken to start: ~8 seconds
Average time taken to stop the server when there is at least 1 dispatch done: ~30 seconds
The start time mostly taken up by the verification of whether the server is ready to accept dispatches, that's why it is a kind of acceptable. But the stop time taken is actually a lot and we should try to reduce it. The majority of the time when stopping the server is actually taken up by the
_terminate_child_processes
function (can be found here.Currently we are sending the SIGINT signal to the leader process and then shutting down its children and we know that this is working fine albeit slow. But as soon as I tried to use other methods of trying to terminate the process, such as the terminate and kill commands made available by psutils, none of them worked and the command got stuck in waiting forever.
We need to look further into fixing it by using the Dask APIs to possibly stop the cluster of workers instead of shutting down their processes directly.
It would also be better if we have tell the user what stage exactly is being loaded when starting/stopping the server and be more verbose.
The text was updated successfully, but these errors were encountered: