You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to lack of node-level communication, the stats at the end of each pipeline step can only aggregate results from the current node, and what's being written to the disk is the status of the last finished worker, rather than the global info
The text was updated successfully, but these errors were encountered:
Hi, this is resolved for slurm executor by running a stats merger after all substasks are finished. I don't think there is a way to accomplish same behavior, as the global orchestration in local executor multi-node is not done by datatrove. Thus the responsibility of launching the merge script can't handled by datatrove.
Due to lack of node-level communication, the stats at the end of each pipeline step can only aggregate results from the current node, and what's being written to the disk is the status of the last finished worker, rather than the global info
The text was updated successfully, but these errors were encountered: