Skip to content

Commit

Permalink
some refinement on code comment (#1324)
Browse files Browse the repository at this point in the history
Co-authored-by: 宋家亮_Jialiang <[email protected]>
  • Loading branch information
jlsong01 and 宋家亮_Jialiang authored Nov 7, 2024
1 parent de6c5b1 commit d529925
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion dlrover/python/elastic_agent/torch/training.py
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@ def _join_rendezvous(self):
def next_rendezvous(self):
"""The handler will periodically query the world from the master until
the world is not empty. The world is a dictionary like
like {0: 8, 1: 8, 2: 8} where the key is the node ID and the value is
{0: 8, 1: 8, 2: 8} where the key is the node ID and the value is
the local world size. The handler can get its rank by the position
of it node ID in the world.
"""
Expand Down
2 changes: 1 addition & 1 deletion dlrover/trainer/torch/elastic_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
auto-config will set the nnodes as the number of nodes in a job,
nproc_per_node as the number of available GPUs. If the number of
nodes >= 4, it will set the network-check as True. If network-check is True,
dlrover-run will launch simple tasks on each node to check wether
dlrover-run will launch simple tasks on each node to check whether
the node is slow or fault.
Single-node multi-worker
Expand Down

0 comments on commit d529925

Please sign in to comment.