Skip to content

Pytorch process group third-party plugin for UCC

License

Notifications You must be signed in to change notification settings

huiyujie/torch_ucc

 
 

Repository files navigation

PyTorch plugin for UCC

This repo implements PyTorch Process Group API for UCC as a third-party plugin.

Requirements

License

This repo is released under the MIT license. Please see the LICENSE file for more information.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

# Build
UCX_HOME=<PATH_TO_UCX> UCC_HOME=<PATH_TO_UCC> WITH_CUDA=<PATH_TO_CUDA> python setup.py install

UCX_HOME required, specifies path to UCX installation directory

UCC_HOME required, specifies path to UCC installation directory

WITH_CUDA optional, if WITH_CUDA=no is set then only CPU tensors are supported

Run

Configuration variables

Name Values Description
TORCH_UCC_ALLGATHER_BLOCKING_WAIT 0 or 1 Sets behavior of wait function for CUDA Allgather. Async collective in PyTorch
TORCH_UCC_ALLREDUCE_BLOCKING_WAIT 0 or 1 Sets behavior of wait function for CUDA Allreduce.
TORCH_UCC_ALLTOALL_BLOCKING_WAIT 0 or 1 Sets behavior of wait function for CUDA Alltoall.
TORCH_UCC_BCAST_BLOCKING_WAIT 0 or 1 Sets behavior of wait function for CUDA Bcast.
export LD_LIBRARY_PATH=<PATH_TO_UCX>/lib:<PATH_TO_UCC>/lib:$LD_LIBRARY_PATH
python example.py
import torch
import torch.distributed as dist
import torch_ucc

....
dist.init_process_group('ucc', rank=comm_rank, world_size=comm_size)
....
dist.all_to_all_single(recv_tensor, send_tensor)

About

Pytorch process group third-party plugin for UCC

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 68.4%
  • Python 31.2%
  • Shell 0.4%