GitHub - petuum/autodist: Simple Distributed Deep Learning on TensorFlow

Documentation | Examples

AutoDist is a distributed deep learning training engine for TensorFlow. AutoDist provides a user-friendly interface to distribute the training of a wide variety deep learning models across many GPUs with scalability and minimal code change.

Introduction

Different from specialized distributed ML systems, AutoDist is created to speed up a broad range of DL models with excellent all-round performance. AutoDist achieves this goal by:

Compilation: AutoDist expresses the parallelization of DL models as a standardized compilation process, optimizing multiple dimensions of ML parallelization including synchronization, partitioning, placement etc.
Composable architecture: AutoDist contains a flexible backend that can express various different ML parallelization techniques and allows for composing distribution strategies that blend different distributed ML system architectures.
Model and resource awareness: Based on the compilation process, AutoDist analyzes the model and generates more optimal distribution strategies that adapt to both the model properties and the cluster specification.

Besides all these advanced features, AutoDist is designed to isolate the sophistication of distributed systems from ML prototyping and exposes a simple API that makes it easy to use and switch between different distributed ML techniques for users of all levels.

For a closer look at the performance, please refer to our doc.

Using AutoDist

Installation:

pip install autodist

Modifying existing TensorFlow code to use AutoDist is easy:

import tensorflow as tf
from autodist import AutoDist

ad = AutoDist(resource_spec_file="resource_spec.yml")

with tf.Graph().as_default(), ad.scope():
    ########################################################
    # Build your (single-device) model here,
    #   and train it distributedly.
    ########################################################
    sess = ad.create_distributed_session()
    sess.run(...)

Ready to try? Please refer to the examples in our Getting Started page.

References & Acknowledgements

We learned and borrowed insights from a few open source projects including Horovod, Parallax, and tf.distribute.

Name		Name	Last commit message	Last commit date
Latest commit History 208 Commits
.github		.github
autodist		autodist
docker		docker
docs		docs
examples		examples
tests		tests
.gitignore		.gitignore
.prospector.yaml		.prospector.yaml
CONTRIBUTING.md		CONTRIBUTING.md
GENVER		GENVER
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Using AutoDist

References & Acknowledgements

About

Releases

Contributors 9

Languages

License

petuum/autodist

Folders and files

Latest commit

History

Repository files navigation

Introduction

Using AutoDist

References & Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Contributors 9

Languages