make your deep learning life easier
Rosetta Stone is a lightweight framework that aims to make your deep learning life easier. It enables users to performe end-to-end experiment quickly and efficiently. In comparison with the other open source libraries, Rosetta is an alternate low-code toolkit that can be used to perform deep learning tasks with only few lines of code. It easy to use and make you focus on designing your models!
🦆 Version 1.1.15 out now!
*Note: master
branch is the developing branch.
- yaml-styled model for elegantly configuring complex applications
- best practice
- Unified design for various applications
- Pre-trained models
- State-of-the-art performance
- Python >= 3.6
- Pytorch >= 1.4.0
Install the latest version from source
# clone the project repository, and install via pip
$ git clone https://git.huya.com/wangfeng2/rosetta_stone.git \
&& cd rosetta_stone \
&& pip install -e .
or released stable version via pip
:
$ pip install --upgrade rosetta-stone
For ease-of-use, you can also use rosetta with Docker
:
# build docker image
$ docker build --tag huya_ai:rosetta .
# run the docker container
$ docker run --rm -it -v $(PWD):/rosetta --name rosetta huya_ai:rosetta bash
In rosetta
you don’t need to specify a training loop, just define the dataLoaders and the models. For ResNet
example,
- Step 1: Create YAML Configuration
create a yaml file (usually named as app.yaml
) within your repo as the example below.
```yaml
resnet56: &resnet56
model_module: examples.vision.resnet_model:ResNet
dataio_module: examples.vision.cifar10:CIFAR10
batch_size: 256
num_classes: 10
n_size: 9
```
-
Step 2: Define Dataloader
-
Step 3: Define Model
-
Step 4: Start to train
-
training from scratch
$ rosetta train resnet56 --yaml-path app.yaml
-
overrides parameters defined in yaml file
# the cli paramer `--yaml-path` has default value `app.yaml` $ rosetta train resnet56 --batch_size=125
-
training using automatic mixture precision (amp)
$ rosetta train resnet56 --yaml-path app.yaml --use-amp
-
distributed training using
torch.distributed.launch
(recommended)$ python -m torch.distributed.launch --module --nproc_per_node={GPU_NUM} rosetta.main train resnet56
-
distributed training using
horovod
(not recommended)$ rosetta train resnet56 --use-horovod
-
You can contribute to this project by sending a merge request. After approval, the merge request will be merged by the reviewer.
Before making a contribution, please confirm that:
- Code quality stays consistent across the script, module or package.
- Code is covered by unit tests.
- API is maintainable.
- flambe: An ML framework to accelerate research and its path to production.
- Jacinle: It contains a range of utility functions for python development, including project configuration, file IO, image processing, inter-process communication, etc.
- homura: PyTorch utilities including trainer, reporter, etc.
- FARM: Fast & easy transfer learning for NLP. Harvesting language models for the industry.
- kotonoha: NLP utilities for research
- padertorch: A collection of common functionality to simplify the design, training and evaluation of machine learning models based on pytorch with an emphasis on speech processing.
- Tips, tricks and gotchas in PyTorch
- PyTorch Parallel Training: PyTorch Parallel Training(单机多卡并行、混合精度、同步BN训练指南文档)
- 给训练踩踩油门 —— Pytorch 加速数据读取
- 高性能PyTorch是如何炼成的?
- service-streamer: Boosting your Web Services of Deep Learning Applications.
- Masked batchnorm in PyTorch