Skip to content

Commit

Permalink
docs fix and v0.2.5 (#156)
Browse files Browse the repository at this point in the history
* pre

* update docs

* update docs

* $ in bash

* size -> hidden_layer_size

* doctest

* doctest again

* filter a warning

* fix bug

* fix examples

* test fail

* test succ
  • Loading branch information
Trinkle23897 authored Jul 22, 2020
1 parent 089b85b commit bd9c3c7
Show file tree
Hide file tree
Showing 21 changed files with 139 additions and 122 deletions.
9 changes: 2 additions & 7 deletions .github/ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,10 @@
+ [ ] RL algorithm bug
+ [ ] documentation request (i.e. "X is missing from the documentation.")
+ [ ] new feature request
- [ ] I have visited the [source website], and in particular read the [known issues]
- [ ] I have searched through the [issue tracker] and [issue categories] for duplicates
- [ ] I have visited the [source website](https://github.com/thu-ml/tianshou/)
- [ ] I have searched through the [issue tracker](https://github.com/thu-ml/tianshou/issues) for duplicates
- [ ] I have mentioned version numbers, operating system and environment, where applicable:
```python
import tianshou, torch, sys
print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
```

[source website]: https://github.com/thu-ml/tianshou/
[known issues]: https://github.com/thu-ml/tianshou/#faq-and-known-issues
[issue categories]: https://github.com/thu-ml/tianshou/projects/2
[issue tracker]: https://github.com/thu-ml/tianshou/issues?q=
9 changes: 2 additions & 7 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,10 @@

Less important but also useful:

- [ ] I have visited the [source website], and in particular read the [known issues]
- [ ] I have searched through the [issue tracker] and [issue categories] for duplicates
- [ ] I have visited the [source website](https://github.com/thu-ml/tianshou)
- [ ] I have searched through the [issue tracker](https://github.com/thu-ml/tianshou/issues) for duplicates
- [ ] I have mentioned version numbers, operating system and environment, where applicable:
```python
import tianshou, torch, sys
print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
```

[source website]: https://github.com/thu-ml/tianshou
[known issues]: https://github.com/thu-ml/tianshou/#faq-and-known-issues
[issue categories]: https://github.com/thu-ml/tianshou/projects/2
[issue tracker]: https://github.com/thu-ml/tianshou/issues?q=
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: PEP8 Check
name: PEP8 and Docs Check

on: [push, pull_request]

Expand All @@ -11,9 +11,20 @@ jobs:
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Upgrade pip
run: |
python -m pip install --upgrade pip setuptools wheel
- name: Install dependencies
run: |
python -m pip install flake8
- name: Lint with flake8
run: |
flake8 . --count --show-source --statistics
- name: Install dependencies
run: |
pip install ".[dev]" --upgrade
- name: Documentation test
run: |
cd docs
make html SPHINXOPTS="-W"
cd ..
60 changes: 29 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Here is Tianshou's other features:
- Support any type of environment state (e.g. a dict, a self-defined class, ...) [Usage](https://tianshou.readthedocs.io/en/latest/tutorials/cheatsheet.html#user-defined-environment-and-different-state-representation)
- Support customized training process [Usage](https://tianshou.readthedocs.io/en/latest/tutorials/cheatsheet.html#customize-training-process)
- Support n-step returns estimation for all Q-learning based algorithms
- Support multi-agent RL easily [Usage](https://tianshou.readthedocs.io/en/latest/tutorials/cheatsheet.html##multi-agent-reinforcement-learning)
- Support multi-agent RL [Usage](https://tianshou.readthedocs.io/en/latest/tutorials/cheatsheet.html##multi-agent-reinforcement-learning)

In Chinese, Tianshou means divinely ordained and is derived to the gift of being born with. Tianshou is a reinforcement learning platform, and the RL algorithm does not learn from humans. So taking "Tianshou" means that there is no teacher to study with, but rather to learn by themselves through constant interaction with the environment.

Expand All @@ -49,24 +49,27 @@ In Chinese, Tianshou means divinely ordained and is derived to the gift of being
Tianshou is currently hosted on [PyPI](https://pypi.org/project/tianshou/). It requires Python >= 3.6. You can simply install Tianshou with the following command:

```bash
pip3 install tianshou
$ pip install tianshou
```

You can also install with the newest version through GitHub:

```bash
pip3 install git+https://github.com/thu-ml/tianshou.git@master
# latest release
$ pip install git+https://github.com/thu-ml/tianshou.git@master
# develop version
$ pip install git+https://github.com/thu-ml/tianshou.git@dev
```

If you use Anaconda or Miniconda, you can install Tianshou through the following command lines:

```bash
# create a new virtualenv and install pip, change the env name if you like
conda create -n myenv pip
$ conda create -n myenv pip
# activate the environment
conda activate myenv
$ conda activate myenv
# install tianshou
pip install tianshou
$ pip install tianshou
```

After installation, open your python console and type
Expand All @@ -82,9 +85,9 @@ If no error occurs, you have successfully installed Tianshou.

The tutorials and API documentation are hosted on [tianshou.readthedocs.io](https://tianshou.readthedocs.io/).

The example scripts are under [test/](https://github.com/thu-ml/tianshou/blob/master/test) folder and [examples/](https://github.com/thu-ml/tianshou/blob/master/examples) folder. It may fail to run with PyPI installation, so please re-install the github version through `pip3 install git+https://github.com/thu-ml/tianshou.git@master`.
The example scripts are under [test/](https://github.com/thu-ml/tianshou/blob/master/test) folder and [examples/](https://github.com/thu-ml/tianshou/blob/master/examples) folder.

中文文档位于 [https://tianshou.readthedocs.io/zh/latest/](https://tianshou.readthedocs.io/zh/latest/)
中文文档位于 [https://tianshou.readthedocs.io/zh/latest/](https://tianshou.readthedocs.io/zh/latest/)

<!-- 这里有一份天授平台简短的中文简介:https://www.zhihu.com/question/377263715 -->

Expand All @@ -95,7 +98,7 @@ The example scripts are under [test/](https://github.com/thu-ml/tianshou/blob/ma
Tianshou is a lightweight but high-speed reinforcement learning platform. For example, here is a test on a laptop (i7-8750H + GTX1060). It only uses 3 seconds for training an agent based on vanilla policy gradient on the CartPole-v0 task: (seed may be different across different platform and device)

```bash
python3 test/discrete/test_pg.py --seed 0 --render 0.03
$ python3 test/discrete/test_pg.py --seed 0 --render 0.03
```

<div align="center">
Expand All @@ -108,10 +111,10 @@ We select some of famous reinforcement learning platforms: 2 GitHub repos with m
| --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
| GitHub Stars | [![GitHub stars](https://img.shields.io/github/stars/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers) | [![GitHub stars](https://img.shields.io/github/stars/openai/baselines)](https://github.com/openai/baselines/stargazers) | [![GitHub stars](https://img.shields.io/github/stars/hill-a/stable-baselines)](https://github.com/hill-a/stable-baselines/stargazers) | [![GitHub stars](https://img.shields.io/github/stars/ray-project/ray)](https://github.com/ray-project/ray/stargazers) | [![GitHub stars](https://img.shields.io/github/stars/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch)](https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch/stargazers) | [![GitHub stars](https://img.shields.io/github/stars/astooke/rlpyt)](https://github.com/astooke/rlpyt/stargazers) |
| Algo - Task | PyTorch | TensorFlow | TensorFlow | TF/PyTorch | PyTorch | PyTorch |
| PG - CartPole | 6.09±4.60s | None | None | 19.26±2.29s | None | ? |
| DQN - CartPole | 6.09±0.87s | 1046.34±291.27s | 93.47±58.05s | 28.56±4.60s | 31.58±11.30s \*\* | ? |
| A2C - CartPole | 10.59±2.04s | \*(~1612s) | 57.56±12.87s | 57.92±9.94s | \*(Not converged) | ? |
| PPO - CartPole | 31.82±7.76s | \*(~1179s) | 34.79±17.02s | 44.60±17.04s | 23.99±9.26s \*\* | ? |
| PG - CartPole | 9.02±6.79s | None | None | 19.26±2.29s | None | ? |
| DQN - CartPole | 6.72±1.28s | 1046.34±291.27s | 93.47±58.05s | 28.56±4.60s | 31.58±11.30s \*\* | ? |
| A2C - CartPole | 15.33±4.48s | \*(~1612s) | 57.56±12.87s | 57.92±9.94s | \*(Not converged) | ? |
| PPO - CartPole | 6.01±1.14s | \*(~1179s) | 34.79±17.02s | 44.60±17.04s | 23.99±9.26s \*\* | ? |
| PPO - Pendulum | 16.18±2.49s | 745.43±160.82s | 259.73±27.37s | 123.62±44.23s | Runtime Error | ? |
| DDPG - Pendulum | 37.26±9.55s | \*(>1h) | 277.52±92.67s | 314.70±7.92s | 59.05±10.03s \*\* | 172.18±62.48s |
| TD3 - Pendulum | 44.04±6.37s | None | 99.75±21.63s | 149.90±7.54s | 57.52±17.71s \*\* | 210.31±76.30s |
Expand Down Expand Up @@ -142,7 +145,7 @@ We decouple all of the algorithms into 4 parts:
- `process_fn`: to preprocess data from replay buffer (since we have reformulated all algorithms to replay-buffer based algorithms);
- `learn`: to learn from a given batch data.

Within these API, we can interact with different policies conveniently.
Within this API, we can interact with different policies conveniently.

### Elegant and Flexible

Expand Down Expand Up @@ -182,17 +185,12 @@ Define some hyper-parameters:

```python
task = 'CartPole-v0'
lr = 1e-3
gamma = 0.9
n_step = 3
eps_train, eps_test = 0.1, 0.05
epoch = 10
step_per_epoch = 1000
collect_per_step = 10
target_freq = 320
batch_size = 64
lr, epoch, batch_size = 1e-3, 10, 64
train_num, test_num = 8, 100
gamma, n_step, target_freq = 0.9, 3, 320
buffer_size = 20000
eps_train, eps_test = 0.1, 0.05
step_per_epoch, collect_per_step = 1000, 10
writer = SummaryWriter('log/dqn') # tensorboard is also supported!
```

Expand All @@ -208,7 +206,8 @@ Define the network:

```python
from tianshou.utils.net.common import Net

# you can define other net by following the API:
# https://tianshou.readthedocs.io/en/latest/tutorials/dqn.html#build-the-network
env = gym.make(task)
state_shape = env.observation_space.shape or env.observation_space.n
action_shape = env.action_space.shape or env.action_space.n
Expand All @@ -219,8 +218,7 @@ optim = torch.optim.Adam(net.parameters(), lr=lr)
Setup policy and collectors:

```python
policy = ts.policy.DQNPolicy(net, optim, gamma, n_step,
target_update_freq=target_freq)
policy = ts.policy.DQNPolicy(net, optim, gamma, n_step, target_update_freq=target_freq)
train_collector = ts.data.Collector(policy, train_envs, ts.data.ReplayBuffer(buffer_size))
test_collector = ts.data.Collector(policy, test_envs)
```
Expand All @@ -236,7 +234,7 @@ result = ts.trainer.offpolicy_trainer(
print(f'Finished training! Use {result["duration"]}')
```

Save / load the trained policy (it's exactly the same as PyTorch nn.module):
Save / load the trained policy (it's exactly the same as PyTorch `nn.module`):

```python
torch.save(policy.state_dict(), 'dqn.pth')
Expand All @@ -254,26 +252,26 @@ collector.close()
Look at the result saved in tensorboard: (with bash script in your terminal)

```bash
tensorboard --logdir log/dqn
$ tensorboard --logdir log/dqn
```

You can check out the [documentation](https://tianshou.readthedocs.io) for advanced usage.

## Contributing

Tianshou is still under development. More algorithms and features are going to be added and we always welcome contributions to help make Tianshou better. If you would like to contribute, please check out [docs/contributing.rst](https://github.com/thu-ml/tianshou/blob/master/docs/contributing.rst).
Tianshou is still under development. More algorithms and features are going to be added and we always welcome contributions to help make Tianshou better. If you would like to contribute, please check out [this link](https://tianshou.readthedocs.io/en/latest/contributing.html).

## TODO

Check out the [Issue/PR Categories](https://github.com/thu-ml/tianshou/projects/2) and [Support Status](https://github.com/thu-ml/tianshou/projects/3) page for more detail.
Check out the [Project](https://github.com/thu-ml/tianshou/projects) page for more detail.

## Citing Tianshou

If you find Tianshou useful, please cite it in your publications.

```latex
@misc{tianshou,
author = {Jiayi Weng, Minghao Zhang, Dong Yan, Hang Su, Jun Zhu},
author = {Jiayi Weng, Minghao Zhang, Alexis Duburcq, Kaichao You, Dong Yan, Hang Su, Jun Zhu},
title = {Tianshou},
year = {2020},
publisher = {GitHub},
Expand Down
6 changes: 4 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
'sphinx.ext.doctest',
'sphinx.ext.intersphinx',
'sphinx.ext.coverage',
'sphinx.ext.imgmath',
# 'sphinx.ext.imgmath',
'sphinx.ext.mathjax',
'sphinx.ext.ifconfig',
'sphinx.ext.viewcode',
Expand All @@ -58,7 +58,9 @@
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
autodoc_default_options = {'special-members': '__call__, __getitem__, __len__'}
autodoc_default_options = {'special-members': ', '.join([
'__len__', '__call__', '__getitem__', '__setitem__',
'__getattr__', '__setattr__'])}

# -- Options for HTML output -------------------------------------------------

Expand Down
19 changes: 11 additions & 8 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,14 @@ To install Tianshou in an "editable" mode, run

.. code-block:: bash
pip3 install -e ".[dev]"
$ git checkout dev
$ pip install -e ".[dev]"
in the main directory. This installation is removable by

.. code-block:: bash
python3 setup.py develop --uninstall
$ python setup.py develop --uninstall
PEP8 Code Style Check
---------------------
Expand All @@ -23,7 +24,7 @@ We follow PEP8 python code style. To check, in the main directory, run:

.. code-block:: bash
flake8 . --count --show-source --statistics
$ flake8 . --count --show-source --statistics
Test Locally
------------
Expand All @@ -32,7 +33,7 @@ This command will run automatic tests in the main directory

.. code-block:: bash
pytest test --cov tianshou -s --durations 0 -v
$ pytest test --cov tianshou -s --durations 0 -v
Test by GitHub Actions
----------------------
Expand Down Expand Up @@ -65,11 +66,13 @@ To compile documentation into webpages, run

.. code-block:: bash
make html
$ make html
under the ``docs/`` directory. The generated webpages are in ``docs/_build`` and can be viewed with browsers.

Chinese Documentation
---------------------
Chinese documentation is in https://tianshou.readthedocs.io/zh/latest/, and the develop version of documentation is in https://tianshou.readthedocs.io/en/dev/.

Pull Request
------------

Chinese documentation is in https://tianshou.readthedocs.io/zh/latest/
All of the commits should merge through the pull request to the ``dev`` branch. The pull request must have 2 approvals before merging.
2 changes: 1 addition & 1 deletion docs/contributor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ We always welcome contributions to help make Tianshou better. Below are an incom
* Jiayi Weng (`Trinkle23897 <https://github.com/Trinkle23897>`_)
* Minghao Zhang (`Mehooz <https://github.com/Mehooz>`_)
* Alexis Duburcq (`duburcqa <https://github.com/duburcqa>`_)
* Kaichao You (`youkaichao <https://github.com/youkaichao>`_)
* Kaichao You (`youkaichao <https://github.com/youkaichao>`_)
Loading

0 comments on commit bd9c3c7

Please sign in to comment.