PySC2 Deep Reinforcement Learning Agents

This repository implements different Deep Reinforcement Learning Agents for the pysc2 learning environment as described in the DeepMind StarCraft II paper.

We provide implementations for:

Advantage Actor Critic (A2C) based on A3C https://arxiv.org/abs/1602.01783
- Fully Connected Policy
- Convolutional LSTM Policy https://arxiv.org/abs/1506.04214
Proximal Policy Optimization (PPO) https://arxiv.org/abs/1707.06347
FeUdal Networks (FuN) https://arxiv.org/abs/1703.01161

This repository is part of a student research project which was conducted at the Autonomous Systems Labs, TU Darmstadt by Daniel Palenicek, Marcel Hussing, and Simon Meister.

The repository was originally located at simonmeister/pysc2-rl-agents but has moved to this new location.

Content

The following gives a brief explaination about what we have implemented in this repository. For more detailed information check out the reports.

FeUdal Networks

We have adapted and implemented the FeUdal Networks algorithm for hierarical reinforcement learning on StarCraft II. To be compatable with StarCraft II we account for the spatial state and action space, opposed to the original pubication on Atari.

A2C & PPO

We implemented these baseline agents to learn the PySC2 minigames. While PPO can only train a FullyConvolutional Policy in the current implementation A2C can additionally train a ConvolutionalLSTM policy.

Reports

We document our implementation and results in more depth in the following reports:

Daniel Palenicek, Marcel Hussing, Simon Meister (Apr. 2018): Deep Reinforcement Learning for StarCraft II
Daniel Palenicek, Marcel Hussing (Sep. 2018): Adapting Feudal Networks for StarCraft II

Usage

Software Requirements

Python 3
pysc2 (tested with v1.2)
TensorFlow (tested with 1.4.0)
StarCraft II and mini games (see below or pysc2)

Quick Install Guide

pip install numpy tensorflow-gpu pysc2==1.2
Install StarCraft II. On Linux, use 3.16.1. Unzip the package into the home directory.
Download the mini games and extract them to your ~/StarcraftII/Maps/ directory.

Train & run

Quickstart: python run.py <experiment-id> will run the training with default settings for Fully Connected A2C. To evalutate after training run python run.py <experiment-id> --eval.

The implementation enables highly configurable experiments via the command line args. To see the full documentation run python run.py --help.

The most important flags to add to the python run.py <experiment-id> command include:

--agent: Choose between A2C, PPO and FeUdal
--policy: Choose the topology of the policy network (not all agents are compatible with every network)
--map: Choose the mini-map which you want to train on
--vis: Visualize the agent

Summaries are written to out/summary/<experiment_name> and model checkpoints are written to out/models/<experiment_name>.

Hardware Requirements

For fast training, a GPU is recommended. We ran our experiments on Titan X Pascal and GTX 1080Ti GPUs

Results

On the mini games, we report the following results as best mean over score:

Map	FC	ConvLSTM	PPO	FUN	DeepMind
MoveToBeacon	26	26	26	26	26
CollectMineralShards	97	93	-	-	103
FindAndDefeatZerglings	45	-	-	-	45
DefeatRoaches	-	-	-	-	100
DefeatZerglingsAndBanelings	68	-	-	-	62
CollectMineralsAndGas	-	-	-	-	3978
BuildMarines	-	-	-	-	3

In the following we show plots for the score over episodes.

FeUdal Networks

PPO

A2C

Convolutional LSTM

Fully Connected

Note that the DeepMind mean scores are their best individual scores after 100 runs for each game, where the initial learning rate was randomly sampled for each run. We use a constant initial learning rate for a much smaller number of runs due to limited hardware.

License

This project is licensed under the MIT License (refer to the LICENSE file for details).

Acknowledgments

The code in rl/environment.py is based on OpenAI baselines, with adaptions from sc2aibot. Some of the code in rl/agents/a2c/runner.py is loosely based on sc2aibot. The Convolutional LSTM Cell implementation is taken from carlthome. The FeUdal Networks implementation is inspired by dmakian.

Name		Name	Last commit message	Last commit date
Latest commit History 291 Commits
reports		reports
rl		rl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PySC2 Deep Reinforcement Learning Agents

Content

FeUdal Networks

A2C & PPO

Reports

Usage

Software Requirements

Quick Install Guide

Train & run

Hardware Requirements

Results

FeUdal Networks

PPO

A2C

Convolutional LSTM

Fully Connected

License

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

License

danielpalen/pysc2-rl-agents

Folders and files

Latest commit

History

Repository files navigation

PySC2 Deep Reinforcement Learning Agents

Content

FeUdal Networks

A2C & PPO

Reports

Usage

Software Requirements

Quick Install Guide

Train & run

Hardware Requirements

Results

FeUdal Networks

PPO

A2C

Convolutional LSTM

Fully Connected

License

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages