Skip to content
This repository has been archived by the owner on Mar 24, 2020. It is now read-only.

Deep learning implementations (Asynchronous Deep Q-Learning) of multiple Game Theory algorithms for adversarial learning (WoLF-PHC, GIGA-WoLF, WPL, EMA-QL, PGA-APP)

Notifications You must be signed in to change notification settings

david-simoes-93/Mixed-Policy-Asynchronous-Deep-Q-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mixed-Policy Asynchronous Deep Q-Learning

A mixed-policy version of the Asynchronous 1-step Q-learning algorithm, based on WoLF-PHC, GIGA-WoLF, WPL, EMA-QL and PGA-APP, with several Game Theoretic test scenarios.

Multi-agent Double DQN

The Multi-agent Double DQN algorithm is in the asyncdqn folder. You will need Python3.3+, matplotlib, python-tk and TensorFlow 0.13+. To run some threads locally, adjust configuration on asyncdqn/DQN-LocalThreads.py, and just

export PYTHONPATH=$(pwd)
python3 asyncdqn/DQN-LocalThreads.py

To run 15 processes distributed, adjust configuration on asyncdqn/DQN-Distributed.py, and

./start-dqn-mixed.sh 15

If you want to test table-based mixed-policy algorithms, you can also adjust configuration on mixedQ/run_mixed_algorithms.py, and

export PYTHONPATH=$(pwd)
python3 mixedQ/run_mixed_algorithms.py

For specific tests of WPL in a multi-state environment, configure mixedQ/wpl_nrps.py and

export PYTHONPATH=$(pwd)
python3 mixedQ/wpl_nrps.py

Results

The algorithm works out of the box with all scenarios. This is a pseudo-code description.

screenshot from 2017-09-06 12-22-21

We test on 5 famous Game Theory challenges.

screenshot from 2017-09-06 12-17-12

We used a neural network with 2 hidden layers of 150 nodes each, and ELU activation. We share network weights to speed-up learning, as shown below.

screenshot from 2017-09-06 12-17-36

Below we can see the evolution of the policies of 2 agents in self-play using the Wolf-PHC, GIGA-WoLF, WPL, and EMA-QL algorithms, over 1000 epochs of 10000 trials. The games shown are the Tricky Game (solid) and the Biased Game (dotted), both shown in Figure 2. Each plot represents the probability of playing the first action by each player.

screenshot from 2017-09-06 12-10-27

Below we can see the evolution of the policies of 2 agents in self-play using the deep learning implementations of Wolf-PHC, GIGA-WoLF, WPL, and EMA-QL algorithms, over 400 epochs of 250 iterations. The games shown are the Tricky Game (solid) and the Biased Game (dotted), both shown in Figure 2. Each plot represents the probability of playing the first action by each player.

screenshot from 2017-09-06 12-10-38

About

Deep learning implementations (Asynchronous Deep Q-Learning) of multiple Game Theory algorithms for adversarial learning (WoLF-PHC, GIGA-WoLF, WPL, EMA-QL, PGA-APP)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published