SonicTheHedgehog2-Tensorflow2

This repository is for the Tensorflow 2 code for training the SonicTheHedgehog2 with human expert data.

Please visit the Medium post to see the detatiled instucrtions about this poject.

Playing Sonic The Hedgehog 2 using Deep Learning — Part 1

How to find the Sonic2 ROM

Originally, you can install the Sonic 2 environment by buying the game from the Steam and follow the installation tutorial of Chang-Chia-Chi.

However, the steam stop to sell the Sonic2 game alone. Instead, they are selling the bunder pack of all Sega game from Sega Mega Drive & Genesis Classics on Steam. I assume that Sonic2 Rom file also is included inside of it, but not sure.

Anyway, you need to import the found ROM file from below Python command.

python3 -m retro.import /path/to/your/ROMs/directory/

Sorry for not uploading the ROM file on this repo.

Python Dependencies

gym 0.14.0
tensorflow-gpu 2.4.1
tensorflow-probability 0.11.0
pygame 1.9.6
gym-retro 0.8.0

Reference

Gym Retro: https://github.com/openai/retro
Retro-movies: https://github.com/openai/retro-movies
Sonic-the-Hedgehog-A3C-LSTM-tensorflow2: https://github.com/Chang-Chia-Chi/Sonic-the-Hedgehog-A3C-LSTM-tensorflow2

Human Expert Data

You can download the dataset from my Google Drive. It consists of total 1800 data for 100 number per each Act.

Act Name	Sample Video
EmeraldHillZone.Act1
EmeraldHillZone.Act2
ChemicalPlantZone.Act1
ChemicalPlantZone.Act2
MetropolisZone.Act1
MetropolisZone.Act2
MetropolisZone.Act3
OilOceanZone.Act1
OilOceanZone.Act2
MysticCaveZone.Act1
MysticCaveZone.Act2
HillTopZone.Act1
HillTopZone.Act2
CasinoNightZone.Act1
CasinoNightZone.Act2
WingFortressZone.Act1
AquaticRuinZone.Act1
AquaticRuinZone.Act2

How to run code

Please train the agent by Supervised Learning first and evaluate the performance of model. The agent should do a spin dash action properly. If it looks fine, train the model further by Reinforcement Learning.

Running a Supervised Learning

You can use the below command for training your Agent by Supervised Learning. It will save a weight of model to the model folder of the workspace path.

$ python run_supervised_learning.py --workspace_path [folder path] --gpu_use [True, Fasle] --replay_path [folder path] --level_name [level name] --use_action_history [True, False]

$ python3.7 run_supervised_learning.py --workspace_path /home/kimbring2/Sonic-the-Hedgehog-A3C-LSTM-tensorflow2 --replay_path /media/kimbring2/be356a87-def6-4be8-bad2-077951f0f3da/retro-movies/human/SonicTheHedgehog2-Genesis/contest --level_name SonicTheHedgehog2-Genesis-EmeraldHillZone.Act2 --use_action_history True

You can the training progress by watching the Tensorboard log of the tensorboard folder of the workspace path.

Running a Evaluation

After finishing the Supervised Learning, try to test a performance of a trained model.

$ python run_evaluation.py --workspace_path [folder path] --use_action_history [True, False] --model_name [file name] --gpu_use [True, Fasle] --level_name [level name]

$ python3.7 run_evaluation.py --workspace_path /home/kimbring2/Sonic-the-Hedgehog-A3C-LSTM-tensorflow2 --use_action_history True --model_name supervised_model_1900 --gpu_use True --level_name SonicTheHedgehog2-Genesis-EmeraldHillZone.Act2

Pretrained Model

Without action history

EmeraldHillZone.Act2

Using action history

EmeraldHillZone.Act2

Running a Reinforcement Learning

Because of long game play time, normal A2C method can not be used because it should use whole episode once. Therefore, off-policy A2C such as IMPALA is needed. It can restore trajectory data from buffer for training like a DQN.

You can run the IMPALA with Supervised model for the Sonic environment by below command.

$ ./run_reinforcement_learning.sh [number of envs] [gpu use] [pretrained model]

You can ignore below error of learner.py part. It does not effect the training process.

Traceback (most recent call last):
File "C:/minerl/learner.py", line 392, in
coord.join(thread_data)
File "C:\Users\sund0\anaconda3\envs\minerl_env\lib\site-packages\tensorflow\python\training\coordinator.py", line 357, in join
threads = self._registered_threads.union(set(threads))

where line 391 and 392 is
for thread_data in thread_data_list:
coord.join(thread_data)

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
image		image
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
actor.py		actor.py
conv_lstm.ipynb		conv_lstm.ipynb
env_test.py		env_test.py
learner.py		learner.py
network.py		network.py
parametric_distribution.py		parametric_distribution.py
replay_checker.py		replay_checker.py
run_evaluation.py		run_evaluation.py
run_inverse_dynamic_evaluation.py		run_inverse_dynamic_evaluation.py
run_inverse_dynamics_learning.py		run_inverse_dynamics_learning.py
run_reinforcement_learning.sh		run_reinforcement_learning.sh
run_supervised_learning.py		run_supervised_learning.py
run_supervised_learning_idm.py		run_supervised_learning_idm.py
stop.sh		stop.sh
video_download.ipynb		video_download.ipynb
video_url.txt		video_url.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SonicTheHedgehog2-Tensorflow2

How to find the Sonic2 ROM

Python Dependencies

Reference

Human Expert Data

How to run code

Running a Supervised Learning

Running a Evaluation

Pretrained Model

Without action history

Using action history

Running a Reinforcement Learning

About

Releases

Packages

Languages

kimbring2/SonicTheHedgehog2-Tensorflow2

Folders and files

Latest commit

History

Repository files navigation

SonicTheHedgehog2-Tensorflow2

How to find the Sonic2 ROM

Python Dependencies

Reference

Human Expert Data

How to run code

Running a Supervised Learning

Running a Evaluation

Pretrained Model

Without action history

Using action history

Running a Reinforcement Learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages