This repository is for the Tensorflow 2 code for training the SonicTheHedgehog2 with human expert data.
Please visit the Medium post to see the detatiled instucrtions about this poject.
Originally, you can install the Sonic 2 environment by buying the game from the Steam and follow the installation tutorial of Chang-Chia-Chi.
However, the steam stop to sell the Sonic2 game alone. Instead, they are selling the bunder pack of all Sega game from Sega Mega Drive & Genesis Classics on Steam. I assume that Sonic2 Rom file also is included inside of it, but not sure.
Anyway, you need to import the found ROM file from below Python command.
python3 -m retro.import /path/to/your/ROMs/directory/
Sorry for not uploading the ROM file on this repo.
- gym 0.14.0
- tensorflow-gpu 2.4.1
- tensorflow-probability 0.11.0
- pygame 1.9.6
- gym-retro 0.8.0
- Gym Retro: https://github.com/openai/retro
- Retro-movies: https://github.com/openai/retro-movies
- Sonic-the-Hedgehog-A3C-LSTM-tensorflow2: https://github.com/Chang-Chia-Chi/Sonic-the-Hedgehog-A3C-LSTM-tensorflow2
You can download the dataset from my Google Drive. It consists of total 1800 data for 100 number per each Act.
Please train the agent by Supervised Learning first and evaluate the performance of model. The agent should do a spin dash action properly. If it looks fine, train the model further by Reinforcement Learning.
You can use the below command for training your Agent by Supervised Learning. It will save a weight of model to the model folder of the workspace path.
$ python run_supervised_learning.py --workspace_path [folder path] --gpu_use [True, Fasle] --replay_path [folder path] --level_name [level name] --use_action_history [True, False]
$ python3.7 run_supervised_learning.py --workspace_path /home/kimbring2/Sonic-the-Hedgehog-A3C-LSTM-tensorflow2 --replay_path /media/kimbring2/be356a87-def6-4be8-bad2-077951f0f3da/retro-movies/human/SonicTheHedgehog2-Genesis/contest --level_name SonicTheHedgehog2-Genesis-EmeraldHillZone.Act2 --use_action_history True
You can the training progress by watching the Tensorboard log of the tensorboard folder of the workspace path.
After finishing the Supervised Learning, try to test a performance of a trained model.
$ python run_evaluation.py --workspace_path [folder path] --use_action_history [True, False] --model_name [file name] --gpu_use [True, Fasle] --level_name [level name]
$ python3.7 run_evaluation.py --workspace_path /home/kimbring2/Sonic-the-Hedgehog-A3C-LSTM-tensorflow2 --use_action_history True --model_name supervised_model_1900 --gpu_use True --level_name SonicTheHedgehog2-Genesis-EmeraldHillZone.Act2
Because of long game play time, normal A2C method can not be used because it should use whole episode once. Therefore, off-policy A2C such as IMPALA is needed. It can restore trajectory data from buffer for training like a DQN.
You can run the IMPALA with Supervised model for the Sonic environment by below command.
$ ./run_reinforcement_learning.sh [number of envs] [gpu use] [pretrained model]
You can ignore below error of learner.py part. It does not effect the training process.
Traceback (most recent call last):
File "C:/minerl/learner.py", line 392, in
coord.join(thread_data)
File "C:\Users\sund0\anaconda3\envs\minerl_env\lib\site-packages\tensorflow\python\training\coordinator.py", line 357, in join
threads = self._registered_threads.union(set(threads))
where line 391 and 392 is
for thread_data in thread_data_list:
coord.join(thread_data)