Optimize performance by changing to tilemaps #1

Baekalfen · 2023-10-11T05:59:30Z

No description provided.

Baekalfen · 2023-10-11T06:00:48Z

~~These optimizations depend on this PR: Baekalfen/PyBoy#250~~

Edit:
This PR requires PyBoy 1.6.0

PWhiddy · 2023-10-11T06:49:31Z

Awesome!
This breaks the current pretrained model though right? I think this should be behind a flag or maybe as a second environment file, so the functionality demoed in the video stays intact.

Baekalfen · 2023-10-11T06:59:15Z

Sorry, yes. It does break the current model. Also I have only tested that it is programmatically sound, not that the RL is still working. It might require some additional tweaking.

I'd probably put this as work-in-progress until you're ready (if you want to) to retrain with this method. Probably waiting with the retraining until we've found if there are other improvements to make

PWhiddy · 2023-10-12T03:31:31Z

Cool. If you want to put this change into a copy of the red_gym_env file (maybe something like red_gym_tilemap_env), I could merge this so it lives in the repo. Otherwise I'll just leave it open for now.

Baekalfen · 2023-10-13T08:32:35Z

Let's leave it open for now. I might want to replace the knn system with a simpler coordinate check. I think that could provide additional performance increase.

EDIT:

But yes, we can rename it to keep both versions if you'd like.

AlexGreason · 2023-10-18T03:20:19Z

This will definitely need changes to the feature encoder - currently, it's putting a value in the range of 0-383 into a uint8, so it's overflowing and some tile indices overlap. Probably best to use a custom CNN where it passes through an embedding first to map from the integer indices to a arbitrary learnable vectors.

Iron-Bound · 2023-10-19T16:32:10Z

Let me just say thanks, this is a great work!

In training (Intel 13th/7900/32gb) can report:
System memory usage dropped 20-30%,
GPU is up from 2% to 7% utilization
and finally I'm seeing stat results in the first iteration.

Baekalfen · 2023-10-20T07:39:35Z

Just adding, that the PokemonGen1 game wrapper has been deployed in PyBoy 1.6.0

jsuarez5341 · 2023-10-21T04:32:37Z

Oh cool, I'm also cleaning up the env file and making improvements in parallel https://github.com/PufferAI/PufferLib/blob/0.5-cleanup/pufferlib/environments/pokemon_red.py

Going to port it to CleanRL, but I may be able to help rerun some experiments for free in the meanwhile

veken0m · 2023-11-03T13:48:49Z

@Baekalfen, @AlexGreason I've been experimenting with this tile implementation to train a different game and getting much better results using 'MlpPolicy' over 'CnnPolicy'. My understanding is that 'CnnPolicy' is better for raw image processing (pixel level) but 'MlpPolicy' is better when using tiles. This may also allow reducing the output_shape below 36x36 which seems to be a limitation of CnnPolicy but I haven't tested this yet.

Baekalfen · 2023-11-06T08:24:21Z

@Baekalfen, @AlexGreason I've been experimenting with this tile implementation to train a different game and getting much better results using 'MlpPolicy' over 'CnnPolicy'. My understanding is that 'CnnPolicy' is better for raw image processing (pixel level) but 'MlpPolicy' is better when using tiles. This may also allow reducing the output_shape below 36x36 which seems to be a limitation of CnnPolicy but I haven't tested this yet.

Very interesting. How's performance? I'd assume reducing the input size to 18, 20 (the native tilemap size) would help too.

veken0m · 2023-11-07T04:46:16Z

@Baekalfen, @AlexGreason I've been experimenting with this tile implementation to train a different game and getting much better results using 'MlpPolicy' over 'CnnPolicy'. My understanding is that 'CnnPolicy' is better for raw image processing (pixel level) but 'MlpPolicy' is better when using tiles. This may also allow reducing the output_shape below 36x36 which seems to be a limitation of CnnPolicy but I haven't tested this yet.

Very interesting. How's performance? I'd assume reducing the input size to 18, 20 (the native tilemap size) would help too.

Very good! Got roughly 20% FPS increase during training with MlpPolicy, output_shape = (18, 20, 3) and vec_dim = 1080 (to fix update_frame_knn_index runtime error). I'm running on 12-core 3900X with 32GB RAM.

Baekalfen added 3 commits October 9, 2023 21:42

Avoid zero-division in read_hp_fraction

ab18841

Added missing fast_video

b673a48

Change from screen-dumps to tilemaps to improve performance

b4b8379

PWhiddy added the feature label Oct 19, 2023

techmore mentioned this pull request Oct 20, 2023

check total cpu cores for the cpu_num var run_baseline_parallel.py #52

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize performance by changing to tilemaps #1

Optimize performance by changing to tilemaps #1

Baekalfen commented Oct 11, 2023

Baekalfen commented Oct 11, 2023 •

edited

Loading

PWhiddy commented Oct 11, 2023

Baekalfen commented Oct 11, 2023

PWhiddy commented Oct 12, 2023

Baekalfen commented Oct 13, 2023 •

edited

Loading

AlexGreason commented Oct 18, 2023

Iron-Bound commented Oct 19, 2023

Baekalfen commented Oct 20, 2023

jsuarez5341 commented Oct 21, 2023

veken0m commented Nov 3, 2023

Baekalfen commented Nov 6, 2023

veken0m commented Nov 7, 2023

Optimize performance by changing to tilemaps #1

Are you sure you want to change the base?

Optimize performance by changing to tilemaps #1

Conversation

Baekalfen commented Oct 11, 2023

Baekalfen commented Oct 11, 2023 • edited Loading

PWhiddy commented Oct 11, 2023

Baekalfen commented Oct 11, 2023

PWhiddy commented Oct 12, 2023

Baekalfen commented Oct 13, 2023 • edited Loading

AlexGreason commented Oct 18, 2023

Iron-Bound commented Oct 19, 2023

Baekalfen commented Oct 20, 2023

jsuarez5341 commented Oct 21, 2023

veken0m commented Nov 3, 2023

Baekalfen commented Nov 6, 2023

veken0m commented Nov 7, 2023

Baekalfen commented Oct 11, 2023 •

edited

Loading

Baekalfen commented Oct 13, 2023 •

edited

Loading