Skip to content

Latest commit

 

History

History
35 lines (20 loc) · 1.05 KB

README.md

File metadata and controls

35 lines (20 loc) · 1.05 KB

Reinforcement Learning Grid Walker

Reinforcement learning toy model of a walker on a two-dimensional grid. The task is to avoid the mines (negative rewards, penalties) while reaching the treats (positive rewards and fixed points).

The learning algorithm used is the canonical Q-learning.

The user can play with different hyperparameters such as the learning rate, reward forecast discount, exploration-exploitation trade-off on grids of varying size and different reward/penalty densities.

The panel of settings:

Settings panel

The grid and the learned policy can be visualized.

Settings panel


This project was generated with Angular 7, and visualization uses roughjs.


Play with me here, or locally:

Build dependencies with:

npm install

then run with

ng serve