Code written in python2 with numpy
Evaluate and iterate policy in grid with rewards
Monte Carlo algorithm implementation in grid with rewards
Temporal difference algorithms
On-policy algorithm - "Sarsa"
Off-policy algorithm - "Q learning"
Genetic algorithm
Naive bayes algorithm
Travelling Salesman Problem algorithm in reinforcement learning With tour generation and ant colony optimization