Skip to content

thelittlelamb/Intro2RL

Repository files navigation

Introduction of Reinforcement Learning

1 Multi-Armed Bandit

2 Markov Decision Process

Markov process

Bellman equation: Bellman Expectation Equation & Bellman optimality equation

3 Dynamic Programming

To solve: The optimal policy in Markov decision processes

Policy Iteration & Value Iteration

4 Temporal Difference Model

Sarsa Algorithm & Q-Learning

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published