title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

On the Linear Convergence of Policy Gradient Methods for Finite MDPs

We revisit the finite time analysis of policy gradient methods in the one of the simplest settings: finite state and action MDPs with a policy class consisting of all stochastic policies and with exact gradient evaluations. There has been some recent work viewing this setting as an instance of smooth non-linear optimization problems, to show sub-linear convergence rates with small step-sizes. Here, we take a completely different perspective based on illuminating connections with policy iteration, to show how many variants of policy gradient algorithms succeed with large step-sizes and attain a linear rate of convergence.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

bhandari21a

0

On the Linear Convergence of Policy Gradient Methods for Finite MDPs

2386

2394

2386-2394

2386

false

Bhandari, Jalaj and Russo, Daniel

given	family
Jalaj	Bhandari

given	family
Daniel	Russo

2021-03-18

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics

130

inproceedings

date-parts

2021

3

18

http://proceedings.mlr.press/v130/bhandari21a/bhandari21a.pdf

label	link
Supplementary PDF	http://proceedings.mlr.press/v130/bhandari21a/bhandari21a-supp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2021-03-18-bhandari21a.md

2021-03-18-bhandari21a.md

Files

2021-03-18-bhandari21a.md

Latest commit

History

2021-03-18-bhandari21a.md

File metadata and controls