This repo. contains a reimplementation of the REINFORCE algorithm (vanilla policy gradient)
git clone https://github.com/FaisalAhmed0/REINFORCE
conda create -n vpg_env
pip3 install -r requirements.txt
conda activate vpg_env
python train.py --env "CartPole-v0"
tensorboard --logir ./runs