Copyright © 2022 Intelligent Driving Laboratory (iDLab). All rights reserved.
Optimal control is an important theoretical framework for sequential decision-making and control of industrial objects, especially for complex and high-dimensional problems with strong nonlinearity, high randomness, and multiple constraints. Solving the optimal control input is the key to applying this theoretical framework to practical industrial problems. Taking Model Predictive Control as an example, computation time solving its control input relies on receding horizon optimization, of which the real-time performance greatly restricts the application and promotion of this method. In order to solve this problem, iDLab has developed a series of full state space optimal strategy solution algorithms and the set of application toolchain for industrial control based on Reinforcement Learning and Approximate Dynamic Programming theory. The basic principle of this method takes an approximation function (such as neural network) as the policy carrier, and improves the online real-time performance of optimal control by offline solving and online application. The GOPS toolchain will cover the following main links in the whole industrial control process, including control problem modeling, policy network training, offline simulation verification, controller code deployment, etc. GOPS currently supports the following algorithms:
- Deep Q Network (DQN)
- Deep Deterministic Policy Gradient (DDPG)
- Twin Delayed DDPG (TD3)
- Asynchronous Advantage Actor-Critic (A3C)
- Soft Actor-Critic (SAC)
- Distributional Soft Actor-Critic (DSAC)
- Trust Region Policy Optimization (TRPO)
- Proximal Policy Optimization (PPO)
- Infinite-Horizon Approximate Dynamic Programming (INFADP)
- Finite-Horizon Approximate Dynamic Programming (FHADP)
- Mixed Actor-Critic (MAC)
- Mixed Policy Gradient (MPG)
- Separated Proportional-Integral Lagrangian (SPIL)
GOPS requires:
- Windows 7 or greater or Linux.
- Python 3.6 or greater (GOPS V1.0 precompiled Simulink models use Python 3.8). We recommend using Python 3.8.
- (Optional) Matlab/Simulink 2018a or greater.
- The installation path must be in English.
You can install GOPS through the following steps:
# clone GOPS repository
git clone https://github.com/Intelligent-Driving-Laboratory/GOPS.git
cd gops
# create conda environment
conda env create -f gops_environment.yml
conda activate gops
# install GOPS
pip install -e .
The tutorials and API documentation are hosted on gops.readthedocs.io.
This is an example of running finite-horizon Approximate Dynamic Programming (FHADP) on inverted double pendulum environment. Train the policy by running:
python example_train/fhadp/fhadp_mlp_idpendulum_serial.py
After training, test the policy by running:
python example_run/run_idp_fhadp.py
You can record a video by setting save_render=True
in the test file. Here is a video of running a trained policy on the task: