Multi-task reinforcement learning agent for simulated and real-world quadrotor control using Isaac Gym and ArduPilot.
Unless otherwise stated in local licenses or file headers, all code in this repository is:
Copyright 2024 Max Planck Institute for Intelligent Systems
Licensed under the terms of the GNU General Public License v3.0 or later
📜 https://www.gnu.org/licenses/gpl-3.0.en.html
mkdir MultitaskRL && cd MultitaskRL
-
Download Isaac Gym Preview 4 from:
👉 https://developer.nvidia.com/isaac-gym -
Extract it into your workspace:
tar -xvf IsaacGym_Preview_4_Package.tar.gz
- Run the setup script to install the Isaac Gym conda environment:
bash IsaacGym_Preview_4_Package/isaacgym/create_conda_env_rlgpu.sh
- Activate the environment:
conda activate rlgpu
💡 Tip: You may need to edit the script to change the Python version to
3.8
for compatibility with PyTorch 2.0.
git clone https://github.com/robot-perception-group/GraphMTSAC_UAV.git
Update the Isaac Gym environment with this project's dependencies:
conda env update --file GraphMTSAC_UAV/environment.yml
If needed, install any remaining pip dependencies manually:
pip install -e GraphMTSAC_UAV
pip install -r GraphMTSAC_UAV/requirements.txt
✅ This will install all required Python packages, including
wandb
,gym
, and any others listed inrequirements.txt
.
- Enter the project directory:
cd GraphMTSAC_UAV/
- Start training the agent (available options:
SAC
,MTSAC
,RMAMTSAC
):
python run.py agent=MTSAC wandb_log=False env=Quadcopter env.num_envs=25 env.sim.headless=False agent.save_model=False
- Sweep example using Weights & Biases:
wandb sweep sweep/mtsac_hyper.yml
Experiment results and logs are saved under the
sweep/
directory and visualized via wandb.
-
Install ArduPilot firmware (see: https://ardupilot.org/dev/docs/building-setup-linux.html)
-
Configure the autopilot to use your custom controller:
👉 https://ardupilot.org/dev/docs/copter-adding-custom-controller.html -
Generate the model parameter header from your trained neural network:
python3 script/cpp_generator_rmagraphnet.py
This will generate the file NN_Parameters.h
.
- Move the header into the custom ArduPilot folder:
mv NN_Parameters.h AC_CustomControl/NN_Parameters.h
-
Replace ArduPilot's original
AC_CustomControl
directory with this modified one. -
Compile ArduPilot with the custom controller:
./waf configure --board Pixhawk6X copter --enable-custom-controller
GraphMTSAC_UAV/
├── agents/ # SAC, MTSAC, RMAMTSAC agents
├── assets/ # Quadcopter urdf definition
├── cfg/ # Training configuration
├── env/ # Quadcopter simulator settings
├── common/ # Shared utilities, wrappers, layers
├── sweep/ # W&B hyperparameter sweep configs
├── script/ # Tools for real-world deployment
├── AC_CustomControl/ # Custom firmware integration
├── run.py # Main entry point
├── play.py # Main testing point
├── requirements.txt
├── environment.yml
@article{liu2025multitask,
title={Multitask Reinforcement Learning for Quadcopter Attitude Stabilization and Tracking using Graph Policy},
author={Liu, Yu Tang and Vale, Afonso and Ahmad, Aamir and Ventura, Rodrigo and Basiri, Meysam},
journal={arXiv preprint arXiv:2503.08259},
year={2025}
}