DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles

Implementation code for our paper "DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles"(arXiv) in IEEE Transactions on Robotics (T-RO) 2023. This repository contains our DRL-VO code for training and testing the DRL-VO control policy in its 3D human-robot interaction Gazebo simulator. Please note that this open source version code directly uses accurate pedestrian information from the Gazebo simulator instead of using the YOLO&MHT pipeline as in our paper (due to some commercial library licensing restrictions in the MHT tracker). Video demos can be found at multimedia demonstrations. Here are two GIFs showing our DRL-VO control policy for navigating in the simulation and real world.

  • Simulation: simulation_demo
  • Real world: hardware_demo


Our DRL-VO control policy is a novel learning-based control policy with strong generalizability to new environments that enables a mobile robot to navigate autonomously through spaces filled with both static obstacles and dense crowds of pedestrians. The policy uses a unique combination of input data to generate the desired steering angle and forward velocity: a short history of lidar data, kinematic data about nearby pedestrians, and a sub-goal point. The policy is trained in a reinforcement learning setting using a reward function that contains a novel term based on velocity obstacles to guide the robot to actively avoid pedestrians and move towards the goal. This DRL-VO control policy is tested in a series of 3D simulated experiments with up to 55 pedestrians and an extensive series of hardware experiments using a turtlebot2 robot with a 2D Hokuyo lidar and a ZED stereo camera. In addition, our DRL-VO control policy ranked 1st in the simulated competition and 3rd in the final physical competition of the ICRA 2022 BARN Challenge, which is tested in highly constrained static environments using a Jackal robot. The deployment code for ICRA 2022 BARN Challenge can be found in "nav-competition-icra2022-drl-vo". DRL-VO Architecture


  • Ubuntu 20.04
  • ROS-Noetic
  • protobuf 3.20.0
  • Python 3.8.5
  • Pytorch 1.7.1
  • Tensorboard 2.4.1
  • Gym 0.18.0
  • Stable-baseline3 1.1.0


This package requires these packages:

Optional packages:

We provide two ways to install our DRL-VO navigation packages on Ubuntu 20.04:

  1. standalone install them on your PC;
  2. use a pre-created singularity container directly (no need to configure the environment).

1) Standalone installation on PC:

  1. install ROS Noetic by following ROS installation document.
  2. install required learning-based packages:
sudo apt-get install python-is-python3
pip install protobuf==3.20.0
pip install torch==1.7.1+cu110 -f
pip install gym==0.18.0 pandas==1.2.1
pip install stable-baselines3==1.1.0
pip install tensorboard==2.4.1 psutil cloudpickle
  1. install Turtlebot2 ROS packages:
sudo apt-get install ros-noetic-move-base*
sudo apt-get install ros-noetic-map-server*
sudo apt-get install ros-noetic-amcl*
sudo apt-get install ros-noetic-navigation*
sudo apt-get install ros-noetic-ecl-threads
mkdir -p ~/catkin_ws/src
cd ~/catkin_ws/src
chmod +x 
sudo sh 
  1. install DRL-VO ROS navigation packages:
cd ~/catkin_ws/src
git clone
git clone
git clone
cd ..
source ~/catkin_ws/devel/

2) Using singularity container: all required packages are installed

  1. install singularity software:
cd ~
sudo apt install ./singularity-ce_3.9.7-bionic_amd64.deb
  1. download pre-created "drl_vo_container.sif" to the home directory.

  2. install DRL-VO ROS navigation packages:

pip install protobuf==3.20.0 
cd ~
singularity shell --nv drl_vo_container.sif
source /etc/.bashrc
mkdir -p ~/catkin_ws/src
cd ~/catkin_ws/src
git clone
git clone
git clone
cd ..
source ~/catkin_ws/devel/
  1. ctrl + D to exit the singularity container.


Running on PC:

  • train on desktop (with a GUI): the trained models and log files will be stored in "~/drl_vo_runs"
roscd drl_vo_nav
cd ..
sh ~/drl_vo_runs
  • train on server (without a GUI): the trained models and log files will be stored in "~/drl_vo_runs"
roscd drl_vo_nav
cd ..
sh ~/drl_vo_runs
  • inference on desktop (navigation):
roscd drl_vo_nav
cd ..

You can then use the "2D Nav Goal" button on Rviz to set a random goal for the robot, as shown below: sending_goal_demo

Running on a singularity container:

  • train on desktop (with a GUI): the trained models and log files will be stored in "~/drl_vo_runs"
cd ~
singularity shell --nv drl_vo_container.sif
source /etc/.bashrc
source ~/catkin_ws/devel/
roscd drl_vo_nav
cd ..
sh ~/drl_vo_runs
  • train on server (without a GUI): the trained models and log files will be stored in "~/drl_vo_runs"
cd ~
singularity shell --nv drl_vo_container.sif
source /etc/.bashrc
source ~/catkin_ws/devel/
roscd drl_vo_nav
cd ..
sh ~/drl_vo_runs
  • inference on desktop (navigation):
cd ~
singularity shell --nv drl_vo_container.sif
source /etc/.bashrc
source ~/catkin_ws/devel/
roscd drl_vo_nav
cd ..

Deploy on a hardware robot or other simulators for application or evaluation:

  • Take the Jackal robot equipped with a ZED2 camera and a Hokuyo-UTM-30LX lidar as an example, where the ZED2 camera can directly provide pedestrian tracking information: You can deploy our DRL-VO control policy using either a standalone installation or a Singularity container.
roscd drl_vo_nav
cd ..
git checkout deploy
cd ../..
source ~/catkin_ws/devel/

Please modify the following configuration in the drl_vo_nav.launch according to your robot and environment configuration:

    <!-- rviz -->
    <arg name="rviz"                        default="false"/>
    <!-- Map -->
    <arg name="map_file"                    default="$(find drl_vo_nav)/maps/coe_full_lobby/coe_full_lobby2.yaml"/>
    <!-- Subscriber topics -->
    <arg name="scan_topic"                  default="scan"/>  <!-- sensor_msgs::LaserScan -->
    <arg name="ped_topic"                   default="zed_node/obj_det/objects"/>  <!-- zed_interfaces::object_stamped -->
    <arg name="vel_topic"                   default="jackal_velocity_controller/cmd_vel"/> <!-- geometry_msgs::Twist -->
    <arg name="odom_topic"                  default="odometry/filtered" />  <!-- nav_msgs::Odometry  -->
    <!-- Publisher topics -->
    <arg name="smooth_cmd_vel_topic"        default="cmd_vel"/>  <!-- robot control command: geometry_msgs::Twist -->
    <!-- AMCL initial pose -->
    <arg name="initial_pose_x"              default="0.0"/>
    <arg name="initial_pose_y"              default="0.0"/>
    <arg name="initial_pose_a"              default="0.0"/>
    <!-- TF frames -->
    <arg name="base_frame_id"               default="base_link"/>
    <arg name="global_frame_id"             default="map"/>
    <arg name="odom_frame_id"               default="odom"/>
    <!-- Pure Pursuit parameters -->
    <arg name="lookahead"                   default="2.0"/>
    <arg name="rate"                        default="20.0"/>
    <!-- DRL-VO parameters -->
    <arg name="model_file"                  default="$(find drl_vo_nav)/src/model/"/>
    <arg name="vx_limit"                    default="0.5"/>
    <arg name="wz_limit"                    default="0.7"/>

Then, please modify the navigation parameter files in the param folder, especially the 'obstacles_layer' in the costmap_common_params.yaml.

  observation_sources: scan
  scan: {sensor_frame: hokuyo_link, data_type: LaserScan, topic: scan, marking: true, clearing: true, min_obstacle_height: -2.0, max_obstacle_height: 2.0, obstacle_range: 8, raytrace_range: 8.5}

Finally, you can use roslaunch drl_vo to navigate:

roslaunch drl_vo_nav drl_vo_nav.launch


  author={Xie, Zhanteng and Dames, Philip},
  journal={IEEE Transactions on Robotics}, 
  title={{DRL-VO}: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles}, 

  title={Towards safe navigation through crowded dynamic environments},
  author={Xie, Zhanteng and Xin, Pujie and Dames, Philip},
  booktitle={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},