
This project aims to solve the route optimisation problem of individual vehicle by reinforcement learning.
Giving a set of start terminal and end terminal, a vehicle will follow the route computed by RL.
For multiple vehicles and demands case, gmmrr/fleet-route-optim is another version aims to deal with vehicles in fleet.
This repo is part of Guanming's capstone project.
It is executed as a reference.
Processing Time: 0.09724 seconds
Travelled Time: 6.15 mins


Last Episode: 120
Processing Time: 42.869093 seconds
Travelled Time: 6.67 mins
Similar to Q Learning, but we trigger exploration in a given exploration_rate.
As in the graph, bumps are caused by those explorations.
Last Episode: 465
Processing Time: 161.478861 seconds
Travelled Time: 6.65 mins
Obviously, outcomes of both Q Learning and SARSA are slightly worse than the classic Dijkstra algorithm. If the map is well informed, Dijkstra is one of the best with its highly effective performance. Though the map in this project is not infinitely extended because of the limit of node amount by OSM, they are close enough. The result of SARSA is slightly better than Q Learning counterpart as well. It indicates the benefit of the policy that SARSA take.
-
Congestion is randomly generated
Similar to mentioned situation above. It is randomly chosen from edges space, and it can be defined infleet_environment.py
to low, medium, or high level. -
Speed is a constant
Net downloaded from OSM website helps classify the edge type, like primary, secondary, residential highway. Each of them has a defined speed. In this project, we don't take acceleration into consideration. Thus, it seems like to be far away from the practical case. -
Traffic light is set in a 90 seconds interval
Even if it is close to the practical case, it is still not real. They are set as a program rather than a constant pattern in reality. -
The terminal condition of RL
It is set that convergence occurs when time taken (round to the second decimal place) in 5 episodes is consistent.
- Download SUMO (https://sumo.dlr.de/docs/Downloads.php)
- Clone this repository to your local machine
- Install the necessary packages by following operations:
$ pip3 install -r requirements.txt
- Update the main.py with your SUMO directory to set the environment variable
def sumo_config():
os.environ["SUMO_HOME"] = '$SUMO_HOME' # -- change to your path to $SUMO_HOME
...
- Upload your netedit file and update the network_file variable
network_file = './network_files/ncku_network.net.xml'
More on OSM website: https://www.openstreetmap.org/
Config command is saved in ./network_files/config.txt
- Upload your traffic_light file
tls = tls_from_tllxml('./network_files/ncku_network.tll.xml')
This file can be converted by Netedit, more on https://sumo.dlr.de/docs/Netedit/index.html
- Edit start_node and end_node in
main.py
# 02 Configure network variables
start_node = "864831599" # can be defined, the scope is the nodes in the network
end_node = "5739293224"
- Run the code
$ python3 main.py
In agent.py
, we can set
# Hyperparameters for Q_Learning
learning_rate = 0.9 # alpha
discount_factor = 0.1 # gamma
# Hyperparameters for SARSA
learning_rate = 0.9 # alpha
discount_factor = 0.1 # gamma
exploration_rate = 0.1 # ratio of exploration and exploitation
and we have
reward_lst = [-50, -50, -30, 100, 50, -1]
They are defined as [invalid_action_reward, dead_end_reward, loop_reward, completion_reward, bonus_reward, continue_reward]
respectively.