Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
williamshen-nz authored Sep 6, 2019
1 parent 84a20de commit 978ac92
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Research project for COMP3770 at the Australian National University.

Published at the Symposium on Combinatorial Search 2019 as "[Guiding Search with Generalized Policies for Probabilistic Planning
Published at the Symposium on Combinatorial Search (SoCS) 2019 as "[Guiding Search with Generalized Policies for Probabilistic Planning
](https://aaai.org/ocs/index.php/SOCS/SOCS19/paper/view/18334)"

## Abstract
Expand All @@ -11,7 +11,7 @@ Planning is the essential ability of an intelligent agent to solve the problem o

Monte-Carlo Tree Search (MCTS) is a state-space search algorithm for optimal decision making that relies on performing Monte-Carlo simulations to incrementally build a search tree, and estimate the values of each state. MCTS can often achieve state-of-the-art performance when combined with domain-specific knowledge. However, without this knowledge, MCTS requires a large number of simulations in order to obtain reliable estimates in the search tree.

The Action Schema Network (ASNets) [Toyer et al., 2018](https://github.com/qxcv/asnets) is a very recent contribution in planning that uses deep learning and neural networks to learn generalized policies for planning problems. ASNets are well suited to problems where the ``local knowledge of the environment can help to avoid certain traps''. However, like most machine learning algorithms, an ASNet may fail to generalize to problems that it was not trained on. For example, this could be due to a poor choice of hyperparameters that lead to an undertrained or overtrained network.
The Action Schema Network (ASNets) \[[Toyer et al., 2018](https://github.com/qxcv/asnets)\] is a very recent contribution in planning that uses deep learning and neural networks to learn generalized policies for planning problems. ASNets are well suited to problems where the "local knowledge of the environment can help to avoid certain traps". However, like most machine learning algorithms, an ASNet may fail to generalize to problems that it was not trained on. For example, this could be due to a poor choice of hyperparameters that lead to an undertrained or overtrained network.

This research project is concerned with investigating how we can improve upon the policy learned by an ASNet by combining it with MCTS. Our project has three key contributions. The first contribution is an ingredient-based framework for MCTS that allows us to specify different flavors of MCTS -- including those which use the policy learned by an ASNet. Our second contribution is two new methods which allow us to use ASNets to perform simulations in MCTS, and hence directly affect the estimated values of states in the search tree. Our third and final contribution is two new methods for using ASNets in the selection phase of MCTS. This allows us to bias the navigation of the search space towards what an ASNet believes is promising.

Expand Down

0 comments on commit 978ac92

Please sign in to comment.