This repository contains the official implementation of our algorithm Retrosynthetic Planning with Dual Value Networks (ICML 2023), based on the open-source codebase of Retro*.
In this work, we aim at using reinforcement learning (RL) to fine-tune single-step retrosynthesis prediction models through simulated synthetic routes. The RL training is expected to improve both the (1) synthesizability and (2) route quality of the output routes while retaining the single-step accuracy of the SL pretrained model. We implement the PDVN algorithm described in our paper, and a parallel training framework to scale the training to a large amount of molecules. We also provide all the datasets (including ChEMBL-1000 and GDB17-1000) used in the paper.
conda env create --file environment.yml
conda activate pdvn
To reproduce the results in the paper, we also need the additional files containing the training dataset, evaluation datasets (USPTO, ChEMBL-1000, GDB17-1000), SL-pretrained model, starting molecules, and the template rules.
All the necessary files can be downloaded from the link, and should be copied into the retro_star
folder.
The PDVN training algorithm is implemented in retro_mcts.py
, which iterates the sampling phase and the training phase. Checkpoints are saved in the folder ${RESULT_FOLDER}
after every save_interval
number of molecules are trained.
To enhance the sampling efficiency, we use n_processes
to specify the number of processes used for sampling synthesis routes over the n_gpus
number of GPUs. For example, --n_processes 15 --n_gpu 3
means that 15 processes are used for sampling synthesis routes over 3 GPUs, and each GPU will handle 5 sampling processes. We also use gpu
to select the GPU used for training the model.
# scripts/pdvn_training.sh
EXP_NAME=pdvn
RESULT_FOLDER=./results/pdvn/run1/
cd retro_star
python retro_mcts.py \
--name ${EXP_NAME} \
--starting_molecules dataset/origin_dict_canonical.csv \
--lr 1e-3 --minibatch_size 128 --batch_size 1024 \
--realistic_filter \
--n_epochs 3 --iteration 500 --num_simulations 100 --depth 15 --PUCT_coef 1.0 \
--runner parallel --gpu 3 --n_gpus 3 --n_processes 15 \
--save_model --save_interval 25000 --result_folder ${RESULT_FOLDER} \
--train_routes dataset/routes_train.pkl \
--test_interval 5000 --test_batch_size 128
After the training, we evaluate the model on the USPTO test dataset. This serves 2 purposes: (1) to assess the performance of the model, and (2) to choose the best checkpoint for evaluation on other datasets, e.g., ChEMBL-1000 and GDB17-1000.
The evaluation algorithm requires the result_folder
and test_routes
to be specified. The result_folder
is the folder where the checkpoints are saved, and the test_routes
is the dataset used for evaluation.
# scripts/evaluation.sh
RESULT_FOLDER=./results/pdvn/run1/
TEST_ROUTES=dataset/routes_possible_test_hard.pkl
cd retro_star
python test_plan.py \
--method retro \
--starting_molecules dataset/origin_dict_canonical.csv \
--realistic_filter \
--iteration 500 --gpu 0 \
--runner parallel --n_gpus 4 --n_processes 20 \
--result_folder ${RESULT_FOLDER} \
--test_routes ${TEST_ROUTES}
We also provide the script to visualize the results across all the checkpoints on the USPTO test dataset. This dataset also serves as the evaluation dataset for selecting the best checkpoint (highest success rate) to further assess the performance on other datasets, such as ChEMBL-1000 and GDB17-1000.
# scripts/plot.sh
RESULT_FOLDER=./results/pdvn/run1/
PLAN_INFO=retro_routes_possible_test_hard
cd retro_star
python plan_results_analyzer.py \
--result_folder ${RESULT_FOLDER} \
--plan_info ${PLAN_INFO}
After choosing the best checkpoint according to the USPTO test dataset, we combine the model with the retro*-0 planner (this can be replaced by other planners) to evaluate the ChEMBL-1000 and GDB17-1000 datasets.
# scripts/retro_star_search.sh
RESULT_FOLDER=./results/retro_star/chembl/
TEST_ROUTES=dataset/chembl_1000.pkl
BACKWARD_MODEL=<REPLANCE_THIS_WITH_BEST_MODEL>
cd retro_star
python retro_plan.py \
--test_routes ${TEST_ROUTES} \
--mlp_model_dump ${BACKWARD_MODEL} \
--result_folder ${RESULT_FOLDER} \
--iteration 500
To evaluate the GDP17-1000 dataset, replace the value of TEST_ROUTES
with dataset/gdb17_1000.pkl
.
If you use PDVN in your research, please consider citing our work in the following BibTeX format:
@InProceedings{pmlr-pdvn,
title = {Retrosynthetic Planning with Dual Value Networks},
author = {Liu, Guoqing and Xue, Di and Xie, Shufang and Xia, Yingce and Tripp, Austin and Maziarz, Krzysztof and Segler, Marwin and Qin, Tao and Zhang, Zongzhang and Liu, Tie-Yan},
booktitle = {Proceedings of the 40th International Conference on Machine Learning},
pages = {22266--22276},
year = {2023},
volume = {202},
}