ReBADD-SE: Multi-objective Molecular Optimisation using SELFIES Fragment and Off-Policy Self-critical Sequence Training

This is the repository for ReBADD-SE, a multi-objective molecular optimization model that designs a molecular structures in the format of SELFIES. For more details, please refer to our paper.

Latest update: 26 Jan 2024

Install

conda env create -f environment.yml

Task Descriptions

TASK1: ReBADD-SE for GSK3b, JNK3, QED, and SA (frag-level)
TASK3: ReBADD-SE for BCL2, BCLXL, and BCLW (frag-level)
TASK4: ReBADD-SE for BCL2, BCLXL, and BCLW (char-level)
TASK7: SELFIES Collapse Analaysis between ReBADD-SE (frag-level) and GA+D

Notebook Descriptions

0_preprocess_data.ipynb

(Important!) Before starting any TASK, please first run the scripts in the directory 'data/chembl' or 'data/zinc15'
Read the training data
Preprocess the data for model training

1_pretraining.ipynb

Read the training data
The generator learns the grammar rules of SELFIES

2_optimize+{objectives}.ipynb

(Important!) Please check first the 'ReBADD_config.py' in which a reward function have to be defined appropriately
Load the pretrained generator

3_checkpoints+{objectives}.ipynb

Load the checkpoints stored during optimization
Sample molecules for each checkpoint

4_calculate_properties.ipynb

For each checkpoint, load the sampled molecules
Evaluate their property scores

5_evaluate_checkpoints.ipynb

Calculate metrics (e.g. success rate)
Find the best checkpoint

Note

If you have any further questions, please do not hesitate to let me know.

[email protected]

Citation

@article{CHOI2023106721,
	title = {ReBADD-SE: Multi-objective molecular optimisation using SELFIES fragment and off-policy self-critical sequence training},
	journal = {Computers in Biology and Medicine},
	volume = {157},
	pages = {106721},
	year = {2023},
	issn = {0010-4825},
	doi = {https://doi.org/10.1016/j.compbiomed.2023.106721},
	url = {https://www.sciencedirect.com/science/article/pii/S0010482523001865},
	author = {Jonghwan Choi and Sangmin Seo and Seungyeon Choi and Shengmin Piao and Chihyun Park and Sung Jin Ryu and Byung Ju Kim and Sanghyun Park},
	keywords = {Drug discovery, De novo drug design, Multi-objective optimisation, SELFIES, Reinforcement learning}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
TASK1		TASK1
TASK3		TASK3
TASK4		TASK4
TASK7		TASK7
data		data
rebadd		rebadd
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReBADD-SE: Multi-objective Molecular Optimisation using SELFIES Fragment and Off-Policy Self-critical Sequence Training

Install

Task Descriptions

Notebook Descriptions

0_preprocess_data.ipynb

1_pretraining.ipynb

2_optimize+{objectives}.ipynb

3_checkpoints+{objectives}.ipynb

4_calculate_properties.ipynb

5_evaluate_checkpoints.ipynb

Note

Citation

About

Releases 2

Packages

Languages

License

mathcom/ReBADD-SE

Folders and files

Latest commit

History

Repository files navigation

ReBADD-SE: Multi-objective Molecular Optimisation using SELFIES Fragment and Off-Policy Self-critical Sequence Training

Install

Task Descriptions

Notebook Descriptions

0_preprocess_data.ipynb

1_pretraining.ipynb

2_optimize+{objectives}.ipynb

3_checkpoints+{objectives}.ipynb

4_calculate_properties.ipynb

5_evaluate_checkpoints.ipynb

Note

Citation

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages