Skip to content

Commit

Permalink
Merge pull request #65 from volkamerlab/dev
Browse files Browse the repository at this point in the history
[v1.2.0] KinFragLib code and dataset update
  • Loading branch information
PaulaKramer authored Apr 9, 2024
2 parents 33ac787 + ba07c3a commit 866ede4
Show file tree
Hide file tree
Showing 72 changed files with 518,663 additions and 312,181 deletions.
27 changes: 13 additions & 14 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,31 +21,30 @@ jobs:
fail-fast: false
matrix:
cfg:
- os: ubuntu-latest
python-version: "3.7"
- os: ubuntu-latest
python-version: "3.8"
#- os: ubuntu-latest
# python-version: "3.9"
- os: ubuntu-latest
python-version: "3.9"
#- os: macos-latest
# python-version: "3.6"
# python-version: "3.9"
#- os: windows-latest
# python-version: "3.6"
# python-version: "3.9"

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

# More info on options: https://github.com/conda-incubator/setup-miniconda
- uses: conda-incubator/setup-miniconda@v2
- uses: conda-incubator/setup-miniconda@v3
with:
python-version: ${{ matrix.cfg.python-version }}
mamba-version: "*"
miniforge-variant: Mambaforge
miniforge-version: latest
environment-file: environment.yml
channels: conda-forge,defaults
activate-environment: kinfraglib
auto-update-conda: false
auto-activate-base: false
show-channel-urls: true
#auto-update-conda: false
#auto-activate-base: false
#show-channel-urls: true

- name: Additional info about the build
shell: bash
Expand All @@ -64,7 +63,7 @@ jobs:
shell: bash -l {0}
run: |
echo "Download combinatorial library from zenodo..."
wget -q -O data/combinatorial_library/combinatorial_library.tar.bz2 https://zenodo.org/record/3956580/files/combinatorial_library.tar.bz2?download=1
wget -q -O data/combinatorial_library/combinatorial_library.tar.bz2 https://zenodo.org/record/10843763/files/combinatorial_library.tar.bz2?download=1
ls -l data/combinatorial_library/
echo "Decompress selected files..."
tar -xvf data/combinatorial_library/combinatorial_library.tar.bz2 combinatorial_library_deduplicated.json chembl_standardized_inchi.csv
Expand All @@ -75,6 +74,6 @@ jobs:
- name: Run tests
shell: bash -l {0}
run: |
PYTEST_ARGS="--nbval-lax --current-env --nbval-cell-timeout=900"
PYTEST_ARGS="--nbval-lax --nbval-current-env --nbval-cell-timeout=1800"
pytest $PYTEST_ARGS
12 changes: 12 additions & 0 deletions CITATION.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
@article{doi:10.1021/acs.jcim.0c00839,
author = {Sydow, Dominique and Schmiel, Paula and Mortier, Jérémie and Volkamer, Andrea},
title = {KinFragLib: Exploring the Kinase Inhibitor Space Using Subpocket-Focused Fragmentation and Recombination},
journal = {Journal of Chemical Information and Modeling},
volume = {60},
number = {12},
pages = {6081-6094},
year = {2020},
doi = {10.1021/acs.jcim.0c00839},
note ={PMID: 33155465},
URL = {https://doi.org/10.1021/acs.jcim.0c00839}
}
45 changes: 41 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@

![KinFragLib workflow](./docs/img/toc_github_kinfraglib.png)

Please note that this repository is constantly updated. You can retrieve the repository state for the published KinFragLib paper in release [v1.0.0](https://github.com/volkamerlab/KinFragLib/releases/tag/v1.0.0).
**Note**: This repository is constantly updated, hence the statistics and numbers derive from the paper.
The current fragmentation library is based on the [KLIFS](https://klifs.net/) database downloaded on 06.12.2023.
You can retrieve the repository state for the published KinFragLib paper in release [v1.0.0](https://github.com/volkamerlab/KinFragLib/releases/tag/v1.0.0).

## Table of contents

Expand Down Expand Up @@ -61,8 +63,13 @@ fragments in order to generate novel potential inhibitors.
# Change to KinFragLib directory
cd /path/to/KinFragLib
# Create and activate environment
# Create environment
# Hint: if conda is too slow, consider mamba instead
conda env create -f environment.yml
# When using a MacBook with an M1 chip you may need instead:
CONDA_SUBDIR=osx-64 conda env create -f environment.yml
# Activate environment
conda activate kinfraglib
# Install the kinfraglib pip package
Expand All @@ -85,7 +92,7 @@ fragments in order to generate novel potential inhibitors.
Please contact us if you have questions or suggestions.

* Open an issue on our GitHub repository: https://github.com/volkamerlab/KinFragLib/issues
* Or send us an email: andrea.volkamer@charite.de
* Or send us an email: volkamer@cs.uni-saarland.de

We are looking forward to hearing from you!

Expand All @@ -95,7 +102,7 @@ This resource is licensed under the [MIT](https://opensource.org/licenses/MIT) l

## Citation

Sydow, D., Schmiel, P., Mortier, J., and Volkamer, A. KinFragLib: Exploring the Kinase Inhibitor Space Using Subpocket-Focused Fragmentation and Recombination. J. Chem. Inf. Model. 2020. https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c00839
[Sydow, D., Schmiel, P., Mortier, J., and Volkamer, A. KinFragLib: Exploring the Kinase Inhibitor Space Using Subpocket-Focused Fragmentation and Recombination. J. Chem. Inf. Model. 2020. https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c00839](CITATION.bib)

```bib
@article{doi:10.1021/acs.jcim.0c00839,
Expand All @@ -111,3 +118,33 @@ note ={PMID: 33155465},
URL = {https://doi.org/10.1021/acs.jcim.0c00839}
}
```
## List of publications
- **Kinase Inhibitor Scaffold Hopping with Deep Learning Approaches**
Lizhao Hu, Yuyao Yang, Shuangjia Zheng, Jun Xu, Ting Ran, and Hongming Chen
*Journal of Chemical Information and Modeling* **2021**
[10.1021/acs.jcim.1c00608](https://pubs.acs.org/doi/full/10.1021/acs.jcim.1c00608)
- **TWN-FS method: A novel fragment screening method for drug discovery**
Yoon, Hye Ree and Park, Gyoung Jin and Balupuri, Anand and Kang, Nam Sook
*Computational and Structural Biotechnology Journal* **2023**
[10.1016/j.csbj.2023.09.037](https://doi.org/10.1016/j.csbj.2023.09.037)
- **Efficient Hit-to-Lead Searching of Kinase Inhibitor Chemical Space via Computational Fragment Merging**
Grigorii V. Andrianov, Wern Juin Gabriel Ong, Ilya Serebriiskii, and John Karanicolas
*Journal of Chemical Information and Modeling* **2021**
[10.1021/acs.jcim.1c00630](https://doi.org/10.1021/acs.jcim.1c00630)
- **KiSSim: Predicting Off-Targets from Structural Similarities in the Kinome**
Dominique Sydow, Eva Aßmann, Albert J. Kooistra, Friedrich Rippmann, and Andrea Volkamer
*Journal of Chemical Information and Modeling* **2022**
[10.1021/acs.jcim.2c00050](https://10.1021/acs.jcim.2c00050)
- **Target-Focused Library Design by Pocket-Applied Computer Vision and Fragment Deep Generative Linking**
Merveille Eguida, Christel Schmitt-Valencia, Marcel Hibert, Pascal Villa, and Didier Rognan
*Journal of Medicinal Chemistry* **2022**
[10.1021/acs.jmedchem.2c00931](https://pubs.acs.org/doi/10.1021/acs.jmedchem.2c00931)
- **Guided docking as a data generation approach facilitates structure-based machine learning on kinases**
Backenköhler M, Groß J, Wolf V, Volkamer A.
*ChemRxiv* **2023**
[10.26434/chemrxiv-2023-prk53](https://chemrxiv.org/engage/chemrxiv/article-details/658441f7e9ebbb4db96d98e8) *This content is a preprint and has not been peer-reviewed.*
- **Constructing Innovative Covalent and Noncovalent Compound Libraries: Insights from 3D Protein–Ligand Interactions** Xiaohe Xu, Weijie Han, Xiangzhen Ning, Chengdong Zang, Chengcheng Xu, Chen Zeng, Chengtao Pu, Yanmin Zhang, Yadong Chen, and Haichun Liu *Journal of Chemical Information and Modeling* **2024**[10.1021/acs.jcim.3c01689](https://pubs.acs.org/doi/10.1021/acs.jcim.3c01689)




6 changes: 4 additions & 2 deletions data/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# Data

Overview on data content.
**Note**: Our fragmentation library is currently based on KLIFS downloaded on *06.12.2023*.

- `fragment_library/`: Full fragment library resulting from the KinFragLib fragmentation procedure comprises of about 3000 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases.
Overview of data content:

- `fragment_library/`: Full fragment library resulting from the KinFragLib fragmentation procedure comprises about 3000 fragments, which are the basis for exploring the subpocket-based chemical space of ligands co-crystallized with kinases.
- `fragment_library_filtered/`: Filtered fragment library: Select fragments tailored for the recombination (remove pool X, deduplicate per subpocket, remove unfragmented ligands, remove all fragments that connect only to pool X, keep only fragment-like fragments, and filter for hinge-like AP fragments).
- `fragment_library_reduced/`: Reduced fragment library: Select a diverse set of fragments (per subpocket) for recombination starting from the filtered fragment library.
- `combinatorial_library/`: Combinatorial library based on the reduced fragment library.
Expand Down
6 changes: 3 additions & 3 deletions data/combinatorial_library/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# KinFragLib: Combinatorial library

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3956580.svg)](https://doi.org/10.5281/zenodo.3956580)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10843763.svg)](https://doi.org/10.5281/zenodo.10843763)

This folder is meant for the metadata and properties of the KinFragLib combinatorial library, which is based on the KinFragLib fragment library at https://github.com/volkamerlab/KinFragLib. This dataset is used for the analysis of the combinatorial library.

**Note**: Since this dataset contains large files, we provide it outside this repository at https://zenodo.org/record/3956580 (DOI: 10.5281/zenodo.3956580, v1.0.1).
**Note**: Since this dataset contains large files, we provide it outside this repository at https://zenodo.org/record/10843763 (DOI: 10.5281/zenodo.10843763, v2.0.0).
In order to run the analysis notebooks, please download this dataset to this folder.

## Raw data

- `combinatorial_library.json`: Full combinatorial library, please refer to `notebooks/4_1_combinatorial_library_data_preparation.ipynb` at https://github.com/volkamerlab/KinFragLib for detailed information about this data format
- `combinatorial_library_deduplicated.json`: Deduplicated combinatorial library (based on InChIs)
- `chembl_standardized_inchi.csv`: Standardized ChEMBL 25 molecules in the form of InChI strings.
- `chembl_standardized_inchi.csv`: Standardized ChEMBL 33 molecules in the form of InChI strings.

## Processed data

Expand Down
Loading

0 comments on commit 866ede4

Please sign in to comment.