Skip to content

Latest commit

 

History

History
26 lines (16 loc) · 1.33 KB

README.md

File metadata and controls

26 lines (16 loc) · 1.33 KB

CPT

Cross-protein transfer learning for variant effect prediction

This repository contains the codes and data for reproducing main results from the manuscript "Cross-protein transfer learning substantially improves zero-shot prediction of disease variant effects".

analysis.ipynb: Jupyter notebook for the main analyses.

CPT/: Python files for models and utility functions.

data/: Data necessary to train and evaluate the models.

We also provide pre-computed CPT-1 scores for 18,602 human proteins at

  1. Zenodo
  2. Huggingface (an interactive app to visualize and download individual proteins)

If the user would like to generate whole-proteome predictions with the trained model by themselves, the feature matrices can be downloaded at: EVE set, no-EVE set.

Citation

Jagota, M.*, Ye, C.*, Albors, C., Rastogi, R., Koehl, A., Ioannidis, N., and Song, Y.S.†
"Cross-protein transfer learning substantially improves disease variant prediction", Genome Biology, 24, Article Number: 182 (2023).

*These authors contributed equally to this work.
†To whom correspondence should be addressed: [email protected]

DOI: https://doi.org/10.1186/s13059-023-03024-6