OpenFold Local Jupyter Notebook 📔 | Metrics, Plots, Concurrent Inference #484

juliocesar-io · 2024-08-28T00:35:13Z

Overview

This PR introduces a fully featured Local Notebook for performing inference, obtaining metrics, ranking the best model, and generating plots in a structured and reproducible manner, particularly for experimentation with large datasets.

The metrics are similar to those in the Colab notebook but optimized for a local installation with Docker. It also introduces parallel execution to leverage multiple GPUs.

The notebook operates by executing Docker commands using the Docker client and accessing OpenFold functions within a standalone environment. This approach ensures that the OpenFold codebase remains unaffected, serving as a client to help reproduce metrics and results from the Colab notebook locally.

Usage

Refer to instructions in notebooks/OpenFoldLocal.ipynb

Setup the notebook

Fist, build Openfold using Docker. Follow this guide.

Then, go to the notebook folder

cd notebooks

Create an environment to run Jupyter with the requirements

mamba create -n openfold_notebook python==3.10

Activate the environment

mamba activate openfold_notebook

Install the requirements

pip install -r src/requirements.txt

Start your Jupyter server in the current folder

jupyter lab . --ip="0.0.0.0"

Access the notebook URL or connect remotely using VSCode.

Inference example

Initializing the client:

import docker
from src.inference import InferenceClientOpenFold

# You can also use a remote docker server 
docker_client = docker.from_env()

# Initialize the OpenFold Docker client setting the database path 

databases_dir = "/path/to/databases"

openfold_client = InferenceClientOpenFold(databases_dir, docker_client)

Running Inference:

# For multiple sequences, separate sequences with a colon `:`
input_string = "DAGAQGAAIGSPGVLSGNVVQVPVHVPVNVCGNTVSVIGLLNPAFGNTCVNA:AGETGRTGVLVTSSATNDGDSGWGRFAG"

model_name = "multimer" # or "monomer"
weight_set = 'AlphaFold' # or 'OpenFold'

# Run inference
run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_string)

Using a file:

input_file = "/path/to/test.fasta"

run_id = openfold_client.run_inference(weight_set, model_name, inference_input=input_file)

Screenshots

juliocesar-io added 15 commits August 10, 2024 01:58

added initial files

ed4468b

refactored utils

5e03577

added msa and inference commands

5db2d43

added docker commad for msa

27dff8a

rename codebase folder

d2059ba

refactor to use class based client

42b1383

Updated notebook examples

0a0f88f

updated notebook

b8d26e3

refactor to granular class

9ea2733

added metrics and plots

55fcc70

added date to run folder

0bf3be0

added GPU param

f5cce3e

Updated usage examples

e36d391

fixed openfold weight_set inference

67ba135

Added related info and clean notebook

3505bc6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenFold Local Jupyter Notebook 📔 | Metrics, Plots, Concurrent Inference #484

OpenFold Local Jupyter Notebook 📔 | Metrics, Plots, Concurrent Inference #484

juliocesar-io commented Aug 28, 2024 •

edited

Loading

OpenFold Local Jupyter Notebook 📔 | Metrics, Plots, Concurrent Inference #484

Are you sure you want to change the base?

OpenFold Local Jupyter Notebook 📔 | Metrics, Plots, Concurrent Inference #484

Conversation

juliocesar-io commented Aug 28, 2024 • edited Loading

Overview

Usage

Setup the notebook

Inference example

Screenshots

juliocesar-io commented Aug 28, 2024 •

edited

Loading