Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #149

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -120,4 +120,5 @@ test/
*.tfevents.*
*.sto
*.a3m
nogit/
/nogit/
/out/
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,28 +145,26 @@ The output directory of running Uni-Fold contain the predicted structures in `*.

## Training Uni-Fold

Training Uni-Fold relies on pre-calculated features of proteins. We provide a demo dataset in the [example data](example_data) folder. A larger dataset will be released soon.

### Demo case

To start with, we provide a demo script to train the monomer/multimer system of Uni-Fold:

```bash
bash train_monomer_demo.sh .
bash train_monomer_demo.sh ./out
```

and

```bash
bash train_multimer_demo.sh .
bash train_multimer_demo.sh ./out
```

This command starts a training process on the [demo data](example_data) included in this repository. Note that this demo script only tests the correctness of package installation and does not reflect any true performances.
These two commands start training processes on the [demo data](example_data) included in this repository. Note that this demo script only tests the correctness of package installation and does not reflect any true performances.


### Full training dataset download

We thank [ModelScope](https://modelscope.cn/home) and [Volcengine](https://www.volcengine.com) for providing free hosts of the full training dataset. The downloaded dataset could be directly used to train Uni-Fold from-scratch.
Training Uni-Fold relies on pre-calculated features of proteins. We thank [ModelScope](https://modelscope.cn/home) and [Volcengine](https://www.volcengine.com) for providing free hosts of the full training dataset. The downloaded dataset could be directly used to train Uni-Fold from-scratch. Notably, the full dataset is approximately 2TB in size, so make sure the local storage is capable (>3TB recommended).

#### Download from ModelScope

Expand All @@ -175,7 +173,7 @@ Download the dataset from [modelscope](https://modelscope.cn/datasets/DPTech/Uni
1. install modelscope

```bash
pip3 install https://databot-algo.oss-cn-zhangjiakou.aliyuncs.com/maas/modelscope-1.0.0-py3-none-any.whl
pip3 install modelscope
```

2. download the dataset with python
Expand All @@ -192,7 +190,8 @@ ds = MsDataset.load(dataset_name='Uni-Fold-Data', namespace='DPTech', split='tra
# The data will be located at ${your_data_path}/modelscope/hub/datasets/downloads/DPTech/Uni-Fold-Data/master/*
```

#### Download from Volcengine

#### Download from Volcengine (recommended)

1. install rclone

Expand Down