Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README does not mention that rsync is needed for DB download #1032

Open
MamfTheKramf opened this issue Oct 18, 2024 · 0 comments
Open

README does not mention that rsync is needed for DB download #1032

MamfTheKramf opened this issue Oct 18, 2024 · 0 comments

Comments

@MamfTheKramf
Copy link

Currently the section about downloading the genetic databases does not tell you, you need rsync installed to download them. It only mentions aria2c:

alphafold/README.md

Lines 62 to 82 in f251de6

1. Download genetic databases and model parameters:
* Install `aria2c`. On most Linux distributions it is available via the
package manager as the `aria2` package (on Debian-based distributions this
can be installed by running `sudo apt install aria2`).
* Please use the script `scripts/download_all_data.sh` to download
and set up full databases. This may take substantial time (download size is
556 GB), so we recommend running this script in the background:
```bash
scripts/download_all_data.sh <DOWNLOAD_DIR> > download.log 2> download_all.log &
```
* **Note: The download directory `<DOWNLOAD_DIR>` should *not* be a
subdirectory in the AlphaFold repository directory.** If it is, the Docker
build will be slow as the large databases will be copied into the docker
build context.
* It is possible to run AlphaFold with reduced databases; please refer to
the [complete documentation](#genetic-databases).

But the download_pdb_mmcif.sh script does need rsync.

As far as I can see, rsync is the only other dependency needed apart from aria2. But not mentioning it in this section or checking it at the very beginning of the download_all_data.sh script an be quite annoying when you set up the download over night and it crashes at this step with ~180GB still to download.

When I find time today, I can also submit a PR, as this are only very small changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant