Skip to content

Latest commit

 

History

History
353 lines (261 loc) · 9.72 KB

DOCKER.md

File metadata and controls

353 lines (261 loc) · 9.72 KB

Docker

To install VariantValidator via Docker, first ensure you have both docker and docker-compose installed. See their documentation for information.

Create a directory collate your cloned repositories. Move into the directory then, clone the repository.

$ git clone https://github.com/openvar/variantValidator

Once the repository has been cloned, cd into the variantValidator directory that the clone creates.

$ cd variantValidator/

If you have cloned the repository previously, update it

$ git pull

Configure

Edit the file configuration/docker.ini as required

From version 2.0.0 adding an API key is optional. Entrez API key As is adding an email address

Note: configuration can be updated (see below for details)

Build the container

Note: some of these steps take >>1hr to complete depending on the speed of your internet connection, particularly compiling SeqRepo

Note: Depending on your system setup you may need to use sudo or be root to run docker. If you are using sudo you will need to prefix the docker-compose commands below with sudo --preserve-env=HOME, or else if just using sudo edit the docker-comose.yml file to replace ${HOME} with a location of your choice, making sure to create the variantvalidator_data and share folders as needed.

  • Pull images
$ docker-compose pull
  • Create a directory for sharing resources between your computer and the container
$ mkdir ~/variantvalidator_data
$ mkdir ~/variantvalidator_data/share

i.e. a directory called share in your home directory

  • Edit the vdb_docker.df file

You need to select your chip set e.g. Arm or Intel and remove the relevant hash. Default is intel

# For Arm chips e.g. Apple M1
# FROM biarms/mysql:5.7

# For Intel chips
FROM mysql:5.7
  • Build
$ docker-compose build --no-cache
  • Complete build
    • The first time you do this, it will complete the build process, for example, populating the required the databases
    • When this is completed you will need to shutdown the services and re-start (see below)
    • The build takes a while because the vv databases are large. However, this is a significant improvement on previou s versions. Install time is approximately 30 minutes (depending on the speed of you computer and internet connection)
    • The build has completed when you see the message "Successfully built "
    • example: "Successfully built fc9b83c8d21fa8bdebd52e0e87b9fde967933a043dace1a31916f8106110c8d8 "
    • Then complete the following steps
# Create the containers (This only takes a coule of minutes)
$ docker-compose up

# When you see the following message the containers have been created. 
# "vvta_1     | 2021-07-23 16:29:17.590 UTC [1] LOG:  database system is ready to accept connections"
# Initial shut down prior to re-launch and working with VarinatValidator in Docker
ctrl + c

Build errors you may encounter

If you have MySQL and or Postgres databases already running, you may encounter an error

"ERROR: for vdb Cannot start service vdb: Ports are not available: listen tcp 0.0.0.0:3306: bind: address already in use"

If you encounter these issues, stop the build by pressing ctrl+c

  • Reconfigure the ports used in the docker-comose.yml file as shown here
services:
  vdb:
    build:
      context: .
      dockerfile: vdb_docker.df
    ports:
      # - "33060:3306"
      - "3306:3306"
    expose:
      # - "33060"
      - "3306"
  uta:
    build:
      context: .
      dockerfile: uta_docker.df
    ports:
      - "54320:5432"
    expose:
      - "54320"
  • hash (#) the conflicting port and add the new ports as shown above
  • force-recreate the container
$ docker-compose down
$ docker-compose up --force-recreate

You may encounter a build error relating to other unavailable ports

"Cannot start service restvv: Ports are not available: listen tcp 0.0.0.0:8000: bind: address already in use"

If you encounter these issues, stop the build by pressing ctrl+c

  • Reconfigure the ports used in the docker-comose.yml file as shown here
  restvv:
    build: .
    depends_on:
      - vdb
      - uta
    volumes:
      - seqdata:/usr/local/share/seqrepo
    ports:
      - "5000:5000"
      # - "8000:8000"
      - "8080:8080"
    expose:
      - "5000"
      # - "8000"
      - 8080
  • hash (#) the conflicting port and add the new ports as shown above
  • Change the command in Dockerfile to reflect the changes e.g. CMD gunicorn -b 0.0.0.0:8080 app --threads=5 --chdir ./rest_VariantValidator/
  • force-recreate the container
$ docker-compose down
$ docker-compose up --force-recreate

Checking the installation

go into the container via bash

$ docker-compose run vv bash

Use Pytest to check the integrity of the installation (Recommended)

  • First check that the SeqRepo has installed correctly
$ ls /usr/local/share/seqrepo/
# returns
VV_SR_2021_2
$ cd /app
$ pytest

Launch

You can then launch the docker containers and run them using

$ docker-compose up

Once installed and running it is possible to run just the container containing VariantValidator, either to run the validator script

$ docker-compose run vv variant_validator.py

Example

# Note: The variant description must be contained in '' or "". See MANUAL.md for more examples
$ docker-compose run vv variant_validator.py -v 'NC_000017.11:g.50198002C>A' -g GRCh38 -t mane -s individual -f json -m -o stdout

Example 2 - use Python to collect output

import subprocess
validation = subprocess.run(["docker-compose run vv variant_validator.py -v 'NC_000017.11:g.50198002C>A' -g GRCh38 -t mane -s individual -f json -m -o stdout"], stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE, shell=True)
print(validation.stdout.decode("utf-8"))

run python

$ docker-compose run vv python

or go into the container via bash

$ docker-compose run vv bash

Note, that each time one of these commands is run a new container is created. For more information on how to use docker-compose see their documentation.

Accessing the VariantValidator databases externally

It is possible to access both the UTA and Validator databases outside of docker as they expose the default PostgreSQL and MySQL ports (5432 and 3306 respectively). It is also possible to access the seqrepo database outside of docker by editing your config file to point at the shared directory.

~/variantvalidator_data/share/seqrepo

Accessing VariantValidator directly through bash and reconfiguring a container post build

The container hosts a full install of VariantValidator.

To start this version you use the command

$ docker-compose run vv bash

When you are finished exit the container

$ exit

What you can do in bash mode

  1. Run VariantValidator can be run on the commandline from within the container

    • Instructions can be found in the VariantValidator manual under sections Database updates and Operation
  2. Start the REST services in development mode, bound to port 5000

    • For example, this is useful if you want to develop new methods and test them
    • Note: Under the terms and conditions of our license changes to the code and improvements must be made available to the community so that we can integrate them for the good of all our users
    • See instructions on VariantValidator development in Docker

Developing VariantValidator in Docker

The container has been configured with git installed. This means that you can clone Repos directly into the container

To develop VariantValidator in the container

Start the container

$ docker-compose run vv bash

ON YOUR COMPUTER change into the shared directory

$ cd /usr/local/share/

Then create a directory for development

$ mkdir repos
$ cd /usr/local/share/repos

Clone the VariantValidator Repo

$ git clone https://github.com/openvar/variantValidator.git

Checkout the develop branch

$ git checkout develop
$ git pull

Create an new branch for your developments

$ git branch name_of_branch
$ git checkout name_of_branch

IN THE CONTAINER, pip install the code so it can be run by the container

$ cd /usr/local/share/DevelopmentRepos/variantValidator
$ pip install -e . 

You can then use the containers Python interpreter to run queries, e.g.

import json
import VariantValidator
vval = VariantValidator.Validator()
variant = 'NM_000088.3:c.589G>T'
genome_build = 'GRCh38'
select_transcripts = 'all'
validate = vval.validate(variant, genome_build, select_transcripts)
validation = validate.format_as_dict(with_meta=True)
print(json.dumps(validation, sort_keys=True, indent=4, separators=(',', ': ')))

Removing and re-building

# Delete all containers
$ docker-compose down
$ docker system prune -a --volumes

Once you have deleted the containers, got to Install and Build

Alternatively, you may wish to try and force the containers to re-build without deleting

# Force re-build
$ docker-compose down
$ docker-compose up --force-recreate

If you choose this option, make sure you see the container restvv being re-created and all Python packages being reinstalled in the printed logs, otherwise the container may not actually be rebuilt and the contained modules may not update