-
Notifications
You must be signed in to change notification settings - Fork 41
Suggestion to improve simplicity of the docker installation #575
Comments
Hi, as the readme describes the steps have to run in a very specific order. The config importer has to run first. Then the web app cannot be running when the indexer is running because they share the same embedded database, and it has to run before the web app. AFAIK this cannot be accomplished with Docker compose. If you have any suggestions to improve this, please open a PR. |
This is possible using the docker-compose "depends_on" field. For example in one of our projects a similar situation is handled by first generating data and only starting a Virtuoso SPARQL endpoint with an import script when that data generating docker container is finished. The data is also converted into SQL statements and only when that is finished then the database container is run so that it can import it. Simplified excerpt from a similar setup
The full docker-compose.yml can be seen at https://github.com/hitontology/docker/blob/master/docker-compose.yml. In case I figure it out for OLS I will create a pull request. P.S.: There are also health checks in docker compose but this didn't work out for us but unfortunately I don't remember why. |
Interesting, thanks. Will the data generation run every time with depends_on? The indexer is very slow - at EBI it regularly takes ~48 hours. |
Also, for the same reason in larger setups (including ours) it is necessary to run the indexer elsewhere and copy the data in. This is another reason we have it as a separate distinct step. |
We perform the data generation in the build step, so the data is baked into the resulting image and there is no custom entrypoint at all, so it only runs once on build:
That could probably be improved further with a two step build with a second step "from SCRATCH" to save an additional 5 MB or so. |
After reading the OLS readme I think that example is not applicable because the source data for the indexer is not known at build time.
The SQL files are stored in the "SQL" volume and are later picked up by the database container. P.S.: I'm not a Docker expert by any means, so I can't guarantee that this is the best way to implement that kind of dependency, but I found it to make development, testing and deployment much easier in my experiences if the setup is build in a way that everything can be thrown away and rebuilt at any time by using |
Thanks Konrad,
Kind of both! Solr is loaded into a live instance, but Neo4j is embedded (a bit like e.g. sqlite) by both the indexer and ols-web, so the indexer has to be shut down so that ols-web can operate on the same files afterwards. I do think there will be a way we can simplify all of this, but we're working with code which has been designed for use with very large amounts of data at EBI, and the requirements are not always the same for smaller scale deployments. (Until recently we had very little support for deployments of OLS outside of EBI.) We are planning a new version of OLS in which we aim to reduce much of this complexity. I will tag this issue with the OLS 4 milestone and we can have a rethink then. |
I got it to work locally! The following shows the page on http://localhost:8080 where you can browse and search the ontologies. For example, entering "diabetes" shows milk thistle supplement as first hit. version: '2'
services:
solr:
image: ebispot/ols-solr:latest
environment:
- SOLR_HOME=/mnt/solr-config
ports:
- 8983:8983
volumes:
- ols-solr-data:/var/solr
- ./ols-solr/src/main/solr-5-config:/mnt/solr-config
network_mode: "host"
command: ["-Dsolr.solr.home=/mnt/solr-config", "-Dsolr.data.dir=/var/solr", "-f"]
mongo:
image: mongo:3.2.9
ports:
- 27017:27017
volumes:
- ols-mongo-data:/data/db
network_mode: "host"
command:
- mongod
ols-config-importer:
#image: ebispot/ols-config-importer:stable
build:
context: .
dockerfile: ./ols-apps/ols-config-importer/Dockerfile
volumes:
- ./config:/config
network_mode: "host"
depends_on: ["mongo"]
restart: on-failure:2
ols-indexer:
build:
context: .
dockerfile: ./ols-apps/ols-indexer/Dockerfile
volumes:
- ols-neo4j-data:/mnt/neo4j
- ols-downloads:/mnt/downloads
network_mode: "host"
depends_on:
ols-config-importer:
condition: service_completed_successfully
ols-web:
build:
context: .
dockerfile: ols-web/Dockerfile
network_mode: "host"
depends_on:
ols-indexer:
condition: service_completed_successfully
links:
- solr
- mongo
environment:
# - spring.data.solr.host=http://solr:8983/solr
- spring.data.solr.host=http://localhost:8983/solr
- spring.data.mongodb.host=localhost
- ols.customisation.logo=${LOGO}
- ols.customisation.title=${TITLE}
- ols.customisation.short-title=${SHORT_TITLE}
- ols.customisation.web=${WEB}
- ols.customisation.twitter=${TWITTER}
- ols.customisation.org=${ORG}
- ols.customisation.backgroundImage=${BACKGROUND_IMAGE}
- ols.customisation.backgroundColor=${BACKGROUND_COLOR}
- ols.customisation.issuesPage=${ISSUES_PAGE}
- ols.customisation.supportMail=${SUPPORT_MAIL}
- OLS_HOME=/mnt/
volumes:
- ols-neo4j-data:/mnt/neo4j
- ols-downloads:/mnt/downloads
ports:
- 8080:8080
volumes:
ols-solr-data:
ols-mongo-data:
ols-neo4j-data:
ols-downloads: |
The next step would be to get it to work without host mode, probably by changing the source code of the ols-config-importer. |
That's exciting! AFAIK ols-config-importer is not hardcoded to localhost. You should be able to set |
That worked! Without version: '2'
services:
solr:
image: ebispot/ols-solr:latest
environment:
- SOLR_HOME=/mnt/solr-config
ports:
- 8983:8983
volumes:
- ols-solr-data:/var/solr
- ./ols-solr/src/main/solr-5-config:/mnt/solr-config
command: ["-Dsolr.solr.home=/mnt/solr-config", "-Dsolr.data.dir=/var/solr", "-f"]
mongo:
image: mongo:3.2.9
ports:
- 27017:27017
volumes:
- ols-mongo-data:/data/db
command:
- mongod
ols-config-importer:
#image: ebispot/ols-config-importer:stable
build:
context: .
dockerfile: ./ols-apps/ols-config-importer/Dockerfile
environment:
- spring.data.mongodb.host=mongo
volumes:
- ./config:/config
depends_on: ["mongo"]
restart: on-failure:2
ols-indexer:
build:
context: .
dockerfile: ./ols-apps/ols-indexer/Dockerfile
environment:
- spring.data.solr.host=http://solr:8983/solr
- spring.data.mongodb.host=mongo
volumes:
- ols-neo4j-data:/mnt/neo4j
- ols-downloads:/mnt/downloads
depends_on:
ols-config-importer:
condition: service_completed_successfully
ols-web:
build:
context: .
dockerfile: ols-web/Dockerfile
depends_on:
ols-indexer:
condition: service_completed_successfully
links:
- solr
- mongo
environment:
# - spring.data.solr.host=http://solr:8983/solr
- spring.data.solr.host=http://solr:8983/solr
- spring.data.mongodb.host=mongo
- ols.customisation.logo=${LOGO}
- ols.customisation.title=${TITLE}
- ols.customisation.short-title=${SHORT_TITLE}
- ols.customisation.web=${WEB}
- ols.customisation.twitter=${TWITTER}
- ols.customisation.org=${ORG}
- ols.customisation.backgroundImage=${BACKGROUND_IMAGE}
- ols.customisation.backgroundColor=${BACKGROUND_COLOR}
- ols.customisation.issuesPage=${ISSUES_PAGE}
- ols.customisation.supportMail=${SUPPORT_MAIL}
- OLS_HOME=/mnt/
volumes:
- ols-neo4j-data:/mnt/neo4j
- ols-downloads:/mnt/downloads
ports:
- 8080:8080
volumes:
ols-solr-data:
ols-mongo-data:
ols-neo4j-data:
ols-downloads: |
Usually when I check out a docker compose setup,
docker-compose up --build
integrates all the necessary steps, except maybe setting some documented environment variables.However in this repository, it seems that there is a process involving multiple steps with manual intervention.
While the steps aren't that complicated, and can probably be automated by wrapping it in another Dockerfile or writing another docker-compose that executes those steps, I don't understand why that is necessary in the first place.
Isn't the entire purpose of docker compose to achieve a reliable reproducible setup without manual intervention except for configuration through environment variables or maybe a configuration file?
For example:
Aren't the volumes created automatically? Sorry if I have some wrong assumptions here, I have not been using Docker for that long but in the cases I have seen until now, there was never a need to do
docker volume create
because docker automatically generated all volumes listed indocker-compose.yml
.Why is it necessary to start the services before changing the configuration file, can't this be done before?
Can't this be mounted inside docker-compose.yml?
Why not put that in docker-compose instead?
The text was updated successfully, but these errors were encountered: