Skip to content

ess-dive/docker-metacat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Metacat

metacat

The official metacat docker image, made in a collaboration between NCEAS and LBNL

What is metacat?

from www.dataone.org/software-tools/metacat

Metacat is a flexible, open source metadata catalog and data repository that targets scientific data, particularly from ecology and environmental science. Metacat accepts XML as a common syntax for representing the large number of metadata content standards that are relevant to ecology and other sciences. Thus, Metacat is a generic XML database that allows storage, query, and retrieval of arbitrary XML documents without prior knowledge of the XML schema.

Metacat is designed and implemented as a Java servlet application that utilizes a relational database management system to store XML and associated meta-level information. Installation of Metacat recommends the use of Apache Tomcat for servlet management and PostgreSQL as the underlying RDBMS, although other configurations are possible. Metacat provides a rich client Application Programming Interface (API) and supports a variety of languages, including Java, Python, and Perl.

Metacat is being used extensively throughout the world to manage environmental data. It is a key infrastructure component for the NCEAS data catalog, the Knowledge Network for Biocomplexity (KNB) data catalog, and for the DataONE system, among others.

Usage

The Metacat docker image requires an existing database. This example will start and link a postgres database and Metacat member node. It is a very minimal example. Please refer to the Metacat administration documentation for more information on how to configure a Metacat member node.

docker network create metacat-network

DB_USER=metacat
DB_PASSWORD=metacat
docker run  \
   -p 5432:5432 \
   -e POSTGRES_PASSWORD=$DB_PASSWORD \
   -e POSTGRES_USER=$DB_USER \
   --network metacat-network -d \
   --name db  -it postgres:alpine 

How to use this image

Image Build Arguments

The following are the optional build arguments for this docker image.

METADAT_GID: (default: 121 - debian tomcat GID) the UID to run the Tomcat server as.

METACAT_UID: (default: 108 - debian tomcat UID) the UID to run the Tomcat server as.

docker build --build-arg METACAT_UID=<uid> 
    --build-arg METADAT_GID=<gid>

Image Environment Variables

The following environment variables are optional:

ADMIN: the administrator username. Please note that this must match the auth.administrator value in metacat.properties.

ADMINPASS: The password for the admin user

ADMINPASS_FILE: A file that contains the admin user password. This file must be mountied inside the container.

APP_PROPERTIES_FILE: (default /config/app.properties) The file path where the local application property files are mounted in the container.

METACAT_APP_CONTEXT: (default: metacat) The metacat application context that is presented in the url path. The url path is http://domain/METACAT_APP_CONTEXT.

METACAT_VERSION: (default:2.8.7) The version of metacat to download and install in the image

Build the docker metacat image:

VERSION=2.8.7
./build.sh $VERSION

Create a file named app.properties with the content below:

######## Configuration utility section  ################

configutil.propertiesConfigured=true
configutil.authConfigured=true
configutil.skinsConfigured=true
configutil.databaseConfigured=true
configutil.geoserverConfigured=bypassed
configutil.dataoneConfigured=bypassed
configutil.ezidConfigured=bypassed

############### Server Values #################

server.name=localhost
server.httpPort=8080
server.httpSSLPort=8443

############### Application Values ############

## one of the few places where we use ANT tokens
application.deployDir=/usr/local/tomcat/webapps
## This is autodiscovered and populated by the config utility
application.context=metacat

############### Database Values ###############

database.connectionURI=jdbc:postgresql://db/metacat
database.user=metacat
database.password=metacat
database.type=postgres
database.driver=org.postgresql.Driver
database.adapter=edu.ucsb.nceas.dbadapter.PostgresqlAdapter

######## Authentication  ##############################################
auth.administrators=metacat-admin@localhost


######## Replication properties  #########################################
replication.logdir=/var/metacat/logs

Run the docker container

docker run  \
       -v ${PWD}/app.properties:/config/app.properties   \
       -p 8080:8080     \
       -e ADMIN=metacat-admin@localhost   \
       -e ADMINPASS=metacat-admin    \
       --name mn  -d \
       --network=metacat-network  \
       -it metacat:$VERSION 

The metacat REST API should be able to be accessed at http://localhost:8080/metacat/d1/mn

Patch server.xml

Get the conf files from the latest image

docker pull tomcat:9.0-jdk8

If the image updated, then determine if the conf files changed

docker run --entrypoint "/bin/cat" tomcat:9.0-jdk8  conf/server.xml > image-server.xml
git diff image-server.xml

If modifying the conf file, update server.xml Then create the patch file.

diff  -u image-server.xml server.xml > server.xml.patch 

Check the patch to see if the changes make sense

git diff server.xml.patch

Patch Metacat eml-dataset.xsl

  • Get the eml-dataset.xsl from the latest metacat tar ball

diff -u style/skins/metacatui/eml-2/eml-dataset.xsl
metacat/skins/metacatui/eml-2/eml-dataset.xsl > metacat/eml-dataset.xsl.patch

Patch Metacat Solr schema.xsl and solrconfig.xml

  • Edit the schema.xml and solrconfig.xml from the latest metacat tar ball

diff -u WEB-INF/classes/solr-home/conf/schema.xml
solr/WEB-INF/classes/solr-home/conf/schema.xml > solr/schema.xml.patch diff -u WEB-INF/classes/solr-home/conf/solrconfig.xml
solr/WEB-INF/classes/solr-home/conf/solrconfig.xml > solr/schema.xml.patch

License

TBD

Supported Docker versions

This image is officially supported on Docker version 17.09.0.

People

Current Project Team Members: