- Versions
- Scripts
- [Image Creation](#Image Creation)
- [Container Creation](#Container Creation)
- [Executing Arbitrary Commands](#Executing Arbitrary Commands)
- [Data Persistence](#Data Persistence)
- Passwords
- [Executing psql Scripts on Start Up](#Executing psql Scripts on Start Up)
- [User Mapping](#User Mapping)
- [Backing Up Databases](#Backing Up Databases)
- [Restoring a Database Dump](#Restoring a Database Dump)
- [Configuring the Data Store](#Configuring the Data Store)
- [Killing the Container](#Killing the Container)
This Dockerfile compiles the following software:
-
PostgreSQL 9.6.0;
-
GEOS 3.5.0;
-
Proj 4.9.3: patched with the spanish national grid for conversion between ED50 to ETRS89;
-
GDAL 2.1.1: also patched;
-
PostGIS 2.3.0: patched as well.
There is a script in this repo to help working with this image. psql-docker opens a psql console on a standalone container to connect to other databases. To check how it works:
psql-docker -h
Build the image directly from Git (this can take a long time):
./build.sh
or pull it from Docker Hub:
docker pull geographica/postgis:breezy_badger
The image exposes port 5432, a volume designated by enviroment variable POSTGRES_DATA_FOLDER with the data folder, and another one POSTGRES_OUTPUT_FOLDER for database output (like backups).
There are several options available to create containers. Check Usage_Cases for testing. The most simple one:
docker run -d -P --name pgcontainer \
geographica/postgis:breezy_badger
This will create a container with two default volumes, /data and /output, for storing the data store and output, respectively. The default encoding will be UTF-8, and the locale en_US. No additional modification or action is taken.
Containers can be configured by means of setting environmental variables:
-
POSTGRES_PASSWD: set the password for user postgres. See Passwords for more details. Defaults to postgres;
-
ENCODING: encoding to create the data store and the default database, if applicable. Defaults to UTF-8;
-
LOCALE: locale for the data store and the default database, if any. Defaults to en_US;
-
PSQL_SCRIPTS: semicolon separated psql scripts to be executed on the data store once created, in absolute path. Defaults to null, meaning no action is to be taken. See [Executing psql Scripts on Start Up](#Executing psql Scripts on Start Up) for more details;
-
CREATE_USER: creates an user and a default database with this owner at startup. The structure of this parameter is USERNAME;PASSWORD, and it defaults to null;null, in which case no user and database will be created (very bad luck if you want your user and database to be called 'null' :| ). This user and database are created before any psql script is executed or any backup is restored;
-
BACKUP_DB: semicolon separated names of databases to backup by default. Defaults to null, which means no database will be backed-up by default, or to CREATE_USER in case any is used so default database will be backed up automatically. See [Backing Up Databases](#Backing Up Databases) for details;
-
PG_RESTORE: semicolon separated names of database dumps to be restored. See [Restoring a Database Dump](#Restoring a Database Dump) for details. Defaults to null, meaning that no action is to be taken. Restores are done after all psql scripts are executed;
-
UID_FOLDER: the folder in the container whose user and group ID must be matched for the postgres user. Defaults to null, meaning that the system will try to set the ID. Check [User Mapping](#User Mapping) for details. Please note that Docker for Mac on MacOS Sierra handles nicely the user mapping to the user running Docker, so this is not necessary;
-
CUID / CGID: custom UID and GID for the postgres user. Check [User Mapping](#User Mapping) for details;
-
PG_HBA: configuration of pg_hba.con access file. See [Configuring the Data Store](#Configuring the Data Store) for details;
-
PG_CONF: configuration of postgresql.conf See [Configuring the Data Store](#Configuring the Data Store) for details.
Some examples of container initializations:
export PGPASSWD="md5"$(printf '%s' "new_password_here" "postgres" | md5sum | cut -d ' ' -f 1) && \
docker run -d -P --name ageworkshoptestpg -e "POSTGRES_PASSWD=${PGPASSWD}" \
geographica/postgis:breezy_badger
This run command will create a container with a default options, but changing the postgres password to new_password_here, and sending it already encrypted to the container. Check Passwords for details:
docker run -d -P --name ageworkshoptestpg -e "LOCALE=es_ES" -e "CREATE_USER=project" \
-e "CREATE_USER_PASSWD=project_pass" \
geographica/postgis:breezy_badger
This will create the container with a spanish locale, and will create on startup an user and database called project, being project_pass the password for the project user. Additionaly, the project database is set to be automatically backed up.
docker run -d -P --name ageworkshoptestpg -v /home/demo_scripts/:/init_scripts/ \
-e "LOCALE=es_ES" -e "CREATE_USER=project" \
-e "CREATE_USER_PASSWD=project_pass" -e "BACKUP_DB=project" \
-e "PSQL_SCRIPTS=/init_scripts/Schema00_DDL.sql;/init_scripts/Schema01_DDL.sql" \
geographica/postgis:breezy_badger
This one creates a container with a hard-mounted volume from local demo_scripts to container's /init_scripts where a couple of psql scripts will be stored. Creates an user and database called project and executes on it the two mentioned scripts.
Please check folder Usage_Cases for a set of usage cases bundled as a test suite of sorts.
The image can run arbitrary commands. This is useful for example for creating a temporary container for just dump a database, run a psql session with the one inside this image, or executing scripts into another container.
Some examples:
# Interactive pg_dump, will ask for password
docker run --rm -ti -v /whatever/:/d --link the_container_running_the_database:pg \
geographica/postgis:breezy_badger \
pg_dump -b -E UTF8 -f /d/dump -F c -v -Z 9 -h pg -p 5432 -U postgres project
# Full automatic pg_dump, with password as ENV variable
docker run --rm -v /home/malkab/Desktop/:/d --link test_07:pg \
geographica/postgis:breezy_badger \
PGPASSWORD="new_password_here" pg_dump -b -E UTF8 -f /d/dump33 -F c \
-v -Z 9 -h pg -p 5432 -U postgres postgres
# Interactive psql
docker run --rm -ti -v /home/malkab/Desktop/:/d --link test_07:pg \ geographica/postgis:breezy_badger \ PGPASSWORD="new_password_here" psql -h pg -p 5432 -U postgres postgres
Datastore data can be persisted in a data volume or host mounted folder and be used later by another container. The container checks if POSTGRES_DATA_FOLDER has a file postgresql.conf. If not, considers the datastore to be not created and creates an empty one.
Passwords sent to the container with environment variables POSTGRES_PASSWD and CREATE_USER_PASSED can be passed either on plain text or already encrypted á la PostgreSQL. To pass it on plain text means that anybody with access to the docker inspect command on the server will be able to read passwords. Encrypting them previously means that docker inspect will show the encrypted password, adding an additional layer of secrecy.
PostgreSQL passwords are encrypted using the MD5 checksum algorithm on the following literal:
md5 + md5hash(real password + username)
For example, in the case of user myself and password secret, the encrypted password will be the MD5 sum of secretmyself prefixed with md5, in this case, md5a296d28d6121e7307ac8e72635ae206b.
To provide encrypted password to containers, use the following command:
export USER="projectuser" && \
export USERPASSWD="md5"$(printf '%s' "userpass" ${USER} | md5sum | cut -d ' ' -f 1) && \
export PGPASSWD="md5"$(printf '%s' "password_here" "postgres" | md5sum | cut -d ' ' -f 1) && \
docker run -d -P --name ageworkshoptestpg -e "POSTGRES_PASSWD=${PGPASSWD}" \
-e "CREATE_USER=${USER}" -e "CREATE_USER_PASSWD=${USERPASSWD}" \
geographica/postgis:breezy_badger
Ugly, but effective. Keep in mind, however, that if you use provisioning methods like bash scripts or Docker Compose others will still be able to read passwords from these sources, so keep them safe.
The image can run psql scripts on container's start up. To do so, put scripts inside the container (via a child container image that ADD them from the Dockerfile or mounting a volume) and configure the PSQL_SCRIPTS environment variable. This variable must contain full paths inside the container to psql scripts separated by semicolons (;) that will be executed in order on container startup. For example:
export PGPASSWD="md5"$(printf '%s' "password_here" "postgres" | md5sum | cut -d ' ' -f 1) && \
docker run -d -P --name ageworkshoptestpg -e "POSTGRES_PASSWD=${PGPASSWD}" \
-v /localscripts/:/psql_scripts/ \
-e "PSQL_SCRIPTS=/psql_scripts/script1.sql;/psql_scripts/script2.sql" \
geographica/postgis:breezy_badger
script1.sql and script2.sql will be executed on container startup. Scripts are executed as postgres.
Please note that Docker for Mac on MacOS Sierra handles nicely the user mapping to the user running Docker, so this is not necessary.
The container will create an inner postgres user and group for running the service. The UID and GID of this objects can be adjusted to match one at the host, so files in mounted volumes will be owned by the matched host user. The logic behind user mapping is as follows:
-
if env variables CUID and CGID are set, postgres user will be created with this codes;
-
if the env variable UID_FOLDER is set, the ID will be taken from the given folder;
-
if the output exposed volume is mounted to a host folder (like as using the -v option), UID and GID of the owner of the host folder will be read and the container postgres user and group will match them;
-
same if data is exposed as a host volume;
-
if nothing of the above happens, the user will be created with ID assigned by the system.
This image provides a simple method to backup databases with pg_dump. Databases to be backed up is controlled by the BACKUP_DB environmental variable, with the names of databases separated by a semicolon.
To back up databases, a docker exec is needed:
docker exec -ti containername make_backups
This command accepts data base names as arguments that overrides any BACKUP_DB value:
docker exec -ti containername make_backups database_a database_b
Backups are stored at POSTGRES_OUTPUT_FOLDER, which is a exposed volume. Usage patterns may be hard mounting the volume (somewhat dirty) or better linking it to a SFTP or data container for remote retrieval. Backups are time stamped and the backup file has the following format:
[container hash]-[ISO time stamp]-[database name].backup
The command used for backups is:
pg_dump -b -C -E [encoding] -f [backup file name] -F c -v -Z 9 -h localhost -p 5432 -U postgres [database name]
The image allows for restoration of database dumps created by pg_dump. The PG_RESTORE environmental variable is used for this. It's a semicolon separated list of parameters for pg_restore:
-e "PG_RESTORE=-C -F c -v -d postgres -U postgres /path/post1.backup;-d databasename -F c -v -U postgres /path/post2.backup;-C -F c -O -v -d postgres -U postgres post3.backup"
As a base, pg_restore is launched with the prefix:
pg_restore -h localhost -p 5432
Please refer to the pg_restore and pg_dump official documentation for more details. Host is always localhost and port is always 5432, so no need to declare.
Restores are performed after executing any script passed to the container with the PSQL_SCRIPTS variable. If any role must be present at restoration time, create it with a psql script before.
The image allows for configuration of pg_hba.conf and postgresql.conf data store files at creation time and later. This is advanced stuff, refer to the PostgreSQL documentation for details.
pg_hba.conf configuration is handled by a script called pg_hba_conf. pg_hba_conf has three modes of operation:
[1] pg_hba_conf l
[2] pg_hba_conf a "line 1#line 2#...#line n"
[3] pg_hba_conf d "line 1#line 2#...#line n"
which means:
-
[1] prints current contents of pg_hba.conf;
-
[2] adds lines to pg_hba.conf;
-
[3] deletes lines from pg_hba.conf.
This commands can be issued by standard Docker's exec:
docker exec -ti whatevercontainer pg_hba_conf a \
"host all all 23.123.22.1/32 trust#host all all 93.32.12.3/32 md5"
but at startup it is controlled by an environment variable, PG_HBA, which defaults to:
ENV PG_HBA "local all all trust#host all all 127.0.0.1/32 trust#host all all 0.0.0.0/0 md5#host all all ::1/128 trust"
This defaults should be submitted for basic operation. For universal access, for example for testing, add:
local all all trust#host all all 0.0.0.0/0 trust#host all all 127.0.0.1/32 trust#host all all ::1/128 trust
Modify this variable to configure at creation time. Keep in mind, however, that any value provided to this variable will supersede the default. Don't forget to include basic access permissions if you modify this variable, or the server will be hardly reachable. For testing purposes, direct commands can be issued via exec. Check Usage Cases for examples.
Configuration of postgresql.conf follows an identical procedure. Command is postgresql_conf and has the same syntax as pg_hba_conf. The environmental variable is PG_CONF, which defaults to the following configuration:
max_connections=100#listen_addresses='*'#shared_buffers=128MB#dynamic_shared_memory_type=posix#log_timezone='UTC'#datestyle='iso, mdy'#timezone='UTC'#lc_messages='en_US.UTF-8'#lc_monetary='en_US.UTF-8'#lc_numeric='en_US.UTF-8'#lc_time='en_US.UTF-8'#log_statement='all'#log_directory='pg_log'#log_filename='postgresql-%Y-%m-%d_%H%M%S.log'#logging_collector=on#client_min_messages=notice#log_min_messages=notice#log_line_prefix='%a %u %d %r %h %m %i %e'#log_destination='stderr,csvlog'#log_rotation_size=500MB
At creation time, language, encoding, and locale info is added based on env variables LOCALE and ENCODING.
Logs are stored at $POSTGRES_DATA_FOLDER/pg_log.
This container will handle signals send to it with docker kill properly, so the database is shut down tidily. Thus:
- SIGTERM signals for a smart shutdown, waiting for all connections and transactions to be finished. The server won't allow for new connections, thou:
pg_ctl -D . stop -m smart
docker kill -s SIGTERM containername
- SIGINT signals for fast shutdown. The server will abort current transactions and disconnect users, but will exit nicely otherwise;
pg_ctl -D . stop -m fast
docker kill -s SIGINT containername
- SIGQUIT signals for immediate shutdown. This will leave the database in a improper state and lead to recovery on next startup:
pg_ctl -D . stop -m immediate
docker kill -s SIGQUIT containername