Skip to content

Virtual Machine Provisioning

John Paden edited this page Jun 2, 2016 · 11 revisions

Updated 05/27/2014 for conf/provisions.sh VERSION

Static Input Parameters (Users can modify with caution)

preProv installs basic libraries on linux not included on the CReSIS CentOS template

newDb installs and configures a new database system

serverName used to set the server name and other properties throughout the system

serverAdmin default is "root" but this could be set to an email

appName controls the name of the application throughout (should probably stay ops)

dbName sets the name of the database

installPgData if set the bulkloader (requiring datapacks) is used. See more about this process here

snfsBasePath the base path to the system on snfs1 (eg. "/cresis/snfs1/web/ops2/")

webDataDir controls the path for output data from Django (CSV,KML,etc ...)

Getting User Input

read -s -p "Database User (default=admin): " dbUser && printf "\n"; read -s -p "Database Password (default=pubAdmin): " dbPswd && printf "\n"; echo -e $dbPswd > /etc/db_pswd.txt;

This section gets the username and password from the user. It can be changed when provisioning in the production system. The defaults (still need to be entered) are shown.

Pre-Provisioning

If preProv is set to 1 this section will be run.

if [ $preProv -eq 1 ]; then

	cd ~ && cp /vagrant/conf/software/epel-release-6-8.noarch.rpm ./
	rpm -Uvh epel-release-6*.rpm
	rm -f epel-release-6-8.noarch.rpm
	yum update -y
	yum groupinstall -y "Development Tools"
	yum install -y gzip gcc unzip rsync wget git
	iptables -F
	iptables -A INPUT -p tcp --dport 22 -j ACCEPT #SSH ON TCP 22
	iptables -A INPUT -p tcp --dport 80 -j ACCEPT #HTTP ON TCP 80
	iptables -A INPUT -p tcp --dport 443 -j ACCEPT #HTTPS ON TCP 443
	iptables -P INPUT DROP
	iptables -P FORWARD DROP
	iptables -P OUTPUT ACCEPT
	iptables -A INPUT -i lo -j ACCEPT
	iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
	/sbin/service iptables save
	/sbin/service iptables restart

fi

Lines 1-3 install the EPEL (Extra Packages for Enterprise Linux) repository. The system is updated yum update -y. Many basic system tools are installed, iptables is set correctly to accept 22/80/443 connections.

Geoserver Data Directory

geoServerStr="GEOSERVER_DATA_DIR="$snfsBasePath"geoserver"
echo $geoServerStr >> ~/.bashrc
. ~/.bashrc

This sets the GEOSERVER_DATA_DIR environment variable which is used by the Apache Tomcat application server and GeoServer web application to search for the data path.

System Update and PGDG Repo

# UPDATE SYSTEM
yum update -y

# INSTALL THE PGDG REPO
cd ~ && cp -f /vagrant/conf/software/pgdg-centos93-9.3-1.noarch.rpm ./
rpm -Uvh pgdg-centos93-9.3-1.noarch.rpm
rm -f pgdg-centos93-9.3-1.noarch.rpm

This updates the system and installes the PGDG repository which is used in install the most recent postgres/postgis and dependencies using YUM.

Python 2.7 and VirtualEnv

# INSTALL DEPENDENCIES
yum install -y python-pip zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel
python-pip install --upgrade nose

# INSTALL PYTHON 2.7.6
cd ~ && cp -f /vagrant/conf/software/Python-2.7.6.tar.xz ./
tar xf Python-2.7.6.tar.xz
cd Python-2.7.6
./configure --prefix=/usr --enable-shared LDFLAGS="-Wl,-rpath /usr/lib"
make && make altinstall
cd ~ && rm -rf Python-2.7.6 && rm -f Python-2.7.6.tar.xz

# INSTALL AND ACTIVATE VIRTUALENV
pip install virtualenv
virtualenv -p /usr/bin/python2.7 /usr/bin/venv
source /usr/bin/venv/bin/activate

This section installs Python 2.7.6 from source inside of a virtual environment (contained python binary and libraries separated from system level python) and then activates the vituralenv. All future python use/installs are inside the virtual environment (within this script).

Apache Web Server

# INSTALL APACHE HTTPD
yum install -y httpd httpd-devel

# INSTALL MOD_WSGI (COMPILE WITH Python27)
cd ~ && cp -f /vagrant/conf/software/mod_wsgi-3.4.tar.gz ./
tar xvfz mod_wsgi-3.4.tar.gz
cd mod_wsgi-3.4/
./configure --with-python=/usr/bin/python2.7
LD_RUN_PATH=/usr/lib make && make install
cd ~ && rm -f mod_wsgi-3.4.tar.gz && rm -rf mod_wsgi-3.4

This section installs Apache HTTP web server (httpd) and Mod-WSGI. Mod WSGI is used by Django (Django takes a WSGI request object from Apache and processes it).

Configuration Files

A series of text configuration files are written to the server. A summary is given here.

/etc/httpd/conf.d/djangoWsgi.conf allows Apache to communicate with a Django application.

/etc/httpd/conf.d/geoserverProxy.conf allows the URL /geoserver to point to the GeoServer web application hosted by apache tomcat (localhost:8080/geoserver)

/var/www/sites/$serverName/conf/$appName.conf is the primary apache configuration file setting up path aliases and the basic server details

/var/www/sites/$serverName/cgi-bin/proxy.cgi a stock configuration provided by openlayers for cross-domain requests. Not currently used, but future-proofs the openlayers implementation. The servername is written into the allowed hosts in this configuration.

/etc/crontab is the configuration for cron (timed tasks) that take place on the server.

Java Installation

# COPY INSTALLATION FILES
cd ~
cp /vagrant/conf/software/jre-8-linux-x64.rpm ./
cp /vagrant/conf/software/jai-1_1_3-lib-linux-amd64-jre.bin ./
cp /vagrant/conf/software/jai_imageio-1_1-lib-linux-amd64-jre.bin ./

# INSTALL JAVA JRE
rpm -Uvh jre*
alternatives --install /usr/bin/java java /usr/java/latest/bin/java 200000
rm -f jre-8-linux-x64.rpm

# INSTALL JAI
cd /usr/java/jre1.8.0/
chmod u+x ~/jai-1_1_3-lib-linux-amd64-jre.bin
~/jai-1_1_3-lib-linux-amd64-jre.bin
rm -f ~/jai-1_1_3-lib-linux-amd64-jre.bin

# INSTALL JAI-IO
export _POSIX2_VERSION=199209 
chmod u+x ~/jai_imageio-1_1-lib-linux-amd64-jre.bin 
~/jai_imageio-1_1-lib-linux-amd64-jre.bin 
rm -f ~/jai_imageio-1_1-lib-linux-amd64-jre.bin && cd ~

This section installs JAVA JRE / JAVA JAI / JAVA JAI-I/O. User input (accepting two license agreements) is needed in the section of code when run.

PostgreSQL + PostGIS configuration

pgDir=$snfsBasePath'pgsql/9.3/'
pgPth=$snfsBasePath'pgsql/'

# EXCLUDE POSTGRESQL FROM THE BASE CentOS RPM
sed -i -e '/^\[base\]$/a\exclude=postgresql*' /etc/yum.repos.d/CentOS-Base.repo 
sed -i -e '/^\[updates\]$/a\exclude=postgresql*' /etc/yum.repos.d/CentOS-Base.repo 

# INSTALL POSTGRESQL
yum install -y postgresql93* postgis2_93* 

# INSTALL PYTHON PSYCOPG2 MODULE FOR POSTGRES
export PATH=/usr/pgsql-9.3/bin:"$PATH"
pip install psycopg2

This section installs PostgreSQL (server,libs,...) and PostGIS (server,libs,...)

if [ $newDb -eq 1 ]; then
	
	# MAKE THE SNFS1 MOCK DIRECTORY IF IT DOESNT EXIST
	if [ ! -d $pgPth ]
		then
			mkdir -p $pgPth
			chown postgres:postgres $pgPth
			chmod 700 $pgPth
	fi
	
	# INITIALIZE THE DATABASE CLUSTER
	cmdStr='/usr/pgsql-9.3/bin/initdb -D '$pgDir
	su - postgres -c "$cmdStr"
	
	# WRITE PGDATA and PGLOG TO SERVICE CONFIG FILE 
	sed -i "s,PGDATA=/var/lib/pgsql/9.3/data,PGDATA=$pgDir,g" /etc/rc.d/init.d/postgresql-9.3
	sed -i "s,PGLOG=/var/lib/pgsql/9.3/pgstartup.log,PGLOG=$pgDir/pgstartup.log,g" /etc/rc.d/init.d/postgresql-9.3
	
	# CREATE STARTUP LOG
	touch $pgDir"pgstartup.log"
	chown postgres:postgres $pgDir"pgstartup.log"
	chmod 700 $pgDir"pgstartup.log"

	# SET UP THE POSTGRESQL CONFIG FILES
	pgConfDir=$pgDir"postgresql.conf"
	sed -i "s,#port = 5432,port = 5432,g" $pgConfDir
	sed -i "s,#track_counts = on,track_counts = on,g" $pgConfDir
	sed -i "s,#autovacuum = on,autovacuum = on,g" $pgConfDir
	sed -i "s,local   all             all                                     peer,local   all             all                                     trust,g" $pgConfDir

	# START UP THE POSTGRESQL SERVER
	service postgresql-9.3 start

	# CREATE THE ADMIN ROLE
	cmdstring="CREATE ROLE "$dbUser" WITH SUPERUSER LOGIN PASSWORD '"$dbPswd"';"
	psql -U postgres -d postgres -c "$cmdstring"

	# CREATE THE POSTGIS TEMPLATE
	cmdstring="createdb postgis_template -O "$dbUser 
	su - postgres -c "$cmdstring"
	psql -U postgres -d postgis_template -c "CREATE EXTENSION postgis; CREATE EXTENSION postgis_topology;"

fi

If a new database is being installed this code initializes the database server, sets directory permissions, modifies the postgresql configuration, creates the admin user and postgis_template database. Finally the acutal ops (dbName) database is created.

Python Package Installation

# INSTALL PACKAGES WITH PIP
pip install Cython 
pip install geojson ujson django-extensions simplekml pylint
pip install --pre line_profiler

# INSTALL NUMPY/SCIPY 
yum -y install atlas-devel blas-devel
pip install numpy
pip install scipy

# INSTALL GEOS
yum -y install geos-devel

The commands pip and yum are used to install all of the needed python packages.

Install and Configure Django

# INSTALL DJANGO
pip install Django==1.6.4

# CREATE DIRECTORY AND COPY PROJECT
mkdir -p /var/django/
cp -rf /vagrant/conf/django/* /var/django/

# GENERATE A NEW SECRET_KEY
NEW_SECRET_KEY=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9*^+()@' | fold -w 40 | head -n 1);
echo $NEW_SECRET_KEY >> /etc/secret_key.txt

# MODIFY THE DATABASE NAME
sed -i "s,		'NAME': 'ops',		'NAME': '$dbName',g" /var/django/ops/ops/settings.py
sed -i "s,		'USER': 'admin',		'USER': '$dbUser',,g" /var/django/ops/ops/settings.py

This section uses Pip to install a fixed version of Django, copy the CReSIS Django project into place, create and write a new secret key and modify the django settings.py file.

if [ $newDb -eq 1 ]; then

	# SYNC THE DJANGO DEFINED DATABASE
	python /var/django/$appName/manage.py syncdb --noinput 

	# CREATE DATABASE VIEWS FOR CROSSOVER ERRORS
	viewstr='psql -U postgres -d '$dbName' -c "CREATE VIEW app_crossover_errors AS WITH cx1 AS (SELECT cx.id, cx.angle, cx.geom,cx.point_path_1_id,cx.point_path_2_id, lp.layer_id, lp.twtt FROM app_crossovers cx LEFT JOIN app_layer_points lp ON cx.point_path_1_id=lp.point_path_id), cx2 AS (SELECT cx.id, cx.angle, cx.geom,cx.point_path_1_id,cx.point_path_2_id, lp.layer_id, lp.twtt FROM app_crossovers cx LEFT JOIN app_layer_points lp ON cx.point_path_2_id=lp.point_path_id) SELECT COALESCE(cx1.id,cx2.id) cross_id, COALESCE(cx1.angle,cx2.angle) angle, COALESCE(cx1.geom,cx2.geom) geom, COALESCE(cx1.layer_id,cx2.layer_id) layer_id, cx1.twtt twtt_1, cx2.twtt twtt_2,pp1.id point_path_1_id, pp2.id point_path_2_id, pp1.location_id, pp1.gps_time gps_time_1, pp2.gps_time gps_time_2, pp1.heading heading_1, pp2.heading heading_2, pp1.roll roll_1, pp2.roll roll_2, pp1.pitch pitch_1, pp2.pitch pitch_2, pp1.geom point_path_1_geom, pp2.geom point_path_2_geom, (SELECT name FROM app_frames WHERE id=pp1.frame_id) frame_1_name, (SELECT name FROM app_frames WHERE id=pp2.frame_id) frame_2_name, pp1.segment_id segment_1_id, pp2.segment_id segment_2_id, (SELECT name FROM app_seasons WHERE id=pp1.season_id) season_1_name, (SELECT name FROM app_seasons WHERE id=pp2.season_id) season_2_name, CASE WHEN COALESCE(cx1.layer_id,cx2.layer_id) IS NULL THEN NULL WHEN COALESCE(cx1.layer_id,cx2.layer_id) = 1 THEN ABS((ST_Z(pp1.geom) - cx1.twtt*299792458.0003452/2)) ELSE ABS((ST_Z(pp1.geom) - (SELECT twtt FROM app_layer_points WHERE layer_id=1 AND point_path_id = pp1.id)*299792458.0003452/2 - (cx1.twtt - (SELECT twtt FROM app_layer_points WHERE layer_id = 1 AND point_path_id = pp1.id))*299792458.0003452/2/sqrt(3.15))) END AS layer_elev_1, CASE WHEN COALESCE(cx1.layer_id,cx2.layer_id) IS NULL THEN NULL WHEN COALESCE(cx1.layer_id,cx2.layer_id) = 1 THEN ABS((ST_Z(pp2.geom) - cx2.twtt*299792458.0003452/2)) ELSE ABS((ST_Z(pp2.geom) - (SELECT twtt FROM app_layer_points WHERE layer_id=1 AND point_path_id = pp2.id)*299792458.0003452/2 - (cx2.twtt - (SELECT twtt FROM app_layer_points WHERE layer_id = 1 AND point_path_id = pp2.id))*299792458.0003452/2/sqrt(3.15))) END AS layer_elev_2 FROM cx1 FULL OUTER JOIN cx2 ON (cx1.id=cx2.id AND cx1.layer_id=cx2.layer_id) JOIN app_point_paths pp1 ON pp1.id=COALESCE(cx1.point_path_1_id,cx2.point_path_1_id) JOIN app_point_paths pp2 ON pp2.id=COALESCE(cx1.point_path_2_id,cx2.point_path_2_id) WHERE (cx1.layer_id IS NOT NULL OR cx2.layer_id IS NOT NULL);"'
	eval ${viewstr//app/rds}
	eval ${viewstr//app/snow}
	eval ${viewstr//app/accum}
	eval ${viewstr//app/kuband}

fi

If a new database is being installed the syncdb command is called (creating the schema defined by Django in the database and creating all initial data (fixtures). And a crossover errors view is created for each application.

Bulk Data Loading

if [ $installPgData -eq 1 ]; then
	fCount=$(ls -A /vagrant/data/postgresql/ | wc -l);
	if [ $fCount -gt 1 ]; then
		
		# INSTALL pg_bulkload AND DEPENDENCIES
		cd ~ && cp -f /vagrant/conf/software/pg_bulkload-3.1.5-1.pg93.rhel6.x86_64.rpm ./
		cd ~ && cp -f /vagrant/conf/software/compat-libtermcap-2.0.8-49.el6.x86_64.rpm ./
		yum install -y openssl098e;
		rpm -Uvh ./compat-libtermcap-2.0.8-49.el6.x86_64.rpm;
		rpm -ivh ./pg_bulkload-3.1.5-1.pg93.rhel6.x86_64.rpm;
		rm -f compat-libtermcap-2.0.8-49.el6.x86_64.rpm && rm -f pg_bulkload-3.1.5-1.pg93.rhel6.x86_64.rpm
		
		# ADD pg_bulkload FUNCTION TO THE DATABASE
		su postgres -c "psql -f /usr/pgsql-9.3/share/contrib/pg_bulkload.sql "$appName"";
		
		# LOAD INITIAL DATA INTO THE DATABASE
		sh /vagrant/conf/bulkload/initdataload.sh
	fi
fi

This section optionally installs and bulkloads data using the pg_bulkload tool. See more about this process here.

Constraints

psql -U postgres -d $dbName -c "ALTER TABLE rds_layer_points ADD UNIQUE (layer_id, point_path_id);ALTER TABLE accum_layer_points ADD UNIQUE (layer_id, point_path_id);ALTER TABLE snow_layer_points ADD UNIQUE (layer_id, point_path_id);ALTER TABLE kuband_layer_points ADD UNIQUE (layer_id, point_path_id);"

This code creates a unique constraint in the database for each applications layer points in an effort to prevent duplicates.

Apache Tomcat & GeoServer WAR


# INSALL APACHE TOMCAT
yum install -y tomcat6

# CONFIGURE TOMCAT6
echo 'JAVA_HOME="/usr/java/jre1.8.0/"' >> /etc/tomcat6/tomcat6.conf
echo 'JAVA_OPTS="-server -Xms512m -Xmx512m -XX:+UseParallelGC -XX:+UseParallelOldGC"' >> /etc/tomcat6/tomcat6.conf
echo 'CATALINA_OPTS="-DGEOSERVER_DATA_DIR='$snfsBasePath'geoserver"' >> /etc/tomcat6/tomcat6.conf

# MAKE THE EXTERNAL GEOSERVER DATA DIRECTORY (IF IT DOESNT EXIST)
geoServerDataPath=$snfsBasePath"geoserver/"
if [ ! -d $geoServerDataPath ]; then
	mkdir -p $geoServerDataPath
fi

# EXTRACT THE OPS GEOSERVER DATA DIR TO THE DIRECTORY
cp -rf /vagrant/conf/geoserver/geoserver/* $geoServerDataPath

# GET THE GEOSERVER REFERENCE DATA
if [ -f /vagrant/data/geoserver/geoserver.zip ]; then

	unzip /vagrant/data/geoserver/geoserver.zip -d $geoServerDataPath"data/"

else

	# DOWNLOAD THE DATA PACK FROM CReSIS (MINIMAL LAYERS)
	cd /vagrant/data/geoserver/ && wget https://data.cresis.ku.edu/data/ops/geoserver.zip
	
	# UNZIP THE DOWNLOADED DATA PACK
	unzip /vagrant/data/geoserver/geoserver.zip -d $geoServerDataPath"data/"

fi

# TEMPORARY HACK UNTIL THE GEOSERVER.ZIP STRUCTURE CHANGES
mv $geoServerDataPath"data/geoserver/data/arctic" $geoServerDataPath"data/"
mv $geoServerDataPath"data/geoserver/data/antarctic" $geoServerDataPath"data/"
rm -rf $geoServerDataPath"data/geoserver/"

# COPY THE GEOSERVER WAR TO TOMCAT
cp /vagrant/conf/geoserver/geoserver.war /var/lib/tomcat6/webapps

# SET OWNERSHIP/PERMISSIONS OF GEOSERVER DATA DIRECTORY
chmod -R u=rwX,g=rwX,o=rX $geoServerDataPath
chown -R tomcat:tomcat $geoServerDataPath

# START APACHE TOMCAT
service tomcat6 start

In this section Apache Tomcat 6 (tomcat6) is installed and configured for the custom JAVA installation. The custom geoserver data directory is copied into place and the correct permissions set. The GeoServer reference data is also downloaded and unzipped.

GeoPortal

cp -rf /vagrant/conf/geoportal/* /var/www/html/ # COPY THE APPLICATION

# WRITE THE BASE URL TO app.js
# sed -i "s,	 baseUrl: ""http://192.168.111.222"",	 baseUrl: ""$serverName"",g" /var/www/html/app.js

# CREATE AND CONFIGURE ALL THE OUTPUT DIRECTORIES
mkdir -m 777 -p $snfsBasePath"data/csv/"
mkdir -m 777 -p $snfsBasePath"data/kml/"
mkdir -m 777 -p $snfsBasePath"data/mat/"
mkdir -m 777 -p $snfsBasePath"datapacktmp/"
mkdir -m 777 -p  $snfsBasePath"data/datapacks/"
mkdir -m 777 -p $snfsBasePath"data/reports/"
mkdir -m 777 -p /var/profile_logs/txt/

This section copies the GeoPortal configuration and sets up output directories. It also makes sure the URL is correctly set in app.js for the GeoPortal.

Final Setups

# APACHE HTTPD
service httpd start
chkconfig httpd on

# POSTGRESQL
service postgresql-9.3 start
chkconfig postgresql-9.3 on

# APACHE TOMCAT
service tomcat6 start
chkconfig tomcat6 on

This section makes sure the web, web application, and database server are always on and start on any reboot. Finally the system is updated and your good to go.