Automated Testing and Deployment of a Python Micro-service to AWS, using Docker, Boto3 and Travis-CI
The purpose of this project is to demonstrate how to automate the testing and deployment of a simple Flask-based (RESTful) micro-service to a production-like environment on AWS. The deployment pipeline is handled by Travis-CI, that has been granted access to this GitHub repository and configured to run upon a pull request or a merge to the master branch. The pipeline is defined in the .travis.yaml
file and consists of the following steps:
- define which Python version to use;
- install the
Pipenv
package usingpip
; - use
Pipenv
to install the project dependencies defined inPipfile.lock
; - run unit tests by executing
pipenv run python -m unittest tests/*.py
; and, - if on the
master
branch - e.g. if a pull request has been merged - then start Docker and run thedeploy_to_aws.py
script.
The deploy_to_aws.py
script defines the deployment process, which performs the following steps without any manual intervention:
- build the required Docker image;
- pushe the image to AWS's Elastic Container Registry (ECR); and,
- trigger a rolling redeployment of the service across an Elastic Container Service (ECS) cluster.
It is reliant on the definition of three environment variables: AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, and AWS_REGION
. For security reasons, these are kept out of the .travis.yml
and are instead defined using the Travis-CI UI.
Although the micro-service used in this example - as defined in microservice/api.py
module - only returns a simple message upon a simple GET
request, it could just as easily be a Machine Learning (ML) model-scoring service that receives the values of feature variables and returns a prediction - the overall pattern is the same.
Currently, the initial setup of the required AWS infrastructure is entirely manual (although this could also be scripted in the future). What's required, is an ECS cluster that is capable hosting multiple groups of Docker containers (or 'tasks' - i.e. web applications or in our case just a single micro-service), that sit behind a load balances that accepts incoming traffic and routes it to different containers in the cluster. Collectively,this constitutes a 'service' that is highly available. At a high-level, the steps required to setup this infrastructure using the AWS management console, are as follows (assuming the existence of a repository in ECR, containing our docker image):
- create a new ECS cluster, in new VPC, using instance that are ~
t2.medium
;- when configuring the security group (firewall) for the cluster, consider allowing a rule for single IP to assist debugging (e.g. YOUR_LOCAL_IP_ADDRESS/32);
- create a new application load balancer for new VPC;
- then create a custom security group for the load balancer (from the EC2 console), that allows anything from the outside world to pass;
- modify ECS cluster's security group to allow the load balancer access, by explicitly referencing the security group for the load balancer, that we have just created;
- create a new target group for the new VPC (from within the EC2 console under the 'Load Balancers' section), which we will eventually point the load balancer to;
- there is no need to add the instances from the ECS cluster in this step, as this will he handled automatically when creating the service;
- modify the health check path to
/microservice
, otherwise it won't get 200s and and will try to re-register hosts;
- create a new task in ECS;
- for the sake of simplicity, choose
daemon
mode - i.e. assume there is only one container per-task; - when adding the container for the task, be sure to reference the Docker image uploaded to ECR;
- for the sake of simplicity, choose
- create a new service for our ECS cluster;
- referencing the task, load balancer and target group that we have created in the steps above.
We use pipenv for managing project dependencies and Python environments (i.e. virtual environments). All of the direct packages dependencies required to run the code (e.g. docker and boto3), as well as all the packages used during development (e.g. IPython for interactive console sessions), are described in the Pipfile
. Their precise downstream dependencies are described in Pipfile.lock
.
To get started with Pipenv, first of all download it - assuming that there is a global version of Python available on your system and on the PATH, then this can be achieved by running the following command,
pip3 install pipenv
Pipenv is also available to install from many non-Python package managers. For example, on OS X it can be installed using the Homebrew package manager, with the following terminal command,
brew install pipenv
For more information, including advanced configuration options, see the official pipenv documentation.
Make sure that you're in the project's root directory (the same one in which Pipfile
resides), and then run,
pipenv install --dev
This will install all of the direct project dependencies as well as the development dependencies (the latter a consequence of the --dev
flag).
In order to continue development in a Python environment that precisely mimics the one the project was initially developed with, use Pipenv from the command line as follows,
pipenv run python3
The python3
command could just as well be ipython3
or the Jupyter notebook server, for example,
pipenv run jupyter notebook
This will fire-up a Jupyter notebook server where the default Python 3 kernel includes all of the direct and development project dependencies. This is how we advise that the notebooks within this project are used.
All test have been written using the unittest package from the Python standard library. Tests are kept in the tests
folder and can be run from the command line by - e.g. by evoking,
pipenv run python -m unittest tests/test_*.py
This can be started via the command line, from the root directory using,
pipenv run python -m microservice.api
Which will start the server at http://localhost:5000/microservice
.