We will work on the migration of our operational DB (PostgresSQL) that we used in the module Cloud Intro
to Cloud SQL
PubSub is a messaging service that allows you to send and receive messages between independent applications. It is similar to Kafka, but it is a managed service.
To create a PubSub topic in la UI, follow these steps:
- Go to pub/sub in the console.
- Click on
Create Topic
. - Name the topic
order-events
. - Add another topic called
delivery-events
.
gcloud compute instances create orders-app \
--zone=europe-west1-b \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--subnet=projects/<your-project-id>/regions/europe-west1/subnetworks/default \
--machine-type=e2-micro \
--source-machine-image=projects/<your-project-id>/global/machineImages/<MACHINE_IMAGE_NAME> \
--boot-disk-size=10GB
gcloud compute instances create delivery-app \
--zone=europe-west1-b \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--subnet=projects/<your-project-id>/regions/europe-west1/subnetworks/default \
--machine-type=e2-micro \
--source-machine-image=projects/<your-project-id>/global/machineImages/<MACHINE_IMAGE_NAME> \
--boot-disk-size=10GB
gcloud storage buckets create gs://edem-terraform-state \
--location=europe-west1 \
--uniform-bucket-level-access
cd EDEM2425/gcp_sql/excercise_e2e/terraform
Modify the service_account_email
variable in the variables.tf
file with the email of the service account of your project.
terraform init
terraform plan
terraform apply
Once we finished the excercise, we can destroy the infrastructure by running:
terraform destroy
First, create a new database in the UI of Cloud SQL. Go to database, click on Create Database
and call it ecommerce
.
To create the tables inside the PostgresDB instance, you need to connect to the PostgresDB instance using Cloud SQL Studio, and run the following SQL commands:
CREATE TABLE IF NOT EXISTS customers (
id SERIAL PRIMARY KEY,
customer_name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL
);
CREATE TABLE IF NOT EXISTS products (
id SERIAL PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
price DECIMAL(10, 2) NOT NULL
);
CREATE TABLE IF NOT EXISTS orders (
id SERIAL PRIMARY KEY,
customer_id INT NOT NULL,
created_at TIMESTAMP NOT NULL,
total_price DECIMAL(10, 2) NOT NULL,
FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS order_products (
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
price DECIMAL(10, 2) NOT NULL,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE,
FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE CASCADE
);
### In both instances
-
Log in to the instance:
gcloud compute ssh <instance-name> --zone=europe-west1-b
-
Run a git pull inside the repository:
cd EDEM2425 git pull
-
Move to the correct directory:
cd EDEM2425/gcp_sql/exercise_e2e
-
Create a virtual environment:
python3 -m venv .venv
-
Activate the venv:
source .venv/bin/activate
-
Install the requirements:
pip install -r requirements.txt
- Run the
orders-app
using nohup so that it runs in the background (if you close the terminal, the app will still be running):nohup bash -c 'HOST_IP=<your-cloud-sql-ip> PROJECT_ID=<your-project-id> python -m orders_app.orders_to_db.main' > output.log 2>&1 &
This will start creating orders, store them in the database and publish confirmation events to the order-events
topic.
If you want to see the logs, run the following command:
tail -f output.log
- Run the following command to start the
delivery-app
:nohup bash -c 'PROJECT_ID=<your-project-id> python -m delivery_app.main' > output.log 2>&1 &
If you want to see the logs, run the following command:
tail -f output.log
This will start consuming the events from the order-events
topic and publish delivery events to the delivery-events
topic.
Once you have the orders-app
and the delivery-app
running, you can also deploy the analytical-layer
to process the delivery events and store them in ClickHouse in your local machine as we did in the Cloud INTRO end2end excercise.
To do this, we need a few steps
-
Go to the excercise directory in your local machine:
cd EDEM2425/gcp_setup/excercise_1
-
Create the venv, activate it and install the requirements:
python -m venv .venv source .venv/bin/activate pip install -r requirements.txt
-
Deploy the docker-compose of the analytical-layer:
cd analytical-layer docker-compose up -d
Now that we have both the clickhouse and metabase running, we can deploy the EL pipeline to syncronize the PostgresDB and the Events from PubSub in two different ways.
To do so, we need to run the following commands:
-
Create your application-default credentials. This will allow us to identify to the GCP services with our user account:
gcloud auth application-default login
-
Run the script to syncronize the PostgresDB database inside the ./excercise_1 directory:
HOST_IP=localhost POSTGRES_IP=<postgres-ip> python -m analytical_layer.el_orders.main
If you are using Windows CMD, you can run:
set HOST_IP=localhost set POSTGRES_IP=<postgres-ip> python -m analytical_layer.el_orders.main
If you are using Windows Powershell, you can run:
$env:HOST_IP = "localhost"; $env:POSTGRES_IP = "<postgres-ip>"; python -m analytical_layer.el_orders.main
-
To syncronize the events of the delivery app, you can run the following command:
HOST_IP=localhost PROJECT_ID=<your-gcp-project-id> python -m analytical_layer.el_delivery.main
If you are using Windows CMD, you can run:
set HOST_IP="localhost" set PROJECT_ID="<your-gcp-project-id>" python -m analytical_layer.el_delivery.main
If you are using Windows Powershell, you can run:
$env:HOST_IP = "localhost"; $env:PROJECT_ID = "<your-gcp-project-id>"; python -m analytical_layer.el_delivery.main
-
To identify ourselves inside the Docker Container, we need to create a Service Account in GCP to allow the
analytical-layer
to consume thedelivery-events
topic.To do this, follow these steps:
- Go to the
IAM & Admin
section in the GCP console. - Click on
Service Accounts
. - Click on
Create Service Account
. - Name the service account
pub-sub-get-subscription
. - Add the role
Pub/Sub Subscriber
. - Click on
Create
. - Click on the service account you just created.
- Click on
Add Key
. - Select
JSON
and click onCreate
. - Copy the path of the JSON file you just downloaded.
- Go to the
-
Build the Docker image:
docker build -t analytical-layer-cron -f analytical_layer/docker/DockerFile .
-
Run the following command (remember to change the variables within <> by your own values):
docker run --network analytical_layer_default \ -e PROJECT_ID=<your-gcp-project-id> \ -e POSTGRES_IP=<postgres-ip> \ -e HOST_IP=<clickhouse-docker-container-ip> \ -v <path-to-your-pub-sub-sa>:/app/pub-sub-credentials.json \ -e GOOGLE_APPLICATION_CREDENTIALS=/app/pub-sub-credentials.json \ analytical-layer-cron