In this sample, you'll build a Cloud Run service that receives notifications of completed BigQuery jobs using Eventarc, determines if the BigQuery job was expensive to run and if so, sends a notification email using SendGrid.
You can check AuditLogs of BigQuery for all the information. Note that AuditLogs are enabled by default in BigQuery.
To see completed BigQuery jobs, you can see AuditLogs logged under
serviceName
of bigquery.googleapis.com
and methodName
of jobservice.jobcompleted
:
To get more job details, you can check under jobsCompletedEvent
where you have
access to what query was executed and more importantly how much it cost under
totalBilledBytes
:
totalBilledBytes
gives you an idea how expensive the query was. For example,
if the query was cached, this will be zero and cost nothing but for expensive
queries, it will be in GBs or more. In this sample, we'll assume queries 1GB or
more are expensive.
Before deploying services and triggers, go through some setup steps.
Make sure that the project id is setup:
gcloud config set project [YOUR-PROJECT-ID]
PROJECT_ID=$(gcloud config get-value project)
Enable all necessary services:
gcloud services enable run.googleapis.com
gcloud services enable eventarc.googleapis.com
gcloud services enable cloudbuild.googleapis.com
You will use Audit Logs trigger for BigQuery which are enabled by default, nothing to do here.
Set region, location and platform for Cloud Run and Eventarc:
REGION=us-central1
gcloud config set run/platform managed
gcloud config set run/region $REGION
gcloud config set eventarc/location $REGION
Default compute service account will be used in the Audit Log triggers of Eventarc. Grant the
eventarc.eventReceiver
role to the default compute service account:
PROJECT_NUMBER="$(gcloud projects describe $(gcloud config get-value project) --format='value(projectNumber)')"
gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
--member=serviceAccount:$PROJECT_NUMBER[email protected] \
--role='roles/eventarc.eventReceiver'
This service will receive the BigQuery jobcompleted
events, log the received
event and if totalBilledBytes
is more than 1GB, it will use SendGrid to send an
email about the expensive query.
You need to setup a SendGrid account and create an API key. You can follow SendGrid's API Keys doc for more details on how to setup SendGrid.
The code of the service is in app.py. You can take a look how to parse the received CloudEvent, how to parse the AuditLog with the relevant info and finally how to send an email for expensive queries.
Build and push the container image:
SERVICE_NAME=bigquery-usage-notifier
gcloud builds submit --tag gcr.io/$PROJECT_ID/$SERVICE_NAME
Deploy the service while passing in TO_EMAILS
to email address where you want
to send the notification and SENDGRID_API_KEY
with your send SendGrid API Key.
[email protected]
SENDGRID_API_KEY=yoursendgridapikey
gcloud run deploy $SERVICE_NAME \
--image gcr.io/$PROJECT_ID/$SERVICE_NAME \
--allow-unauthenticated \
--update-env-vars TO_EMAILS=$TO_EMAILS,SENDGRID_API_KEY=$SENDGRID_API_KEY
Once the service is deployed, create a trigger to filter for the right BigQuery events:
gcloud eventarc triggers create $SERVICE_NAME-trigger \
--destination-run-service=$SERVICE_NAME \
--destination-run-region=$REGION \
--event-filters="type=google.cloud.audit.log.v1.written" \
--event-filters="serviceName=bigquery.googleapis.com" \
--event-filters="methodName=jobservice.jobcompleted" \
--service-account=$PROJECT_NUMBER[email protected]
Before testing, make sure the trigger is ready by checking ACTIVE
flag:
gcloud eventarc triggers list
NAME ACTIVE
bigquery-usage-notifier-trigger Yes
To test, you need to run a BigQuery job that results in 1GB or more billed
bytes. Here's a sample query to run using bq
CLI:
bq query \
--nouse_legacy_sql \
--nouse_cache \
'SELECT * FROM `bigquery-samples`.reddit.full'
Once the query completes, you should see the Cloud Run service log the query and you should also receive an email like this one:
The following BigQuery job completed
principalEmail: [email protected]
jobId: bqjob_r3293aa18ce3b8bed_00000179e689b8b9_1
createTime: 2021-06-07T12:54:16.783Z
query: SELECT * FROM `bigquery-samples`.reddit.full
totalBilledBytes: 1450180608.0, above 1GB? True