Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Off Circle CI / To Github Actions + dagger.io #923

Merged
merged 91 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from 83 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
fd6f6f0
Add Github action for integration test
JCZuurmond Sep 29, 2023
795e40a
Update tox
JCZuurmond Sep 29, 2023
ff39c5d
Fetch spark from https link
JCZuurmond Sep 29, 2023
1505fc6
Use Spark version 3.1.2
JCZuurmond Sep 29, 2023
44fe33f
Seperate running Spark session and thrift
JCZuurmond Sep 29, 2023
2655631
Use Spark 3.1.2 and Hadoop 3.2
JCZuurmond Sep 29, 2023
915f67e
Reset tox.ini
JCZuurmond Sep 29, 2023
f0ef215
Remove base pythons in tox.ini
JCZuurmond Sep 29, 2023
e8457df
Fix reference to Docker compose file
JCZuurmond Sep 29, 2023
842466a
Remove timeout
JCZuurmond Sep 29, 2023
0738f2d
Remove artifact steps
JCZuurmond Sep 29, 2023
277bef1
Bump Spark and Hadoop versions
JCZuurmond Sep 29, 2023
8d5853d
Reset Spark and Hadoop version
JCZuurmond Sep 29, 2023
919528a
Update comment
JCZuurmond Sep 29, 2023
15e48fd
Add changie
JCZuurmond Sep 29, 2023
ab90c4c
Merge branch 'main' into add-github-workflow-for-integration-tests
Fleid Oct 12, 2023
31cb05e
add databricks and PR execution protections
colin-rogers-dbt Oct 18, 2023
31eceb5
Merge branch 'main' into migrateOffCircleCI
colin-rogers-dbt Oct 18, 2023
fd54d7f
use single quotes
colin-rogers-dbt Oct 23, 2023
8de8339
remove `_target` suffix
colin-rogers-dbt Oct 23, 2023
e85232f
add comment to test
colin-rogers-dbt Oct 23, 2023
fe3300e
specify container user as root
colin-rogers-dbt Oct 23, 2023
b37e14b
formatting
colin-rogers-dbt Oct 23, 2023
51511ec
remove python setup for pre-existing container
colin-rogers-dbt Oct 23, 2023
98607b6
download simba
colin-rogers-dbt Oct 23, 2023
e6ec414
fix curl call
colin-rogers-dbt Oct 23, 2023
05a2c08
fix curl call
colin-rogers-dbt Oct 23, 2023
a89ec58
fix curl call
colin-rogers-dbt Oct 23, 2023
2a18fad
fix curl call
colin-rogers-dbt Oct 23, 2023
1481396
fix curl call
colin-rogers-dbt Oct 23, 2023
31b427c
fix curl call
colin-rogers-dbt Oct 23, 2023
15ba1da
fix db test naming
colin-rogers-dbt Oct 23, 2023
ca33a23
confirm ODBC driver installed
colin-rogers-dbt Oct 23, 2023
6274d77
add odbc driver env var
colin-rogers-dbt Oct 23, 2023
0ba91a2
add odbc driver env var
colin-rogers-dbt Oct 23, 2023
f092026
specify platform
colin-rogers-dbt Oct 23, 2023
b968985
check odbc driver integrity
colin-rogers-dbt Oct 23, 2023
8a49567
add dbt user env var
colin-rogers-dbt Oct 23, 2023
7723e8d
add dbt user env var
colin-rogers-dbt Oct 23, 2023
ea5ebfa
fix host_name env var
colin-rogers-dbt Oct 23, 2023
610e5e9
try removing architecture arg
colin-rogers-dbt Oct 24, 2023
b4411ab
swap back to pull_request_target
colin-rogers-dbt Oct 24, 2023
cae6c8a
try running on host instead of container
colin-rogers-dbt Oct 24, 2023
0c68972
Update .github/workflows/integration.yml
colin-rogers-dbt Oct 24, 2023
b2f63bd
try running odbcinst -j
colin-rogers-dbt Oct 24, 2023
80eb7e4
remove bash
colin-rogers-dbt Oct 24, 2023
4bbfa71
add sudo
colin-rogers-dbt Oct 24, 2023
b1d2020
add sudo
colin-rogers-dbt Oct 24, 2023
38fda3d
update odbc.ini
colin-rogers-dbt Oct 24, 2023
6b599a1
install libsasl2-modules-gssapi-mit
colin-rogers-dbt Oct 24, 2023
0976c4f
install libsasl2-modules-gssapi-mit
colin-rogers-dbt Oct 24, 2023
42f2784
set -e on odbc install
colin-rogers-dbt Oct 24, 2023
4f11291
set -e on odbc install
colin-rogers-dbt Oct 24, 2023
1384084
set -e on odbc install
colin-rogers-dbt Oct 24, 2023
543e321
sudo echo odbc.inst
colin-rogers-dbt Oct 24, 2023
307a9af
Merge branch 'main' into migrateOffCircleCI
mikealfare Oct 27, 2023
f380d46
remove postgres components
mikealfare Nov 2, 2023
c334f32
remove release related items
mikealfare Nov 2, 2023
19dcff3
remove irrelevant output
mikealfare Nov 2, 2023
01b0c0c
move long bash script into its own file
mikealfare Nov 2, 2023
d3d2844
update integration.yml to align with other adapters
mikealfare Nov 2, 2023
94af018
Merge branch 'main' into migrateOffCircleCI
mikealfare Nov 2, 2023
72daf90
revert name change
mikealfare Nov 2, 2023
4f63a3c
Merge remote-tracking branch 'origin/migrateOffCircleCI' into migrate…
mikealfare Nov 2, 2023
b43c9d1
revert name change
mikealfare Nov 2, 2023
91715d2
combine databricks and spark tests
mikealfare Nov 2, 2023
943a8dc
combine databricks and spark tests
mikealfare Nov 2, 2023
3d0dece
Add dagger
colin-rogers-dbt Nov 30, 2023
080b816
remove platform
colin-rogers-dbt Nov 30, 2023
c8477ce
add dagger setup
colin-rogers-dbt Jan 8, 2024
c0a37ae
add dagger setup
colin-rogers-dbt Jan 8, 2024
9b9dc79
Merge branch 'main' into migrateOffCircleCI
colin-rogers-dbt Jan 8, 2024
8c6a745
set env vars
colin-rogers-dbt Jan 8, 2024
6a6b4ce
Merge remote-tracking branch 'origin/migrateOffCircleCI' into migrate…
colin-rogers-dbt Jan 8, 2024
1ae321a
install requirements
colin-rogers-dbt Jan 8, 2024
6361429
install requirements
colin-rogers-dbt Jan 8, 2024
6bca5dc
add DEFAULT_ENV_VARS and test_path arg
colin-rogers-dbt Jan 8, 2024
f4293e0
remove circle ci
colin-rogers-dbt Jan 8, 2024
d398065
formatting
colin-rogers-dbt Jan 9, 2024
6108d44
update changie
colin-rogers-dbt Jan 9, 2024
d472f3b
Update .changes/unreleased/Under the Hood-20230929-161218.yaml
colin-rogers-dbt Jan 9, 2024
ce92bcf
formatting fixes and simplify env_var handling
colin-rogers-dbt Jan 9, 2024
0c4ed9e
Merge remote-tracking branch 'origin/migrateOffCircleCI' into migrate…
colin-rogers-dbt Jan 9, 2024
56b14bc
remove tox, update CONTRIBUTING.md and cleanup GHA workflows
colin-rogers-dbt Jan 9, 2024
9849c1c
remove tox, update CONTRIBUTING.md and cleanup GHA workflows
colin-rogers-dbt Jan 9, 2024
f9a4c58
install test reqs in main.yml
colin-rogers-dbt Jan 9, 2024
bbe17a8
install test reqs in main.yml
colin-rogers-dbt Jan 9, 2024
3f44e96
formatting
colin-rogers-dbt Jan 9, 2024
afd3866
remove tox from dev-requirements.txt and Makefile
colin-rogers-dbt Jan 10, 2024
259ebc7
clarify spark crt instantiation
colin-rogers-dbt Jan 10, 2024
a8a7010
add comments on python-version
colin-rogers-dbt Jan 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changes/unreleased/Under the Hood-20230929-161218.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Under the Hood
body: Add GitHub action for integration testing and use dagger-io to run tests. Remove CircleCI workflow.
time: 2023-09-29T16:12:18.968755+02:00
custom:
Author: JCZuurmond, colin-rogers-dbt
Issue: "719"
136 changes: 0 additions & 136 deletions .circleci/config.yml

This file was deleted.

17 changes: 17 additions & 0 deletions .github/scripts/update_dbt_core_branch.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash -e
set -e

git_branch=$1
target_req_file="dev-requirements.txt"
core_req_sed_pattern="s|dbt-core.git.*#egg=dbt-core|dbt-core.git@${git_branch}#egg=dbt-core|g"
tests_req_sed_pattern="s|dbt-core.git.*#egg=dbt-tests|dbt-core.git@${git_branch}#egg=dbt-tests|g"
if [[ "$OSTYPE" == darwin* ]]; then
# mac ships with a different version of sed that requires a delimiter arg
sed -i "" "$core_req_sed_pattern" $target_req_file
sed -i "" "$tests_req_sed_pattern" $target_req_file
else
sed -i "$core_req_sed_pattern" $target_req_file
sed -i "$tests_req_sed_pattern" $target_req_file
fi
core_version=$(curl "https://raw.githubusercontent.com/dbt-labs/dbt-core/${git_branch}/core/dbt/version.py" | grep "__version__ = *"|cut -d'=' -f2)
bumpversion --allow-dirty --new-version "$core_version" major
160 changes: 160 additions & 0 deletions .github/workflows/integration.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# **what?**
# Runs integration tests.

# **why?**
# Ensure code runs as expected.

# **when?**
# This will run for all PRs, when code is pushed to a release
# branch, and when manually triggered.

name: Adapter Integration Tests

on:
push:
branches:
- "main"
- "*.latest"
- "releases/*"
emmyoop marked this conversation as resolved.
Show resolved Hide resolved
pull_request_target:
workflow_dispatch:
inputs:
dbt-core-branch:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ ❤️ ❤️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we get there are we going to also add inputs for dbt-adapter and dbt-common?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call out when we update all the adapters to use those packages we should update these to include them

description: "branch of dbt-core to use in dev-requirements.txt"
required: false
type: string

# explicitly turn off permissions for `GITHUB_TOKEN`
permissions: read-all

# will cancel previous workflows triggered by the same event and for the same ref for PRs or same SHA otherwise
concurrency:
group: ${{ github.workflow }}-${{ github.event_name }}-${{ contains(github.event_name, 'pull_request_target') && github.event.pull_request.head.ref || github.sha }}
cancel-in-progress: true

defaults:
run:
shell: bash

jobs:
# generate test metadata about what files changed and the testing matrix to use
test-metadata:
# run if not a PR from a forked repository or has a label to mark as safe to test
if: >-
github.event_name != 'pull_request_target' ||
github.event.pull_request.head.repo.full_name == github.repository ||
contains(github.event.pull_request.labels.*.name, 'ok to test')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this label still utilized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed it is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is the label used? GitHub automatically doesn't run PRs from forks now. Is this leftover from CircleCI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is leftover from before github had a mechanism to prevent running tests from forks maybe? We have this in a bunch of repos, sounds like it can be removed which would be awesome

runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.generate-matrix.outputs.result }}

steps:
- name: Check out the repository (non-PR)
if: github.event_name != 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false

- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

- name: Check if relevant files changed
if: github.event_name == 'pull_request_target'
# https://github.com/marketplace/actions/paths-changes-filter
# For each filter, it sets output variable named by the filter to the text:
# 'true' - if any of changed files matches any of filter rules
# 'false' - if none of changed files matches any of filter rules
# also, returns:
# `changes` - JSON array with names of all filters matching any of the changed files
uses: dorny/paths-filter@v2
id: get-changes
with:
token: ${{ secrets.GITHUB_TOKEN }}
filters: |
spark:
- 'dbt/**'
- 'tests/**'
- 'dev-requirements.txt'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to use the output from this job anywhere? We should also be able to accomplish the same thing via filters.

on:
  pull_request_target:
    types:
      - opened
    branches:
      - 'releases/**'
    paths:
      - '**.js'
    paths-ignore:
      - '.changes/*'
`


test:
name: ${{ matrix.test }} / python ${{ matrix.python-version }} / ubuntu-latest

# run if not a PR from a forked repository or has a label to mark as safe to test
# also checks that the matrix generated is not empty
if: >-
(
github.event_name != 'pull_request_target' ||
github.event.pull_request.head.repo.full_name == github.repository ||
contains(github.event.pull_request.labels.*.name, 'ok to test')
)
runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
test:
- "apache_spark"
- "spark_session"
- "databricks_sql_endpoint"
- "databricks_cluster"
- "databricks_http_cluster"

env:
TOXENV: integration-${{ matrix.test }}
PYTEST_ADDOPTS: "-v --color=yes --csv integration_results.csv"
DBT_INVOCATION_ENV: github-actions
DD_CIVISIBILITY_AGENTLESS_ENABLED: true
DD_API_KEY: ${{ secrets.DATADOG_API_KEY }}
DD_SITE: datadoghq.com
DD_ENV: ci
DD_SERVICE: ${{ github.event.repository.name }}
DBT_DATABRICKS_CLUSTER_NAME: ${{ secrets.DBT_DATABRICKS_CLUSTER_NAME }}
DBT_DATABRICKS_HOST_NAME: ${{ secrets.DBT_DATABRICKS_HOST_NAME }}
DBT_DATABRICKS_ENDPOINT: ${{ secrets.DBT_DATABRICKS_ENDPOINT }}
DBT_DATABRICKS_TOKEN: ${{ secrets.DBT_DATABRICKS_TOKEN }}
DBT_DATABRICKS_USER: ${{ secrets.DBT_DATABRICKS_USERNAME }}
DBT_TEST_USER_1: "[email protected]"
DBT_TEST_USER_2: "[email protected]"
DBT_TEST_USER_3: "[email protected]"
ODBC_DRIVER: "Simba"

steps:
- name: Check out the repository
if: github.event_name != 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false

# explicitly checkout the branch for the PR,
# this is necessary for the `pull_request` event
colin-rogers-dbt marked this conversation as resolved.
Show resolved Hide resolved
- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

Comment on lines +86 to +92
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This defeats the purpose of using the pull_request_target trigger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand on this? Looks like this is what all our other integration test workflows do

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read up a bit more adn this does not defeat the purpose of pull_request_target` because we're already running the workflow from the head which is actually the point.

I understand this is how it's done in the other adapter repos but I'm not sure it's necessary. From the actions/checkout docs

    repository: ''

    # The branch, tag or SHA to checkout. When checking out the repository that
    # triggered a workflow, this defaults to the reference or SHA for that event.
    # Otherwise, uses the default branch.

So this feels like a over-complication of checkout.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline and we will create a ticket to explain the logic here but leave it as is to follow patterns in adapter repos.

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: "3.11"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not using a python matrix here. matrix.python-version will be empty and you're just setting it to always use 3.11 here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be tested against all supported python versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the python version being used is actually configured based off the base image dagger uses for the test container as specified here: https://github.com/dbt-labs/dbt-spark/pull/923/files#diff-ca1b9f8fafcab3a8c6df7e520e725035642f58e54df0d88402a7a61fac5578b6R95

Since I don't think we're doing that in CircleCI today I don't think that's a blocker but I agree we should add that
in the near future. Ticket to track: #970


- name: Install python dependencies
run: |
python -m pip install --user --upgrade pip
python -m pip --version
python -m pip install -r dagger/requirements.txt

- name: Update dev_requirements.txt
if: inputs.dbt-core-branch != ''
run: |
pip install bumpversion
./.github/scripts/update_dbt_core_branch.sh ${{ inputs.dbt-core-branch }}


- name: Run tests for ${{ matrix.test }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
./.github/scripts/update_dbt_core_branch.sh ${{ inputs.dbt-core-branch }}
- name: Run tests for ${{ matrix.test }}
./.github/scripts/update_dbt_core_branch.sh ${{ inputs.dbt-core-branch }}
- name: Run tests for ${{ matrix.test }}

run: python dagger/run_dbt_spark_tests.py --profile ${{ matrix.test }}
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,5 @@ test.env
.hive-metastore/
.spark-warehouse/
dbt-integration-tests
/.tool-versions
/.hypothesis/*
3 changes: 0 additions & 3 deletions README.md
emmyoop marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
<a href="https://github.com/dbt-labs/dbt-spark/actions/workflows/main.yml">
<img src="https://github.com/dbt-labs/dbt-spark/actions/workflows/main.yml/badge.svg?event=push" alt="Unit Tests Badge"/>
</a>
<a href="https://circleci.com/gh/dbt-labs/dbt-spark/?branch=main">
<img src="https://circleci.com/gh/dbt-labs/dbt-spark/tree/main.svg?style=shield" alt="Integration Tests Badge"/>
</a>
Comment on lines -8 to -10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

</p>

**[dbt](https://www.getdbt.com/)** enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Expand Down
2 changes: 2 additions & 0 deletions dagger/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
dagger-io~=0.8.0
python-dotenv
Loading
Loading