Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Off Circle CI / To Github Actions + dagger.io #923

Merged
merged 91 commits into from
Jan 10, 2024

Conversation

colin-rogers-dbt
Copy link
Contributor

@colin-rogers-dbt colin-rogers-dbt commented Oct 18, 2023

resolves #719
docs dbt-labs/docs.getdbt.com/#

Problem

Solution

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@cla-bot cla-bot bot added the cla:yes label Oct 18, 2023
@colin-rogers-dbt
Copy link
Contributor Author

Successful run of integration tests: https://github.com/dbt-labs/dbt-spark/actions/runs/7455834000/job/20285527332

@@ -1,7 +1,7 @@
# install latest changes in dbt-core
# TODO: how to automate switching from develop to version branches?
git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-core&subdirectory=core
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinning as main is broken until we can migrate

Copy link
Member

@emmyoop emmyoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome and is so exciting.

.github/workflows/integration.yml Outdated Show resolved Hide resolved
pull_request_target:
workflow_dispatch:
inputs:
dbt-core-branch:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ ❤️ ❤️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we get there are we going to also add inputs for dbt-adapter and dbt-common?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call out when we update all the adapters to use those packages we should update these to include them

if: >-
github.event_name != 'pull_request' ||
github.event.pull_request.head.repo.full_name == github.repository ||
contains(github.event.pull_request.labels.*.name, 'ok to test')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is the label used? GitHub automatically doesn't run PRs from forks now. Is this leftover from CircleCI?

Comment on lines 51 to 81
steps:
- name: Check out the repository (non-PR)
if: github.event_name != 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false

- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

- name: Check if relevant files changed
if: github.event_name == 'pull_request_target'
# https://github.com/marketplace/actions/paths-changes-filter
# For each filter, it sets output variable named by the filter to the text:
# 'true' - if any of changed files matches any of filter rules
# 'false' - if none of changed files matches any of filter rules
# also, returns:
# `changes` - JSON array with names of all filters matching any of the changed files
uses: dorny/paths-filter@v2
id: get-changes
with:
token: ${{ secrets.GITHUB_TOKEN }}
filters: |
spark:
- 'dbt/**'
- 'tests/**'
- 'dev-requirements.txt'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to use the output from this job anywhere? We should also be able to accomplish the same thing via filters.

on:
  pull_request_target:
    types:
      - opened
    branches:
      - 'releases/**'
    paths:
      - '**.js'
    paths-ignore:
      - '.changes/*'
`

Comment on lines +134 to +140
- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This defeats the purpose of using the pull_request_target trigger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand on this? Looks like this is what all our other integration test workflows do

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read up a bit more adn this does not defeat the purpose of pull_request_target` because we're already running the workflow from the head which is actually the point.

I understand this is how it's done in the other adapter repos but I'm not sure it's necessary. From the actions/checkout docs

    repository: ''

    # The branch, tag or SHA to checkout. When checking out the repository that
    # triggered a workflow, this defaults to the reference or SHA for that event.
    # Otherwise, uses the default branch.

So this feels like a over-complication of checkout.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline and we will create a ticket to explain the logic here but leave it as is to follow patterns in adapter repos.


elif test_profile == "spark_session":
tst_container = tst_container.with_exec(["pip", "install", "pyspark"])
tst_container = tst_container.with_exec(["apt-get", "install", "openjdk-17-jre", "-y"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So easy!

Comment on lines 9 to 11
echo "[Simba]\nDriver = /opt/simba/spark/lib/64/libsparkodbc_sb64.so" >> /etc/odbcinst.ini
dpkg -l | grep Simba # confirm that the driver is installed
export ODBC_DRIVER="/opt/simba/spark/lib/64/libsparkodbc_sb64.so"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for why this can't be an env var sent in the workflow.

@@ -2,8 +2,8 @@ ARG OPENJDK_VERSION=8
FROM eclipse-temurin:${OPENJDK_VERSION}-jre

ARG BUILD_DATE
ARG SPARK_VERSION=3.3.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this go back 2 minor versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember... in either case no need to touch this now since we're not using it in CI

tox.ini Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the datadog env vars need to be defined in here under passenv (example)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call out, since we're not using tox we don't need passenv but we do need to pass the env vars to the container

.changes/unreleased/Under the Hood-20230929-161218.yaml Outdated Show resolved Hide resolved
if env_name.startswith(("DD_", "DBT_"))}

TESTING_ENV_VARS.update({
"ODBC_DRIVER": "/opt/simba/spark/lib/64/libsparkodbc_sb64.so",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is set here and not in the workflow since it's defined by the configure_odbc.sh script

@colin-rogers-dbt colin-rogers-dbt changed the title Migrate Off Circle CI Migrate Off Circle CI / To Github Actions + dagger.io Jan 9, 2024
Copy link
Member

@emmyoop emmyoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit more needed to get rid of tox. And I lean towards needing some kind of README in the dagger dir since this is a new tool for us.

Comment on lines +134 to +140
- name: Check out the repository (PR)
if: github.event_name == 'pull_request_target'
uses: actions/checkout@v3
with:
persist-credentials: false
ref: ${{ github.event.pull_request.head.sha }}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read up a bit more adn this does not defeat the purpose of pull_request_target` because we're already running the workflow from the head which is actually the point.

I understand this is how it's done in the other adapter repos but I'm not sure it's necessary. From the actions/checkout docs

    repository: ''

    # The branch, tag or SHA to checkout. When checking out the repository that
    # triggered a workflow, this defaults to the reference or SHA for that event.
    # Otherwise, uses the default branch.

So this feels like a over-complication of checkout.

Comment on lines +20 to +24
paths-ignore:
- ".changes/**"
- ".flake8"
- ".gitignore"
- "**.md"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ you've inspired me dbt-labs/dbt-core#9355

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: "3.11"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be tested against all supported python versions?

tox.ini Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Makefile still references tox.

Copy link
Member

@emmyoop emmyoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohoo!

@colin-rogers-dbt colin-rogers-dbt merged commit f9f75e9 into main Jan 10, 2024
11 checks passed
@colin-rogers-dbt colin-rogers-dbt deleted the migrateOffCircleCI branch January 10, 2024 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Initial Conversion to GitHub Actions
6 participants