dbt plugin for Palm CLI
This plugin adds dbt-specific commands for use with Palm CLI
Install this plugin along with palm
pip install palm-dbt
Or from source
python3 -m pip install .
To configure your project to use the palm-dbt plugin, you will need a .palm/config.yaml
this can be created by running palm init
, once you have your config file,
add the dbt-palm plugin with the following configuration:
plugins:
- dbt
Check the version of palm-dbt inside a project in which you have configured palm with the dbt plugin:
palm plugin versions
palm-dbt ships with a command to containerize and convert your existing dbt project.
For example, if you wanted to containerize your existing dbt project running on 0.21.0, you would run:
palm containerize --version 0.21.0
palm-dbt uses the git branch name to set the schema for all commands via env vars. This allows palm to clean up test data after each run, ensuring that your data warehouse stays clean and free of development/test data.
To enable this functionality, palm-dbt ships with 2 macros that handle schema naming and cleanup:
-
generate_schema_name - This macro overrides the dbt-core macro to auto-generate a schema name based on your current git branch and PALM_DBT_ENV.
-
drop_branch_schemas - This macro uses the branch named schema and the TEST database to clean up any models generated by running dbt in development or test environments. Calls to this macro are baked in to many of the palm dbt commands.
See the section about the palm dbt naming macros below for more information.
To install these macros, run palm install
from within a project that is configured
to use the palm-dbt plugin.
In order to ensure your runs are idempotent, we recommend that you do not run
palm-dbt commands against main
, master
or any other production-like branches
you may be using.
To prevent palm running against specific branches, add the following config to
your project's .palm/config.yaml
protected_branches:
- main
- master
# Any other branches you want to protect
One of the most painful parts of data testing is unfortunate shared mutable state. palm-dbt provides a mechanism to eliminate this undesirable situation by namespacing each run of dbt. for git branches other than main or master, palm will prefix the calculated schema name with a formatted version of your branch name. In CI, this will be additionally prefixed with "CI". For example:
- you open a branch FEATURE/DATA-100/update-widget
- when you
palm run
in your local env, the schemapublic
will be built asfeature_data_100_update_widget_public
. The schemasales
will be built asfeature_data_100_update_widget_sales
. - in CI the schemas will be
ci_feature_data_100_update_widget_public
,ci_feature_data_100_update_widget_sales
(respectively). - in prod the schemas will be 'public' and 'sales' (respectively).
Refs will automatically update as well. This way, you can use a single test database and not worry about conflicts between developers, or between branches for the same developer (like during hotfixes).
In palm-dbt we have determined that running dbt deps
before every command is
problematic for a few reasons:
- It takes time, slowing down development, CI, and every production run.
- If dbt hub or github have an outage, our dbt commands fail and remain broken until the upstream error is resolved
- If you forget to run dbt deps, the resulting error messages can be quite confusing.
To solve these problems, we have decided that running dbt clean && dbt deps
should
happen in the Dockerfile, when the image is being built.
To support this decision your project must do the following:
- Include
RUN dbt clean && dbt deps
in the Dockerfile - Include a
volume
entry in the docker-compose.yaml for the dbt_modules directory like this- /app/{{packages_dir}}
, which will prevent the.:./app
volume from blowing away the deps generated when the image was built.
if you use palm containerize
this will be done for you!
Additionally, if you need to make changes to your deps you should use palm build
to rebuild the image, which will update your deps!
From a non-protected branch, running palm run
will:
- drop (if it exists) the namespaced schema in development
- create the namespaced schema in development
- seed and run
- drop the namespaced schema in development
Why drop it? so your testing is atomic.
Want to persist it? use the flag --persist