-
Notifications
You must be signed in to change notification settings - Fork 1.6k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The future of vars #4938
Comments
Vibes-wise, I'm most attached to the problems solved by Vibe 2, but I disagree with the approach to get there:
What about a second type of macro? We could take inspiration from Redshift UDFs (and likely other things) which can attach hints (VOLATILE | STABLE | IMMUTABLE) to indicate whether the planner can recycle the same output for a given input. It would be cool to be able to do something like this:
and only allow this subset of macros in yaml files, where they would have access to Functionally, this would be identical to the variables envisaged above. I think it would aid in a cleaner mental model though - conceptually I think of vars as constant, exported values, not as snippets of code in their own right. I could swing either way on whether we allow things like |
Regarding Vibe 2 I suggest the following extension on vars: Folder-level variables Imagine that we could define variables in the Example: vars:
# global vars:
test_lake_load_ts_predicate: "(_lake_load_ts > (current_timestamp - interval '3 days'))"
test_mod_ts_predicate: "(_mod_ts > (current_timestamp - interval '3 days'))"
# package-specific vars:
###### ↓ current implementation ↓ ######
dbt_artifacts:
dbt_artifacts_database: "lp_lake"
dbt_artifacts_schema: "raw_dbt_artifacts"
dbt_artifacts_table: "lp_prodb"
###### ↑ current implementation ↑ ######
models:
# general model settings:
+persist_docs:
relation: true
columns: true
+post-hook:
- "{{ dbt_snow_mask.apply_masking_policy() }}"
# packages:
dbt_artifacts:
# NOTE: To me it feels like a good idea to put package-specific variables
# rather here than in the global vars block above!
###### ↓ proposed change ↓ ######
vars:
dbt_artifacts_database: "lp_lake"
dbt_artifacts_schema: "raw_dbt_artifacts"
dbt_artifacts_table: "lp_prodb"
###### ↑ proposed change ↑ ######
+schema: core_dbt_artifacts
staging:
+schema: orig_dbt_artifacts
# this project:
this_dbt_project:
mart_whatever:
+schema: mart_whatever
description: A generic data mart
###### ↓ proposed change ↓ ######
vars:
source_prefix: "whatever"
database_name: "whatever_2"
###### ↑ proposed change ↑ ######
Side note: As you can see, I also propose to put package-specific vars rather (or alternatively/additionally) into the models section of the specific package. What do you all think about this? |
Would be nice if one could use secret env vars also in models (i.e., outside of profiles.yml or packages.yml). My use case is creating a temp view over a JDBC connection in a pre-hook to access some data and I would need a way to securely pass the credentials. |
I'm going to turn this into a GitHub discussion, since we're using those more frequently — for Bigger Improvements, in addition to Very Big Ideas. We don't have immediate next steps, or concrete plans of when to schedule this work, but it's feedback (and confusion) that we get pretty consistently. By the time of dbt v2, I'd like to see us have better, more consistent, more intuitive answers for conditional configuration. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Thanks to @gshank @joellabes @ChenyuLInx for a great discussion yesterday about all things
vars
! I'll do my best to summarize below.Prompted by requests like:
vars
should be definable outsidedbt_project.yml
(Defining vars, folder-level configs outside dbt_project.yml #2955)vars
should be available for resource configs defined indbt_project.yml
([CT-369] [Feature] Use variables declared in dbt_project.yml file in the file #4873)vars
(defined indbt_project.yml
) can be set to Jinja-rendered expressions (target-based Jinja rendering of vars in dbt_project.yml #3105, Variables inproject.yml
that use jinja templating are not properly rendering when referenced in packages #3658)generate_database_name
does not apply to sources #4753)When to recommend...?
target
-based logic.sql
files, but not in.yml
files)generate_x_name
macros (database/schema/alias of "executable" nodes, but not for sources)Do we have any strong recommendations around when to use vars, and when to use env vars? I don't think so, at least not well-documented. Here are my own strong-ish opinions:
Env vars
profiles.yml
into version control), or git-related info inpackages.yml
Vars
--vars
)One small note: dbt is smarter about partial parsing today with env vars (an actual partial re-parse) than with vars (will trigger a full re-parse, whether the change is in
--vars
or indbt_project.yml
). This is a temporary technical limitation only, and we plan to reach eventual parity.Other types of config
target
-based Jinja conditional expressions are nice when you want to surface the logic very close to where it's being used:{{ 3 if target.name == 'dev' else 100 }}
days of data for a backfill, for instance.generate_x_name
macros have a leg up on the others, in that they have access to everything: all vars, env vars,target
info, and the fullnode
specification of the model/seed/snapshot/test. So if you have highly specific rules about what goes where (if X config, Y tag, Z materialization type, ...), the macro is the right place to encode that "rule." This logic just can't be (re)used anywhere elseForward-looking vibes
While some of these changes are not mutually exclusive, I sensed two
Vibe 1: Vars are exactly like env vars.
You can (already) define both on the CLI:
They should available in all the same places and rendering contexts. That includes
profiles.yml
+packages.yml
(places where only--vars
are available today).Vars can be defined in a
.yml
or.env
file, in addition todbt_project.yml
. They're parsed first, regardless of where they're defined—even before profile information.Vibe 2: Vars are a way to DRY up your project
Vars save and render static Jinja expressions, which can take env vars +
target
values as their inputs.Following the example in #4753, imagine you've got:
We've proposed solving that with better source configurations (#3662), crucially the ability to configure multiple sources at once. But look again: Those configs are just slightly different.
The immediate instinct is to replace this with a macro, until you realize that macros aren't supported in
.yml
files today—at this point, primarily for reasons of rendering speed / parsing performance. And we can't tell a "pure" macro (static inputs + outputs) from a "dirty" one.Instead, what if you could "save" a reusable Jinja expression / static snippet, in the form of a rendered var?
This feels much more intuitive for vars that are set in
dbt_project.yml
, versus the ones passed in for--vars
. In fact, it feels like a bit of a disjuncture between them. Do we need a third kind of var...?The text was updated successfully, but these errors were encountered: