Rephrase the two criteria for resolving ref
calls with the --defer
flag
#6332
Labels
content
Improvements or additions to content
improvement
Use this when an area of the docs needs improvement as it's currently unclear
Contributions
Link to the page on docs.getdbt.com requiring updates
https://docs.getdbt.com/reference/node-selection/defer#usage
What part(s) of the page would you like to see updated?
Our current documentation reads like this:
The logic isn't easy to follow, especially when
--favor-state
is in the mix.So I propose using one of the following variants instead (which all say the same thing in slightly different ways):
Option 1
With
--defer
, dbt decides between thetarget
namespace or the state manifest’s namespace. By default, it uses thetarget
namespace. But if the node doesn’t exist in the database or--favor-state
is set, and the node is not included in the selected nodes, it uses the state manifest.Option 2
When
--defer
is used, dbt resolvesref
calls like this:--favor-state
is set), the state manifest is used.target
namespace.Option 3
With
--defer
, dbt chooses the state manifest if:--favor-state
is used).Otherwise, it defaults to the
target
namespace.Option 4
With
--defer
, if the node isn’t included and doesn’t exist in the database (or--favor-state
is set), dbt uses the state manifest. Otherwise, it defaults totarget
.Option 5
By default, dbt uses the
target
namespace to resolveref
calls.But with
--defer
, dbt will use the state manifest instead if:--favor-state
is used).Option 6
With
--defer
, dbt will use the state manifest to resolveref
calls if:--favor-state
is used).Otherwise it uses the default
target
namespace as normal.Option 7
For the selected nodes, dbt always uses the default
target
namespace to resolveref
calls, no matter what.But for any nodes not included in the selection, dbt will use the state manifest instead if:
--defer
is used and--favor-state
is used (or the node doesn’t exist in the database)Additional information
Key intuition
The value prop of deferral is allowing to safely run a subset of the DAG without the time and cost of having to first build all the upstream parents.
When running a subset of your DAG (like CI), there might be references to database objects that don't exist and won't be built by the command. To avoid database errors,
--defer
is handy for creating a fall-back for theseref
s -- you don't need to bother creating a missing database object if you have state that you can defer to. (But if you always want to use the state manifest rather than checking if the database object exists or not, then add--favor-state
.)Why check for existence in the database?
If the object already exists in the database, then there is no need to defer to a separate state manifest -- just use the database object that already exists.
(For use cases where the state manifest is preferred even when the database object exists, use
--favor-state
.)Why check if the node is included in the selection or not?
If the node is included in the selection, then it will be built and able to be referenced safely because it will exist in the database.
It's only the nodes not included in the selection that need to be worried about and handled.
References that might not exist in the database because they aren't in the selection
There are two different options for choosing how to handle these references:
--favor-state
)This most closely aligns with the phrasing in Option 7.
Questions
There are two questions a user might ask:
Ideally, our documentation would allow someone to answer either of those questions. We might need to phase the key ideas in two different ways to accomplish this. But we might be able to answer both with a single brief explanation as well. 🤷
The text was updated successfully, but these errors were encountered: