Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace environment variable with a project flag to gate microbatch functionality #10799

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

MichelleArk
Copy link
Contributor

@MichelleArk MichelleArk commented Sep 30, 2024

Resolves #10798

Problem

We had gated the new microbatch feature behind an environment variable in the initial implementation of microbatch (#10594). However, for a better experience, we want people to be setting a project flag (AKA behavior flag).

Solution

Gate microbatch functionality behind a project flag (AKA behavior flag) instead of an environment variable.

Checklist

  • I have read the contributing guide and understand what's expected of me.
  • I have run this code in development, and it appears to resolve the stated issue.
  • This PR includes tests, or tests are not required or relevant for this PR.
  • This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
  • This PR includes type annotations for new and modified functions.

@cla-bot cla-bot bot added the cla:yes label Sep 30, 2024
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

Copy link

codecov bot commented Sep 30, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 1 line in your changes missing coverage. Please review.

Project coverage is 84.46%. Comparing base (0c51985) to head (db3fba8).

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10799      +/-   ##
==========================================
- Coverage   89.08%   84.46%   -4.63%     
==========================================
  Files         183      183              
  Lines       23580    23591      +11     
==========================================
- Hits        21006    19925    -1081     
- Misses       2574     3666    +1092     
Flag Coverage Δ
integration 84.46% <92.85%> (-1.92%) ⬇️
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Unit Tests 84.46% <92.85%> (-4.63%) ⬇️
Integration Tests 84.46% <92.85%> (-1.92%) ⬇️

…with behavior flag changes

This is needed to get the tests to pass. This is only necessary until the changes in
dbt-adapters are merged. Said another way: we need to merge the changes to dbt-adapters
first, and then revert these dependency changes before merging these changes.
@QMalcolm QMalcolm marked this pull request as ready for review October 1, 2024 21:25
@QMalcolm QMalcolm requested a review from a team as a code owner October 1, 2024 21:25
@QMalcolm QMalcolm changed the title first pass: replace os env with project flag Replace environment variable with a project flag to gate micorbatch functionality Oct 1, 2024
Comment on lines -1 to +2
git+https://github.com/dbt-labs/dbt-adapters.git@main
git+https://github.com/dbt-labs/dbt-adapters.git@main#subdirectory=dbt-tests-adapter
git+https://github.com/dbt-labs/dbt-adapters.git@microbatch-behavior-flag
git+https://github.com/dbt-labs/dbt-adapters.git@microbatch-behavior-flag#subdirectory=dbt-tests-adapter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to get the tests to pass. This is only necessary until the changes in dbt-adapters are merged. Said another way: we need to merge the changes to dbt-adapters first, and then revert these dependency changes before merging this PR's changes.

@QMalcolm QMalcolm requested a review from a team October 1, 2024 22:00
@MichelleArk MichelleArk changed the title Replace environment variable with a project flag to gate micorbatch functionality Replace environment variable with a project flag to gate microbatch functionality Nov 4, 2024
…o `find_materialization_macro_candidate_by_name`

This is necessary because in the following commit we're going to add a function to get
whether or not a manifest should be run using the new microbatch functionality. Getting
the MacroCandidate is necessary because we need to know the locality of the macro.
…ity should be used

The new microbatch functionality is, unfortunately, potentially dangerous. That is
it adds a new materalization strategy `microbatch` which an end user could have
defined as a custom strategy previously. Additionally we added config keys to nodes,
and as `config` is just a Dict[str, Any], it could contain anything, thus meaning
people could already be using the configs we're adding for different purposes. Thus
we need some intellegent gating. Specifically something that adheres to the following:

cms = Custom Microbatch Strategy
abms = Adapter Builtin Microbatch Strategy
bf = Behavior flag
umb = Use Microbatch Batching
t/f/e = True/False/Error

| cms | abms | bf | umb |
| t   | t    | t  | t   |
| f   | t    | t  | t   |
| t   | f    | t  | t   |
| f   | f    | t  | e   |
| t   | t    | f  | f   |
| f   | t    | f  | t   |
| t   | f    | f  | f   |
| f   | f    | f  | e   |

(The above table assumes that there is a microbatch model present in the project)

In order to achieve this we need to check that either the microbatch behavior
flag is set to true OR microbatch materializaion being used is the _root_ microbatch
materialization (i.e. not custom). The function we added in this commit,
`use_microbatch_batches`, does just that.
In 0349968 I had done this for the function
`find_materialization_macro_by_name`, but that wasn't the right function to
do it to, and will be reverted shortly. `find_materialization_macro_by_name`
is used for finding the general materialization macro, whereas `find_macro_by_name`
is more general. For the work we're doing, we need to find the microbatch
macro, which is not a materialization macro.
…_name` to `find_materialization_macro_candidate_by_name`"

This reverts commit 0349968.
…tead of `root`

Previously were were checking for a locality of `root`. However, a locality
of `root` means it was provided by a `package`. We wnt to check for locality
of `core` which basically means `builtin via dbt-core/adapters`. There is
another locality `imported` which I beleive means it comes from another
package.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace DBT_EXPERIMENTAL_MICROBATCH env var with behaviour flag
2 participants