Skip to content

Commit

Permalink
Updated upgrading to v1.9 guide to included parallel batch execution …
Browse files Browse the repository at this point in the history
…and added links to incremental microbatch page (#6608)

## What are you changing in this pull request and why?

I've created this PR to show updates to upgrading to v1.9 guide to
included parallel batch execution and added links to incremental
microbatch page

## Checklist
- [ ] I have reviewed the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [ ] The topic I'm writing about is for specific dbt version(s) and I
have versioned it according to the [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and/or [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content)
guidelines.
- [ ] I have added checklist item(s) to this list for anything anything
that needs to happen before this PR is merged, such as "needs technical
review" or "change base branch."
- [ ] The content in this PR requires a dbt release note, so I added one
to the [release notes
page](https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes).
<!--
PRE-RELEASE VERSION OF dbt (if so, uncomment):
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
-->
<!-- 
ADDING OR REMOVING PAGES (if so, uncomment):
- [ ] Add/remove page in `website/sidebars.js`
- [ ] Provide a unique filename for new pages
- [ ] Add an entry for deleted pages in `website/vercel.json`
- [ ] Run link testing locally with `npm run build` to update the links
that point to deleted pages
-->

<!-- vercel-deployment-preview -->
---
🚀 Deployment available! Here are the direct links to the updated files:


-
https://docs-getdbt-com-git-new-branch-name-1-dbt-labs.vercel.app/docs/build/incremental-microbatch
-
https://docs-getdbt-com-git-new-branch-name-1-dbt-labs.vercel.app/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9

<!-- end-vercel-deployment-preview -->

---------

Co-authored-by: Mirna Wong <[email protected]>
Co-authored-by: Leona B. Campbell <[email protected]>
  • Loading branch information
3 people authored Dec 6, 2024
1 parent d873999 commit 7f824a4
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 1 deletion.
3 changes: 2 additions & 1 deletion website/docs/docs/build/incremental-microbatch.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ Incremental models in dbt are a [materialization](/docs/build/materializations)
Microbatch is an incremental strategy designed for large time-series datasets:
- It relies solely on a time column ([`event_time`](/reference/resource-configs/event-time)) to define time-based ranges for filtering. Set the `event_time` column for your microbatch model and its direct parents (upstream models). Note, this is different to `partition_by`, which groups rows into partitions.
- It complements, rather than replaces, existing incremental strategies by focusing on efficiency and simplicity in batch processing.
- Unlike traditional incremental strategies, microbatch doesn't require implementing complex conditional logic for [backfilling](#backfills).
- Unlike traditional incremental strategies, microbatch enables you to [reprocess failed batches](/docs/build/incremental-microbatch#retry), auto-detect [parallel batch execution](#parallel-batch-execution), and eliminate the need to implement complex conditional logic for [backfilling](#backfills).

- Note, microbatch might not be the best strategy for all use cases. Consider other strategies for use cases such as not having a reliable `event_time` column or if you want more control over the incremental logic. Read more in [How `microbatch` compares to other incremental strategies](#how-microbatch-compares-to-other-incremental-strategies).

### How microbatch works
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ Starting in Core 1.9, you can use the new [microbatch strategy](/docs/build/incr
- Simplified query design: Write your model query for a single batch of data. dbt will use your `event_time``lookback`, and `batch_size` configurations to automatically generate the necessary filters for you, making the process more streamlined and reducing the need for you to manage these details.
- Independent batch processing: dbt automatically breaks down the data to load into smaller batches based on the specified `batch_size` and processes each batch independently, improving efficiency and reducing the risk of query timeouts. If some of your batches fail, you can use `dbt retry` to load only the failed batches.
- Targeted reprocessing: To load a *specific* batch or batches, you can use the CLI arguments `--event-time-start` and `--event-time-end`.
- [Automatic parallel batch execution](/docs/build/incremental-microbatch#parallel-batch-execution): Process multiple batches at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. dbt intelligently auto-detects if your batches can run in parallel, while also allowing you to manually override parallel execution with the `concurrent_batches` config.


Currently microbatch is supported on these adapters with more to come:
* postgres
Expand Down

0 comments on commit 7f824a4

Please sign in to comment.