Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add event_time page #6383

Merged
merged 71 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
46231b0
add event_time page
mirnawong1 Oct 30, 2024
33a66a8
Merge branch 'current' into add-event-time
mirnawong1 Oct 30, 2024
6ebf5eb
update source/snapshots
mirnawong1 Oct 30, 2024
501d948
add to model
mirnawong1 Oct 30, 2024
4f2c6dc
add img and rn
mirnawong1 Oct 30, 2024
57ee608
fix link
mirnawong1 Oct 30, 2024
451fc46
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Oct 30, 2024
3354c9d
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Oct 30, 2024
603c21c
fix link again
mirnawong1 Oct 30, 2024
0c68f62
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Oct 30, 2024
488460c
Update event-time.md
mirnawong1 Oct 30, 2024
2fb62c5
Update release-notes.md
mirnawong1 Oct 30, 2024
69ba339
Update event-time.md
mirnawong1 Oct 30, 2024
1ebbbdb
Update advanced-ci.md
mirnawong1 Oct 30, 2024
2b713ee
Update advanced-ci.md
mirnawong1 Oct 30, 2024
c789601
Update advanced-ci.md
mirnawong1 Oct 30, 2024
5708119
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 4, 2024
903c5d1
Merge branch 'current' into add-event-time
mirnawong1 Nov 4, 2024
2dd873a
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 4, 2024
b7a07be
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 4, 2024
12cdffa
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 4, 2024
016c555
Update event-time.md
mirnawong1 Nov 4, 2024
9c49664
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 4, 2024
2910914
Merge branch 'current' into add-event-time
mirnawong1 Nov 4, 2024
5ba059e
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 4, 2024
79128fe
Merge branch 'current' into add-event-time
mirnawong1 Nov 4, 2024
cc34575
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 5, 2024
551821d
update img
mirnawong1 Nov 6, 2024
735ae38
fix img size
mirnawong1 Nov 6, 2024
ac7616b
Merge branch 'current' into add-event-time
mirnawong1 Nov 6, 2024
bdc037e
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 6, 2024
0363051
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 6, 2024
d693c9b
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 6, 2024
809f2a7
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 6, 2024
81e2318
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 6, 2024
14632b3
Merge branch 'current' into add-event-time
mirnawong1 Nov 6, 2024
aad3987
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 6, 2024
e92c9db
Update website/docs/reference/source-configs.md
mirnawong1 Nov 6, 2024
2b98454
Merge branch 'current' into add-event-time
mirnawong1 Nov 11, 2024
3ad1bb6
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 11, 2024
3da521f
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 11, 2024
edd1123
add scenarios
mirnawong1 Nov 11, 2024
52c0db9
add scenarios
mirnawong1 Nov 11, 2024
f461ffa
fold in grace's feedback
mirnawong1 Nov 11, 2024
a4f3b23
Merge branch 'current' into add-event-time
mirnawong1 Nov 11, 2024
a1c8166
Merge branch 'add-event-time' of github.com:dbt-labs/docs.getdbt.com …
mirnawong1 Nov 11, 2024
f1969f4
remove redundant
mirnawong1 Nov 11, 2024
c170a3b
Merge branch 'current' into add-event-time
mirnawong1 Nov 14, 2024
d6a309b
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 14, 2024
3a8dee5
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 14, 2024
5851c2b
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 14, 2024
0656327
Update release-notes.md
mirnawong1 Nov 14, 2024
57679b2
Merge branch 'current' into add-event-time
mirnawong1 Nov 14, 2024
556249a
Update website/docs/docs/dbt-versions/release-notes.md
mirnawong1 Nov 15, 2024
0bd8584
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 15, 2024
bd233ad
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 15, 2024
b9e4be0
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 15, 2024
ff3416a
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 15, 2024
613f1ef
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 15, 2024
85f181d
Update website/docs/reference/resource-configs/event-time.md
mirnawong1 Nov 15, 2024
4644684
Merge branch 'current' into add-event-time
mirnawong1 Nov 18, 2024
8cf073b
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 18, 2024
46763d8
Merge branch 'current' into add-event-time
mirnawong1 Nov 18, 2024
d2bf5af
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 18, 2024
6015dee
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 18, 2024
4250c9d
Update website/docs/docs/deploy/advanced-ci.md
mirnawong1 Nov 18, 2024
76b12e9
update header adn link
mirnawong1 Nov 19, 2024
de8f752
Merge branch 'current' into add-event-time
mirnawong1 Nov 19, 2024
4b28bbc
Merge branch 'add-event-time' into update-sources-snapshots
mirnawong1 Nov 19, 2024
337248b
add event _time to sources/snapshots/models/seeds (#6384)
mirnawong1 Nov 19, 2024
0e16ca6
Update incremental-microbatch.md
mirnawong1 Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions website/docs/docs/build/incremental-microbatch.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Refer to [Supported incremental strategies by adapter](/docs/build/incremental-s

Incremental models in dbt are a [materialization](/docs/build/materializations) designed to efficiently update your data warehouse tables by only transforming and loading _new or changed data_ since the last run. Instead of reprocessing an entire dataset every time, incremental models process a smaller number of rows, and then append, update, or replace those rows in the existing table. This can significantly reduce the time and resources required for your data transformations.

Microbatch incremental models make it possible to process transformations on very large time-series datasets with efficiency and resiliency. When dbt runs a microbatch model — whether for the first time, during incremental runs, or in specified backfills — it will split the processing into multiple queries (or "batches"), based on the `event_time` and `batch_size` you configure.
Microbatch incremental models make it possible to process transformations on very large time-series datasets with efficiency and resiliency. When dbt runs a microbatch model — whether for the first time, during incremental runs, or in specified backfills — it will split the processing into multiple queries (or "batches"), based on the [`event_time`](/reference/resource-configs/event-time) and `batch_size` you configure.

Each "batch" corresponds to a single bounded time period (by default, a single day of data). Where other incremental strategies operate only on "old" and "new" data, microbatch models treat every batch as an atomic unit that can be built or replaced on its own. Each batch is independent and <Term id="idempotent" />. This is a powerful abstraction that makes it possible for dbt to run batches separately — in the future, concurrently — and to retry them independently.

Expand Down Expand Up @@ -162,7 +162,7 @@ Several configurations are relevant to microbatch models, and some are required:

| Config | Type | Description | Default |
|----------|------|---------------|---------|
| `event_time` | Column (required) | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | N/A |
| [`event_time`](/reference/resource-configs/event-time) | Column (required) | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | N/A |
| `begin` | Date (required) | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on `2024-10-01` with `begin = '2023-10-01` will process 366 batches (it's a leap year!) plus the batch for "today." | N/A |
| `batch_size` | String (required) | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | N/A |
| `lookback` | Integer (optional) | Process X batches prior to the latest bookmark to capture late-arriving records. | `1` |
Expand Down
2 changes: 2 additions & 0 deletions website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
\* The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability.

## November 2024
- **New**: Use the `event_time` configuration to specify "at what time did the row occur." This configuration is required for [Incremental microbatch](/docs/build/incremental-microbatch) and can be added to ensure you're comparing overlapping times in [Advanced CI's compare changes](/docs/deploy/advanced-ci). Available in dbt Cloud Versionless and dbt Core v1.9 and higher. Refer to [event_time](/reference/resource-configs/event-time) for more information.

Check warning on line 22 in website/docs/docs/dbt-versions/release-notes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/dbt-versions/release-notes.md#L22

[custom.Typos] Oops there's a typo -- did you really mean 'event_time'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'event_time'? ", "location": {"path": "website/docs/docs/dbt-versions/release-notes.md", "range": {"start": {"line": 22, "column": 21}}}, "severity": "WARNING"}

Check warning on line 22 in website/docs/docs/dbt-versions/release-notes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/dbt-versions/release-notes.md#L22

[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'? ", "location": {"path": "website/docs/docs/dbt-versions/release-notes.md", "range": {"start": {"line": 22, "column": 360}}}, "severity": "WARNING"}
- **Fix**: This update improves [dbt Semantic Layer Tableau integration](/docs/cloud-integrations/semantic-layer/tableau) making query parsing more reliable. Some key fixes include:
- Error messages for unsupported joins between saved queries and ALL tables.
- Improved handling of queries when multiple tables are selected in a data source.
Expand All @@ -27,6 +28,7 @@
- **Enhancement**: The dbt Semantic Layer supports creating new credentials for users who don't have permissions to create service tokens. In the **Credentials & service tokens** side panel, the **+Add Service Token** option is unavailable for those users who don't have permission. Instead, the side panel displays a message indicating that the user doesn't have permission to create a service token and should contact their administration. Refer to [Set up dbt Semantic Layer](/docs/use-dbt-semantic-layer/setup-sl) for more details.

## October 2024

<Expandable alt_header="Coalesce 2024 announcements">

Documentation for new features and functionality announced at Coalesce 2024:
Expand Down
11 changes: 11 additions & 0 deletions website/docs/docs/deploy/advanced-ci.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,17 @@ dbt reports the comparison differences in:

<Lightbox src="/img/docs/dbt-cloud/example-ci-compare-changes-tab.png" width="85%" title="Example of the Compare tab" />

### Speeding up comparisons
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved


When an [`event_time`](/reference/resource-configs/event-time) column is specified on your model, compare changes can optimize comparisons by using only the overlapping timeframe (meaning the timeframe exists in both the CI and production environment), helping you avoid incorrect row-count changes and return results faster.

This is useful in scenarios like:
- **Subset of data in CI** &mdash; When CI builds only a [subset of data](/best-practices/best-practice-workflows#limit-the-data-processed-when-in-development) (like the most recent 7 days), compare changes might interpret the excluded data as "deleted rows." Configuring `event_time` allows you to avoid this issue by limiting comparisons to the overlapping timeframe, preventing false alerts about data deletions that are just filtered out in CI.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
- **Fresher data in CI than in production** &mdash; When your CI job includes fresher data than production (because it has run more recently), compare changes might flag the additional rows as "new" data, even though they’re just fresher data in CI. With `event_time` configured, the comparison only includes the shared timeframe and correctly reflects actual changes in the data.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

<Lightbox src="/img/docs/deploy/apples_to_apples.png" width="90%" title="event_time ensures the same time-slice of data is accurately compared between your CI and production environments." />

mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
## About the cached data

After [comparing changes](#compare-changes), dbt Cloud stores a cache of no more than 100 records for each modified model for preview purposes. By caching this data, you can view the examples of changed data without rerunning the comparison against the data warehouse every time (optimizing for lower compute costs). To display the changes, dbt Cloud uses a cached version of a sample of the data records. These data records are queried from the database using the connection configuration (such as user, role, service account, and so on) that's set in the CI job's environment.
Expand Down
284 changes: 284 additions & 0 deletions website/docs/reference/resource-configs/event-time.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,284 @@
---
title: "event_time"
id: "event-time"
sidebar_label: "event_time"
resource_types: [models, seeds, source]
description: "dbt uses event_time to understand when an event occurred. When defined, event_time enables microbatch incremental models and more refined comparison of datasets during Advanced CI."
datatype: string
---

Available in dbt Cloud Versionless and dbt Core v1.9 and higher.

Check warning on line 10 in website/docs/reference/resource-configs/event-time.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/event-time.md#L10

[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'? ", "location": {"path": "website/docs/reference/resource-configs/event-time.md", "range": {"start": {"line": 10, "column": 49}}}, "severity": "WARNING"}

<Tabs>
<TabItem value="model" label="Models">

<File name='dbt_project.yml'>

```yml
models:
[resource-path:](/reference/resource-configs/resource-path)
+event_time: my_time_field
```
</File>


<File name='models/properties.yml'>

```yml
models:
- name: model_name
[config](/reference/resource-properties/config):
event_time: my_time_field
```
</File>

<File name="models/modelname.sql">

```sql
{{ config(
event_time='my_time_field'
) }}
```

</File>

</TabItem>

<TabItem value="seeds" label="Seeds">

<File name='dbt_project.yml'>

```yml
seeds:
[resource-path:](/reference/resource-configs/resource-path)
+event_time: my_time_field
```
</File>

<File name='seeds/properties.yml'>

```yml
seeds:
- name: seed_name
[config](/reference/resource-properties/config):
event_time: my_time_field
```

</File>
</TabItem>

<TabItem value="snapshot" label="Snapshots">

<File name='dbt_project.yml'>

```yml
snapshots:
[resource-path:](/reference/resource-configs/resource-path)
+event_time: my_time_field
```
</File>

<VersionBlock firstVersion="1.9">
<File name='snapshots/properties.yml'>

```yml
snapshots:
- name: snapshot_name
[config](/reference/resource-properties/config):
event_time: my_time_field
```
</File>
</VersionBlock>

<VersionBlock lastVersion="1.8">

<File name="models/modlename.sql">

```sql

{{ config(
event_time: 'my_time_field'
) }}
```

</File>


import SnapshotYaml from '/snippets/_snapshot-yaml-spec.md';

<SnapshotYaml/>
</VersionBlock>



</TabItem>

<TabItem value="sources" label="Sources">

<File name='dbt_project.yml'>

```yml
sources:
[resource-path:](/reference/resource-configs/resource-path)
+event_time: my_time_field
```
</File>

<File name='models/properties.yml'>

```yml
sources:
- name: source_name
[config](/reference/resource-properties/config):
event_time: my_time_field
```

</File>
</TabItem>
</Tabs>

## Definition

Set the `event_time` to the name of the field that represents the timestamp of the event -- "at what time did the row occur" -- as opposed to an event ingestion date. You can configure `event_time` for a [model](/docs/build/models), [seed](/docs/build/seeds), or [source](/docs/build/sources) in your `dbt_project.yml` file, property YAML file, or config block.

Here are some examples of good and bad `event_time` columns:
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

- ✅ Good:
- `account_created_at` &mdash; This represents the specific time when an account was created, making it a fixed event in time.
- `session_began_at` &mdash; This captures the exact timestamp when a user session started, which won’t change and directly ties to the event.

- ❌ Bad:

- `_fivetran_synced` &mdash; This isn't the time that the event happened, it's the time that the event was ingested.
- `last_updated_at` &mdash; This isn't a good use case as this will keep changing over time.

`event_time` is required for [Incremental microbatch](/docs/build/incremental-microbatch) and highly recommended for [Advanced CI's compare changes](/docs/deploy/advanced-ci#speeding-up-comparisons) in CI/CD workflows, where it ensures the same time-slice of data is correctly compared between your CI and production environments.

## Examples

<Tabs>

<TabItem value="model" label="Models">

Here's an example in the `dbt_project.yml` file:

<File name='dbt_project.yml'>

```yml
models:
my_project:
user_sessions:
+event_time: session_start_time
```
</File>

Example in a properties YAML file:

<File name='models/properties.yml'>

```yml
models:
- name: user_sessions
config:
event_time: session_start_time
```

</File>

Example in sql model config block:

<File name="models/user_sessions.sql">

```sql
{{ config(
event_time='session_start_time'
) }}
```

</File>

This setup sets `session_start_time` as the `event_time` for the `user_sessions` model.
</TabItem>

<TabItem value="seeds" label="Seeds">

Here's an example in the `dbt_project.yml` file:

<File name='dbt_project.yml'>

```yml
seeds:
my_project:
my_seed:
+event_time: record_timestamp
```

</File>

Example in a seed properties YAML:

<File name='seeds/properties.yml'>

```yml
seeds:
- name: my_seed
config:
event_time: record_timestamp
```
</File>

This setup sets `record_timestamp` as the `event_time` for `my_seed`.

</TabItem>

<TabItem value="snapshot" label="Snapshots">

Here's an example in the `dbt_project.yml` file:

<File name='dbt_project.yml'>

```yml
snapshots:
my_project:
my_snapshot:
+event_time: record_timestamp
```

</File>

Example in a snapshot properties YAML:

<File name='my_project/properties.yml'>

```yml
snapshots:
- name: my_snapshot
config:
event_time: record_timestamp
```
</File>

This setup sets `record_timestamp` as the `event_time` for `my_snapshot`.

</TabItem>

<TabItem value="sources" label="Sources">

Here's an example of source properties YAML file:

<File name='models/properties.yml'>

```yml
sources:
- name: source_name
tables:
- name: table_name
config:
event_time: event_timestamp
```
</File>

This setup sets `event_timestamp` as the `event_time` for the specified source table.

</TabItem>
</Tabs>
1 change: 1 addition & 0 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -926,6 +926,7 @@ const sidebarSettings = {
"reference/resource-configs/alias",
"reference/resource-configs/database",
"reference/resource-configs/enabled",
"reference/resource-configs/event-time",
"reference/resource-configs/full_refresh",
"reference/resource-configs/contract",
"reference/resource-configs/grants",
Expand Down
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading