Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: reorder Benefits mart fields #3518

Merged
merged 2 commits into from
Nov 4, 2024
Merged

Conversation

thekaveman
Copy link
Member

@thekaveman thekaveman commented Oct 29, 2024

This PR is currently on top of #3515, so I could test this locally more easily. Once that PR is merged, this can be rebased on main.

Description

Follow up to #3468.

The UNION DISTINCT was producing strange results, e.g. combining results from different columns into the same output column.

This was due to one or both of:

  1. The order of the columns not being the same between the baseline and transform CTE
  2. Some columns missing from the transform CTE

This PR alphabetizes the fields in both SELECT statements, and ensures that the transform SELECTS all fields from the baseline CTE.

A minor side effect of this cleanup was to remove the transform CTE in favor of a direct SELECT statement. This simplifies the back-and-forth to keep the field order straight.

This is a full list of the fields in alphabetical order (from BigQuery)

amplitude_id
app
city
client_event_time
client_upload_time
country
device_family
device_id
device_type
event_id
event_properties_card_tokenize_func
event_properties_card_tokenize_url
event_properties_claims_provider
event_properties_eligibility_verifier
event_properties_enrollment_flows
event_properties_enrollment_group
event_properties_enrollment_method
event_properties_error_name
event_properties_error_status
event_properties_error_sub
event_properties_href
event_properties_language
event_properties_origin
event_properties_path
event_properties_status
event_properties_transit_agency
event_time
event_type
language
library
os_name
os_version
processed_time
region
server_received_time
server_upload_time
session_id
start_version
user_id
user_properties_eligibility_verifier
user_properties_enrollment_flows
user_properties_enrollment_method
user_properties_initial_referrer
user_properties_initial_referring_domain
user_properties_referrer
user_properties_referring_domain
user_properties_transit_agency
user_properties_user_agent
uuid
version_name

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

How has this been tested?

$ poetry run dbt run -s +fct_benefits_events
03:26:13  Running with dbt=1.5.1
03:26:15  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.calitp_warehouse.mart.ad_hoc
03:26:15  Found 420 models, 950 tests, 0 snapshots, 0 analyses, 852 macros, 0 operations, 12 seed files, 175 sources, 4 exposures, 0 metrics, 0 groups
03:26:15  
03:26:18  Concurrency: 8 threads (target='dev')
03:26:18  
03:26:18  1 of 2 START sql view model kegan_staging.stg_amplitude__benefits_events ....... [RUN]
03:26:19  1 of 2 OK created sql view model kegan_staging.stg_amplitude__benefits_events .. [CREATE VIEW (0 processed) in 1.06s]
03:26:19  2 of 2 START sql table model kegan_mart_benefits.fct_benefits_events ........... [RUN]
03:26:34  2 of 2 OK created sql table model kegan_mart_benefits.fct_benefits_events ...... [CREATE TABLE (26.9m rows, 73.1 GiB processed) in 14.77s]
03:26:34  
03:26:34  Finished running 1 view model, 1 table model in 0 hours 0 minutes and 19.02 seconds (19.02s).
03:26:34  
03:26:34  Completed successfully
03:26:34  
03:26:34  Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2

Post-merge follow-ups

  • No action required
  • Actions required (specified below)

Copy link

github-actions bot commented Oct 29, 2024

Warehouse report 📦

DAG

Legend (in order of precedence)

Resource type Indicator Resolution
Large table-materialized model Orange Make the model incremental
Large model without partitioning or clustering Orange Add partitioning and/or clustering
View with more than one child Yellow Materialize as a table or incremental
Incremental Light green
Table Green
View White

@thekaveman thekaveman marked this pull request as ready for review October 31, 2024 18:01
@thekaveman thekaveman changed the base branch from refactor/devcontainer to main November 4, 2024 18:41
@thekaveman thekaveman merged commit 9c67d74 into main Nov 4, 2024
4 checks passed
@thekaveman thekaveman deleted the fix/benefits-reorder-fields branch November 4, 2024 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants