Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/snowplow unified/0.2.0 #24

Merged
merged 9 commits into from
Jan 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1 +1 @@
* @agnessnowplow
* @snowplow/com-snowplowanalytics-engineering-datavalue-integrations
22 changes: 22 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,3 +1,25 @@
snowplow-unified 0.2.0 (2024-01-30)
---------------------------------------
## Summary
This release adds the ability to calculate mobile screen engagement using the screen summary context. There is also a new optional module for a conversions table. Other changes are the ability to stitch the users table during session stitching and heatset is a recognised platform now.

## 🚨 Breaking Changes 🚨
Existing users on Snowflake / Databricks / Redshift will need to make changes to some of their derived tables. For a full sql script on how to achieve this, check out the relevant [migration guide](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/migration-guides/unified/). The other option is to do a [complete refresh](https://docs.snowplow.io/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/full-or-partial-refreshes/#complete-refresh-of-snowplow-package) of the package.

## Features
- Add mobile screen engagement calculation using the screen summary context (#16)
- Adds user stitching to the users table (enabled with `snowplow__session_stitching`)
- Adds "headset" to the list of recognized platforms
- Add optional conversions module

## Fixes
- Consider screen view ID from the screen view context (#14)
- Fix link to incorrect FAQ in README
- Remove test for not null screen ID and name in app errors table

## Upgrading
Bump the snowplow-unified version in your `packages.yml` file.

snowplow-unified 0.1.2 (2023-11-23)
---------------------------------------
## Summary
Expand Down
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ If you find a bug, please report an issue on GitHub.

The snowplow-unified package is Copyright 2023-present Snowplow Analytics Ltd.

This distribution is all licensed under the [Snowplow Personal and Academic License][license] . (If you are uncertain how it applies to your use case, check our answers to [frequently asked questions](https://docs.snowplow.io/docs/contributing/community-license-faq/).)
This distribution is all licensed under the [Snowplow Personal and Academic License][license] . (If you are uncertain how it applies to your use case, check our answers to [frequently asked questions](https://docs.snowplow.io/docs/contributing/personal-and-academic-license-faq/).)

[license]: https://docs.snowplow.io/personal-and-academic-license-1.0/
[license-image]: http://img.shields.io/badge/license-Snowplow--Personal--and--Academic--1-blue.svg?style=flat
Expand All @@ -73,4 +73,3 @@ This distribution is all licensed under the [Snowplow Personal and Academic Lice
[dbt-package-docs]: https://docs.getdbt.com/docs/building-a-dbt-project/package-management
[discourse-image]: https://img.shields.io/discourse/posts?server=https%3A%2F%2Fdiscourse.snowplow.io%2F
[discourse]: http://discourse.snowplow.io/

11 changes: 11 additions & 0 deletions dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ vars:
snowplow__session_lookback_days: 730
snowplow__session_stitching: true
snowplow__view_stitching: false
snowplow__conversion_stitching: true
snowplow__session_timestamp: collector_tstamp
snowplow__start_date: '2020-01-01'
# snowplow__total_all_conversions: false
Expand Down Expand Up @@ -85,15 +86,18 @@ vars:
snowplow__enable_application_context: false
snowplow__enable_screen_context: false
snowplow__enable_deep_link_context: false
snowplow__enable_screen_summary_context: false
# add extra custom fields:
snowplow__page_view_passthroughs: []
snowplow__session_passthroughs: []
snowplow__user_first_passthroughs: []
snowplow__user_last_passthroughs: []
snowplow__conversion_passthroughs: []
# enable custom modules:
snowplow__enable_consent: false
snowplow__enable_cwv: false
snowplow__enable_app_errors: false
snowplow__enable_conversions: false

# WAREHOUSE SPECIFIC

Expand Down Expand Up @@ -125,6 +129,7 @@ vars:
snowplow__application_error_events: com_snowplowanalytics_snowplow_application_error_1
snowplow__screen_view_events: com_snowplowanalytics_mobile_screen_view_1
snowplow__deep_link_context: com_snowplowanalytics_mobile_deep_link_1
snowplow__screen_summary_context: com_snowplowanalytics_mobile_screen_summary_1

# Completely or partially remove models from the manifest during run start.
on-run-start:
Expand Down Expand Up @@ -167,6 +172,12 @@ models:
scratch:
+schema: "scratch"
+tags: "scratch"
conversions:
+schema: "derived"
+tags: ["snowplow_unified_incremental", "derived", "conversions"]
scratch:
+schema: "scratch"
+tags: "scratch"
core_web_vitals:
+schema: "derived"
+tags: ["snowplow_unified_incremental", "derived", "core_web_vitals"]
Expand Down
134 changes: 126 additions & 8 deletions docs/markdown/snowplow_unified_common_cols.md
Original file line number Diff line number Diff line change
Expand Up @@ -416,10 +416,14 @@ The page’s character encoding e.g. , β€˜UTF-8’

{% docs col_doc_width %}
The page’s width in pixels e.g. 1024

On mobile, it is the content width reported in the `screen_summary` context.
{% enddocs %}

{% docs col_doc_height %}
The page’s height in pixels e.g. 3000

On mobile, it is the content height reported in the `screen_summary` context.
{% enddocs %}

{% docs col_tr_currency %}
Expand Down Expand Up @@ -788,10 +792,6 @@ First application version.
Last application version.
{% enddocs %}

{% docs col_session_duration_s %}
Total duration of a session in seconds.
{% enddocs %}

{% docs col_device_user_id %}
Unique device user id.
{% enddocs %}
Expand All @@ -816,10 +816,6 @@ Earliest timestamp for the user's activity, based on `derived_tstamp`.
Latest timestamp for the user's activity, based on `derived_tstamp`.
{% enddocs %}

{% docs col_sessions_duration_s %}
Total session duration for the specific user.
{% enddocs %}

{% docs col_active_days %}
Total number of active days for the user.
{% enddocs %}
Expand Down Expand Up @@ -1217,3 +1213,125 @@ Referrer URL, source of this deep-link.
{% docs col_event_index_in_session %}
A session index of the event.
{% enddocs %}

{% docs col_foreground_sec %}
Time in seconds spent on the current screen while the app was in foreground.
{% enddocs %}

{% docs col_background_sec %}
Time in seconds spent on the current screen while the app was in background
{% enddocs %}

{% docs col_last_item_index %}
Index of the last viewed item in the list on the screen
{% enddocs %}

{% docs col_items_count %}
Total number of items in the list on the screen
{% enddocs %}

{% docs col_min_x_offset %}
Minimum horizontal scroll offset on the scroll view in pixels
{% enddocs %}

{% docs col_max_x_offset %}
Maximum horizontal scroll offset on the scroll view in pixels
{% enddocs %}

{% docs col_min_y_offset %}
Minimum vertical scroll offset on the scroll view in pixels
{% enddocs %}

{% docs col_max_y_offset %}
Maximum vertical scroll offset on the scroll view in pixels
{% enddocs %}

{% docs col_content_width %}
Width of the scroll view in pixels
{% enddocs %}

{% docs col_content_height %}
Height of the scroll view in pixels
{% enddocs %}

{% docs col_last_list_item_index %}
Index of the last viewed item in the list on the screen

This is calculated only for mobile apps based on the `screen_summary` context.
{% enddocs %}

{% docs col_list_items_count %}
Total number of items in the list on the screen

This is calculated only for mobile apps based on the `screen_summary` context.
{% enddocs %}

{% docs col_list_items_percentage_scrolled %}
Percentage of the list on the screen that the user scrolled to.

This is calculated only for mobile apps based on the `screen_summary` context.
{% enddocs %}

{% docs col_engaged_time_in_s %}
Time spent by the user on the page or screen.

On Web, it is calculated using page pings.
On mobile, it is calculated using information in the `screen_summary` context.
{% enddocs %}

{% docs col_session_engaged_time_in_s %}
The total time engaged by a user within a session.

On Web, it is calculated using page pings.
On mobile, it is calculated using information in the `screen_summary` context.
{% enddocs %}

{% docs col_user_engaged_time_in_s %}
The total engaged time in seconds by the user.

On Web, it is calculated using page pings.
On mobile, it is calculated using information in the `screen_summary` context.
{% enddocs %}

{% docs col_absolute_time_in_s %}
Total time in seconds of the page or screen view (including inactivity).

On Web, it is the time between the `start_tstamp` and `end_tstamp` of the page view and the last page ping.
On mobile, it is the time that the app was in foreground + background during the screen view (taken from the `screen_summary` context).
{% enddocs %}

{% docs col_session_absolute_time_in_s %}
The time in seconds between the `start_tstamp` and `end_tstamp` of the first and last event in the session.
{% enddocs %}

{% docs col_user_absolute_time_in_s %}
The time in seconds between the `start_tstamp` and `end_tstamp` of the first and last event of sessions of the user.
{% enddocs %}

{% docs col_horizontal_pixels_scrolled %}
Distance the user scrolled horizontally in pixels.

On Web, it is calculated based on the page ping events.
On mobile, it is calculated using the `screen_summary` context.
{% enddocs %}

{% docs col_vertical_pixels_scrolled %}
Distance the user scrolled vertically in pixels.

On Web, it is calculated based on the page ping events.
On mobile, it is calculated using the `screen_summary` context.
{% enddocs %}

{% docs col_horizontal_percentage_scrolled %}
Percentage of page scrolled horizontally.

On Web, it is calculated based on the page ping events.
On mobile, it is calculated using the `screen_summary` context.
{% enddocs %}

{% docs col_vertical_percentage_scrolled %}
Percentage of page scrolled vertically.

On Web, it is calculated based on the page ping events.
On mobile, it is calculated using the `screen_summary` context.
{% enddocs %}
2 changes: 1 addition & 1 deletion docs/markdown/snowplow_unified_macros_docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ The specific sql to be used for the relevant warehouse to calculate the count of
{% endraw %}
{% enddocs %}

{% docs macro_get_conversion_columns %}
{% docs macro_conversion_query %}
{% raw %}

A macro to keep the different ways of calculating conversion fields per warehouse abstracted away for the sessions table.
Expand Down
13 changes: 13 additions & 0 deletions docs/markdown/snowplow_unified_views_docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,16 @@ This model calculates the time a visitor spent engaged on a given page view. Thi
This model calculates the horizontal and vertical scroll depth of the visitor on a given page view. Such metrics are useful when assessing engagement on a page view.

{% enddocs %}

{% docs table_screen_summary_metrics %}

This model calculates screen engagement statistics based on the screen summary context entity tracked on mobile apps.
It contains metrics related to the screen time and scroll depth.

{% enddocs %}

{% docs table_session_screen_summary_metrics %}

This model calculates screen time metrics per session based on the screen summary context entity tracked on mobile apps.

{% enddocs %}
25 changes: 13 additions & 12 deletions integration_tests/.scripts/integration_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,52 +24,53 @@ fi
for db in ${DATABASES[@]}; do

echo "Snowplow unified integration tests: Seeding data"

eval "dbt seed --full-refresh --target $db" || exit 1;

echo "Snowplow unified integration tests: App errors module"
echo "Snowplow unified integration tests: Conversions"
eval "dbt run --full-refresh --select +snowplow_unified_conversions snowplow_unified_integration_tests.source --vars '{snowplow__allow_refresh: true, snowplow__backfill_limit_days: 220, snowplow__enable_cwv: false, snowplow__enable_conversions: true}' --target $db" || exit 1;

echo "Snowplow unified integration tests: App errors module"
eval "dbt run --full-refresh --select +snowplow_unified_app_errors snowplow_unified_integration_tests.source --vars '{snowplow__allow_refresh: true, snowplow__backfill_limit_days: 220, snowplow__enable_cwv: false, snowplow__enable_app_errors: true}' --target $db" || exit 1;

echo "Snowplow unified integration tests: Late enabled contexts"

eval "dbt run --full-refresh --select +test_late_enabled_contexts snowplow_unified_integration_tests.source --vars '{snowplow__allow_refresh: true, snowplow__backfill_limit_days: 220, snowplow__enable_cwv: false, snowplow__enable_mobile_context: false, snowplow__enable_geolocation_context: false, snowplow__enable_application_context: false, snowplow__enable_screen_context: false, snowplow__enable_app_errors: false, snowplow__enable_deep_link_context: false, snowplow__enable_cwv: false, snowplow__enable_iab: false, snowplow__enable_ua: false, snowplow__enable_browser_context: false, snowplow__enable_consent: false}' --target $db" || exit 1;

eval "dbt run --select +test_late_enabled_contexts run --vars '{snowplow__allow_refresh: true, snowplow__backfill_limit_days: 250, snowplow__enable_cwv: false}' --target $db"

echo "Snowplow unified integration tests: Late enabled contexts test passed"

echo "Snowplow unified integration tests: Execute models (all contexts except for cwv) - run 1/4"

eval "dbt run --full-refresh --vars '{snowplow__allow_refresh: true, snowplow__backfill_limit_days: 243, snowplow__enable_cwv: false}' --target $db" || exit 1;

for i in {2..4}
do
echo "Snowplow unified integration tests: Execute models (all contexts except for cwv) - run $i/4"

eval "dbt run --vars '{snowplow__enable_cwv: false}' --target $db" || exit 1;
done

echo "Snowplow unified integration tests: Test models"

eval "dbt test --exclude snowplow_unified_web_vital_measurements snowplow_unified_web_vital_measurements_actual snowplow_unified_web_vital_events_this_run test_name:not_null --store-failures --target $db" || exit 1;
eval "dbt test --exclude snowplow_unified_web_vital_measurements snowplow_unified_web_vital_measurements_actual snowplow_unified_web_vital_events_this_run snowplow_unified_views_mobile_screen_engagement_actual test_name:not_null --store-failures --target $db" || exit 1;

echo "Snowplow unified integration tests: All non-CWV tests passed"

echo "Snowplow unified integration tests - Core Web Vitals: Execute models"

eval "dbt run --select +snowplow_unified_web_vital_measurements_actual snowplow_unified_web_vital_measurements_expected_stg source --full-refresh --vars '{snowplow__allow_refresh: true, snowplow__start_date: '2023-03-01', snowplow__backfill_limit_days: 50, snowplow__cwv_days_to_measure: 999, snowplow__enable_mobile: false, snowplow__enable_mobile_context: false, snowplow__enable_geolocation_context: false, snowplow__enable_application_context: false, snowplow__enable_screen_context: false, snowplow__enable_app_errors: false, snowplow__enable_deep_link_context: false, snowplow__enable_ua: false, snowplow__enable_browser_context: false, snowplow__enable_consent: false}' --target $db" || exit 1;

eval "dbt test --select snowplow_unified_web_vital_measurements_actual --store-failures --target $db" || exit 1;

echo "Snowplow unified integration tests: Execute web (all web contexts except for cwv)"

eval "dbt run --full-refresh --vars '{snowplow__allow_refresh: true, snowplow__backfill_limit_days: 9999, snowplow__enable_mobile: false, snowplow__enable_mobile_context: false, snowplow__enable_geolocation_context: false, snowplow__enable_application_context: false, snowplow__enable_screen_context: false, snowplow__enable_app_errors: false, snowplow__enable_deep_link_context: false, snowplow__enable_cwv: false}' --select +snowplow_unified_users snowplow_unified_events_stg --target $db" || exit 1;

echo "Snowplow unified integration tests: Execute mobile (all mobile contexts)"

eval "dbt run --full-refresh --vars '{snowplow__allow_refresh: true, snowplow__backfill_limit_days: 9999, snowplow__enable_web: false, snowplow__enable_iab: false, snowplow__enable_ua: false, snowplow__enable_browser_context: false, snowplow__enable_consent: false, snowplow__enable_cwv: false}' --select +snowplow_unified_users snowplow_unified_events_stg --target $db" || exit 1;

echo "Snowplow unified integration tests: All CWV tests passed"

echo "Snowplow unified integration tests: Test mobile screen engagement"

eval "dbt run --select +snowplow_unified_views_mobile_screen_engagement_actual snowplow_unified_views_mobile_screen_engagement_expected_stg source --full-refresh --vars '{snowplow__allow_refresh: true, snowplow__start_date: '2023-12-19', snowplow__backfill_limit_days: 50, snowplow__enable_cwv: false, snowplow__enable_screen_summary_context: true, snowplow__enable_ua: false, snowplow__enable_iab: false, snowplow__enable_web: false, snowplow__enable_browser_context: false, snowplow__enable_consent: false, snowplow__enable_yauaa: false, snowplow__enable_geolocation_context: false, snowplow__enable_deep_link_context: false, snowplow__enable_app_errors: false}' --target $db" || exit 1;

eval "dbt test --select snowplow_unified_views_mobile_screen_engagement_actual --vars '{snowplow__enable_screen_summary_context: true, snowplow__enable_web: false, snowplow__enable_cwv: false, snowplow__enable_ua: false, snowplow__enable_iab: false, snowplow__enable_web: false, snowplow__enable_browser_context: false, snowplow__enable_consent: false, snowplow__enable_yauaa: false, snowplow__enable_geolocation_context: false, snowplow__enable_deep_link_context: false, snowplow__enable_app_errors: false}' --store-failures --target $db" || exit 1;

echo "Snowplow unified integration tests: Mobile screen engagement tests passed"

done
2 changes: 1 addition & 1 deletion integration_tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ There are certain exceptions to how different warehouses process data and in pla

- the non-deterministic nature of row_number() function for Redshift/Postgres/Databricks means that we had to hard-code actuals and expected models for cases where we are testing duplicate rows with exact same results / window
- postgres / redshift needing the array format of : (within sessions_expected)
- bigquery handling of snowplow_utils.timestamp_diff() - absolute_time_in_s changes as well as sessions_duration_s
- bigquery handling of snowplow_utils.timestamp_diff() - absolute_time_in_s changes
- rotating domain_userid per session is hard-coded in the integration test expectations, when run in one batch the user_identifier differs: 2e340eb6e94820ea8369c0174c612260d1cfe9d41f0fe46268994e28d9c0bbf17
0e9ab97b5d9d9a174112df13fe9c44788af3ac9088a8b41e0998d92a8b4b5a4fc
- same with the number of quarantined sessions
Loading
Loading