Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/click uri deprecation #69

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

fivetran-joemarkiewicz
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz commented Sep 5, 2024

PR Overview

This PR will address the following Issue/Feature: LinkedIn Ads Issue #35

This PR will result in the following new package version: v0.10.0

This will be a breaking change since we are adding a new field. Additionally, there are major changes to the click_uri field to support the new LinkedIn Ads API and connector schema updates. Therefore, it would be most appropriate to list this as a breaking change so users are made fully aware.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

Breaking Changes

  • The click_uri_type field has been added to the stg_linkedin_ads__creative_history model. This field allows users to differentiate which click uri type (text_ad or spotlight) is being used to populate the results of the click_uri field.
    • Please be aware this field only supports text_ad or spotlight click uri types. If you are interested in this package supporting more click uri ad types, please let us know in this Feature Request.

Bug Fixes

  • The click_uri field has been adjusted to populate the results following a coalesce on the text_ad_landing_page, spotlight_landing_page, or click_uri fields.
    • This change is in response to a LinkedIn Ads API and Fivetran LinkedIn Ads connector update which moved click_uri data to either the text_ad_landing_page or spotlight_landing_page fields depending on the creative type.
    • Please be aware this field only supports text_ad or spotlight click uri types. If you are interested in this package supporting more click uri ad types, please let us know in this Feature Request.
  • Updated the is_latest_version window function in the following models to exclude the source_relation field from the partition statement when linkedin_ads_union_schemas or linkedin_ads_union_databases variables are empty. Also, modified it to skip the window function if the upstream table is empty, using the new window_function_if_table_exists() and is_table_empty() macros. This change addresses Redshift's issue with partitioning by constant expressions.
    • stg_linkedin_ads__account_history
    • stg_linkedin_ads__campaign_group_history
    • stg_linkedin_ads__campaign_history
    • stg_linkedin_ads__creative_history

Under the Hood

  • Updates to the linkedin_creative_history_data seed file to include the following new fields to ensure accurate data validation tests:
    • text_ad_landing_page
    • spotlight_landing_page

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • [n/a] dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned.
  • All necessary documentation and version upgrades have been applied.
  • docs were regenerated (unless this PR does not include any code or yml updates).
  • BuildKite integration tests are passing.
  • Detailed validation steps have been provided below.

Detailed Validation

Please share any and all of your validation steps:

Please see the downstream dbt_linkedin PR #38 for further validation details.

Additionally, I was able to confirm the click_uri and click_uri_type fields were populating as expected in the stg_linkedin_ads__creative_history model. You can see the text_ad and spotlight types populating as expected. However, the null records for http://www.github.com are expected since they are populated by the deprecated click_uri (per the seed data) field and should be displayed as null.

  • image

If you had to summarize this PR in an emoji, which would it be?

🔗

@fivetran-joemarkiewicz fivetran-joemarkiewicz self-assigned this Sep 5, 2024
Copy link
Contributor

@fivetran-reneeli fivetran-reneeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed! Another question I had was I wanted to confirm text_ad_landing_page and spotlight_landing_page are exclusive? So there's no case where customer can have both.

never mind-- I see the internal thread that this has been discussed already!

CHANGELOG.md Show resolved Hide resolved
models/stg_linkedin.yml Show resolved Hide resolved
@fivetran-joemarkiewicz
Copy link
Contributor Author

fivetran-joemarkiewicz commented Sep 18, 2024

question I had was I wanted to confirm text_ad_landing_page and spotlight_landing_page are exclusive?

@fivetran-reneeli, confirmed with product and engineering that it is exclusive. You will not have both. It will always either be one or the other.

See my other responses in the comments above. Let me know if you have other questions.

@fivetran-reneeli
Copy link
Contributor

Thanks @fivetran-joemarkiewicz just followed up! And regarding the ad type exclusivity, I saw the internal thread that this has been discussed already!

@fivetran-joemarkiewicz
Copy link
Contributor Author

Thanks @fivetran-reneeli! I just made the change in the CHANGELOG to reflect the proper macro name. This should be good for re-review.

Copy link
Contributor

@fivetran-reneeli fivetran-reneeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

CHANGELOG.md Outdated Show resolved Hide resolved
{%- macro default__is_table_empty(table_name) -%}
{%- if execute and flags.WHICH in ('run', 'build') %}
{% set row_count_query %}
select count(*) as row_count from {{ table_name }}
Copy link
Contributor

@fivetran-avinash fivetran-avinash Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, do we need to count the entire table if we're just checking if the table is empty? Not that there are many cases where there are a billion records in Linkedin that could impact query execution, but wondering if just checking a subset would still get us the intended result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point 🤔. What would you recommend for adjusting this to count the entire table?

Copy link
Contributor

@fivetran-avinash fivetran-avinash Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Would something like


{%- macro default__is_table_empty(table_name) -%}
    {%- if execute and flags.WHICH in ('run', 'build') %}
        {% set row_check_query %}
            SELECT 1
            FROM {{ table_name }}
            LIMIT 1
        {% endset %}
        {% set results = run_query(row_check_query) %}
        {% if results and results.rows | length == 0 %}
            {{ return("empty") }} 
        {% endif %}
    {% endif -%}
{%- endmacro -%}


work? The length would be 1 if there's a row (and thus not empty), and 0 if not.

(or something along those lines).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea, but if we were to implement it then I'd want to validate this a bit more since it is more than just a one line code change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sensible! Just a suggestion anyway, this PR is approved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash I thought more about this and wanted to make try some tests to see if we could get this more performant option working before release. I was able to get this working in the latest commit. See the validations below. Let me know your thoughts!

(Edit): There is a comment in most of these screenshots that says the below schema is populated. Please note this is only relevant for the first screenshot it appears. For all the others it was simply not removed by accident. You can use the names of the schemas as the guide.

  • ✅ When running on an empty schema, the macro works as expected.
    image

  • ✅ When running on a populated schema, the macro works as expected.
    image

  • ✅ When running on two populated union schemas, the macro works as expected.
    image

  • ✅ When running on two union schemas which are empty, the macro works as expected.
    image

Would you be able to take a second look and let me know if you have any notes before this moves forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also reran the downstream integration tests and saw success. So it looks like we are in the clear that this doesn't cause downstream issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was able to replicate several of these setups, so I'd say it's safe to move forward.

Copy link
Contributor

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz A few comments before release, but nothing blocking. Approved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants