Skip to content

Commit

Permalink
Fix deduplication in Snowflake query string
Browse files Browse the repository at this point in the history
Signed-off-by: hkuepers <[email protected]>
  • Loading branch information
hkuepers committed Dec 19, 2024
1 parent f37b52a commit 710d1de
Showing 1 changed file with 16 additions and 10 deletions.
26 changes: 16 additions & 10 deletions sdk/python/feast/infra/offline_stores/snowflake.py
Original file line number Diff line number Diff line change
Expand Up @@ -771,20 +771,26 @@ def _get_entity_df_event_timestamp_range(
),
/*
3. If the `created_timestamp_column` has been set, we need to
deduplicate the data first. This is done by calculating the
`MAX(created_at_timestamp)` for each event_timestamp.
Otherwise, the ASOF JOIN can have unstable side effects
https://docs.snowflake.com/en/sql-reference/constructs/asof-join#expected-behavior-when-ties-exist-in-the-right-table
3. If the `created_timestamp_column` has been set, we need to
deduplicate the data first. This is done by calculating the
`MAX(created_at_timestamp)` for each event_timestamp and joining back on the subquery.
Otherwise, the ASOF JOIN can have unstable side effects
https://docs.snowflake.com/en/sql-reference/constructs/asof-join#expected-behavior-when-ties-exist-in-the-right-table
*/
{% if featureview.created_timestamp_column %}
"{{ featureview.name }}__dedup" AS (
SELECT
*,
MAX("created_timestamp") AS "created_timestamp"
FROM "{{ featureview.name }}__subquery"
GROUP BY {{ featureview.entities | map('tojson') | join(', ')}}{% if featureview.entities %},{% else %}{% endif %} "event_timestamp"
SELECT *
FROM "{{ featureview.name }}__subquery"
INNER JOIN (
SELECT
{{ featureview.entities | map('tojson') | join(', ')}}{% if featureview.entities %},{% else %}{% endif %}
"event_timestamp",
MAX("created_timestamp") AS "created_timestamp"
FROM "{{ featureview.name }}__subquery"
GROUP BY {{ featureview.entities | map('tojson') | join(', ')}}{% if featureview.entities %},{% else %}{% endif %} "event_timestamp"
)
USING({{ featureview.entities | map('tojson') | join(', ')}}{% if featureview.entities %},{% else %}{% endif %} "event_timestamp", "created_timestamp")
),
{% endif %}
Expand Down

0 comments on commit 710d1de

Please sign in to comment.