Code for Chris Hastie's session for Snowflake BUILD 2024 on how to leverage schema evolution for automated metadata-driven ingestion.
To watch the 30-minute recording of this session, access the following link and create a free account for Snowflake BUILD 2024. This will give you access to the recording of my session, along with recordings of all other sessions that took place during the event.
BUILD 2024: How to leverage schema evolution for automated metadata-driven ingestion
To leverage this repository, each of the following notebook files can be imported separately into Snowflake:
- 0_intro
- 1_file_formats
- 2_inferring_schemas_directly
- 3_table_templates
- 4_metadata_driven_ingestion
- 5_schema_evolution
- 9_outro
To import any of these notebooks into Snowflake, you must first do the following:
-
Create a database and schema to contain the notebook
create database if not exists "MY_DATABASE"; create schema if not exists "MY_DATABASE"."MY_SCHEMA";
-
Create a warehouse for the notebook to use for queries
create warehouse if not exists "MY_WAREHOUSE" with warehouse_size = XSMALL -- Smallest size of warehouse for light workloads auto_suspend = 120 -- 2-minute delay before auto-suspend initially_suspended = TRUE -- Will not start the warehouse until something attempts to use it ;
-
Create an api integration that points to this repository and grant it to the appropriate role
create api integration if not exists "API__GIT__INTERWORKS_GITHUB" api_provider = GIT_HTTPS_API api_allowed_prefixes = ('https://github.com/InterWorks/') comment = 'API integration for InterWorks repositories stored in GitHub' enabled = TRUE ; -- RBAC grant usage on integration "API__GIT__INTERWORKS_GITHUB" to role "MY_ROLE";
-
Create the git repository within the database and schema
create or replace git repository "GIT_REPO__BUILD_2024" origin = 'https://github.com/InterWorks/Snowflake-Build-2024---Schema-Evolution.git' api_integration = "API__GIT__INTERWORKS_GITHUB" ;
-
Create the notebooks by selecting "Create from repository" in the dropdown for creating new notebooks, then locate the desired notebook from the repository.
To quickly share this repository, use this QR code:
Alternatively, share this link:
https://github.com/InterWorks/Snowflake-Build-2024---Schema-Evolution
For more information on the author, Chris Hastie, visit his profile at InterWorks and/or LinkedIn.
My passion is not only to personally provide elegant and simple solutions to complex problems but also to provide others with the knowledge and tools to do this themselves.
I have been engineering data since 2014 and using Snowflake since 2018 to solve customer challenges and drive value, becoming a Data Superhero in 2019 following a series of blog posts and other > community engagement.
I am also a SnowPro Subject Matter Expert and have been involved in most current SnowPro exams.
https://interworks.com/people/chris-hastie