Skip to content

Static fork to maintain references after repo migration. Code for Chris Hastie's session for Snowflake BUILD 2024 on schema evolution for automated metadata-driven ingestion

License

Notifications You must be signed in to change notification settings

InterWorks/Snowflake-Build-2024---Schema-Evolution

 
 

Repository files navigation

Schema Evolution for Automated Metadata-Driven Ingestion

Code for Chris Hastie's session for Snowflake BUILD 2024 on how to leverage schema evolution for automated metadata-driven ingestion.

Presentation recording

To watch the 30-minute recording of this session, access the following link and create a free account for Snowflake BUILD 2024. This will give you access to the recording of my session, along with recordings of all other sessions that took place during the event.

BUILD 2024: How to leverage schema evolution for automated metadata-driven ingestion

QR for BUILD 2024 recording

Leveraging this repository in Snowflake

To leverage this repository, each of the following notebook files can be imported separately into Snowflake:

  • 0_intro
  • 1_file_formats
  • 2_inferring_schemas_directly
  • 3_table_templates
  • 4_metadata_driven_ingestion
  • 5_schema_evolution
  • 9_outro

To import any of these notebooks into Snowflake, you must first do the following:

  1. Create a database and schema to contain the notebook

    create database if not exists "MY_DATABASE";
    create schema if not exists "MY_DATABASE"."MY_SCHEMA";
  2. Create a warehouse for the notebook to use for queries

    create warehouse if not exists "MY_WAREHOUSE"
      with
        warehouse_size = XSMALL -- Smallest size of warehouse for light workloads
        auto_suspend = 120 -- 2-minute delay before auto-suspend
        initially_suspended = TRUE -- Will not start the warehouse until something attempts to use it
    ;
  3. Create an api integration that points to this repository and grant it to the appropriate role

    create api integration if not exists "API__GIT__INTERWORKS_GITHUB"
      api_provider = GIT_HTTPS_API
      api_allowed_prefixes = ('https://github.com/InterWorks/')
      comment = 'API integration for InterWorks repositories stored in GitHub'
      enabled = TRUE
    ;
    -- RBAC
    grant usage on integration "API__GIT__INTERWORKS_GITHUB" to role "MY_ROLE";
  4. Create the git repository within the database and schema

    create or replace git repository "GIT_REPO__BUILD_2024"
      origin = 'https://github.com/InterWorks/Snowflake-Build-2024---Schema-Evolution.git'
      api_integration = "API__GIT__INTERWORKS_GITHUB"
    ;
  5. Create the notebooks by selecting "Create from repository" in the dropdown for creating new notebooks, then locate the desired notebook from the repository.

    Create notebook from repository

    Import notebook from repository

Sharing this repository

To quickly share this repository, use this QR code:

QR for repo

Alternatively, share this link:

https://github.com/InterWorks/Snowflake-Build-2024---Schema-Evolution

More information on the author

For more information on the author, Chris Hastie, visit his profile at InterWorks and/or LinkedIn.

My passion is not only to personally provide elegant and simple solutions to complex problems but also to provide others with the knowledge and tools to do this themselves.

I have been engineering data since 2014 and using Snowflake since 2018 to solve customer challenges and drive value, becoming a Data Superhero in 2019 following a series of blog posts and other > community engagement.

I am also a SnowPro Subject Matter Expert and have been involved in most current SnowPro exams.

InterWorks Profile

https://interworks.com/people/chris-hastie

QR for InterWorks Profile

LinkedIn Profile

https://www.linkedin.com/in/chris-hastie/

QR for LinkedIn

About

Static fork to maintain references after repo migration. Code for Chris Hastie's session for Snowflake BUILD 2024 on schema evolution for automated metadata-driven ingestion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 89.5%
  • Python 10.5%