From 0beba4de8bfee871a10533eb09820bd62be6179c Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Tue, 15 Oct 2024 10:32:04 +0100 Subject: [PATCH] add truncate --- website/blog/2024-10-04-iceberg-is-an-implementation-detail.md | 2 ++ website/blog/2024-10-05-snowflake-feature-store.md | 2 ++ 2 files changed, 4 insertions(+) diff --git a/website/blog/2024-10-04-iceberg-is-an-implementation-detail.md b/website/blog/2024-10-04-iceberg-is-an-implementation-detail.md index eca0a411dad..dc9b78bba8d 100644 --- a/website/blog/2024-10-04-iceberg-is-an-implementation-detail.md +++ b/website/blog/2024-10-04-iceberg-is-an-implementation-detail.md @@ -16,6 +16,8 @@ If you haven’t paid attention to the data industry news cycle, you might have But I have to be honest: **I don’t care**. But not for the reasons you think. + + ## What is Iceberg? To have this conversation, we need to start with the same foundational understanding of Iceberg. Apache Iceberg is a high-performance open table format developed for modern data lakes. It was designed for large-scale datasets, and within the project, there are many ways to interact with it. When people talk about Iceberg, it often means multiple components including but not limited to: diff --git a/website/blog/2024-10-05-snowflake-feature-store.md b/website/blog/2024-10-05-snowflake-feature-store.md index fb62955d4a4..cf5c55be1b5 100644 --- a/website/blog/2024-10-05-snowflake-feature-store.md +++ b/website/blog/2024-10-05-snowflake-feature-store.md @@ -13,6 +13,8 @@ Flying home into Detroit this past week working on this blog post on a plane and Think of the manufacturing materials needed as our data and the building of the bridge as the building of our ML models. There are thousands of engineers and construction workers taking materials from all over the world, pulling only the specific pieces needed for each part of the project. However, to make this project truly work at this scale, we need the warehousing and logistics to ensure that each load of concrete rebar and steel meets the standards for quality and safety needed and is available to the right people at the right time — as even a single fault can have catastrophic consequences or cause serious delays in project success. This warehouse and the associated logistics play the role of the feature store, ensuring that data is delivered consistently where and when it is needed to train and run ML models. + + ## What is a feature? A feature is a transformed or enriched data that serves as an input into a machine learning model to make predictions. In machine learning, a data scientist derives features from various data sources to build a model that makes predictions based on historical data. To capture the value from this model, the enterprise must operationalize the data pipeline, ensuring that the features being used in production at inference time match those being used in training and development.