From f52b1497511d65a159a3a116880e69e3a3f39d6e Mon Sep 17 00:00:00 2001 From: Mark Harrison Date: Thu, 26 Sep 2024 09:21:02 -0700 Subject: [PATCH] Update hive_partitioning.md Noting how to include/exclude artificial hive partitioning columns. --- docs/data/partitioning/hive_partitioning.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/data/partitioning/hive_partitioning.md b/docs/data/partitioning/hive_partitioning.md index 08996f4437a..f86cc83c618 100644 --- a/docs/data/partitioning/hive_partitioning.md +++ b/docs/data/partitioning/hive_partitioning.md @@ -26,6 +26,14 @@ COPY (SELECT *, year(timestamp) AS year, month(timestamp) AS month FROM services TO 'test' (PARTITION_BY (year, month)); ``` +When reading, the partition columns are read from the directory structure and +can be can be included or excluded depending on the `hive_partitioning` parameter. + +```sql +FROM read_parquet('test/*/*/*.parquet', hive_partitioning = true); -- will include year, month partition columns +FROM read_parquet('test/*/*/*.parquet', hive_partitioning = false); -- will not include year, month columns +``` + ## Hive Partitioning Hive partitioning is a [partitioning strategy](https://en.wikipedia.org/wiki/Partition_(database)) that is used to split a table into multiple files based on **partition keys**. The files are organized into folders. Within each folder, the **partition key** has a value that is determined by the name of the folder.