Hive-Dynamic-Partitioning/README.md at master · mustafabacchus/Hive-Dynamic-Partitioning · GitHub

Note: Requires a basic understanding of Hadoop, HortonWorks Sandbox, Apache Hive, and how to navigate HDFS through 'File View'.

Hive Dynamic Partitioning

Data derived from public repositories of the U.S. Patent.

Description:

This repository demonstrates syntax used for creating an internal and external table with dynamic partitioning using Apache Hive.

The Files:

patent.db.zip - Contains the raw data.

questions.txt - Use cases used to create the scripts.

script.sql - The scripts used to create the tables, their partioning, and exmaple queries.

Prequesties:

1. Horton Works Sandbox Hadoop Framework installed and running with Apache Ambari.

(https://hortonworks.com/tutorial/learning-the-ropes-of-the-hortonworks-sandbox/)

2. Apace Hive installed on HDFS.

Execution:

1. Download (clone) this repostory on your local machine.

2. Unzip patents.db.zip.

3. Create a new folder called 'patent' using 'File View'

(https://hortonworks.com/tutorial/loading-and-querying-data-with-hadoop/)

3. Place the data into HDFS using 'File View'

4. Open Hive and copy paste the entire script and execute.