Skip to content

Latest commit

 

History

History
20 lines (20 loc) · 1.2 KB

README.md

File metadata and controls

20 lines (20 loc) · 1.2 KB
Note: Requires a basic understanding of Hadoop, HortonWorks Sandbox, Apache Hive, and how to navigate HDFS through 'File View'.

Hive Dynamic Partitioning

Data derived from public repositories of the U.S. Patent.

Description:

This repository demonstrates syntax used for creating an internal and external table with dynamic partitioning using Apache Hive.

The Files:

patent.db.zip - Contains the raw data.
questions.txt - Use cases used to create the scripts.
script.sql - The scripts used to create the tables, their partioning, and exmaple queries.

Prequesties:

1. Horton Works Sandbox Hadoop Framework installed and running with Apache Ambari.
2. Apace Hive installed on HDFS.

Execution:

1. Download (clone) this repostory on your local machine.
2. Unzip patents.db.zip.
3. Create a new folder called 'patent' using 'File View'
3. Place the data into HDFS using 'File View'
4. Open Hive and copy paste the entire script and execute.