Skip to content

Create Kafka Producer, Kafka Consumer, and insert streaming data to HBase using Java and Spark.

Notifications You must be signed in to change notification settings

malikfm/SparkStreamingHBase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SparkStreamingHBase

This project illustrates how to create Kafka Producer, Kafka Consumer, and insert streaming data into HBase using Java and Spark.

Dummy data were taken from https://www.kaggle.com/selfishgene/historical-hourly-weather-data, humidity.csv then transformed to JSON format.

Prerequisites

Please ensure that you have met the following requirements:

  • Java 8
  • Maven
  • Apache Spark 2.x
  • Apache Kafka 0.10.x
  • Apache HBase 1.x

Using this project

This project consists of two main classes:

  • ProducerMain: read hourly time series of humidity data (humidity.json) then send to Kafka.
  • ConsumerMain: consume data from Kafka, transform the data then save to HBase.

Build

mvn install

Run

Producer

spark-submit --class com.malik.main.ProducerMain --master local[2] malik/engine/SparkStreamingHBase-1.0-SNAPSHOT-jar-with-dependencies.jar

Consumer

spark-submit --class com.malik.main.ConsumerMain --master local[2] malik/engine/SparkStreamingHBase-1.0-SNAPSHOT-jar-with-dependencies.jar

Visit localhost:port using your browser to monitor your spark job. Default port is 4040.

About

Create Kafka Producer, Kafka Consumer, and insert streaming data to HBase using Java and Spark.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages