This project illustrates how to create Kafka Producer, Kafka Consumer, and insert streaming data into HBase using Java and Spark.
Dummy data were taken from, humidity.csv then transformed to JSON format.
Please ensure that you have met the following requirements:
- Java 8
- Maven
- Apache Spark 2.x
- Apache Kafka 0.10.x
- Apache HBase 1.x
This project consists of two main classes:
- ProducerMain: read hourly time series of humidity data (humidity.json) then send to Kafka.
- ConsumerMain: consume data from Kafka, transform the data then save to HBase.
mvn install
spark-submit --class com.malik.main.ProducerMain --master local[2] malik/engine/SparkStreamingHBase-1.0-SNAPSHOT-jar-with-dependencies.jar
spark-submit --class com.malik.main.ConsumerMain --master local[2] malik/engine/SparkStreamingHBase-1.0-SNAPSHOT-jar-with-dependencies.jar
Visit localhost:port
using your browser to monitor your spark job. Default port is 4040.