Simple Kafka Twittter Streaming Which Streams in from Apache Kafka And Stores Data into HBASE inside HDFS File System
- HADOOP & HDFS
- HBASE (localhost:9000)
- YARN
- ZOOKEEPER
- KAFKA
- HADOOP HBASE
- YARN
- HBASE Server
- Zookeeper (localhost:2181)
- Hbase Thrift
- KAFKA SERVER FOR PRODUCER Streaming to TOPIC ['myWorld']
- KAFKA PRODUCER CONSOLE @ localhost:9092 Streaming to TOPIC ['myWorld']
- KAFKA CONSUMER CONSOLE listining in to zookeeper @ localhost:2181 for Streaming to TOPIC ['myWorld']
create an api-keys.py file in the directory as per ( YOU CAN RENAME 'api-keys.py.example')
KEY = 'wjRs...............fKef'
SECRET = '3xB5n................................GWEV6HMDbPhth'
TOKEN = '14959................................7QwUIBKyRZB2QN'
TOKEN_SECRET = 'ysLFG2v..............................CA8p1GXo'
HAVE YOUR OWN TOKENS BY CREATING A TWITTER APP FROM [https://apps.twitter.com/]APPS.TWITTER.COM
Run ( EACH ON EACH TERMINAL )
- Zookeeper [
./$ZOOKEEPER_HOME/bin/zkServer.sh start
], - Kafka Server [
./$KAFKA_HOME/bin/kafka-server-start.sh config/server.properties
], - Create Kafka Topic [
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic myWorld
] - Kafka Producer Console [
./$KAFKA_HOME/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic myWorld
] , - Kafka Consumer Console [
./$KAFKA_HOME/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic myWorld
], - Hbase Thrift [
$HBASE_HOME/bin/hbase thrift start
] - Producer : [
python producer.py
] - Consumer : [
python consumer.py
]
TOPIC | 'myWorld' |
---|---|
TABLE NAME | 'tweet-table' |
COLUMN FAMILY | 'json' |
COLUMN NAME | 'data' |
--- | COLUMN FAMILY |
---|---|
KEY | COLUMN NAME |
------- | ------------------------------------------------- |
row 1 | Data 1 |
row 2 | Data 2 |
row 3 | Data 3 |
HBASE STORAGE STRUCTURE