Skip to content

Latest commit

 

History

History
28 lines (20 loc) · 1.19 KB

README.md

File metadata and controls

28 lines (20 loc) · 1.19 KB

Streaming Analytics for Apache Kafka and Apache Flink

Streaming Analytics Project for the Course "Advanced Analytics and Machine Learning"

  • Summer Term 2021 | Ludwig-Maximilians-Universität München

Collaborators:

  • Giacomo May
  • Manuel Neumayer

Dataset:

  • ~14.500 Tweets from the Twitter Streaming API
LRZ Cloud:

Apache Kafka and Flink are evaluated using VMs on the LRZ Cloud:


The purpose of this project is to analyze and compare two famous Streaming Platforms, Apache Kafka and Apache Flink, regarding metrics like Throughput, Latency, Processing Speed and Scalability in a both Non-Parallel and Parallel Streaming Scenario.

The results are documented in a conference paper.

Terminology

  • Throughput: Amount of MBs sent per unit time (e.g. second)
  • Latency: Amount of elapsed time between the point of sending a stream object and receiving it

Useful reads