Skip to content

polyzos/pulsar-flink-stateful-streams

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WIP:

  • Optimize to backpressure, buffers, checkpoint intervals and wm intervals for larger state
  • User RocksDB API to demonstrate what gets written and how
  • Use time based joins for session windows and add time constraints

Use Case 1

Data Enrichment with Topic Lookups

Use Case 2

Data Aggregation with Time Constraints on Time Windows

Setup a Pulsar Cluster

docker run -rm -it --name pulsar \
-p 6650:6650  -p 8080:8080 \
--mount source=pulsardata,target=/pulsar/data \
--mount source=pulsarconf,target=/pulsar/conf \
apachepulsar/pulsar:2.9.1 \
bin/pulsar standalone

Setup Pulsar Logical Components

Go into your container

docker exec -it pulsar bash

and run the following commands

  1. Create topics
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/orders
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/users
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/items

bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/view_events
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/purchase_events
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/cart_events

bin/pulsar-admin topics list public/default
  1. Set infinite Retention
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/users
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/items

bin/pulsar-admin topics get-retention persistent://public/default/users
bin/pulsar-admin topics get-retention persistent://public/default/items

bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/view_events
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/purchase_events
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/cart_events

Start a Flink Cluster

start-cluster

Deploy the Flink Job

./deploy.sh

Monitor Flink logs

Tail the logs

tail -f log/flink-*-taskexecutor-*

The original Datasets can be found on the following links:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published