WIP:
- Optimize to backpressure, buffers, checkpoint intervals and wm intervals for larger state
- User RocksDB API to demonstrate what gets written and how
- Use time based joins for session windows and add time constraints
docker run -rm -it --name pulsar \
-p 6650:6650 -p 8080:8080 \
--mount source=pulsardata,target=/pulsar/data \
--mount source=pulsarconf,target=/pulsar/conf \
apachepulsar/pulsar:2.9.1 \
bin/pulsar standalone
Go into your container
docker exec -it pulsar bash
and run the following commands
- Create topics
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/orders
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/users
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/items
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/view_events
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/purchase_events
bin/pulsar-admin topics create-partitioned-topic -p 1 persistent://public/default/cart_events
bin/pulsar-admin topics list public/default
- Set infinite Retention
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/users
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/items
bin/pulsar-admin topics get-retention persistent://public/default/users
bin/pulsar-admin topics get-retention persistent://public/default/items
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/view_events
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/purchase_events
bin/pulsar-admin topics set-retention -s -1 -t -1 persistent://public/default/cart_events
start-cluster
Deploy the Flink Job
./deploy.sh
Tail the logs
tail -f log/flink-*-taskexecutor-*
The original Datasets can be found on the following links: