Follow Wiki to Setup Docker-based Environment
Follow Wiki to Setup Docker-based Environment
Building an End-to-End Streaming Analytics and Recommendations Pipeline with Spark, Kafka, and TensorFlow
Part 1 (Analytics and Visualizations)
- Analytics and Visualizations Overview (Live Demo!)
- Verify Environment Setup (Docker, Cloud Instance)
- Notebooks (Zeppelin, Jupyter/iPython)
- Interactive Data Analytics (Spark SQL, Hive, Presto)
- Graph Analytics (Spark, Elastic, NetworkX, TitanDB)
- Time-series Analytics (Spark, Cassandra)
- Visualizations (Kibana, Matplotlib, D3)
- Approximate Queries (Spark SQL, Redis, Algebird)
- Workflow Management (Airflow)
Part 2 (Streaming and Recommendations)
- Streaming and Recommendations (Live Demo!)
- Streaming (NiFi, Kafka, Spark Streaming, Flink)
- Cluster-based Recommendation (Spark ML, Scikit-Learn)
- Graph-based Recommendation (Spark ML, Spark Graph)
- Collaborative-based Recommendation (Spark ML)
- NLP-based Recommendation (CoreNLP, NLTK)
- Geo-based Recommendation (ElasticSearch)
- Hybrid On-Premise+Cloud Auto-scale Deploy (Docker)
- Save Workshop Environment for Your Use Cases
- Washington DC: Saturday, June 18th
- Seattle: Saturday, July 30th
- Santa Clara: Saturday, August 6th
- Chicago: Saturday, September 10th
- Toronto: Saturday, September 17th
- New York: Saturday, September 24th
- Barcelona: Saturday, October 1st
- Munich: Saturday, October 15th
- London: Saturday, October 22nd
- Brussels: Saturday, October 29th
- Oslo: Monday, October 31st
- Tokyo: December 3rd
- Shanghai: December 10th
- Beijing: Saturday, December 17th
- Hyderabad: Saturday, December 24th
- Bangalore: Saturday, December 31st
- Sydney: Saturday, January 7th, 2017
- Melbourne: Saturday, January 14th, 2017
- Sao Paulo: Saturday, February 11th, 2017
- Rio de Janeiro: Saturday, February 18th, 2017
The goal of this workshop is to build an end-to-end, streaming data analytics and recommendations pipeline on your local machine using Docker and the latest streaming analytics
- First, we create a data pipeline to interactively analyze, approximate, and visualize streaming data using modern tools such as Apache Spark, Kafka, Zeppelin, iPython, and ElasticSearch.
- Next, we extend our pipeline to use streaming data to generate personalized recommendation models using popular machine learning, graph, and natural language processing techniques such as collaborative filtering, clustering, and topic modeling.
- Last, we productionize our pipeline and serve live recommendations to our users!