This repository contains source code for slides and project
- Overview, MapReduce, Hadoop.
- Spark basics and RDD.
- Cloud computing and Microsoft Azure.
- SQL and SparkSQL.
- Spark internals.
- Algorithm design for big data systems.
- GraphX/GraphFrames.
- Spark Streaming, Flink.
- Machine learning and MLlib.
- HBase, MongoDB.