This is the central repository for all materials related to Spark: The Definitive Guide by Bill Chambers and Matei Zaharia.
This repository is currently a work in progress and new material will be added over time.
To run this on your local machine either put all data in the "data" folder to /data
on your computer. Another option is that when reading in data from the book, simply specify the path to that particular dataset, on your local machine.
TODO: We will soon upload these instructions.