This repository hosts the code for both assessed exercises corresponding to the Big Data course at the University of Glasgow.
(you can watch a video I made about the course here: https://youtu.be/bQ-2mZoWLGE)
The task was to implement a watered-down version of the PageRank algorithm over a parsed version of the complete Wikipedia edit history as of January 2008.
To achieve such task we were required to design and implement algorithms for parsing, filtering, projecting, and transforming data.
The solution and its explanation can be found in the folder wiki.
The task for this exercise was the same as the previous one, but this time the program needed to offer the option of computing PageRank scores at a user-defined point in time.
The solution and its explanation can be found in the folder wiki-spark/scala.