This repository contains all the assignments that I performed during Data Intensive Computing course in Spring 2019.
LAB - 1 - Exploratory Data Analytics in R - Goal was to extract, process and visualize data in R using Twitter API.
LAB - 2 - Large scale data processing with Hadoop Map Reduce (Python) - Word Count,Word Co-Occurence
• Performed word count and word co-occurrence on data extracted from Twitter, NYTimes API and Common Crawl.
• Used MapReduce for this computation and Tableau to visualize the results.
LAB - 3 - Big Data Analytics using Apache Spark (PYTHON)