CSE587-Data-Intensive-Computing

This repository contains all the assignments that I performed during Data Intensive Computing course in Spring 2019.

LAB - 1 - Exploratory Data Analytics in R - Goal was to extract, process and visualize data in R using Twitter API.

LAB - 2 - Large scale data processing with Hadoop Map Reduce (Python) - Word Count,Word Co-Occurence

• Performed word count and word co-occurrence on data extracted from Twitter, NYTimes API and Common Crawl.

• Used MapReduce for this computation and Tableau to visualize the results.

LAB - 3 - Big Data Analytics using Apache Spark (PYTHON)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Lab 1		Lab 1
Lab 2		Lab 2
Lab 3		Lab 3
LICENSE		LICENSE
README.md		README.md

Provide feedback