Twitter Dataset Analysis and Modeling.
The objective of this project is to analyze and compare performances for accessing a large twitter dataset. The dataset consists of three collections:
- Profile
- Networks and
- Tweets
The goal is to define at least two different data models (of MongoDB) in which the twitter data would be ingested. Then performance evaluation would be conducted for analysis, update and querying of both these data models.
References:
[1] https://wiki.cites.illinois.edu/wiki/display/forward/Dataset-UDI-TwitterCrawl-Aug2012 [2] MongoDB Reference http://docs.MongoDB.org/manual/reference [3] Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin Chen-Chuan Chang: Towards social user profiling: unified and discriminative influence model for inferring home locations. KDD 2012:1023-1031