Skip to content

pavan046/benchmark-events-tweets-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

Manually Assessed Twitter Dataset for Events

Events

  • United States Presidential Elections 2012
  • Hurricane Sandy

Files

The Files in this repository are

  • uselections.csv -- United States Presidential Elections 2012 -- 10084 Tweets
  • hurricansandy.csv -- Hurricane Sandy -- 4085 Tweets

The total number of tweets with relevance in the files are ~12000 tweets. These tweets represent 50 hashtags (25 hashtags each event) and ~15000 tweets(with duplication).

Obtaining the tweets

The tweets can be obtained from the twitterID using the Twitter Search API. Or by modifying the twitter corpur tools (https://github.com/lintool/twitter-corpus-tools).

Format of each file

The files are in csv format with

<tweetid>,<relevance>

where is either 'y'/'n', 'y' -- relevant 'n' -- irrelevant

Hashtags for both events

US Elections

#benghazi
#bethe5percent
#ctvottatnoon
#defeatobama
#earlyvoting
#education
#election
#gallup
#gop
#gop2012
#harvard
#johnson2012
#makeyourvotecount
#obama
#obama2012
#ohio
#p2
#romney
#romneyryan2012
#sandy
#tcot
#teamsearle
#teaparty
#tlot
#vote

**Hurricane Sandy **

#atheist
#blackout
#cnn
#eastcoast
#fdny
#frankenstorm
#hurricane
#hurricanesandy
#latenight
#manhattan
#msm
#newyork
#noaudience
#ny
#nyc
#ohmygod
#romneystormtips
#singlegirlproblems
#staysafe
#storm
#superstorm
#toronto
#usa
#wvu

About

Manually Accessed dataset for Events Following

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published