Skip to content

Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.

Notifications You must be signed in to change notification settings

cbuntain/collabortweet

Repository files navigation

CollaborTweet

This code is a framework for assessing text documents using a collaborative, online framework. It's written in Node.js and makes heavy use of third-party packages, like Express, Sqlite, etc.

This framework was initially developed to impose an ordering over a textual data set using pairwise comparisons (not unlike SentimentIt). The original task was to evaluate emotional intensity of Twitter data by providing users with a pair of tweets and asking them to assess which one exhibits a more negative feeling. This framework can support any such task using textual data and is not restricted to Twitter data, though I suggest short-form text rather than long documents.

In addition, the system has been updated to support labeling text as well. If you want to do a simple relevance classification task on text, this platform will work for you too.

The framework is also designed to support multiple users and multiple tasks, but these settings are only vaguely implemented. The system currently supports a single task, and users are specified by a file (users.js) in the users directory. One or both of these features should be implemented soon.

Setup

You should be able to use npm install to set up the server and download any dependencies. Then, npm start will run the server, which you can access (currently) at (http://localhost:3000).

Before starting the server, you need to set up the database. This can be done using the createSchema.sql file (e.g., sqlite3 database.sqlite3 < createSchema.sql). Then, use the tweets2sql.py python script to populate the database.

Creating Tasks

The tweets2sql.py script expects three arguments: a JSON file describing the task, the path to the SQLite file you created before, and the path the JSON file containing tweets (either in Twitter's format or Gnip's activity format).

The JSON task description file contains the name of the task, the question you want to ask the user, and the type of task (1 for pairwise comparison, 2 for labeling tasks, 3 for range-based tasks). For pairwise comparisons, you don't need more information. For the labeling task, you also need to include a list of labels.

Examples are below:

Comparison Task Description JSON

{
	"name": "Election Content - Negativity Comparisons",
	"question": "Which of these tweets seems more emotionally negative?",
	"type": 1
}

Labeling Task Description JSON

{
	"name": "Election Content - Relevance Labels",
	"question": "Is this tweet relevant to the current election?",
	"type": 2,
    "labels": [
        {"Relevant": [
            "Pro-Government",
            {"Pro-Opposition": [
                "Pro Opp Candidate 1",
                "Pro Opp Candidate 2",
            ]}
        ]},
        "Not Relevant",
        "Not English",
        "Can't Decide"
    ]
}

Range-based question tasks

The platform also supports labeling of range-based questions, sets of questions with associated scale values as the possible label options The task configuration file looks like this for range-based questions:

{
	"name": "Nigeria 2014 - Emotional Ranges",
	"question": "Answer several questions about the emotional range of a social media message.",
	"questions": [
		{
			"question": "Rate the emotional negativity of this tweet",
			"scale": [
				"1 - Not At All",
				"2 - Slightly",
				"3 - Moderate",
				"4 - Very",
				"5 - Incredibly"
			]
		},
		{
			"question": "Rate the emotional positivity of this tweet",
			"scale": [
				"1 - Not At All",
				"2 - Slightly",
				"3 - Moderate",
				"4 - Very",
				"5 - Incredibly"
			]
		}
	],
	"type": 3
}

NOTE: The order of the values in each scale is preserved in the database.

Creating Users

Once your database is populated, you need to add users to the system. Users aren't for authentication so much as ensuring we don't show the same pair to the same user multiple times. Currently, the system uses the users table in the sqlite file, so add users there. Using sqlite, you can do it easily:

sqlite3 database.sqlite3 'INSERT INTO users (userId, screenname, password, fname, lname) VALUES (1, "cbuntain", "cb123", "Cody", "Buntain")'

Assigning User Tasks

Assign tasks to individual users, and restrict their ability to see certain tasks.

Run the bin/assignTask.py file with the following arguments:

--database ../DATABASE/PATH

--user SCREENNAME

--task_id TASKIDNUMBER
 
Example: python bin/assignTask.py --database database.sqlite3 --user screenname --task_id 1

About

Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •