Skip to content

Twabler

Bob Lee edited this page Apr 25, 2019 · 2 revisions

This project is a part of the Data Science Working Group at Code for San Francisco. It was initiated in order to provide all Code for America brigades with a tool for labeling Twitter data for NLP, with a longer term goal of generalizing for any data type. Other DSWG projects can be found at the main GitHub repo.

Contents

Overview

Usable Training Data Labeling Application for Everyone

For machine learning (ML) to be effective, models need accurately labeled data examples against which predictions can be compared. Getting labeled data is a time consuming exercise requiring multiple people spending hours of time. Twabler intends to provide projects with an application that simplifies the process of uploading datasets, configuring the labeling job, onboarding labelers, and managing the progress, while at the same time providing labelers with an easy to use mobile client application for efficiently labeling as many data examples as possible whenever they have their smart phone and some free time.

Wait, Isn't There an App for That?

Yes, there are commercial applications that provide these resources, as well as some pretty good productivity app hacks such as Google Sheets or Airtable, but we want to provide a more usable open source solution to organizations with no more resources than a computer and volunteers with cell phones.

Architecture

Currently, we are working on both iOS and Android client apps and a web app for the admin interface. We are still in the process of scoping features and designing the architecture.

Contact

  • If you haven't joined the SF Brigade Slack, you can do that here.
  • Our slack channel is #twabler
  • Feel free to contact @Josh Freivogel, @Bob (Bob Lee), or @Ariel Takvam on slack with any questions or if you are interested in contributing!