Building an end-to-end NLP pipeline for small teams to do user research with Twitter data
See the Wiki! This project is a part of the Data Science Working Group at Code for San Francisco. Other DSWG projects can be found at the main GitHub repo.
Please refer to this article for how these folders should work together.
The "/main" folder is for production code and has 4 sub folders:
- /data
- /code
- /pipeline
- /output
Use "/sandbox" folder for storing experiments and playing around. "/outreach" is for organizing materials for producing presentations.
- Python
- Spacy
- scikit-learn
- gensim
Name | Slack Handle |
---|---|
Daniel Zou | @daniel.zou |
Josh Freivogel | @Josh Freivogel |
Nathan Chau | @Nathan Chau |
- If you haven't joined the SF Brigade Slack, you can do that here.
- Our slack channel is
#nltweets
- Feel free to contact team leads with any questions or if you are interested in contributing!