- Hello World: a first notebook to check if everything is working.
- Python basics: basics of Python, including ample reference documentation.
- Input/Output: reading and writing files from and to disk.
- (optional) Pandas basics: Pandas is the go-to library for data analysis.
- (optional) Data wrangling: loading, cleaning and transforming data for analysis.
- Working with Tweets: working with data from Twitter.
- (optional) Web Scraping and APIs: scraping the Web and APIs for data.
- (optional) Working with Texts: the Natural Language Processing pipeline to work with texts.
We will use some datasets from this course.
You will need an editor to write code and edit files. If you don't have a preferred one yet, check Sublime Text or Atom.
-
Option 0 (fallback): Use Binder (link above). Keep in mind you need to download your notebooks to save them locally.
-
Option 1 (recommended): Download the repository contents and use Spyder or the Jyputer Notebook. See the guide to setup your environment for more info.
-
Option 2 (advanced): work directly with git and conda. See the guide to setup your environment for more info.
- More on conda enviroments
- Conda cheatsheet
- Getting started with Jupyter notebooks
- On using git and GitHub for version control
A more detailed guide to setup your environment, with multiple options.
Some materials are re-purposed from:
- Applied Data Analysis, Oxford Digitial Humanities Summer School.
- Text Mining, Amsterdam University College.
- Python Programming for the Humanities.