Students:
- refresh their memories on working with Jupyter and pandas
- get practice with pagination and concatenation of datasets
We'll be doing this lab as pair programming, with the TA floating around to help.
- Go over pair programming.
- Talk through the steps below.
- Set up groups with different people than the project teams.
Students can look back at the Computing in Context slides if needed.
- One person in each group, create a new notebook in Google Colab.
- Add everyone's name.
- Share it with your teammate(s).
- Download the NYPD Hate Crime data as a CSV.
- Upload the file to Colab.
- Load the data with pandas.
- Confirm how many records have been loaded.
- Compute an aggregate statistic (mean, median, sum, whatever).
- Create a visualization.
- Keep it simple.
- Switch to getting the data from the API.
- Check how many records the API is returning.
- Get the full dataset using pagination.
- Useful resources:
- Restriction for this lab: Page size shouldn't be set greater than
1000
.
- Check how many results you get total, confirm it matches what's in the data portal.
- Do the aggregate statistic and visualization using the expanded dataset and note how they've changed.
- Submit via CourseWorks.