Skip to content

A project to analyze, visualize, and communicate City of Evanston Open Data

License

Notifications You must be signed in to change notification settings

kirschner21/evanston_viz_2019

 
 

Repository files navigation

Evanston_Viz_2019

A project to analyze, visualize, and communicate City of Evanston Open Data (in response to call for submissions to the Love Data Week 2019 contest). Ideally, the visuals created here will tell a coherent story to be printed for the competition. Any potential prize money is to be used for food for NU data science community events.

Story direction (subject to revision)

Evanston passed its Climate Action and Resiliency Plan last Fall. There are many components to it, but the major initiatives are summarized in the following

carp

The working 'story' would be to create visualizations related to CARP, raising awareness and providing data-driven insight into its initiatives.

Getting Started

All skill levels and creative input are welcome! For example, if you have no coding experience, but would like to workup something with Adobe Illustrator, that would be really helpful in finalizing visualizations. In that same vein, if you don't want to set up a Python environment, you may want to look at Charticulator, Data Illustrator, or some other kind of web app that lets you explore and build data visualizations from a browser. Tableau Desktop app is also highly recommended as it has a freemium public version and free fully-licensed academic version

There are many potentially gratuitous aspects to the repo structure (generated from [cookiecutter data science template](cookiecutter data science), but they've been left here for reference on how one more approach this for a longer-term, larger-team project (Sphinx docs, tox testing, etc.).

The goal would be to interact with this repo through git, but if you have a contribution and are unsure of how to commit it to the repo (or if you would like to be added as an collaborator on Github), just contact @monadnoc.
To give a clear workflow example, if you like working with jupyter notebooks and python Pandas library:

  1. Fork this repo to your GitHub account
  2. git clone the forked repo to you desktop (or download .zip); optional: git remote add upstream ... to keep your fork in sync with the master (see this and this)
  3. place relevant Evanston open data (see below) in data/raw/
  4. open your notebook in the notebooks folder
  5. read that data into your notebook with pandas.read_csv(../data/raw/example.csv), where example is a placeholder name.
  6. as you process data as a DataFrame, you could save it pandas.DataFrame.to_csv(../data/interim/blah.csv), where blah.csv is a placeholder.
  7. When you have a jupyter notebook you'd like to save, push it to your forked repo with git add *.ipynb, git commit -m '[some message]' and git push origin [current branch] then go to Github and you should be able to create a Pull Request from that commit—otherwise, you could just click and drag the iPython notebook to the notebook/ repo folder in a browser.

How to handle data automatically is still up for debate, as most command-line solutions rely on AWS S3 buckets for dumping data into a repo after its cloned. In general, it's considered bad form to host your actual data on Github (when it's above 50 MB, e.g.). So the way to go at the moment is just adding processing and visualization scripts to the repo (as in the above example workflow) and manually downloading data from the Evanston open data portal for local development.

Example analysis towards visualization

Switch to 100% renewable electricity: the solar projects by month dataset could be mapped to identify hotspots for solar projects or correlations between such projects and wealth, race, etc (with census data).

Reduce VMT 35%: the city owned electric vehicle charging station could provide some time-series insight into projections for meeting that goal (is electric vehicle charging increasing? are there certain locations or times when it stagnates?)...Or the Divvy usage statistics

Plant 1000 new trees by 2035: the tree map could help identify ideal sites or species for new trees.

Additional relevant datasets

Evanston energy-benchmarked buildings
Evanston water bodies map
Dempster beach weather
Evanston greenhouse gas emissions (20015-2015)
Evanston electrical energy supply
City of Evanston total solid waste
Evanston public works special pickups (large waste)

Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.testrun.org

Project based on the cookiecutter data science project template. #cookiecutterdatascience

About

A project to analyze, visualize, and communicate City of Evanston Open Data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Makefile 66.6%
  • Python 33.4%