A project to analyze, visualize, and communicate City of Evanston Open Data (in response to call for submissions to the Love Data Week 2019 contest). Ideally, the visuals created here will tell a coherent story to be printed for the competition. Any potential prize money is to be used for food for NU data science community events.
Evanston passed its Climate Action and Resiliency Plan last Fall. There are many components to it, but the major initiatives are summarized in the following
The working 'story' would be to create visualizations related to CARP, raising awareness and providing data-driven insight into its initiatives.
All skill levels and creative input are welcome! For example, if you have no coding experience, but would like to workup something with Adobe Illustrator, that would be really helpful in finalizing visualizations. In that same vein, if you don't want to set up a Python environment, you may want to look at Charticulator, Data Illustrator, or some other kind of web app that lets you explore and build data visualizations from a browser. Tableau Desktop app is also highly recommended as it has a freemium public version and free fully-licensed academic version
There are many potentially gratuitous aspects to the repo structure (generated from [cookiecutter data science template](cookiecutter data science), but they've been left here for reference on how one more approach this for a longer-term, larger-team project (Sphinx docs, tox testing, etc.).
The goal would be to interact with this repo through git, but if you have a contribution and are unsure of how to commit it to the repo (or if you would like to be added as an collaborator on Github), just contact @monadnoc.
To give a clear workflow example, if you like working with jupyter notebooks and python Pandas library:
- Fork this repo to your GitHub account
git clone
the forked repo to you desktop (or download .zip); optional:git remote add upstream ...
to keep your fork in sync with the master (see this and this)- place relevant Evanston open data (see below) in
data/raw/
- open your notebook in the
notebooks
folder - read that data into your notebook with
pandas.read_csv(../data/raw/example.csv)
, where example is a placeholder name. - as you process data as a
DataFrame
, you could save itpandas.DataFrame.to_csv(../data/interim/blah.csv)
, whereblah.csv
is a placeholder. - When you have a jupyter notebook you'd like to save, push it to your forked repo with
git add *.ipynb
,git commit -m '[some message]'
andgit push origin [current branch]
then go to Github and you should be able to create a Pull Request from that commit—otherwise, you could just click and drag the iPython notebook to thenotebook/
repo folder in a browser.
How to handle data automatically is still up for debate, as most command-line solutions rely on AWS S3 buckets for dumping data into a repo after its cloned. In general, it's considered bad form to host your actual data on Github (when it's above 50 MB, e.g.). So the way to go at the moment is just adding processing and visualization scripts to the repo (as in the above example workflow) and manually downloading data from the Evanston open data portal for local development.
Switch to 100% renewable electricity: the solar projects by month dataset could be mapped to identify hotspots for solar projects or correlations between such projects and wealth, race, etc (with census data).
Reduce VMT 35%: the city owned electric vehicle charging station could provide some time-series insight into projections for meeting that goal (is electric vehicle charging increasing? are there certain locations or times when it stagnates?)...Or the Divvy usage statistics
Plant 1000 new trees by 2035: the tree map could help identify ideal sites or species for new trees.
Evanston energy-benchmarked buildings
Evanston water bodies map
Dempster beach weather
Evanston greenhouse gas emissions (20015-2015)
Evanston electrical energy supply
City of Evanston total solid waste
Evanston public works special pickups (large waste)
├── LICENSE
├── Makefile <- Makefile with commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── docs <- A default Sphinx project; see sphinx-doc.org for details
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── setup.py <- makes project pip installable (pip install -e .) so src can be imported
├── src <- Source code for use in this project.
│ ├── __init__.py <- Makes src a Python module
│ │
│ ├── data <- Scripts to download or generate data
│ │ └── make_dataset.py
│ │
│ ├── features <- Scripts to turn raw data into features for modeling
│ │ └── build_features.py
│ │
│ ├── models <- Scripts to train models and then use trained models to make
│ │ │ predictions
│ │ ├── predict_model.py
│ │ └── train_model.py
│ │
│ └── visualization <- Scripts to create exploratory and results oriented visualizations
│ └── visualize.py
│
└── tox.ini <- tox file with settings for running tox; see tox.testrun.org
Project based on the cookiecutter data science project template. #cookiecutterdatascience