-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #144 from BasisResearch/db-parcels
Convert cities dataset into Postgres database
- Loading branch information
Showing
88 changed files
with
2,860 additions
and
59 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
GOOGLE_CLOUD_PROJECT=cities-429602 | ||
GOOGLE_CLOUD_BUCKET=minneapolis-basis | ||
SCHEMA=minneapolis | ||
HOST=34.123.100.76 | ||
DATABASE=cities | ||
USERNAME=postgres |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,19 +3,52 @@ | |
</p> | ||
|
||
|
||
## Evaluating Policy Transfer via Similarity Analysis and Causal Inference | ||
# Evaluating Policy Transfer via Similarity Analysis and Causal Inference | ||
|
||
|
||
## Getting started | ||
|
||
|
||
Welcome to the repository for [polis](http://polis.basis.ai/), developed by [Basis Research Institute](https://www.basis.ai/) for [The Opportunity Project (TOP)](https://opportunity.census.gov/) 2023 in collaboration with the U.S. Department of Commerce. The primary goal of this project is to enhance access to data for local policymakers, facilitating more informed decision-making. | ||
|
||
This is the backend repository for more advanced users. For a more pleasant frontend experience and more information, please use the [app](http://polis.basis.ai/). | ||
|
||
|
||
Installation | ||
------------ | ||
|
||
**Basic Setup:** | ||
|
||
```sh | ||
|
||
git clone [email protected]:BasisResearch/cities.git | ||
cd cities | ||
git checkout main | ||
pip install . | ||
``` | ||
python -m venv venv | ||
source venv/bin/activate | ||
pip install -r requirements.txt | ||
pip install -e . | ||
cd tests && python -m pytest | ||
|
||
The above will install the minimal version that's ported to [polis.basis.ai](http://polis.basis.ai) | ||
|
||
**Dev Setup:** | ||
|
||
To install dev dependencies, needed to run models, train models and run all the tests, run the following command: | ||
|
||
```sh | ||
pip install -e .[dev] | ||
``` | ||
|
||
Details of which packages are available in which see `setup.py`. | ||
|
||
Welcome to the repository for [polis](http://polis.basis.ai/), developed by the [Basis Research Institute](https://www.basis.ai/) for [The Opportunity Project (TOP)](https://opportunity.census.gov/) 2023 in collaboration with the U.S. Department of Commerce. The primary goal of this project is to enhance access to data for local policymakers, facilitating more informed decision-making. | ||
|
||
This is the backend repository for more advanced users. For a more pleasant frontend experience and more information, please use the [app](http://polis.basis.ai/). | ||
** Contributing: ** | ||
|
||
Before submitting a pull request, please autoformat code and ensure that unit tests pass locally | ||
|
||
```sh | ||
make lint # linting | ||
make format # runs black and isort, including on notebooks in the docs/ folder | ||
make tests # linting, unit and notebook tests | ||
``` | ||
|
||
|
||
### The repository is structured as follows: | ||
|
@@ -36,11 +69,24 @@ This is the backend repository for more advanced users. For a more pleasant fron | |
└── tests | ||
``` | ||
|
||
**WARNING: during the beta testing, the most recent version lives on the `staging-county-data` git branch, and so do the most recent versions of the notebooks. Please switch to this branch before inspecting the notebooks. | ||
|
||
If you're interested in downloading the data or exploring advanced features beyond the frontend, check out the `guides` folder in the `docs` directory. There, you'll find: | ||
- `data_sources.ipynb` for information on data sources, | ||
- `similarity-conceptual.ipynb` for a conceptual account of how similarity comparison works. | ||
- `counterfactual-explained.ipynb` contains a rough explanation of how our causal model works. | ||
- `similarity_demo.ipynb` demonstrating the use of the `DataGrabber` class for easy data acces, and of our `FipsQuery` class, which is the key tool in the similarity-focused part of the project, | ||
- `causal_insights_demo.ipynb` for an overview of how the `CausalInsight` class can be used to explore the influence of a range of intervention variables thanks to causal inference tools we employed. [WIP] | ||
|
||
Feel free to dive into these resources to gain deeper insights into the capabilities of the Polis project, or to reach out if you have any comments or suggestions. | ||
## Interested? We'd love to hear from you. | ||
|
||
[polis](http://polis.basis.ai/) is a research tool under very active development, and we are eager to hear feedback from users in the policymaking and public administration spaces to accelerate its benefit. | ||
|
||
If you have feature requests, recommendations for new data sources, tips for how to resolve missing data issues, find bugs in the tool (they certainly exist!), or anything else, please do not hesitate to contact us at [email protected]. | ||
|
||
To stay up to date on our latest features, you can subscribe to our [mailing list](https://dashboard.mailerlite.com/forms/102625/110535550672308121/share). In the near-term, we will send out a notice about our upcoming batch of improvements (including performance speedups, support for mobile, and more comprehensive tutorials), as well as an interest form for users who would like to work closely with us on case studies to make the tool most useful in their work. | ||
|
||
Lastly, we emphasize that this website is still in beta testing, and hence all predictions should be taken with a grain of salt. | ||
|
||
Acknowledgments: polis was built by Basis, a non-profit AI research organization dedicated to creating automated reasoning technology that helps solve society's most intractable problems. To learn more about us, visit https://basis.ai. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
target/ | ||
dbt_packages/ | ||
logs/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Welcome to your new dbt project! | ||
|
||
### Using the starter project | ||
|
||
Try running the following commands: | ||
- dbt run | ||
- dbt test | ||
|
||
|
||
### Resources: | ||
- Learn more about dbt [in the docs](https://docs.getdbt.com/docs/introduction) | ||
- Check out [Discourse](https://discourse.getdbt.com/) for commonly asked questions and answers | ||
- Join the [chat](https://community.getdbt.com/) on Slack for live discussions and support | ||
- Find [dbt events](https://events.getdbt.com) near you | ||
- Check out [the blog](https://blog.getdbt.com/) for the latest news on dbt's development and best practices |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
|
||
# Name your project! Project names should contain only lowercase characters | ||
# and underscores. A good package name should reflect your organization's | ||
# name or the intended use of these models | ||
name: 'cities' | ||
version: '1.0.0' | ||
|
||
# This setting configures which "profile" dbt uses for this project. | ||
profile: 'cities' | ||
|
||
# These configurations specify where dbt should look for different types of files. | ||
# The `model-paths` config, for example, states that models in this project can be | ||
# found in the "models/" directory. You probably won't need to change these! | ||
model-paths: ["models"] | ||
analysis-paths: ["analyses"] | ||
test-paths: ["tests"] | ||
seed-paths: ["seeds"] | ||
macro-paths: ["macros"] | ||
snapshot-paths: ["snapshots"] | ||
|
||
clean-targets: # directories to be removed by `dbt clean` | ||
- "target" | ||
- "dbt_packages" | ||
|
||
|
||
vars: | ||
srid: 26915 # use UTM zone 15N for all geometric data. note, this must have meters as the unit of measure | ||
# years for which we have census tract/block group data | ||
census_years: [2010, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023] |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{% macro median(attr) %} | ||
(percentile_cont(0.5) within group (order by {{ attr }})) | ||
{% endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{% macro safe_divide(num, dem) %} | ||
(case when {{ dem }} = 0 then 0 else {{ num }} / {{ dem }} end) | ||
{% endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{% macro standardize_cont(columns) %} | ||
{% for c in columns %} | ||
{{ c }} as {{ c }}_original, (({{ c }} - (avg({{ c }}) over ())) / (stddev_samp({{ c }}) over ()))::double precision as {{ c }} | ||
{% if not loop.last %},{% endif %} | ||
{% endfor %} | ||
{% endmacro %} | ||
|
||
{% macro standardize_cat(columns) %} | ||
{% for c in columns %} | ||
{{ c }} as {{ c }}_original, (dense_rank() over (order by {{ c }})) - 1 as {{ c }} | ||
{% if not loop.last %},{% endif %} | ||
{% endfor %} | ||
{% endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
-- Tag regions with their containing/most intersecting/closest parent regions. | ||
-- child_table: table with the child regions | ||
-- parent_table: table with the parent regions | ||
-- max_distance: maximum distance to consider a region as a parent (meters) | ||
{% macro tag_regions(child_table, parent_table, max_distance=100) %} | ||
( | ||
-- the not materialized keyword allows us to use indexes on the child and parent | ||
-- tables | ||
with child as not materialized ( | ||
select * from {{child_table}} | ||
) | ||
, parent as not materialized ( | ||
select * from {{parent_table}} | ||
) | ||
, within as ( | ||
select child.id as child_id | ||
, parent.id as parent_id | ||
, child.valid * parent.valid as valid | ||
from | ||
child | ||
inner join parent | ||
on ST_Within (child.geom, parent.geom) | ||
and child.valid && parent.valid | ||
) | ||
, not_within as ( | ||
select * from child | ||
where not exists (select child_id from within where child_id = id) | ||
) | ||
, largest_overlap as ( | ||
select distinct on (child.id) | ||
child.id as child_id | ||
, parent.id as parent_id | ||
, child.valid * parent.valid as valid | ||
from | ||
not_within as child | ||
inner join parent | ||
on ST_Intersects (child.geom, parent.geom) | ||
and child.valid && parent.valid | ||
order by | ||
child_id, | ||
ST_Area (ST_Intersection (child.geom, parent.geom)) desc | ||
) | ||
, no_overlap as ( | ||
select * from not_within | ||
where not exists ( | ||
select child_id from largest_overlap where child_id = id | ||
) | ||
) | ||
, closest as ( | ||
select distinct on (child.id) | ||
child.id as child_id | ||
, parent.id as parent_id | ||
, child.valid * parent.valid as valid | ||
from | ||
no_overlap as child | ||
inner join parent | ||
on child.valid && parent.valid | ||
and ST_DWithin (child.geom, parent.geom, {{max_distance}}) | ||
order by | ||
child_id, | ||
ST_Distance (child.geom, parent.geom) | ||
) | ||
select *, 'within' as type_ from within | ||
union all | ||
select *, 'most_overlap' as type_ from largest_overlap | ||
union all | ||
select *, 'closest' as type_ from closest | ||
) | ||
{% endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{{ | ||
config( | ||
materialized='table', | ||
indexes = [ | ||
{'columns': ['census_block_group', 'year_', 'name_'], 'unique': true}, | ||
] | ||
) | ||
}} | ||
|
||
select | ||
year::smallint as year_, | ||
code as name_, | ||
statefp || countyfp || tractce || blkgrpce as census_block_group, | ||
case when "value" < 0 then null else "value" end as value_ | ||
from {{ source('minneapolis', 'acs_bg_raw') }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{{ | ||
config( | ||
materialized='table', | ||
indexes = [ | ||
{'columns': ['census_tract', 'year_', 'name_'], 'unique': true}, | ||
] | ||
) | ||
}} | ||
|
||
select | ||
year::smallint as year_, | ||
code as name_, | ||
statefp || countyfp || tractce as census_tract, | ||
case when "value" < 0 then null else "value" end as value_ | ||
from {{ source('minneapolis', 'acs_tract_raw') }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
{{ | ||
config( | ||
materialized='table', | ||
indexes = [ | ||
{'columns': ['year_']} | ||
] | ||
) | ||
}} | ||
|
||
with census_tracts as (select * from {{ ref('census_tracts_in_city_boundary') }}) | ||
select | ||
census_tract | ||
, year_ | ||
, st_transform(geom, 4269) as geom | ||
from | ||
census_tracts |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
{{ | ||
config( | ||
materialized='table', | ||
indexes = [ | ||
{'columns': ['description']} | ||
] | ||
) | ||
}} | ||
|
||
-- This is used by the web app. It has a row for each tract, demographic | ||
-- variable pair and a column for each year. | ||
with | ||
demographics as (select * from {{ ref('demographics') }}), | ||
census_tracts as (select * from {{ ref('census_tracts_in_city_boundary') }}), | ||
demographics_filtered as ( | ||
select demographics.* | ||
from demographics | ||
inner join census_tracts using (census_tract, year_) | ||
), | ||
final_ as ( | ||
select | ||
description, | ||
census_tract as tract_id, | ||
{{ dbt_utils.pivot('year_', | ||
dbt_utils.get_column_values(ref('demographics'), | ||
'year_', | ||
order_by='year_'), | ||
then_value='value_', | ||
else_value='null', | ||
agg='max') }} | ||
from demographics_filtered | ||
group by 1, 2 | ||
) | ||
select * from final_ |
Oops, something went wrong.