Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How-to: Count total number of housing units in Cook County with public data #562

Open
dfsnow opened this issue Jul 31, 2024 · 11 comments
Open
Assignees
Labels
new data/feature Create or edit a column/feature or collect new data

Comments

@dfsnow
Copy link
Member

dfsnow commented Jul 31, 2024

Researchers often want to know the total number of housing units in Cook County in a specific tax year, according to the Assessor's Office.

Right now, that is technically at least partially answerable but the data are in multiple data sources. We need to think about number of cards on a res PIN, livable condo spaces, and large multi-unit properties.

The goal of this project is to get a count of the number of housing units in Cook County in tax year 2023 using open data, provide advice on making this data more accessible to the public, and perhaps creating a contribution to our reporting database that institutionalizes these housing counts.

Data sources that are publicly available (happy to discuss more)

  • Single- and Multi-Unit Characteristics data is unique to each Tax Year, PIN, and card_num (the number of cards). A card is a building, because each parcel of land can have multiple buildings on it. Each card that is a residential property class should be counted as a housing unit.
  • Condo characteristics data is unique to each Tax Year and PIN, for PINs that are class 299. However, not all class 299s are livable. If is_parking_space or is_common_area is true, then that PIN isn't a housing unit.
  • Commercial valuation data is produced by our Commercial Valuations team. Note that this dataset at the time of writing currently contains data from each tri: City Tri in 2021, North Tri in 2022, and South Tri in 2023. And it contains many property types, from gas stations, to hotels, to multi-family. Let's start by only looking at properties where modelgroup or the property_type/use indicates some kind of multifamily property. Then, for those properties, we can count the tot_units.

Note that these are on Open Data. The portal allows you to filter and count.

Other things to look at

  • are there other data sources we can compare the final totals to, to ballpark whether these numbers are similar?

Suggested outputs

  • First, any suggested slight changes you might make to the above 3 datasets to make the counting easier
  • Then, a document that explains how to use Open Data to count these totals.
@dfsnow dfsnow added the new data/feature Create or edit a column/feature or collect new data label Jul 31, 2024
@ccao-jardine
Copy link
Member

Added details:
Single- and Multi-Unit Characteristics: the num_apartments column is useful. If it's "None" or NULL, let's assume that this property has 1 living unit (because it's probably a condo or single-family home, without any apartments). For other properties with multiple apartments, let's multiply appropriately. Do filter tax year = 2023

Condos: each row is one living unit, unless it's parking or nonlivable space. Do filter tax year = 2023

For commercial data: @wrridgeway might have a helpful data dictionary, but for now, the sum of apt (instead of tot_units). For this one, note that you do not want to filter tax year = 2023

For the exploratory data, feel free to use excel.

@yufeinancyliu
Copy link

yufeinancyliu commented Aug 1, 2024

here is the result I got, in the sheet 'final_output', in 'output.xlsx'; and the plan and process of calculation is in this
google doc. You could click the links and open them. If anything wrong, please tell me and I will revise them later. And Dan suggests to turn this into a short how-to document. Could you provide me some examples of them? Thanks! @ccao-jardine

@ccao-jardine
Copy link
Member

ccao-jardine commented Aug 6, 2024

Thanks -- this is a good start! Some overall feedback:

google doc:

This is a good start -- let's make it a little more accessible to someone, as if this was a blog post how-to guide. I've attached an example how-to. HowTo_Sales.pdf

  • Please add an intro paragraph, stating the overall goal of what we're trying to answer. (Feel free to do your own version of this issue ticket.)
  • Let's build on each section (Single- and Multi-Unit Characteristics, Condo, and Commercial). In each section, please add a sentence or two explaining what the dataset is. Then, explain the filters. Finally, explain which column(s) to sum.
  • I like the screenshots; please integrate them in each appropriate section.

output.xlsx

Great start! I just want some QC, I think; this is the first time we've tried to use the data this way.

  • Can you do some manual checking to confirm the multi tab values? Specifically, are there any multifamily properties with data missing from this column? (You might group unit count by model group or property use.)
  • Do we have this data for every township? (Group by township.)

general:

I'm curious to see whether these numbers match up to other sources of housing counts, like the US Census. Can you find a few external counts of housing counts, and a description of their methods, so we might compare our totals?

I've requested edit access so I can make more specific feedback.

@yufeinancyliu
Copy link

yufeinancyliu commented Aug 7, 2024

Here is the formatted draft for instruction file. Please tell me if you have any suggestions. :) @ccao-jardine @dfsnow
instruction_housing_unit0807.pdf

@yufeinancyliu
Copy link

The updated (considering apt, sum of unit columns and tot_units) instructions.
The recommendation.
The updated output table.
Please tell me if you have any suggestion:) @dfsnow @ccao-jardine

@ccao-jardine
Copy link
Member

Thanks, the instructions are looking very clear -- nice work! I've requested edit access to the google doc to make small wording revisions.

One thing we should dig into are the instructions for section 3. Large multi-unit properties. All steps say we should first filter the modelgroup to "Multifamily." But one hypothesis is that there might be other kinds of housing units with different modelgroup names. If so, this filter might be excluding some housing units in different model groups.

To test this hypothesis, I looked at sum(apt), sum(tot_units), and sum(2brunits) grouped by modelgroup, without filtering modelgroup. (I chose 2brunits at random from the studiounits, 1brunits, etc.) By doing this, I found a couple of types of multi-family housing with tot_units > 0 that don't have modelgroup = "Multifamily." Specifically:

  • It looks like there may be some other modelgroup names to consider! modelgroup names like "NursingHomes" and "T28-SpecialNursing" have tot_units > 0. Nursing homes and skilled nursing facilities provide specialized housing to the aging population. But I'm actually not sure whether these are typically counted as "housing units" by others, such as the US census.
  • modelgroup "Affordable Housing" looks to generally not have apt or tot_unit, but can have studiounits, 1brunits, 2brunits, etc. Affordable housing properties provide housing at a reduced cost...but again, I'm actually not sure whether these are typically counted as "housing units" by the US census!

For completeness, I think we should show how to count these, but let users make a decision whether to include nursing homes and affordable housing. Let's do the following steps:

  1. Step 1: add two more sections to the instructions (Nursing Homes and Affordable Housing) to count these, where you apply the right modelgroup filter and then sum things appropriately. Please also add these as two more tabs to the google sheet.
  2. Step 2: one question is whether nursing homes and affordable housing qualify as "housing units" in the way these are typically measured! Please spend 1-2 hours two researching whether other agencies that count housing units (Census Bureau, American Comunity Survey, FRED aka Federal Reserve Bank of St. Louis, and Statista) each count nursing homes and affordable housing these as housing units. Please add a table to the instructions doc that looks like the below to summarize your findings. Note that the below is just a demo:
Data source Nursing Homes counted as housing units? Affordable Housing counted as housing units? Sources
Census Bureau no, doesn't count not sure http://...
FRED no, doesn't count yes, this counts http://
  1. Step 3: After completing step 2, we'll have an idea of whether these other agencies do or don't count Nursing Homes and Affordable Housing in their housing count totals. In your instruction doc for the "Total Count of Housing Units in Cook County with Public Data" section, let's give two examples! First, we can keep your existing example to exclude Nursing Homes and Affordable Housing. Then, please add another example to add in Nursing Homes and Affordable Housing.

Thoughts?

@yufeinancyliu
Copy link

I found housing units count data from [Census Bureau](https://www.census.gov/
https://www2.census.gov/programs-surveys/popest/tables/2020-2023/housing/totals/CO-EST2023-HU-17.xlsx), and the count in 2023 is 2,280,981, calculated by survey and sampling, which is higher than what we got.

@yufeinancyliu
Copy link

I completed the tasks above, and the links are the same ones as the links in previous comment.

After reading through Census Bureau's AHS's survey instruction, I found they considered the nursing home and no sign of excluding the affordable houses. FRED used the data from Census Bureau. American Community Survey has no sign of excluding the two categories. For Statista, I need to pay to see the data source.

Please tell me if you have any suggestion! Thanks.

@ccao-jardine
Copy link
Member

Great, thanks! I made some minor suggested text modifications in the google doc, such as some minor word/organization fixes, and adding links to the google sheet. When you have time, for each suggestion, please accept the suggestions, or revise them if I've introduced any errors. And definitely be sure to add authorship information to the top of the document too so that you receive appropriate credit.

Then that's a wrap on this issue!

@yufeinancyliu
Copy link

Great, thanks! I made some minor suggested text modifications in the google doc, such as some minor word/organization fixes, and adding links to the google sheet. When you have time, for each suggestion, please accept the suggestions, or revise them if I've introduced any errors. And definitely be sure to add authorship information to the top of the document too so that you receive appropriate credit.

Then that's a wrap on this issue!

Thank you! I updated the files with your suggestions. Do I need to open the google sheet accessibility? I notice that you added the link in the instruction document.

@ccao-jardine
Copy link
Member

Do I need to open the google sheet accessibility?

Oh, good catch 😅 Yes please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new data/feature Create or edit a column/feature or collect new data
Projects
None yet
Development

No branches or pull requests

3 participants