Meta Data Name: Yu Group at UC Berkeley
Last Modified: 02/28/2021
Author: Laura Chen
- Berkeley Severity Forecasting berkeley_predictions.csv
The Yu group at UC Berkeley Statistics and EECS has compiled, cleaned and documented a large corpus of hospital- and county-level data from a variety of public sources to aid data science efforts to combat COVID-19.
Data are taken directly from the output of the Yu Group at UC Berkeley's model.
berkeley_predictions.csv
Variable | Variable ID in .csv | Description |
---|---|---|
FIPS (Join Column | fips | County geophraphic identifier to join to geospatial data |
COVID Hospital Severity Index | severity_index | A scale of 1-3 indiciating the severity of COVID currently in each county. |
Projected Deaths (5 day forecast) | deaths_YYYY_MM_DD (time-series) | The number of deaths estimated to occur in the next five days. |
At the hospital level, the data includes the location of the hospital, the number of ICU beds, the total number of employees, and the hospital type. At the county level, the data includes COVID-19 cases/deaths from USA Facts and NYT, automatically updated every day, along with demographic information, health resource availability, COVID-19 health risk factors, and social mobility information.
See the Yu-Group/covid19-severity-prediction for more information.
- Hospital Level Data
- cms_cmi: Case Mix Index for hospitals from CMS
- cms_hospitalpayment: Teaching Hospital info from CMS
- DH_hospital: US Hospital info from Definitive Healthcare
- dhhs_hospitalcapacity: US hopsital capcity data, updated weekly, from US DHHS
- hifld_hospital: Hospital info from homeland infrastructue foundation level data
- Nursing Homes Level Data
- nyt_nursinghomes: number of COVID-19-related cases and deaths from nursing homes, as reported by NYT
- hifld_nursinghomes: database of nursing homes/assisted living facilities, populated via open source authoritative sources
- County Level Data
- COVID-19 Cases/Deaths Data
- nytimes_infections: COVID-19-related death/case counts per day per county from NYT
- usafacts_infections: COVID-19-related death/case counts per day per county from USA Facts
- ccd_daily: COVID-19-related deaths, cases, hospitalizations, and testing statistics
- Demographics and Health Resource Availability
- ahrf_health: contains county-level information on health facilities, health professions, measures of resource scarcity, health status, economic activity, health training programs, and socioeconomic and environmental characteristics from Area Health Resources Files
- cdc_svi: Social Vulnerability Index for counties from CDC
- hpsa_shortage: information on areas with shortages of primary care, as designated by the Health Resources & Services Administration (HRSA)
- khn_icu: information on number of ICU beds and hospitals per county from Kaiser Health News
- usda_poverty: county-level poverty estimates from the United States Department of Agriculture, Economic Research Service
- Health Risk Factors
- chrr_health: contains estimates of various health outcomes and health behaviors (e.g., percentage of adult smokers) for each county from County Health Rankings & Roadmaps
- dhdsp_heart: cardiovascular disease mortality rates from CDC DHDSP
- dhdsp_stroke: stroke mortality rates from CDC DHDSP
- ihme_respiratory: chronic respiratory disease mortality rates from IHME
- medicare_chronic: Medicare claims data for 21 chronic conditions
- nchs_mortality: overall mortality rates for each county from National Center for Health Statistics
- usdss_diabetes: diagnosed diabetes in each county from CDC USDSS
- kinsa_ili: measures of anomalous influenza-like illness incidence (ILI) outbreaks in real-time using Kinsa’s county-level illness signals, developed from real-time geospatial thermometer data (private data)
- cmu_covidcast: epidemiological data from the CMU Delphi COVIDcast, which includes data on COVID-like symptoms from Facebook surveys, estimated COVID-related doctor visits and hospital admissions, and other indicators
- Social Distancing and Mobility/Miscellaneous
- nytimes_masks: mask-wearing survey data from NYT and Dynata
- google_mobility: community mobility reports from Google
- apple_mobility: mobility trends from Apple maps direction requests
- unacast_mobility: county-level estimates of the change in mobility from pre-COVID-19 baseline from Unacast (private data)
- streetlight_vmt: estimates of total vehicle miles travelled (VMT) by residents of each county, each day; provided by Streetlight Data (private data)
- safegraph_socialdistancing: aggregated daily views of USA foot-traffic summarizing movement between counties from SafeGraph (private data)
- safegraph_weeklypatterns: place foot-traffic and demographic aggregations that answer: how often people visit, where they came from, where else they go, and more; from SafeGraph (private data)
- jhu_interventions: contains the dates that counties (or states governing them) took measures to mitigate the spread by restricting gatherings (e.g., travel bans, stay at home orders)
- mit_voting: county-level returns for presidential elections from 2000 to 2016 according to official state election data records
- COVID-19 Cases/Deaths Data
- Miscellaneous Data
- bts_airtravel: survey data including origin, destination, and itinerary details from a 10% sample of airline tickets from the Bureau of Transportation Statistics
- fb_socialconnectedness: an anonymized snapshot of all active Facebook users and their friendship networks as a measure of social connectedness between two different places
No limitations to report.
n/a