Skip to content

LIVES Flattened Schema

Malinda Curtis edited this page Feb 24, 2016 · 6 revisions

##Simplified LIVES Schema This document represents a simplified and optional alternative to the original multi-table LIVES schema. It is backwards-compatible with the original LIVES schema for restaurant inspection data.

Why an additional schema?

In many cases, it may be simpler to provide a single data file for LIVES instead of the more complicated multi-file zip bundle specified by the original specification. While the multi-table schema more closely replicates the normalized database tables that would represent the data, it is:

  • More difficult for non-experts to analyze using tools like Excel, since it requires doing joins or lookups between different tables to relate the data
  • More difficult for developers to make use of via download, since they must either load the data into a database to perform joins on, or perform them in memory
  • More difficult for developers to use via an API, since it requires multiple API calls to simulate joins
  • More difficult to maintain, since it requires the creation of multiple data files and bundling them together into a zip file

####Links

####Differences from the original LIVES Schema

  • The three key data files - Businesses, Inspections, and Violations, are combined into one dataset
  • Dates are formatted as ISO8601 dates, instead of YYYYMMDD dates
  • Field names have been prepended with their original table name. For example “name” from the “Businesses” table is now “business_name”
  • The “business_id” fields from the Inspections and Violations tables are repetitive and removed (The “Score Legend” table has been omitted for simplicity. Instead of providing a legend, specify the result via the “inspection_result” field
  • Scores can now fall into one of three scoring methods - “graded”, “cumulative”, or “unscored”

####Important Notes

  • All of the columns associated with the listing of violations are optional. If a data provider does not collect or does not wish to provide details of violations, those columns can be blank or omitted
    • If the data provider includes violations then the dataset will have one record for each violation and the associated inspection and business information will be duplicated for each violation, but can be differentiated by their business_ids and inspection_ids. Any inspection that resulted in no violations can be represented as a single row for the inspection with blank data in the violations fields.
    • If violations are not included then the dataset will have one record for each inspection and the associated business information will be duplicated for each inspection
  • Additional fields beyond those required by the schema may be included in your data file, but they must not override field names that are part of the schema.

####Feed Information Along with your data file, the following metadata must be provided. It should be provided as a single line CSV as long as it has these fields. This is the same as the “Feed Information” file from the original spec, minus the feed_date, and with some additional metadata:

FieldName DataType Required Description
feed_version string Yes Version
municipality_name string Yes Name of the municipality providing this feed.
Examples: ‘San Francisco’ or ‘Multnomah County’
municipality_url string Yes URL of the publishing municipality’s website.
contact_email string Yes Email address to contact regarding invalid data in this feed.
scoring_method string Yes The scoring method used by this city. Can be one of the following:
  • graded - Within a given range, like 0-5, 0-100, with the highest value being a perfect score
  • cumulative - Points from violations are summed, and a score of zero is a perfect score
  • unscored - No score is published, simply the results of the inspection

####Data File Schema

FieldName DataType Required Description
business_id string Yes Unique identifier for the business. For many cities, this may be the license number.
business_name string Yes Common name of the business.
business_address string Yes Street address of the business.
Example: 706 Mission St.
business_city string Yes City of the business. This field must be included if the file contains businesses from multiple cities.
business_state string Yes State or province for the business. In the U.S. this should be the two-letter code for the state.
business_postal_code string No Zip code or other postal code.
business_latitude number No Latitude of the business. This field must be a valid WGS 84 latitude.
Example: 37.7859547
business_longitude number No Longitude of the business. This field must be a valid WGS 84 longitude.
Example: -122.4024658
business_location Point or Location No If you wish to have geospatial API capabilities, include this as a Point datatype column.
Example: POINT(-122.4024658 37.7859547)

If you wish to have geospatial API capabilities and you need to geocode addresses you should include this as a Location datatype column.
Examples:
(37.7859547, -122.4024658)
600 Fourth Ave,
Seattle, WA 98104
business_phone_number string No Phone number for a business including country specific dialing information.
Example: +14159083801
inspection_id string Yes A unique identifier for a given inspection
inspection_date date Yes Date of the inspection in YYYY-MM-DD format.
Example: 2015-08-22
inspection_score number No Calculated inspection score, may be either graded (0-5, 0-100), or cumulative, and this should be defined in your feed metadata.
inspection_result string No* For jurisdictions that do not capture a score, this string represents the non-numeric result of the inspection, for example “Pass” or “Fail”. The original LIVES standard requires this field to contain 4 characters or fewer. For broader use of LIVES data, we suggest shortened terms for this field.
*If inspections are unscored, this value must be provided.
inspection_description string No Single line description containing details on the outcome of an inspection.
inspection_type string No String representing the type of inspection.
Must be one of:
  • initial
  • routine
  • follow-up
  • complaint
violation_id string No A unique identifier for a given violation
violation_description string No* One line description of the violation.
*If violation data is provided then this field is required
violation_code string No Code for the violation. It is recommended that this be based on the FDA Food Code. However, municipalities can decide to use pre-existing codes for this field.
violation_critical boolean No Describes whether the violation is critical (i.e., if it would cause the restaurant to fail their inspection).
Must be one of:
  • true
  • false
Clone this wiki locally