Skip to content

LIVES Flattened Schema

Malinda Curtis edited this page Jan 5, 2016 · 6 revisions

##Simplified LIVES Schema This document represents a simplified and optional alternative to the original multi-table LIVES schema. It is backwards-compatible with the original LIVES schema for restaurant inspection data.

Why an additional schema?

In many cases, it may be simpler to provide a single data file for LIVES instead of the more complicated multi-file zip bundle specified by the original specification. While the multi-table schema more closely replicates the normalized database tables that would represent the data, it is:

  • More difficult for non-experts to analyze using tools like Excel, since it requires doing joins or lookups between different tables to relate the data
  • More difficult for developers to make use of via download, since they must either load the data into a database to perform joins on, or perform them in memory
  • More difficult for developers to use via an API, since it requires multiple API calls to simulate joins
  • More difficult to maintain, since it requires the creation of multiple data files and bundling them together into a zip file

####Links

####Differences from the original LIVES Schema

  • The three key data files - Businesses, Inspections, and Violations, are combined into one dataset
  • Dates are formatted as ISO8601 dates, instead of YYYYMMDD dates
  • Field names have been prepended with their original table name. For example “name” from the “Businesses” table is now “business_name”
  • The “business_id” fields from the Inspections and Violations tables are repetitive and removed (The “Score Legend” table has been omitted for simplicity. Instead of providing a legend, specify the result via the “inspection_result” field
  • Scores can now fall into one of three scoring methods - “graded”, “cumulative”, or “unscored” ####Important Notes
  • All of the columns associated with the listing of violations are optional. If a data provider does not collect or does not wish to provide details of violations, those columns can be blank or omitted
    • If the data provider includes violations then the dataset will have one record for each violation and the associated inspection and business information will be duplicated for each violation, but can be differentiated by their business_ids and inspection_ids. Any inspection that resulted in no violations can be represented as a single row for the inspection with blank data in the violations fields.
    • If violations are not included then the dataset will have one record for each inspection and the associated business information will be duplicated for each inspection
  • Additional fields beyond those required by the schema may be included in your data file, but they must not override field names that are part of the schema.

####Feed Information Along with your data file, the following metadata must be provided. It should be provided as a single line CSV as long as it has these fields. This is the same as the “Feed Information” file from the original spec, minus the feed_date, and with some additional metadata:

|Field Name|Data Type|Required|Description| |:------ |:------ |:------ | |feed_version|string|Yes|Version| |municipality_name|string|Yes|Name of the municipality providing this feed. Examples: ‘San Francisco’ or ‘Multnomah County’| |municipality_url|string|Yes|URL of the publishing municipality’s website.| |contact_email|string|Yes|Email address to contact regarding invalid data in this feed.| |scoring_method|string|Yes|The scoring method used by this city. Can be one of the following: graded - Within a given range, like 0-5, 0-100, with the highest value being a perfect score cumulative - Points from violations are summed, and a score of zero is a perfect score. unscored - No score is published, simply the results of the inspection|

####Data File Schema

|Field Name|Data Type|Required|Description| |:------ |:------ |:------ | |business_id|string|Yes|Unique identifier for the business. For many cities, this may be the license number.| |business_name|string|Yes|Common name of the business.| |business_address|string|Yes|Street address of the business. Example: 706 Mission St.| |business_city|string|Yes|City of the business. This field must be included if the file contains businesses from multiple cities.| |business_state|string|Yes|State or province for the business. In the U.S. this should be the two-letter code for the state.| |business_postal_code|string|No|Zip code or other postal code.| |business_latitude|number|No|Latitude of the business. This field must be a valid WGS 84 latitude. Example: 37.7859547| |business_longitude|number|No|Longitude of the business. This field must be a valid WGS 84 longitude. Example: -122.4024658| |business_location|Location|No|If you wish to have geospatial API capabilities or wish to geocode addresses you should include this Location datatype column. Examples: (37.7859547, -122.4024658) 600 Fourth Ave, Seattle, WA 98104| |business_phone_number|string|No|Phone number for a business including country specific dialing information. Example: +14159083801| |inspection_id|string|Yes|A unique identifier for a given inspection| |inspection_date|date|Yes|Date of the inspection in YYYY-MM-DD format. Example: 2015-08-22| |inspection_score|number|No|Calculated inspection score, may be either graded (0-5, 0-100), or cumulative, and this should be defined in your feed metadata.| |inspection_result|string|No*|For jurisdictions that do not capture a score, this string represents the non-numeric result of the inspection, for example “Pass” or “Fail”. The original LIVES standard requires this field to contain 4 characters or fewer. For broader use of LIVES data, we suggest shortened terms for this field. *If inspections are unscored, this value must be provided.| |inspection_description|string|No|Single line description containing details on the outcome of an inspection.| |inspection_type|string|No|String representing the type of inspection. Must be one of: initial, routine, follow-up, complaint| |violation_id|string|No|A unique identifier for a given violation| |violation_description|string|No*|One line description of the violation. *If violation data is provided then this field is required| |violation_code|string|No|Code for the violation. It is recommended that this be based on the FDA Food Code. However, municipalities can decide to use pre-existing codes for this field.| |violation_critical|boolean|No|Describes whether the violation is critical (i.e., if it would cause the restaurant to fail their inspection). Must be one of: true, false|

Clone this wiki locally