Skip to content

Commit

Permalink
Best practices: adding Dataset Publishing guidelines and Practice Rec…
Browse files Browse the repository at this point in the history
…ommendations for all files (#406)

* BP merge second iteration 

Incorporating the following Best Practices into the GTFS Reference document:

- Dataset Publishing & General Practice guidelines
- Practice Recommendations for all files

* BP merge second iteration update

Moving Dataset Publishing & General Practice guidelines before Field Definitions section.

Merging File Recommendations with File Requirements section.

* Update revision date in reference.md

Updated revision date to Nov 16, 2023
  • Loading branch information
Sergiodero authored Nov 16, 2023
1 parent bf449a4 commit 8268169
Showing 1 changed file with 15 additions and 1 deletion.
16 changes: 15 additions & 1 deletion gtfs/spec/en/reference.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## General Transit Feed Specification Reference

**Revised Dec 8, 2022. See [Revision History](../../CHANGES.md) for more details.**
**Revised Nov 16, 2023. See [Revision History](../../CHANGES.md) for more details.**

This document defines the format and structure of the files that comprise a GTFS dataset.

Expand Down Expand Up @@ -145,6 +145,20 @@ The following example demonstrates how a field value would appear in a comma-del
* Each line must end with a CRLF or LF linebreak character.
* Files should be encoded in UTF-8 to support all Unicode characters. Files that include the Unicode byte-order mark (BOM) character are acceptable. See [http://unicode.org/faq/utf_bom.html#BOM](http://unicode.org/faq/utf_bom.html#BOM) for more information on the BOM character and UTF-8.
* All dataset files must be zipped together. The files must reside at the root level directly, not in a subfolder.
* All customer-facing text strings (including stop names, route names, and headsigns) should use Mixed Case (not ALL CAPS), following local conventions for capitalization of place names on displays capable of displaying lower case characters (e.g. “Brighton Churchill Square”, “Villiers-sur-Marne”, “Market Street”).
* The use of abbreviations should be avoided throughout the feed for names and other text (e.g. St. for Street) unless a location is called by its abbreviated name (e.g. “JFK Airport”). Abbreviations may be problematic for accessibility by screen reader software and voice user interfaces. Consuming software can be engineered to reliably convert full words to abbreviations for display, but converting from abbreviations to full words is prone to more risk of error.

## Dataset Publishing & General Practices

* Datasets should be published at a public, permanent URL, including the zip file name. (e.g., www.agency.org/gtfs/gtfs.zip). Ideally, the URL should be directly downloadable without requiring login to access the file, to facilitate download by consuming software applications. While it is recommended (and the most common practice) to make a GTFS dataset openly downloadable, if a data provider does need to control access to GTFS for licensing or other reasons, it is recommended to control access to the GTFS dataset using API keys, which will facilitate automatic downloads.
* GTFS data should be published in iterations so that a single file at a stable location always contains the latest official description of service for a transit agency (or agencies).
* Datasets should maintain persistent identifiers (id fields) for `stop_id`, `route_id`, and `agency_id` across data iterations whenever possible.
* One GTFS dataset should contain current and upcoming service (sometimes called a “merged” dataset). There are multiple [merge tools](https://gtfs.org/resources/gtfs/#gtfs-merge-tools) available that can be used to create a merged dataset from two different GTFS feeds.
* At any time, the published GTFS dataset should be valid for at least the next 7 days, and ideally for as long as the operator is confident that the schedule will continue to be operated.
* If possible, the GTFS dataset should cover at least the next 30 days of service.
* Old services (expired calendars) should be removed from the feed.
* If a service modification will go into effect in 7 days or fewer, this service change should be expressed through a GTFS-realtime feed (service advisories or trip updates) rather than static GTFS dataset.
* The web-server hosting GTFS data should be configured to correctly report the file modification date (see [HTTP/1.1 - Request for Comments 2616, under Section 14.29](https://tools.ietf.org/html/rfc2616#section-14.29)https://tools.ietf.org/html/rfc2616#section-14.29).

## Field Definitions

Expand Down

0 comments on commit 8268169

Please sign in to comment.