Skip to content

Latest commit

 

History

History
89 lines (71 loc) · 4.66 KB

transitland_extract.md

File metadata and controls

89 lines (71 loc) · 4.66 KB

transitland extract

Extract a subset of a GTFS feed

Synopsis

Extract a subset of a GTFS feed

The extract command extends the basic copy command with a number of additional options and transformations. It can be used to pull out a single route or trip, interpolate stop times, override a single value on an entity, etc. This is a separate command to keep the basic copy command simple while allowing the extract command to grow and add more features over time.

transitland extract [flags] <reader> <writer>

Examples


# Extract a single trip from the BART GTFS, and rename the agency to "test".
% transitland extract --extract-trip "3050453" --set "agency.txt,BART,agency_id,test" "https://www.bart.gov/dev/schedules/google_transit.zip" output2.zip

# Note renamed agency
% unzip -p output2.zip agency.txt
agency_id,agency_name,agency_url,agency_timezone,agency_lang,agency_phone,agency_fare_url,agency_email
test,Bay Area Rapid Transit,https://www.bart.gov/,America/Los_Angeles,,510-464-6000,,

# Only entities related to the specified trip are included in the output.
% unzip -p output2.zip trips.txt
route_id,service_id,trip_id,trip_headsign,trip_short_name,direction_id,block_id,shape_id,wheelchair_accessible,bikes_allowed
1,2020_09_14-DX-MVS-Weekday-15,3050453,San Francisco International Airport,,1,,01_shp,0,0

$ unzip -p output2.zip routes.txt
route_id,agency_id,route_short_name,route_long_name,route_desc,route_type,route_url,route_color,route_text_color,route_sort_order
1,test,YL-S,Antioch to SFIA/Millbrae,,1,http://www.bart.gov/schedules/bylineresults?route=1,FFFF33,,0

% unzip -p output2.zip stop_times.txt
trip_id,arrival_time,departure_time,stop_id,stop_sequence,stop_headsign,pickup_type,drop_off_type,shape_dist_traveled,timepoint
3050453,04:53:00,04:53:00,CONC,0,,0,0,0.00000,0
3050453,04:58:00,04:58:00,PHIL,2,,0,0,4.06000,0
3050453,05:01:00,05:02:00,WCRK,3,,0,0,5.77000,0
3050453,05:06:00,05:07:00,LAFY,4,,0,0,9.23000,0
3050453,05:11:00,05:12:00,ORIN,5,,0,0,12.99000,0
3050453,05:17:00,05:18:00,ROCK,6,,0,0,17.38000,0
...

Options

      --allow-entity-errors              Allow entities with errors to be copied
      --allow-reference-errors           Allow entities with reference errors to be copied
      --bbox string                      Extract bbox as (min lon, min lat, max lon, max lat), e.g. -122.276,37.794,-122.259,37.834
      --create                           Create a basic database schema if none exists
      --create-missing-shapes            Create missing Shapes from Trip stop-to-stop geometries
      --deduplicate-stop-times           Deduplicate StopTimes using Journey Patterns
      --error-limit int                  Max number of detailed errors per error group (default 10)
      --exclude-agency stringArray       Exclude Agency
      --exclude-calendar stringArray     Exclude Calendar
      --exclude-route stringArray        Exclude Route
      --exclude-route-type stringArray   Exclude Routes matching route_type
      --exclude-stop stringArray         Exclude Stop
      --exclude-trip stringArray         Exclude Trip
      --ext stringArray                  Include GTFS Extension
      --extract-agency stringArray       Extract Agency
      --extract-calendar stringArray     Extract Calendar
      --extract-route stringArray        Extract Route
      --extract-route-type stringArray   Extract Routes matching route_type
      --extract-stop stringArray         Extract Stop
      --extract-trip stringArray         Extract Trip
      --fvid int                         Specify FeedVersionID when writing to a database
  -h, --help                             help for extract
      --interpolate-stop-times           Interpolate missing StopTime arrival/departure values
      --normalize-service-ids            Create any missing Calendar entities for CalendarDate service_id's
      --normalize-timezones              Normalize timezones and apply default stop timezones based on agency and parent stops
      --prefix string                    Prefix entities in this feed
      --set stringArray                  Set values on output; format is filename,id,key,value
      --simplify-calendars               Attempt to simplify CalendarDates into regular Calendars
      --simplify-shapes float            Simplify shapes with this tolerance (ex. 0.000005)
      --use-basic-route-types            Collapse extended route_type's into basic GTFS values
      --write-extra-columns              Include extra columns in output
      --write-extra-files                Copy additional files found in source to destination

SEE ALSO

Auto generated by spf13/cobra on 13-Dec-2024