Warehouse: Where to Begin
Warehouse: Where to Begin¶
There is a large selection of data available in the warehouse. Consider this a short guide to the most commonly used tables in our work.
On a given day:
Answer how many trips a provider is scheduled to run and how many trips a particular route may make?
Realtime observations of trips to get a full picture of what occurred.
Find a trip’s start time, where it went, and which route it is associated with.
pointgeometry, so you can see the length and location of a route a provider can run on a given date.
Each shape has its own
An express version and the regular version of a route are considered two different shapes.
For a given day:
How many stops did a provider make? Where did they stop?
How many stops did a particular transit type (streetcar, rail, ferry…)?
Detailed information such as how passengers embark/disembark (ex: on a stop/at a station) onto a vehicle.
gtfs_utils_v2.schedule_daily_feed_to_organization()to find feed names, regional feed type, and gtfs dataset key.
namecolumn returned from the function above refers to a name of the feed, not to a provider.
gtfs_utils_v2.schedule_daily_feed_to_organization()to find regional feed type, gtfs dataset key, and feed type for an organization.
View some of the data produced by the US Department of Transportation for the National Transit Database.
Information from 2018-2021 are available.
Includes information such as reporter type, organization type, website, and address.
Not every provider is required to report their data to the NTD, so this is not a comprehensive dataset.
Understand GTFS quality - how well a transit provider’s GTFS data conforms to California’s Transit Data Guidelines.
Each provider has one row per guideline check. Each row details how well a provider’s GTFS data conforms to a certain guideline (availability on website, accurate accessibility data, etc).