Warehouse: Where to Begin#
There is a large selection of data available in the warehouse. Consider this a short guide to the most commonly used tables in our work.
Important Links#
DBT Docs Cal-ITP contains information on all the tables in the warehouse.
Example notebook uses functions in
shared_utils.gtfs_utils_v2
that query some of the tables below.
Trips#
On a given day:
-
Use
gtfs_utils_v2.get_trips()
.Answer how many trips a provider is scheduled to run and how many trips a particular route may make?
-
Realtime observations of trips to get a full picture of what occurred.
Find a trip’s start time, where it went, and which route it is associated with.
Shapes#
-
Use
gtfs_utils_v2.get_shapes()
.Contains
point
geometry, so you can see the length and location of a route a provider can run on a given date.Each shape has its own
shape_id
andshape_array_key
.An express version and the regular version of a route are considered two different shapes.
Daily#
For a given day:
-
Use
gtfs_utils_v2.get_stops()
.Contains
point
geometry.How many stops did a provider make? Where did they stop?
How many stops did a particular transit type (streetcar, rail, ferry…)?
Detailed information such as how passengers embark/disembark (ex: on a stop/at a station) onto a vehicle.
-
Use
gtfs_utils_v2.schedule_daily_feed_to_organization()
to find feed names, regional feed type, and gtfs dataset key.Please note,the
name
column returned from the function above refers to a name of the feed, not to a provider.Use
gtfs_utils_v2.schedule_daily_feed_to_organization()
to find regional feed type, gtfs dataset key, and feed type for an organization.
Other#
dim_annual_ntd_agency_information
View some of the data produced by the US Department of Transportation for the National Transit Database.
Information from 2018-2021 are available.
Includes information such as reporter type, organization type, website, and address.
Not every provider is required to report their data to the NTD, so this is not a comprehensive dataset.
fct_daily_organization_combined_guideline_checks
Understand GTFS quality - how well a transit provider’s GTFS data conforms to California’s Transit Data Guidelines.
Each provider has one row per guideline check. Each row details how well a provider’s GTFS data conforms to a certain guideline (availability on website, accurate accessibility data, etc).