data-analyses

Place for sharing quick reports, and works in progress

View the Project on GitHub cal-itp/data-analyses

HQ Transit Corridors/Major Transit Stops Technical Notes

Parameter Definitions

Mostly located in update_vars.py.

Script Sequence and Notes

Defined via Makefile

create_aggregate_stop_frequencies.py

This script finds similar (collinear) routes.

Initial steps

Get frequencies at each stop for defined peak periods, with one version only looking at the single most frequent routes per stop and another version looking at all routes per stop.

Find stops in the multi-route version that qualify at a higher threshold than they do in the single-route version. These are the stops (and thus routes and feeds) that we want to check for collinearity.

Detailed collinearity evaluation

  1. Get a list of unique feeds where at least one route_directions pair qualifies to evaluate.
  2. Get stop_times filtered to that feed, and filter that to stops that only qualify with multiple routes, and route directions that pair with at least one other route_direction. Do not consider pairs between the same route in one direction and the same route in the opposite direction.
  3. After that filtering, check again if stop_times includes the minimum frequency to qualify at each stop. Exclude stops where it doesn’t.
  4. Then… evaluate which route_directions can be aggregated at each remaining stop. From the full list of route_directions (sorted by frequency) serving the stop, use list(itertools.combinations(this_stop_route_dirs, 2)) to get each unique pair of route_directions. Check each of those unique pairs to see if it meets the SHARED_STOP_THRESHOLD. If they all do, keep all stop_times entries for that stop, different route_directions can be aggregated together at that stop. If any do not, remove the least frequent route_direction and try again, until a subset passes (only keep stop_times for that subset) or until all are eliminated. Currently implemented recursively as below:

     attempting ['103_1', '101_1', '102_1', '104_1']... subsetting...
     attempting ['103_1', '101_1', '102_1']... subsetting...
     attempting ['103_1', '101_1']... matched!
    
     attempting ['103_1', '101_0', '101_1', '103_0']... subsetting...
     attempting ['103_1', '101_0', '101_1']... subsetting...
     attempting ['103_1', '101_0']... subsetting...
     exhausted!
    
  5. With that filtered stop_times, recalculate stop-level frequencies as before. Only keep stops meeting the minimum frequency threshold for a major stop or HQ corridor.
  6. Finally, once again apply the SHARED_STOP_THRESHOLD after aggregation (by ensuring at least one route_dir at each stop has >= SHARED_STOP_THRESHOLD frequent stops). Exclude stops that don’t meet this criteria.

edge cases:

AC Transit 45 Opposite directions share a same-direction loop. Solved by preventing the same route from being compared with itself in the opposite direction.

SDMTS 944/945 Shared frequent stops are few, and these routes are isolated. Solved by once again applying the SHARED_STOP_THRESHOLD after aggregation (by ensuring at least one route_dir at each stop has >= SHARED_STOP_THRESHOLD frequent stops). Complex typology including a loop route, each pair of [944, 945, 945A(946)] has >= threshold… but not actually in the same spots!

Export

Export stop-level frequencies that are a composite of single-route results, and multi-route results passing the collinearity evaluation. These will be the stop-level frequencies used in subsequent steps.

sjoin_stops_to_segments.py